Since it was introduced in the 1800s, standardised testing in Australian schools has attracted controversy and divided opinion. In this series, we examine its pros and cons, including appropriate uses for standardised tests and which students are disadvantaged by them.
In recent years, we have seen a global surge in standardised testing as nations attempt to improve student outcomes. Rich nations, as well as many middle- and low-income nations, have participated in international assessments such as the Programme for International Student Assessment (PISA), and also developed their own national standardised assessments. But can such assessments improve student outcomes?
Information from standardised tests is too limited to improve outcomes
The National Assessment Program – Literacy and Numeracy (NAPLAN) was introduced in Australia in 2008. It is a standardised test administered annually to all Australian students in Years 3, 5, 7 and 9. These tests are supposed to perform two functions: provide information to develop better schooling policies, and provide teachers with information to improve student outcomes.
However, a decade on and many millions of dollars later, student outcomes on NAPLAN have shown little improvement. Australia’s performance on international assessments such as PISA has actually fallen over these years. Standardised testing has not produced a positive effect on student learning outcomes.
Supporters of standardised testing see NAPLAN as necessary to know which schools and school systems are doing well and which ones are not. It is undoubtedly useful to know if certain parts of the country (such as regional or rural areas), or certain student populations (for example, students with an immigrant or low-SES background), are underperforming. Such information is also crucial when it comes to arguing for resource redistribution, as we see in debates about Gonski.
However, there are clear limits to what NAPLAN can tell us. While it helps us understand schooling at the system level, the information gained from NAPLAN about individual students, classrooms and schools is too limited and error-prone to be of use.
For instance, there is a limit to the number of questions NAPLAN can ask to assess a particular student’s skill or understanding. It may determine that a student cannot perform addition using “carrying over” based on their performance on one or two such items on the 40-item test. This means the error margins in these assessments are very high.
Such errors may be neutralised at a system level, when the test is performed at a sufficiently large scale and with a large sample of students, but when used at the level of individual students, classrooms or schools, NAPLAN assessment data is seriously flawed.
Assessment versus standardised testing
Assessment is integral to the teaching process and occurs almost constantly in good classrooms. Teachers have a range of assessment techniques, including questioning during the course of a lesson, setting assignments, using data from standardised testing, and developing more formal exams. These different assessment techniques fulfil a variety of different purposes: diagnosing student knowledge, shaping student learning and assessing what has been learned.
Increasingly, teachers are encouraged to individualise their teaching in order to accommodate the needs of individual students. This focus on “inclusion” extends to assessment, and teachers are expected to provide a variety of formats and opportunities for students to demonstrate their learning. Education policy statements, such as the 2008 Melbourne Declaration on Educational Goals for Young Australians, emphasise the valuing of student diversity.
Standardised assessments, on the other hand, assume that particular levels of achievement are expected of certain ages or year levels. Students are then classified as meeting, exceeding or being below these expectations. This flies in the face of the realities that teachers observe daily in their classrooms: students do not present themselves as “standardised” humans.
Some Year 9 students perform at the same level as some Year 5, and possibly some Year 3, students.
By this logic, the notion of providing a standardised NAPLAN test for all Year 3, 5, 7 and 9 students is inappropriate.
Teachers who see their students all year long will always have a deeper knowledge of their students than point-in-time standardised tests can offer. Teachers can make better, more nuanced, more useful and more timely assessments of their students. They may choose to include standardised assessments in the suite of approaches they use, but NAPLAN should not be solely privileged over teacher assessments.
Despite this, enormous amounts of money and time have been spent training teachers to use NAPLAN results to inform their teaching. This not only provides an unnecessary and misleading distraction to already over-burdened teachers but it undermines their own professional knowledge and judgement.
Stepping up accountability doesn’t necessarily translate to better outcomes
One of the goals of NAPLAN was to enhance accountability. By judging all schools on the same measure, comparing schools with similar populations, and then making these comparisons public, it was expected that all schools would lift their game.
This strategy assumed that schools could improve but were choosing not to, and that the inducement of market logics (such as school choice) would motivate all schools to do better. It also ignored the many out-of-school factors, such as poverty and geography, that affect the ability of teachers and schools to improve student outcomes.
The other logic was that schools that performed worse could learn from schools that were doing better. Besides minimising the importance of local factors to student learning and suggesting there are universal “silver bullets”, setting schools in competition with one another hardly provides incentives for better performing schools to share their knowledge.
Blame alone is not the answer
Accountability is important and standardised testing can inform policies and improve accountability. But to function as an instrument of accountability, these tests should not be high-stakes, high-stress or high-visibility, particularly since they are so error prone at the student, classroom and school levels.
The use of sample-based tests, such as the United States’ National Assessment of Educational Progress (NAEP), may instead provide useful information by state and territory, as well as by categories such as social capital, ethnicity and gender. This information could highlight problematic areas, and trigger closer and more targeted explorations.
To get this type of information, the tests need not be conducted every year, since effects of any reforms are seldom evident in one year. The error margins also make year-on-year comparisons of limited value. Sample-based tests will also remove the pressures placed on schools and students, which have proven so detrimental.
As recent NAPLAN results have shown, “blame and shame” alone does not improve student learning. Indeed, focusing solely on NAPLAN scores distracts from broader efforts to provide teachers, schools and school systems with the support needed to ensure all students are given the best chance to learn and succeed.
To date, NAPLAN has been largely used by politicians and the education system to hold teachers and schools accountable. But accountability can work both ways. If NAPLAN is to be used, we should also use it to also hold the education system and politicians accountable for the resources and funding they provide to schools and to the local communities they serve. Perhaps then we would see some real and sustained improvements in student outcomes.
The authors do not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and have disclosed no relevant affiliations beyond their academic appointment.