Delivering effective education during a shutdown: Reflections on experience of Ranking and Grading
In the first year of my twenty years of teaching, I worked with my Head of Department on creating a set of mock exams from a range of past papers. We used indicative marks based on exam board grade boundaries common at that the time to assign grades, which gave predictions well below those achieved in the final exams by those students.
To help me give more relevant feedback to student, parents and the school, I started a spreadsheet which collated performance in the mock against actual outcomes, and continued to maintain this data over 19 years.
I modified the test as specifications came and went, but the basis and indeed some of the content remained constant, including multiple choice questions which gave good diagnostic indication of student understanding.
As a result I could discuss performance and likely outcome with students with some confidence and a defined basis for my expectation. It was also very useful when parents questioned predicted grades to qualify that prediction on the basis of (for example) ten years’ experience, with students achieving a particular mark in the test generally achieving a particular grade in the final exam.
This approach was also the basis for any end of year ranking, using an assessment with a provenance that was understood by me and by my department, identifying both common misconceptions and exceptional responses.
Although it is not possible to create this scenario retrospectively, there may be end of topic tests or other activities which have existed from year to year and could be used to inform ranking and grade prediction in centres this year. For example my mentor on my first PGCE placement, who loved statistics and analysis, reviewed all the GCSE end of topic tests and identified that one of the biology tests consistently had a strong correlation with GCSE outcomes!
Without this, a past paper or other similar test can be used for ranking, but the issue that follows is to assign appropriate grades. Using exam board 2019 papers with the associated mark scheme is one way to allocate grades. Areas of caution are firstly in marking, where teachers should mark exactly to the mark scheme and be cautious in any allocation of benefit of doubt or interpretation, and secondly to consider what enhancement occurs between the mock and actual exam. This is more challenging as it may vary from pupil to pupil, but in my experience a half grade enhancement gave a reasonable estimate and the use of a fixed value, rather than attempting individual student judgement, avoids potential bias or inconsistency.
A word of caution also in using past paper questions is to be aware of the differences in level of demand for questions, as identified in a blog I wrote whilst working for OCR. It is possible to construct tests which on first use give particularly high or low marks depending on the topics and associated level of demand. Therefore, for standardisation without historic data I suggest using any past paper in its entirety.
For ranking on tiered papers both at key stage 3 and 4 as a department we created mock exams with a greater proportion of common questions than in the actual assessments. With around 50% of the marks common to all candidates we used these in isolation to rank the entire cohort. The full test marks were then used for higher and lower tier groups to give greater discrimination at those higher and lower grades. Similarly the common questions in GCSE assessments in any year can be used to inform the boundary between foundation and higher and clarify appropriate comparable grades.
In the current situation the strength of the submission both of rank and grade is underpinned by process of teacher judgement based on identifiable data. Identify the evidence you have, even if in your opinion it feels less than ideal, identify the process of converting evidence to outcomes and follow that process. This will support teachers both in any student and parental questions raised or should the exam board request information in support of the submitted grades. As teachers we often underestimate the value of the information we have, we have a notion of the ideal and how if only we had more data we could make even better judgements.
The challenge here, particularly in a new centre, or with a significant number of new staff, is in making the most of the information which is available. It could be useful and affirming to obtain external support to review evidence and to give assurance on judgements made.
Research shows that teachers are very good at making comparable judgements for groups of students to create a rank order. Reflecting on my own experience, the challenge as Head of Department was in aligning the rankings of different teachers, and for combined science agreeing the relative weighting of tests which may not have comparable level of demand where one set of subject test marks could be much higher than those in other subjects and thus skew the ranking in favour of those skilled in a particular science. The notion of “level-ness” was used at the time of the National Strategies, and considered in the use of descriptors and on the basis of teacher experience. The challenge is to retain comparable outcomes based on data rather than anticipation of potential.
This is the big shift for students to accept this year, that their outcome is based on their work to date, rather than for example the promise of exceptional attention to revision leading to a significantly higher grade. Similarly, the implication is that centres will be judged on what they have demonstrated achieving previously. Concerns about changes in cohort or teaching and learning should not be an issue if the grades submitted are based on dependable data. Any positive change in pedagogy should be evident in the comparative test data from year to year, and as such should not require additional justification. The same should be true for higher attaining cohorts who should have higher marks year on year for the same tests or assignments.
From my reading and understanding of the Ofqual documentation there is a strong steer for the use of objective evidence as the basis for judgement, and if the process of reaching the judgement is also clearly documented, centres should feel secure that the final outcomes will be appropriate and comparable.
Neil Wade is a consultant in science education and assessment and a member of the ASE 11-19 committee. His previous roles have included Head of Science & Assistant Head at Benjamin Britten High School, Lowestoft as well as Lead Subject Advisor (STEM) at OCR.