Colleges should gather more data to equitably assess and grade students in the future (opinion)
The year 2020 will be remembered as the year of COVID-19. Not just historically, but also officially — on transcripts. Institutions of higher education initially responded similarly to the pandemic by closing their doors and switching to remote learning. But their grading policies, hurriedly made in response to the crisis, varied wildly. That became the source of the nickname “the asterisk semester,” which slowly became the asterisk year.
Now, as the pandemic blasts past the year mark, students are still fighting for more lenient grading with mixed success. Some colleges are offering relief in the form of flexible grading schemes. Others assert that these do more harm than good. All seem to be struggling to make fair policies with little hard evidence to back them up.
The result is that different institutions’ grading policies will confuse recruiters trying to rank applicants accurately. To prevent more asterisk years, especially as we move into 2021 and the upcoming semester, colleges and universities should create contingency plans, preferably data-driven ones. The key issue that such a policy should address is that remote learning offers challenges that are distinct for each student, making it most difficult to quantify “fairness” when it comes to student assessment.
For example, last spring, some institutions, like Columbia University in New York City, decided to make all courses pass-fail. There are currently several different variations on that approach across the country, and while different colleges have called it by different names (credit-no credit, pass-fail, satisfactory-unsatisfactory and so on), they mean essentially the same thing: students who meet expectations get credit, others don’t. In neither case does the grade affect a student’s GPA.
However, pass-fail grading can be frustrating for some students. A GPA-neutral pass does not help students who need to improve their GPA for scholarships or other reasons, and some opponents of pass-fail contend that it will, in fact, harm students. That has compelled some people to argue against mandatory pass-fail grading. A fair middle ground seems to be to allow students to choose whether to receive a letter grade or a pass-fail grade.
“I don’t think that there is really an ideal or one-size-fits-all solution. Giving students choice, I think is maybe closest to the best because it does allow individual students based on their individual circumstances to choose,” Erin Hardin, director of undergraduate studies and associate head of the psychology department at the University of Tennessee in Knoxville, told me in an interview in March. The University of Tennessee offers A through F grades for most courses, with the option to switch to a three-tiered system of satisfactory-credit-no credit.
Preferably, colleges and universities would base their policies on evidence. However, it’s difficult to find research on how grading systems affect student learning. “What would that evidence look like?” Hardin asks. “We can measure things like motivation and actual grades and objective learning, and then if we move to pass-fail … we can measure those things again and compare.” But most institutions hadn’t had time to run these studies before the end of the semester; they had to rely only on what little evidence they already had.
Chris McMorran and Kiruthika Ragupathi from the National University of Singapore have conducted one such study. They looked at how students and faculty were affected by a “gradeless” semester for incoming students in 2014. After surveying more than 3,000 students and 500 faculty, they found that both students and faculty had issues with motivation and “poor learning behaviors and attitudes.” That was despite both groups favoring the idea of gradeless learning. Part of the problem seems to be historical. The authors note that one gradeless semester “cannot easily undo decades of grade-centric thinking.”
When I asked about the origin of the grading system at the University of Tennessee, Robert Hinde, vice provost for academic affairs, described to me a process in which the university built its pandemic-response grading system by modifying the existing one to maximize flexibility. But if institutions everywhere took that approach, we’d see a flood of distinct grading systems, which might cause problems. Future graduate admissions offices would be burdened with matching a particular grade to its meaning from a particular university. “There are going to be three, four or five years’ worth of students who have gone through this extreme experience,” Hinde says. His advice to admissions offices is to “try to understand why an institution made the decisions they did about implementing a modified grading policy.” Natalie Campbell, who was the student body president when lockdowns started, agreed, saying, “Admissions offices just have to understand that this semester is not an accurate reflection of a student’s performance.”
Even when students choose their grading scheme, issues can arise, however. For example, at the University of Tennessee, students had to decide which system they wanted during a 10-day window in April, although final grades weren’t uploaded until May. Other institutions have allowed requests for letter grades after final scores have been given. Such discrepancies don’t sit well with some people.
“It just strikes me as unfair,” says Allison Stanger, visiting professor of government studies at Harvard University. She argues in an opinion piece that it’s unrealistic “to believe that the playing field is level under these circumstances.” Even if student performance is reduced, she thinks a mandatory pass-fail policy is the most inclusive option. “What you’re really dividing it up into is engaged and not engaged,” she says. And in response to arguments that pass-fail reduces student engagement, she says, “I’ve seen no drop-off in motivation … I’m seeing students do extraordinary things … I’m really quite impressed at how my students have risen to the challenge of adapting to these new circumstances from a variety of different backgrounds.”
The Need for Evidence-Based Policies
In any case, administrators are looking forward to acquiring data — for example, how many students chose satisfactory-credit-no credit grading and how many decided to stick with A-F grading — in order to make at least somewhat better-informed decisions about grading policies in the coming semester. At institutions such as the University of Tennessee where students can choose, it will be particularly useful to learn what percentage of students chose to keep the default grading scheme and how their choice affected their GPA. For example, if more GPAs fell than rose, then colleges and universities might be able to use the difference to estimate the percentage of their students who were meaningfully disrupted by online learning. This “percentage disrupted” estimate will be different for different institutions and would be a useful statistic for policy decisions.
In Stanger’s words, “It really makes you reflect on what we’re doing and why we’re doing it. If the faculty see this as a learning experience and an opportunity to rethink the things we’re not doing so well and think about how we could do them better, it could really be a game changer for higher education.” With the right information, institutions would be able to make better, evidence-based policies and contingency plans before the next crisis.
So what would — or should — that evidence look like? As mentioned previously, motivation and engagement can be measured, for example through surveys, but developing, deploying and analyzing these surveys can be a long process. It would be better to use data that colleges and universities already collect. Then, institutions should turn claims of those arguing for or against pass-fail grading into hypotheses and test them.
Here are some examples.
- Hypothesis 1: Pass-fail grading provides a safety net for students’ GPAs. Does it? What is the difference in GPA between the last two semesters and previous cohorts?
- Hypothesis 2: Pass-fail grading makes students less competitive on the job market. Does it? What is the unemployment rate for recent graduates? Of course, unemployment is up due to the pandemic, but these data could be presented as a fraction of the national unemployment rate. Then how does that fraction compare to previous cohorts?
- Hypothesis 3: Pass-fail is more inclusive. This hypothesis is true if even a single student has issues completing coursework, but how much more inclusive is it? What is the scale of the problem? How does student performance on standardized exams compare to previous semesters? And in that distribution, are all students affected? Or just some? This could be the difference between a blanket pass-fail policy and a case-by-case policy.
At present, the decisions about grading policies seem to be made by logic and intuition but very little data. This is the pandemic that caught us by surprise. We should collect data to inform these policies so we don’t get caught off guard again.
Justin Westerfield is a former teaching assistant at the University of Tennessee, Knoxville.