2018 TEP Statement on Student Evaluations of Teaching
UO student evaluations of teaching (SET) recently--and substantively--changed. These changes are described in detail in the Teaching Evaluation pages. Part of the rationale which spurred the Office of the Provost and the University Senate to support these changes is preserved below. This historical information may be particularly useful for faculty who are less familiar with research on SETs that indicates a lack of correlation between SET scores and student learning, and that documents the impact of bias on SET scores.
The University of Oregon, led by the University Senate and Provost’s office, values student feedback on their learning experiences as part of the new Continuous Improvement and Evaluation of Teaching System. The system aligns evidence from multiple sources (student experience surveys, instructor reflections, and peer reviews) to teaching quality categories defined by the August 2019 MOU on teaching evaluation (Professional, Inclusive, Engaged, and Research-Informed) to mitigate bias and ensure evaluation supports the development of UO’s teaching culture.
Student Evaluation of Teaching
Students are invited to complete student evaluations of teaching (SETs) for their classes at the end of each term at the University of Oregon. In principle, SETs have a dual purpose. Faculty can use SET results to help them identify areas of their teaching that need attention and improvement; that is, they have a formative purpose. SETs also have a summative purpose: they are used to inform evaluations of a faculty member’s teaching as part of decisions about tenure and promotion, contract renewal, and merit raises.
The latter purpose, especially, relies on the assumption that SETs are a valid measure of teaching effectiveness (assumed to be related to student learning). The research literature on SETs is extensive and stretches back nearly 100 years, but over that time little consensus has emerged about whether there is in fact a correlation between SET ratings and student learning, or even how one should measure student learning.
Many—but not all—studies show a modest positive correlation between SET results and student learning  . But recent work, including a careful meta-analysis of previous results , indicates that there is no correlation between SET ratings and student learning after controlling for sample size and publication bias.
Other problems arise as well. For example, there are indications that students often do not interpret questions and terminology on SETs in the same way faculty do  so care must be taken with wording of questions and interpretation of results. Persistent questions also remain (see, for example ) regarding students’ ability to assess teaching effectiveness, the use of SETs to compare faculty in the absence of information about the spread of scores within a relevant group of faculty, and whether student response rates on non-mandatory SETs accurately reflect the true distribution of student opinion. In addition, there is evidence that SET scores vary depending on class size, the level of the class, the discipline, and prior preparation of the students.
Most disturbing, though, are results indicating that SETs show bias in gender  , race  , and ethnicity , with women, African-Americans, and Latino faculty receiving lower scores on SETs than their white male colleagues.
While there is debate about the validity, utility, and fairness of SETs, there is agreement in the research literature that if they are used at all, SETs should be only one of several tools used to assess teaching   . Peer reviews, self-evaluations, administrator reviews, student interviews, and alumni ratings are alternative strategies that can be combined to create a more representative picture of a faculty member’s teaching.