This is my first significant incursion into the issue of Value-Added Models (VAMs). Since there have been entire books and many research studies devoted to this complex issue, it is difficult to capture the intricacies of VAMs, particularly their role in teacher evaluation, in a shorter blog format. I am going to take a stab here but this will certainly not be my last foray into this topic. I believe it is one of the more important conversations that we must have with our state legislators (and hopefully new state superintendent) over the next twelve months.

Many state and school districts have adopted VAMs as part of their educational accountability systems. The chart below reflects those states which have passed legislation which seek to tie student test results to teacher and administrator evaluations.

The goal of these value-added models is to estimate effects of individual teachers or schools on student achievement while accounting for differences in student background. VAMs are increasingly promoted or mandated as a component in high-stakes decisions such as determining compensation, evaluating and ranking teachers, hiring or dismissing teachers, awarding tenure, and closing schools.

I am pleased to announce that it has been a very bad month for value-added models. I am hopeful that the events of the past thirty days are a precursor to future revisions to Oklahoma’s Teacher and Leader Effectiveness Model, specifically the portion tied to quantitative measures.

On April 8th, The American Statistical Association (ASA) issued a cautionary statement about the use of VAMs for assessing teacher effectiveness. In the executive summary, the ASA makes the following recommendations: (emphasis is mine)

* The ASA endorses wise use of data, statistical models, and designed experiments for improving the quality of education.

* VAMs are complex statistical models, and high-level statistical expertise is needed to develop the models and interpret their results.

* Estimates from VAMs should always be accompanied by measures of precision and a discussion of the assumptions and possible limitations of the model. These limitations are particularly relevant if VAMs are used for high-stakes purposes.

* VAMs are generally based on standardized test scores, and do not directly measure potential teacher contributions toward other student outcomes.

* VAMs typically measure correlation, not causation: Effects – positive or negative – attributed to a teacher may actually be caused by other factors that are not captured in the model.

* Under some conditions, VAM scores and rankings can change substantially when a different model or test is used, and a thorough analysis should be undertaken to evaluate the sensitivity of estimates to different models.

* VAMs should be viewed within the context of quality improvement, which distinguishes aspects of quality that can be attributed to the system from those that can be attributed to individual teachers, teacher preparation programs, or schools. Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores, and that the majority of opportunities for quality improvement are found in the system-level conditions. Ranking teachers by their VAM scores can have unintended consequences that reduce quality.

I encourage you to read the entire seven-page summary HERE. The ASA report is just one of several published over the past few years that urge caution on the use of VAMs to make high-stakes decisions about teachers, administrators, and schools. These reports from highly respected research organizations have caused several of states to reconsider their assessment plans.

In February, legislators in Washington State repealed the use of quantitative measures in their teacher evaluation system. This decision was in opposition to the language in the state’s original federal waiver from NCLB. Last week, Secretary of Education Arne Duncan (HERE) confirmed that his department would not be renewing Washington’s waiver and would be forcing the state to conform to the original requirements of NCLB, a law that Duncan himself called “an educational train wreck.” However, rather than allowing Washington to make decisions it feels are in the best interest of its citizens, Duncan orders them back on the train.

On the same day as Duncan’s statement, legislators in Tennessee (HERE) also voted to prevent student growth on tests from being used to revoke or not renew a teacher’s license. Duncan has been hailing Tennessee as a demonstration of the “success” of Race to the Top, in which test-based evaluation of teachers is key. What happens now? Will Arne pull their waiver as well or ask for the return of RTTT funding?

Meanwhile, U.S. Department of Education spokeswoman Dorie Nolt confirmed that the department is continuing to work with Michigan and other states with waivers to support their reform work.

“Every state that received flexibility from the prescriptive mandates of No Child Left Behind law – including Michigan – committed to creating a teacher and leader evaluation and support system that uses multiple measures, including student growth based on statewide assessments and other factors such as observations of teacher practice,” Nolt said.

In Michigan this week, the legislature will be debating the implementation of VAM-based teacher evaluations in their state. Nolt commented, “We are concerned that Michigan doesn’t yet have the authority to ensure that districts include student growth based on statewide assessments as one of multiple measures in their teacher and leader evaluation and support systems. We know state lawmakers are working to address this and avoid jeopardizing Michigan’s waiver.”

In Texas, lawsuits have begun over the use of VAMs to reward and punish teachers. On Wednesday (HERE), seven Houston Independent School District teachers and their union filed a lawsuit against the Houston Independent School District. In this case, the teachers and their unions are focusing on the fact that teacher value-added ratings fluctuated immensely from year to year. For example, one of the plaintiffs, Andy Dewey, a social studies teacher, received high ratings in 2012, enough for him to receive a bonus. His results the next year dropped significantly. As stated in the article: “If, as VAM supporters hold to be true, teachers have substantial effect on student scores, how can a teacher get it perfectly correct one year, and get it all wrong the next?”

Contrary to what those who support value-added measures say, even if you set aside the technical and methodological concerns, there is absolutely no evidence that using value-added measures as a part of teacher evaluations has any effect on student learning. There is, however, a great deal of research pointing out that there are potentially harmful, unintended consequences of using standardized tests in any high stakes manner. Those consequences include:

1. National studies of value-added models (VAM) find that they do not produce stable ratings of teachers. For example, different statistical models (all based on reasonable assumptions) yield different effectiveness scores. Researchers have found that how a teacher is rated changes from class to class, from year to year, and even from test to test. I will elaborate on this point in my next post.

2. There is no evidence that evaluation systems that incorporate student test scores produce gains in student achievement. Student test scores have not been found to be a strong predictor of the quality of teaching as measured by other instruments or approaches.

3. The Oklahoma End-of-Instruction and Grades 3-8 Assessments are designed to evaluate student learning, not teacher effectiveness, nor student learning growth. Using them to measure the latter is akin to using a meter stick to weigh a person: you might be able to develop a formula that links height and weight, but there will be plenty of error in your calculations.

4. Evaluating teachers of non-tested grades and subjects with school-wide measures is highly inaccurate and unfair. Would it make sense to attribute 35 percent of a reading teacher’s evaluation to the performance of the school’s band and orchestra programs? It doesn’t make sense to reverse this connection either. Likewise, it is not comparable or fair to allow teachers of non-tested subjects like science or art to develop their own pre- and post-assessments to determine their own VAM measure. These teachers control the assessment instrument and can easily manipulate the measure to obtain a predetermined outcome. Just tell the kids to blow off the pretest (no grade) and then use the posttest as your final exam which has grade implications.

5. Schools will be reluctant to allow students to take advanced classes out of fear that the student’s score might adversely affect teacher and leader evaluations. At Jenks Middle School, we allow many seventh and eighth grade students to enroll in Algebra I or Geometry. By continuing this practice, we risk losing their high scores from the grade-level math assessments. A student may score advanced on the 6th grade math OCCT but only proficient on an Algebra EOI in 7th grade. Will this count against the algebra teacher despite the fact that the student was advanced two years in math?

6. Teachers will be incentivized to avoid students with health issues, students with disabilities, English Language Learners or students suffering from emotional issues. Research has shown that no model yet developed can adequately account for all of these ongoing factors. Teachers will become much more cognizant of the students who they are assigned and be less likely to accept difficult students.

7. Teachers and administrators will shop for students and classes in order to teach students who are more likely to provide them with desired academic growth and test scores. Teaching will shift to focusing more on “bubble” students or “money” students. These are the students that have been identified to have the most potential for the greatest amount of growth. The other students receive less instruction and teacher attention as a result.

8. The relationship between students and teacher will change. Instead of “teacher and student versus the exam,” it will be “teacher versus students’ performance” on the exam. Students who display apathy as a result of personal challenges and lack of support will be avoided.

9. Collaboration among teachers will be replaced by competition. With a “value added” system, an 8th grade math teacher has little incentive to collaborate with her seventh grade colleagues to make sure upcoming students score well on the 7th grade math test, for incoming students with high scores would make her job more challenging. When competition replaces collaboration, every student loses.

10. The training costs for administrators will become even more expensive as these new TLE components are implemented. Additionally, the added time requirements for teacher evaluations may actually prompt schools to hire additional administrators to comply with the new mandates.

11. Tax dollars will be diverted to outside companies (Battelle for Kids) in the areas of test development, exam security and data analysis. These dollars diverted to testing companies may well range into the tens of millions of dollars statewide at the same time that funding for instruction is cut.

12. Administrative decisions will be made to drop non-tested subjects like art, music, drama, and PE. With a focus on the end of year testing, there inevitably will be a narrowing of the curriculum as teachers focus more on test preparation and skill and drill teaching. Enrichment activities in the arts, music, civics and other non-tested areas will diminish.

13. There will be a negative effect on morale among teachers and administrators.

14. Teaching will become more didactic and teacher-centered rather than student-centered or 21st century-oriented.

15. There will be increased levels of frustration for students as they are subjected to more and more standardized tests.

16. There will be increased student apathy and boredom due to the disconnect between content relevancy and what is tested.

17. Teachers will continue their exodus from a profession which no longer offers them autonomy, trust, or respect. When the job becomes all about raising test scores instead of nurturing the cognitive and emotional growth of children, it becomes just a job and not the fulfillment of one’s life passion.

18. Potential teachers will choose other careers because it is no longer about teaching content and students they care about; it has become more about playing the game to get high test scores. In some schools and districts, teaching has become programmed and scripted and not creative, engaging and self-fulfilling any more.

19. Holding administrators and teachers accountable for test scores in a current state environment where education funding is below 2009 levels and where state leaders are more focused on cutting taxes than funding essential services violates the Cardinal Rule of Accountability, which states “Hold people accountable for what they control.” Forcing districts to cut remedial services, reduce instructional assistants, and increase class sizes while holding them more accountable for student performance is unethical.

20. Albert Einstein said it best: “Not everything that counts can be counted, and not everything that can be counted counts.” VAM models ignore the significant impact that teachers have on students that can never be measured with a standardized assessment.

If you are like me and have limited time to delve into all of the studies out there on value-added models, I encourage you to read THIS ONE. It is truly outstanding and is written in a manner that can be easily understood by educators and policy makers.

The study is entitled: “Reliability and Validity of Inferences About Teachers Based on Student Test Scores” and was presented by Dr. Edward H. Haertel of Stanford University to the Educational Testing Service (ETS) during their 2013 William Angoff Lecture Series. His review is insightful, comprehensive, and balanced.

People who still support high-stakes accountability and the use of VAM will continue to ignore or minimize any objections to their use. The massive increase in testing and its use for high-stakes personnel decisions under federal and state policy is negatively impacting our schools, classrooms, students, teachers, and our parents. The question becomes, at what point are policymakers going to realize the damage being done to public education?

I would love to see our legislators grow a backbone next year and move to permanently repeal this federal intrusion over how we evaluate our educators. VAM is junk science and should NEVER be used to make high-stakes decisions about or teachers and leaders.

Arne Duncan and his Department of Education have overstepped their constitutional authority and are using OUR own tax dollars to force our compliance with their harmful mandates. It will up to each of us to add our voice to this issue early next year. This is a fight that we must win.

If we do not, be prepared to buy one of these t-shirts.

