The Frustrated Teacher: vaa

Showing posts with label vaa. Show all posts

1/24/12

Teaching Is An Art (So Is Lawyering). VAM Can't Work: Updated

I came across a quote today from Sandra Day O'Connor:

"Attorney errors come in an infinite variety and are as likely to be
utterly harmless in a particular case as they are to be prejudicial. They
cannot be classified according to likelihood of causing prejudice. Nor can
they be defined with sufficient precision to inform defense attorneys
correctly just what conduct to avoid. Representation is an art, and an act
or omission that is unprofessional in one case may be sound or even
brilliant in another."

Yeah, teaching too. VAM can't work. It just can't.

Update: So I admit a friend and I are having an email exchange about this science vs. art thing. Here is why I say teaching --and lawyering and doctoring-- is more art than science:

Lawyers, doctors, teachers all have basic knowledge without which they
could not practice the art. These professions are art in the same way
jazz is art--it requires knowledge of music, but then you get to be
creative.

When 2 or 3 different people could perform the job differently and still
end up with a good expected outcome, that implies there is more than
science to it, there must be art.

My surgery required dumping my guts onto the table. I am sure there are
a few ways to do that and a few outcomes depending on the different
ways. I assume there is more than one good or bad way to do that.

Same with trying a case, or teaching a concept, or sewing up my gut.

Update II: Here is the rest of the email exchange. My friend, referred to below as "The Law" is a lawyer. I have summarized her emails to just the pertinent questions I am responding to.

TFT:

The science part of teaching is understanding how kids learn, not the subject matter (though in upper grades the subject matter knowledge is clearly crucial, but still it's not the sole science part). And, how kids learn varies, and science has a hard time pinning much down in this domain, leaving it to art and situational awareness that comes with practice.

Aren't the best trial lawyers performance artists as well as highly knowledgeable about precedents, torts, and whatever else you lawyers have to know about that you learn in law school and then promptly realize it wasn't all that helpful and the only way to get good at trial lawyering is to do it? And we measure trial lawyers by wins and losses, right? Not by their actual performance in the courtroom. Right? And surgeons are rated on survival rates, not on procedure, unless the outcome was bad, then procedures are looked at, right? All this sounds like teaching--we look at outcomes. Except that for teaching, like the family doctor, much of what they do is dependent on things they don't control--diet, homework, and the rest.

You can't measure art, really, can you? I mean, perhaps in the most rudimentary way--painters should use paint and understand something about form, shadow, line, and all that stuff (the science of the art), but one person's art is another person's garbage, right?

Art certainly isn't VAMable, I don't think.

Can we measure my progress by looking at (name redacted) [a middle class, white, gifted student who loved my class and was challenged, and who was tender to the Hispanic student. Sweet.]? Or should we look at (name redacted) [a Hispanic student whose father was in jail and was homeless off and on during the year and scored poorly but whose attitude towards life seemed to improve in my class], whose life was basically devastated from birth? [Middle class student] would have advanced without me. [Hispanic student] didn't advance much, but his sense of self I think got better in my class. Can we measure [Hispanic student]'s sense of self? I don't think so.

I think teaching is a lot like the 1984 case you write about--it's a judgement call reserved for those in charge--professional judgement. There is no standard we can measure against, so we have to measure against what the professionals have gleaned over their years as professionals practicing their art.

Perhaps my use of Art and Science are too broad, but I don't know how else to separate the 2 domains. I also think that there are fewer rules for teachers than other professions. It's more like a therapist than a doctor or lawyer. There are standards of care, policies about privacy and pedagogy (therapy) but each patient (class) is different and will be taught (therapized) differently. In both cases the professional is steeped in the science underpinning their profession, but the actual doing of it seems more like art--the thing the science-knowledge frees you to do.

How's that?

The Law:

Your first sentence answers one of my original questions, I wanted to know whether there was a science to the teaching, as opposed to the subject matter.

TFT:

VAM can't control for family attributes (SES). Of factors that impact a child's ability to learn (do well on a test, more accurately, which is NOT an accurate measure of the child's true ability), most knowledgeable folks say that between 10 and 30% of factors come from school, the rest come from home, as [made obvious] by [Hsp student] and [MC student], among others.

The test--the high stakes test at the end of the year--is what VAM uses. That fact alone makes VAM useless, as one test on one day does not accurately reflect much of anything about the teacher or the student. I suppose that if the whole class did incredibly well, or badly, one could generalize about the teacher. But that's obvious. It's when VAM is used to differentiate between teachers who, on the whole, are relatively similar. VAM does not have the power to do it--it's too prone to error. It is not a measure that can be used, as the variables can't be controlled like they can in industry by controlling inputs (materials/students).

Reformers would have you believe that there is a science to teaching (pedagogy) and charters have figured it out. And that's bullshit. Charters have figured out how to control inputs. There is no science of pedagogy, really. That's my argument--pedagogy is an art. Teaching is an art. Sure, it has some science behind it--brain development, motor development, some stable psychological concepts, but for the most part, it's art.

So, the reformers abuse science's power by giving it more than it deserves in this domain, and they belittle the art of teaching by scripting teachers with curricula that claim to be research based (science) when they aren't cuz there ain't no science they can actually point to, and the research is usually not actual research but a working paper from the publisher or a CMO funded meta-review. Remember, Everyday Math is "research based" but most mathematicians pillory it for its stupidity. It was pushed through after packing the board of the What Works Clearinghouse.

The actual research performed over the past 70 years shows, unequivocally, that home factors make or break a kid. Not teachers. Not schools. Not curricula. Home is where the issues are. And that is where poverty lives.

The reform movement uses bullshit disguised as science (the NYT article on that latest "study" being a perfect example). They can't acknowledge poverty because that would undercut their scheme that claims they know how to save kids with their new pedagogy that is in evidence in their charters that do well. Except few of them do, and the ones that do well control their inputs. Ask KIPP, Aspire, HCZ, HSA and the rest. They've all been in trouble for scheming the inputs.

How's that?

The Law:

Or is good teaching like pornography -- I know it when I see it?

TFT:

Yes. It's exactly like pornography--you know it when you see it. Seriously. Like your lawyer scenario. Porn, teaching, lawyering--non-VAMable.

Activists, educators and academics you should be aware of include:

Dr. Diane Ravitch
Dr. Deborah Meier
Dr. Stephen Krashen
Dr. Shaun Johnson
Anthony Cody
Leoni Haimson
Matt Damon
Jon Stewart
P.L. Thomas

Here are some links to experts. Some are a bit long, but you can and should do it!

--Richard Rothstein looks at An overemphasis on teachers

--and Rothstein again, with others:

Narrowing the Achievement Gap for Low-Income Children: A 19-Year Life Cycle Approach

By Richard Rothstein, Tamara Wilder and Whitney C. Allgood | 2008

--One and another by Jim Horn (of Cambridge College) on VAM.

5/31/11

Value Added Measures: Fail (Shanker Blog): Updated

Much of the criticism of value-added (VA) focuses on systematic bias, such as that stemming from non-random classroom assignment (also here). But the truth is that most of the imprecision of value-added estimates stems from random error. Months ago, I lamented the fact that most states and districts incorporating value-added estimates into their teacher evaluations were not making any effort to account for this error. Everyone knows that there is a great deal of imprecision in value-added ratings, but few policymakers seem to realize that there are relatively easy ways to mitigate the problem.

This is the height of foolishness. Policy is details. The manner in which one uses value-added estimates is just as important – perhaps even more so – than the properties of the models themselves. By ignoring error when incorporating these estimates into evaluation systems, policymakers virtually guarantee that most teachers will receive incorrect ratings. Let me explain.

Each teacher’s value-added estimate has an error margin (e.g., plus or minus X points). Just like a political poll, this error margin tells us the range within which that teacher’s “real” effect (which we cannot know for certain) falls. Unlike political polls, which rely on large random samples to get accurate estimates, VA error margins tend to be gigantic. One concrete example is from New York City, where the average margin of error was plus or minus 30 percentile points. This means that a New York City teacher with a rating at the 60th percentile might “actually” be anywhere between the 30th and 90th percentiles. We cannot even say with confidence whether this teacher is above or below average.

...
[Update! I forgot the following paragraph!!]

Now, here’s the problem: In virtually every new evaluation system that incorporates a value-added model, the teachers whose scores are not significantly different from the average are being treated as if they are. For example, some new systems sort teachers by their value-added scores, and place them into categories – e.g., the top 25 percent are “highly effective,” the next 25 percent are “effective,” the next 25 percent are “needs improvement,” and the bottom 25 percent are “ineffective.”

Read the whole thing @Shanker Blog

5/4/11

American Mathematical Association Accuses Mathematicians Of Incorrectly using VAM

...

Value-Added Models

In the past two decades, a group of statisticians has focused on addressing the first of these four problems. This was natural. Mathematicians routinely create models for complicated systems that are similar to a large collection of students and teachers with many factors affecting individual outcomes over time.

Here’s a typical, although simplified, example, called the “split-plot design”. You want to test
fertilizer on a number of different varieties of some crop. You have many plots, each divided
into subplots. After assigning particular varieties to each subplot and randomly assigning levels of fertilizer to each whole plot, you can then sit back and watch how the plants grow as you apply the fertilizer. The task is to determine the effect of the fertilizer on growth, distinguishing it from the effects from the different varieties. Statisticians have developed standard mathematical tools (mixed models) to do this.

Does this situation sound familiar? Varieties, plots, fertilizer…students, classrooms, teachers? Dozens of similar situations arise in many areas, from agriculture to MRI analysis, always with the same basic ingredients—a mixture of fixed and random effects—and it is therefore not surprising that statisticians suggested using mixed models to analyze test data and determine “teacher effects”.This is often explained to the public by analogy.

One cannot accurately measure the quality of a teacher merely by looking at the scores on a single test at the end of a school year. If one teacher starts with all poorly prepared students, while another starts with all excellent, we would be misled by scores from a single test given to each class. To account for such differences, we might use two tests, comparing scores from the end of one year to the next. The focus is on how much the scores increase rather than the scores themselves. That’s the basic idea behind “value-added”.

But value-added models (VAMs) are much more than merely comparing successive test scores.
Given many scores (say, grades 3–8) for many students with many teachers at many schools, one creates a mixed model for this complicated situation. The model is supposed to take into account all the factors that might influence test results—past history of the student, socioeconomic status, and so forth. The aim is to predict, based on all these past factors, the growth in test scores for students taught by a particular teacher. The actual change represents this more sophisticated “value added”—good when it’s larger than expected; bad when it’s smaller.

The best-known VAM, devised by William Sanders, is a mixed model (actually, several models), which is based on Henderson’s mixed-model equations, although mixed models originate much earlier [Sanders 1997]. One calculates (a huge computational effort!) the best linear unbiased predictors for the effects of teachers on scores. The precise details are unimportant here, but the process is similar to all mathematical modeling, with underlying assumptions and a number of choices in the model’s construction.

History

When value-added models were first conceived, even their most ardent supporters cautioned
about their use [Sanders 1995, abstract]. They were a new tool that allowed us to make sense of mountains of data, using mathematics in the same way it was used to understand the growth of crops or the effects of a drug. But that tool was based on a statistical model, and inferences about individual teachers might not be valid, either because of faulty assumptions or because of normal (and expected) variation.

Such cautions were qualified, however, and one can see the roots of the modern embrace of VAMs in two juxtaposed quotes from William Sanders, the father of the value-added movement, which appeared in an article in Teacher Magazine in the year 2000. The article’s author reiterates the familiar cautions about VAMs, yet in the next paragraph seems to forget them:

Sanders has always said that scores for individual teachers should not be released publicly. “That would be totally inappropriate,” he says. “This is about trying to improve our schools, not embarrassing teachers. If their scores were made available, it would create chaos because most parents would be trying to get their kids into the same classroom.”

Still, Sanders says, it’s critical that ineffective teachers be identified. “The evidence is overwhelming,” he says, “that if any child catches two very weak teachers in a row, unless there is a major intervention, that kid never recovers from it. And that’s something that as a society we can’t ignore” [Hill 2000].