Why we need a better means of evaluating our nation's youngest children
Illustration: Roxanna Bikadoroff
Last fall and again in the spring, the government administered a standardized literacy and math test to all children in the Head Start program. It's being given again this year. Four-year-olds are asked to count objects, name alphabet letters and simple geometrical shapes, understand directions, characterize facial expressions, and identify animals, body parts, and other objects in pictures.
It is hard to discern why the Bush administration insisted on the test over the objections of most leading early-childhood experts and even members of its own Head Start advisory panel.
Perhaps it is nothing more than a reflexive decision of administration ideologues who see tests and more tests as the solution to every conceivable educational problem—or worse, a way to expose the academic failures they fundamentally believe to plague the public school system in America.
There are certainly some legitimate issues to address. One is that the government spends nearly $7 billion annually on Head Start, and we should know what we're getting for our money.
From national studies, it appears that Head Start graduates have better adolescent and adult outcomes than low-income children who weren't in the program; they are more likely to graduate from high school and attend college, they have higher earnings, and they are less likely to commit crimes. But it also appears that Head Start children, especially blacks, may get an initial cognitive boost that soon fades away, so by fourth grade their reading and math scores may be no higher than their peers'.
Another challenge for the program is a growing consensus among early-childhood experts that 4-year-olds are capable of better literacy and mathematics performance than was previously thought. Contrary to experts' thinking a generation ago, preschoolers can begin to read and do math, as any parent of a literate, middle-class 4-year-old knows.
But a standardized test, like that now administered by Head Start, is a poor way to address these challenges. Indeed, it can make things worse.
President Bush has assured educators that he considers it "absurd" for 4-year-olds to take tests like those given in elementary schools; yet in important respects, the flaws of the Head Start test are similar to—and perhaps more severe than—those of standardized tests for older children. As yet, nobody knows what the consequences of doing poorly on the test could be, because the federal Head Start Bureau, though determinedly pushing ahead, still can't say how results might be used. But the fear of adverse consequences alone must create incentives to "teach to the test," as high-stakes exams must do.
Administration officials like to say that if the test assesses appropriate literacy skills, teaching to it can't be bad. But this fails to consider that a 20-minute test (that's the length of the Head Start exam) can't possibly reflect fairly the full breadth of an adequate curriculum. It must inevitably change program emphases.
Consider the items calling for identification of alphabet letters, in matched upper- and lowercase pairs, like "Aa" or "Nn." Research shows that young children who can recognize and name letters are more likely to read later on. So why shouldn't a test give Head Start teachers incentives to teach the alphabet? The reason is that research showing that letter recognition predicts reading success is based on assessing children who learned letters through natural literacy activities, like having stories read to them or playing with picture books. There is no evidence that memorizing alphabet letters out of context predicts later reading skill. But the test will lead teachers to spend more time on alphabet drills and less on reading—just the opposite of what Head Start needs.
Head Start was never intended to be primarily an academic program, but one that prepares low-income children for school by developing their health and their social, emotional, and physical skills, as well as math and reading readiness. The Bush administration claims to support this broad definition of Head Start's goals, and denies it intends to make the program academic. Yet just as rational teachers will shift their instruction to drilling letters and numbers if those are mostly what are tested, so Head Start programs will shift their focus if academics are emphasized in a test-based accountability system. "The administration says it supports all the goals of Head Start, but this test, in its present form, is sending a very different message," says Jacqueline Jones, who knows something about standardized tests. She heads early-childhood and literacy initiatives at the Educational Testing Service and was a member of the Head Start advisory panel.
In response to such criticism, Head Start officials claim they are now trying to standardize measures of 4-year-olds' social and emotional development, but don't want to include such items on a test until they have been "validated"—i.e., proven to predict later school success. This claim is consistent with the administration's proclivity to invoke "scientific" standards in support of favored programs but to ignore science when it contradicts policy preferences.
After all, the literacy items in the Head Start test have not themselves been validated for the manner in which they are being used. In general, very little (only 25 percent) of the variance in first- or second-grade academic performance can be predicted by tests in preschool. For the Head Start test in particular, some items have been borrowed from tests for older children, and some were validated only in combination with other items in a larger test. Early-childhood educators do have test items that assess social and emotional development or fine and gross motor skills. For example, a child's ability to control impulses, a good predictor of whether a 4-year-old will benefit from elementary school academic instruction, can be assessed by items like asking a child to delay opening a wrapped gift when the tester leaves the room. Motor skills can be assessed by seeing if a child can move a toy turtle (slowly) or toy rabbit (rapidly) along a meandering path sketched out on a piece of paper.
These haven't been scientifically validated either, but that makes them little different from the reading items. Including social and emotional items on the Head Start test would at least signal to Head Start teachers that the government was not trying to get them to make academic drills their only priority. An even better signal would be given if each child's assessment had to include a teacher's report of whether the child was up-to-date on regular pediatric and dental visits and had been given comprehensive vision and hearing screening—measures that are among Head Start's legislated objectives and that have at least as much to do with 4-year-olds' later school success as alphabet recognition.
If, as officials sometimes insist, the goal is to assess Head Start program quality and not to evaluate individual children, there is a better system already in place—one ignored by the administration in its compulsion to test first and wonder why later. Currently, federal officials evaluate the quality of every Head Start program in the nation triennially, sending teams of as many as 25 monitors—experts in management and finance, early childhood pedagogy, social-emotional development, health, and nutrition—for a full week to Head Start centers. Results of their investigations are forwarded to the government. Programs must develop plans to correct deficiencies in any area and submit to remonitoring to verify that flaws have been corrected. If the deficiencies are severe, or remain uncorrected, the government ends its contract with the Head Start operator and seeks bids for other organizations to run the community's program—a regular enough occurrence to ensure that Head Start programs take the reviews quite seriously.
Among the standards that Head Start programs must meet to satisfy these review teams is whether each child has been individually evaluated at least three times during the year in all the domains that Head Start should cover, including knowing alphabet letters and one-digit numbers, but also other important school-readiness skills like whether the child knows how to take turns or how to handle disappointments. Head Start teachers are required to show what kind of progress children are making by keeping samples of their work, notes of conversations and observations that document students' skill in each of the academic, social, and emotional areas in which children are expected to grow. Records must indicate whether the child has had regular medical and dental checkups. The one thing the review teams do not demand of teachers is that they give children a sit-down test, inappropriate for 4-year olds, of decontextualized math and reading skills.
There are ways this accountability system could be improved. Policymakers could join the expert teams, for example, to familiarize themselves with the challenges faced by early childhood programs. The monitoring standards could be revised to require that Head Start programs, consistent with what is now known about children's development, have somewhat higher expectations for academic skills without needlessly downgrading other important goals. And the system could require a higher level of skills in instruction and assessment from Head Start teachers—an elusive goal so long as funding for Head Start is so sparse that many teachers have no more than a high school education and are paid accordingly.
Yet even with their flaws, Head Start program reviews comprise the most comprehensive and high-quality ac-countability system in American education today. Rather than asking Head Start to ape the standardized testing regime of the No Child Left Behind law for K-12 education, we'd be better advised to ask elementary and secondary schools to submit to the kind of accountability already characteristic of the Head Start program.