Orwell Award Announcement SusanOhanian.Org Home

The Eggplant


in the collection  

Princeton Review Ranks State Accountability Systems: Is This a Study or a Joke?
Princeton Review states that their ratings "are based on information provided by each state, including legislation, press releases, overviews for parents, testing manuals, reports explaining the significance of test scores, and telephone interviews." They also "relied on the findings of the American Federation of Teachers' study, Making Standards Matter 2001 for some information regarding theh alignment of state standards to state assessments."

Princeton Review sent their analysis to each state's Director of Assessment for a review of their accuracy and completeness. Six state departments of education--Iowa, Kansas, Maryland, Montana, Oklahoma, and Tennessee--chose not to respond, "some emphatically so," according to Princeton Review.

If Princeton Review could not find the required information and the state refused or was otherwise unable to provide it, "we assigned a score of zero: after all, our underlying premise, and that of accountability in general, is that knowing is always better than not know." The asterisks indicate zero ratings that were assigned because information "was not forthcoming."

Letter-grades for each criteria were assigned by equating the scores to "the class A-F distribution."

So some states HAD to fail. Sounds like a high stakes system all right.

Testing the Testers 2003: An Annual Ranking of State Accountability Systems Executive Summary

During the Winter of 2002-2003, The Princeton Review conducted Testing the Testers 2003, its second Annual Ranking of State Accountability Systems. Unlike other studies, ours is not 4.primarily concerned with the rigor of academic standards or of the tests that measure them. Rather we focused on the policies that determine the overall character and effectiveness of each accountability system. Properly conceived and well-implemented, these policies will tend to produce systems that are consistent, secure, open to public scrutiny, and flexible enough to improve over time We also believe they will tend to encourage and support an evolution to better and more effective schools.

As the stakes for testing rise, and with the pressure of the Federal No Child Left Behind act (NCLB), accountability systems increasingly affect what gets taught and how. As a result they will strongly influence how schools develop over the next several years. Simply put, good accountability systems will tend to result in better schools, and bad systems will create worse ones. The purpose of Testing the Testers is to highlight good and bad accountability practice with the hope of helping the overall tide to rise. By “good” we mean accountability systems that will lead not only to improvement on test scores as well as on other measures of school quality, that will support educator professionalization, make school a more satisfying and rewarding experience for students, and importantly, that will be able to improve and adapt as political and pedagogical realities change. Raising test scores is not that difficult if raising scores is all you want to do, and are willing to sacrifice the rest of what school means in order to do so. That, to us, would be bad accountability.

We collected data on twenty-two relevant indicators from each state and the District of Columbia. Each indicator was grouped in one of four major criteria and states received a score of either zero, one, or two points depending upon how their program performed. The criteria were:

Academic Alignment: High-stakes tests are aligned to academic content knowledge and skills as specified by the states’ curriculum standards.
Test Quality: The tests are capable of determining that those curriculum standards have been met.
Sunshine: The policies and procedures surrounding the tests are open, and open to ongoing improvement.
Policy: Accountability systems will tend to affect education in a way that is consistent with the goals of the state.
These criteria were weighted at 20%, 15%, 30%, and 35% respectively and the raw scores scaled accordingly to give each state and the District of Columbia a ranking from one to fifty-one (the highest possible weighted score was 100). Each state was also assigned letter grades on the A-F scale for each of the four criteria.

The best programs are:

Rank State Weighted Score Alignment
20% Test Quality
15% Sunshine
30% Policy
1 NY 88.5 B+ A B A-
2 MA 85.7 B- A A- B+
3 TX 84.3 B- B+ A- A-
4 NC 84.0 B- A B A-
5 VA 81.7 A A B+ B-
6 LA 81.0 B- A B+ B+
6 FL 81.0 B- A B+ B+
8 AZ 80.2 B- A C+ A-
8 OK 80.2 B- A B B+
10 CA 79.7 B+ A B B-

The worst programs are:

Rank State Weighted Score Alignment
20% Test Quality
15% Sunshine
30% Policy
41 KS 58.2 D A C+ C+
42 IN 56.8 D A C C+
43 HI 55.5 C- B+ C- B-
44 WY 54.5 F A C B-
45 ND 54.3 C- B+ C- C+
46 WI 53.2 C- A C- C+
47 WV 52.2 D A- F B-
48 SD 49.8 B- A F C
49 RI 48.5 C- A F B-
50 MT 29.0 F B- F C-

Only Virginia received two A’s, and no state received an A for either of our most significant criteria, Sunshine and Policy. Nearly 30% of states received overall scores of 65 or lower, and of the individual grades given to the bottom-performing twenty states, nearly 40% were C or lower. On the positive side, forty-six programs received grades of B+ or better for the quality of the test instruments themselves, with only Utah scoring lower than a B-.

Although the rankings are affected by the weighting we applied (especially for those states in the middle three quintiles) most states tend to do things well or poorly with some consistency across all indicators, regardless of weighting. Most reasonable weightings (including no weighting at all) do not drastically alter the composition of the top or bottom rankings. Rankings for unscaled scores are presented in the body of this report, and readers are encouraged to download the data spreadsheet from here and formulate their own weightings and judgments.

State Rankings:


1. NY
2. MA
3. TX
4. NC
5. VA
6. LA
6. FL
8. AZ
8. OK*
10 CA
11 SC
12 MS
13. PA
14 UT
15 MN
16 CO
17 NV
17 TN*
19 IL
20 ME
20 OR
22 OH
23 KY
24 WA
25 AK
26 MI
27 ID
28 NJ
29 AR*
30 CT
31 NE
32 VT
33 AL
34 MO
35 MD*
36 DE
37 NM
38 NH
39 DC
40 GA
41 KS*
42 IN
43 HI
44 WY
45 ND
46 WI
47 WV
48 SD
49 RI
50 MT*

— Princeton Review
Testing the Testers 2003

May 2003


This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of education issues vital to a democracy. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit to those who have expressed a prior interest in receiving the included information for research and educational purposes. For more information click here. If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner.