Quantifying Quality: What Can the U.S. News and World Report Rankings Tell Us about the Quality of Higher Education?

University of South Florida University of South Florida

Digital Commons @ University of Digital Commons @ University of

South Florida South Florida

Education Policy Analysis Archives (EPAA) USF Faculty Collections

March 2002

Quantifying Quality: What Can the U.S. News and World Report Quantifying Quality: What Can the U.S. News and World Report

Rankings Tell Us about the Quality of Higher Education? Rankings Tell Us about the Quality of Higher Education?

Marguerite Clarke

Follow this and additional works at: https://digitalcommons.usf.edu/usf_EPAA

Recommended Citation Recommended Citation

Clarke, Marguerite, "Quantifying Quality: What Can the U.S. News and World Report Rankings Tell Us about

the Quality of Higher Education?" (2002).

Education Policy Analysis Archives (EPAA)

. 405.

https://digitalcommons.usf.edu/usf_EPAA/405

This Book is brought to you for free and open access by the USF Faculty Collections at Digital Commons @

University of South Florida. It has been accepted for inclusion in Education Policy Analysis Archives (EPAA) by an

authorized administrator of Digital Commons @ University of South Florida. For more information, please contact

[email protected].

1 of 20

Education Policy Analysis Archives

Volume 10 Number 16 March 20, 2002 ISSN 1068-2341

A peer-reviewed scholarly journal

Editor: Gene V Glass

College of Education

Arizona State University

EDUCATION POLICY ANALYSIS ARCHIVES .

Permission is hereby granted to copy any article

if EPAA is credited and copies are not sold.

Articles appearing in EPAA are abstracted in the Current

Index to Journals in Education by the ERIC Clearinghouse on

Assessment and Evaluation and are permanently archived in

Resources in Education.

Quantifying Quality: What Can the U.S. News and World Report

Rankings Tell us About the Quality of Higher Education?

Marguerite Clarke

Boston College

Citation: Clarke, M. (2002, March 20). Quantifying Quality: What can the U.S. News and World

Report rankings tell us about the quality of higher education? Education Policy Analysis Archives,

10(16). Retrieved [date] from http://epaa.asu.edu/epaa/v10n16/.

Abstract

Since their first appearance in 1983, the U.S. News and World Report

rankings of colleges and graduate schools have generated much

discussion and debate, from some declaring them among the best

rankings ever published to others describing them as shallow, inaccurate,

and even dangerous. The research presented here addresses two of the

most common criticisms of the methodology used to produce these

rankings. In particular, this study answers the following questions: What

is the extent of change in U.S. News' ranking formulas across years and

what are the implications for interpreting shifts in a school's rank over

time? How precise is the overall score that U.S. News uses to rank

schools and what are the implications for assigning schools to discrete

ranks? Findings confirm critic's concerns in each of these areas,

particularly in relation to the ranking of graduate schools of education.

2 of 20

Based on these results, five recommendations are made for improving

the interpretability and usefulness of the rankings.

Introduction

Every year, U.S. News and World Report's (U.S. News) rankings of the academic quality of colleges

and graduate schools hit the newsstands (Note 1). Their arrival brings delight to some and dismay

to others, depending on whether their institution rose or fell in the quality ratings. An improved

ranking can lead to increased donations from proud alumni and more and better qualified students

in next year's applicant pool (Monks and Enhrenberg, 1999). A fall can lead to tighter alignment of

institutional benchmarks and goals with ranking criteria and pressure on admissions staff to bring in

"better" applicants (Mufson, 1999). All the while, a question goes unanswered: What do these

rankings really tell us about the quality of higher education?

As a step toward answering this question, I examine two common criticisms of the methodology

that U.S. News uses to rank colleges and graduate schools. These are: (1) constant changes to the

formula make it impossible to interpret yearly shifts in a school's rank in terms of change in its

relative academic quality (Levin, 1999; Pellegrini, 1999), and (2) the score used to assign schools to

ranks is overly precise, creating a vertical column where a group might more properly exist

(Machung, 1998; Smetanka, 1998). The first section of this article gives a brief introduction to the

U.S. News rankings as well as the questions addressed by this study. The next section outlines the

methodology used to answer these questions and the results of the analyses. The final section

presents conclusions and recommendations.

Before proceeding, a caveat is in order. While many have questioned the overall concept of

academic quality rankings as well as the validity of the different indicators and weights used, I

suspend judgment on these issues to focus on the extent to which methodological problems may

impact the interpretation of the U.S. News rankings.

Background on the U.S. News Rankings

U.S. News published its first rankings of the academic quality of colleges in 1983, the same year

that the National Commission on Excellence in Education released A Nation at Risk, its influential

report blasting the quality of education in America. Based on a survey of college presidents, the

magazine listed Stanford, Harvard, and Yale as the top three national universities and Amherst,

Swarthmore, and Williams as the top three national liberal arts colleges. By 1987, U.S. News had

moved to a multidimensional approach, weighting and combining information on faculty

accomplishments, student achievements, and institutional academic resources to produce an overall

score on which to rank colleges. Rankings of graduate schools of business, engineering, law, and

medicine/primary-care also appeared in this year and used a similar weight-and-sum approach

(rankings of graduate schools of education did not appear until 1994).

The most recent rankings still use this basic approach. At the undergraduate level, schools are

categorized by mission and region (e.g., national universities, national liberal arts colleges, regional

universities, and regional liberal arts colleges). Up to sixteen pieces of information are collected on

schools in each category, including academic reputation; freshmen retention and graduation rates;

average test scores for entering students; per-student spending; and alumni-giving rate. These

indicators are standardized, weighted, and summed to produce an overall score on which to rank

3 of 20

schools in each category against their peers.

At the graduate level, schools are categorized by type—business, education, engineering, law, and

medicine/primary-care. Depending on the type of school, data on up to fourteen

indicators—including test scores, research expenditures, graduate employment rates, and

reputation—are collected. Similar to the undergraduate rankings, the indicators are standardized,

weighted, and summed to produce an overall score on which to rank schools in each category

against their peers. Detailed information on the indicators and methodology that U.S. News uses to

rank colleges and graduate schools is found in Appendix A. (Note 2)

Criticisms of the U.S. News Rankings

Almost two decades after their first publication, the college and graduate school rankings are

among U.S. News' top issues in terms of sales generated (K. Crocker, personal communication,

March 19, 1999). This demand has made them the focus of much criticism and debate, especially

among the institutions that are the subject of the rankings. In addition to questioning the overall

concept of ranking higher education institutions, much criticism has focused on the methodology

used to produce the rankings. Gerhard Casper, then President of Stanford University, focused on

some of these methodological concerns in a letter of protest he wrote to the editor of U.S. News in

1996:

Could there not, though, at least be a move toward greater honesty with, and service to,

your readers by moving away from the false precision? Could you not do away with

rank ordering and overall scores, thus admitting that the method is not nearly that

precise and that the difference between #1 and #2 - indeed, between #1 and #10 - may

be statistically insignificant? Could you not, instead of tinkering to "perfect" the

weightings and formulas, question the basic premise? Could you not admit that quality

may not be truly quantifiable, and that some of the data you use are not even truly

available (e.g., many high schools do not report whether their graduates are in the top

10% of their class)? Parents are confused and looking for guidance on the best choice

for their particular child and the best investment of their hard-earned money. Your

demonstrated record gives me hope that you can begin to lead the way away from

football-ranking mentality and toward helping to inform, rather than mislead, your

readers. (Note 3)

Casper's questions about the "football ranking mentality" employed by U.S. News go to the heart of

the debate over college and graduate school rankings. If, as Casper states, "the difference between

#1 and #2 - indeed, between #1 and #10 - may be statistically insignificant," what are the

implications for the way in which the overall scores for schools are used to put them in rank order?

In addition, if the weights and formula are constantly being "tinkered" with, how should one then

interpret change in a school's rank from year to year?

Others have voiced these methodological concerns. In particular, critics have noted that yearly

formula changes make it almost impossible to interpret shifts in a school's rank in terms of change

in its relative academic quality: a college that is ranked 4

one year and 7th the next may have had

no change in its performance relative to other schools, yet still have moved because of changes in

the ranking methodology (Levin, 1999; Machung, 1998; Pellegrini, 1999). U.S. News' response to

this issue has been that they prefer to make incremental changes every year to produce the "best

possible rankings" than to use the same indicators every year to facilitate precise year-to-year

comparisons.

4 of 20

Critics have also pointed out that the use of overall scores to rank schools magnifies small—and

often insignificant—differences among schools, and that small changes by the school or the

magazine can move a college half a dozen places up or down the ranking list (Crenshaw, 1999).

U.S. News acknowledged this issue in 1998 when it began rounding overall scores to the nearest

whole number in recognition, the editors noted, of the fact that small differences after the decimal

point may reflect non-significant differences between schools (Thompson and Morse, 1998).

Subsequently, the number of schools tied for overall score (and thus rank) increased dramatically.

While much criticism and debate has focused on the methodology used to produce the rankings, the

majority of research has focused on the extent to which the rankings are used by students and

parents (e.g., Art and Science Group, 1995; McDonough, Antonio, Walpole, and Perez, 1998) or

their effect on institutions (e.g., Monks and Ehrenberg, 1999). The research presented here

addresses the two methodological concerns outlined above. In particular, this study answers the

following questions:

What is the extent of change in U.S. News' ranking formulas across years and what are the

implications for interpreting shifts in a school's rank over time?

How precise is the overall score that U.S. News uses to rank schools and what are the

implications for assigning schools to discrete ranks?

Methods and Results

Tracking Changes in Ranking Formulas across Years

In order to gauge the extent of change in the U.S. News ranking formulas over time, year-to-year

changes to the indicators used in each formula were tracked across rankings published between

1995 and 2000 inclusive. Four types of changes were identified and tracked over this six-year

period: changes in the weight assigned to an indicator; the removal of an indicator from a formula;

the addition of an indicator to a formula; and, changes in an indicator's definition or methodology.

Rankings examined included business, education, engineering, law, and medicine/primary-care at

the graduate level and national university and national liberal arts college at the undergraduate.

Changes in weights, methodology, and the addition or removal of indicators were generally easy to

track, although it was not possible to fully track changes in weights at the undergraduate level as

this information was not included until the 1998 edition of the guidebook. Changes in indicator

definition were harder to identify as the wording for a definition could differ from one year to the

next, while the underlying meaning might not. The following rule was used to identify an indicator

definition change:

The new wording must contain additional detail such as a date, money amount, percent, or

other precise information not previously stated or implied.

If the new wording does not include such detail, it should be recognized as changed by U.S.

News in the guidebook text.

Analyses focused on the types of changes that were made to the formula for each ranking, the total

number of these changes across time, the proportion of non-change in each ranking formula, and

the extent to which the amount of change in a ranking formula was related to the amount of

movement in the relative ranks for schools in that ranking across the same time period.

Table 1 summarizes changes in the indicators used for each ranking from 1995 to 2000. The

5 of 20

number of changes for each ranking, by type and overall, is shown in columns two through eight.

The national university and national liberal arts college changes are shown in one column as they

use the same formula. The final column in Table 1 reflects the total number of changes across all

seven rankings (i.e., business, education, engineering, law, medical, national university/liberal arts,

and primary care), again broken down by type.

Table 1

Changes in U.S. News Ranking Indicators, 1995-2000

Business Education Engineering Law Medical National

University/

Liberal

Arts

Primary

Care

Total

Definition/

Methodology

4 (50)* 4 (67) 3 (37.5) 10

(72)

4 (100) 4 (50) 3 (60) 32

(60)

Weight

3 (37.5) 2 (33) 3 (37.5) 1 (7) 0 2 (25) 2 (40) 13

(25)

Addition

0 (0) 0 (0) 1 (12.5) 1 (7) 0 1 (12.5) 0 3 (6)

Removal

1 (12.5) 0 (0) 1 (12.5) 2

(14)

0 1 (12.5) 0 5 (9)

Total

8 (100) 6 (100) 8 (100) 14

(100)

4 (100) 8 (100) 5 (100) 53

(100)

*Column percentages are in parentheses.

Most changes were weight or definition/methodology changes, comprising 85 percent of all

changes occurring over the six editions. Very few indicators were added to or removed from the

ranking formulas, suggesting that U.S. News generally retained the same set of indicators for each

ranking, but consistently refined and redefined these indicators over the years. (Of course, this

redefining process can also change an indicator substantially).

The rate of change varied widely across rankings. While most rankings averaged between 6 and 8

formula changes over the six editions, the law rankings experienced 14 and the medical rankings

only 4 changes over the same period. Several reasons account for the larger number of changes in

the law ranking's indicators, including U.S. News' responses to the complaints of law schools (who

tend to complain more than other schools) and the release of new types of quality-related

information by the American Bar Association.

While a ranking (e.g., the law rankings) may have experienced a large number of changes relative

to other rankings, these changes may be concentrated in a small group of indicators that are

constantly being refined. Different rankings of schools also use different numbers of indicators to

compute their overall score, and thus two rankings that experience the same types and number of

changes may differ in the number of indicators left unchanged overall. Figure 1 shows the

proportion of unchanged indicators for each ranking between 1995 and 2000 inclusive.

6 of 20

Figure 1. Proportion of Indicators Remaining Unchanged in Each US News Ranking,

1995-2000.

The undergraduate rankings (both national university and national liberal arts college) have the

largest proportion (.73 approximately) of unchanged indicators. In contrast, only about one third of

the law school indicators remained unchanged. For most rankings, about half to two thirds of the

indicators remained unchanged over the six editions. This suggests that while it may not be always

possible to interpret changes in a school's overall rank across years, it is possible to track

performance on individual indicators that have remained unchanged across the years. Most of the

unchanged indicators are related to selectivity (e.g., test scores and the proportion of applicants

accepted into the program) and institutional resources (e.g., student-faculty ratios).

In Table 2, an X indicates when it is possible to make cross-year comparisons for a ranking. The

criteria used to make this determination include the four types of indicator changes discussed above

as well as more general formula changes. The latter occurred twice over the six editions examined

here: In 1998 when overall scores were rounded to the nearest whole number, and in 1999 when a

school's performance on each indicator was standardized before obtaining the overall rank score.

While it was not possible to make cross-year comparisons for most rankings over the six years, the

last column in Table 2 suggests that the ranking formulas may be stabilizing. Between 1999 and

2000, there were no changes in the formulas used to rank schools of education, engineering, law,

and medicine, suggesting that change in a school's rank between 1999 and 2000 could be

interpreted in terms of change in its relative academic quality.

Table 2

Ability to Make Comparisons Across Years for a Ranking, 1995-2000

Ranking 1995-1996 1996-1997 1997-1998 1998-1999 1999-2000

Business X

Education

Engineering

Law

Medical X X

National Liberal Arts

National University

7 of 20

Primary Care X

It is important to remember that even when a formula appears to remain stable across years, there

can still be difficulties with cross-year interpretation of ranks. This is due to problems with the

accuracy of the information obtained and critics have pointed out several errors that have arisen due

to mistakes (both accidental and deliberate) in reporting by institutions, and due to the differing

ways in which schools compute figures for certain indicators (Machung, 1998, Smetanka, 1998,

Stecklow, 1995, Wright, 1990-91). U.S. News has tried to reduce the error introduced by these

practices by cross-checking data sent in by schools with data collected by debt-rating agencies,

investors and national organizations such as the National Collegiate Athletic Association, and

tightening up their survey questions, but issues still remain.

The final stage of the comparability analysis examined the extent to which the amount of change in

a ranking formula is related to the amount of movement in schools' ranks for that ranking across

years. Table 3 shows the correlation (r) between the 1995 and 2000 ranks for the top-fifty schools in

each ranking in 1995.

Table 3

Correlation between 1995 and 2000 Ranks for the

Top-Fifty Schools in 1995, By Ranking

Ranking Correlation (r)

Business .89

Education .72

Engineering .88

Law .92

Medicine .88

National Universities .95

National Liberal Arts College .94

Primary Care .08

There is no definite relationship between the amount of change in the indicators for a ranking and

the correlation between the 1995 and 2000 ranks for the top-fifty ranked schools in 1995. For

example, while law schools experienced the most change in their indicators over the six editions of

U.S. News, there was not much difference (r = .92) in the rank ordering of the top-fifty law schools

in 1995 and their ordering in 2000. While varying amounts of change was experienced in the

indicators used for the other rankings, they still show a high degree of similarity (with r's between

.88 and .95) in the rank ordering of their top 50 schools in 1995 and 2000. The main exceptions to

this are the education (r = .72) and primary-care (r = .08) rankings. The low correlation between the

primary-care rankings in 1995 and 2000 can be explained by changes in the population of schools

that U.S. News included in these rankings during this time period. In contrast, the low (relative to

the other rankings) correlation between the 1995 and 2000 ranks of the top-fifty schools of

education in 1995 is linked to the fact that 16 of the top 50 schools in 1995 had experienced large

changes in rank–of ten or more–by the 2000 edition. Table 4 shows the 16 schools of education.

8 of 20

The first six schools all experienced a decline in rank, ranging from a drop of 10 places for the

University of Southern California and the University of Iowa to a drop of 22 places for Syracuse

University. The remaining schools all improved their rank since 1995. Improvement ranged from an

increase of 10 places for the Rutgers University to a jump of 30 places for Arizona State University.

Table 4

Schools of Education with the Biggest Differences in

U.S. News Rank between 1995 and 2000

School Rank Change in Rank

Between 1995 and

2000

1995 1996 1997 1998 1999 2000

University of Iowa 20 22 14 15 27 30 -10

University of Southern

California

23 27 26 30 31 33 -10

University of Georgia 15 10 15 19 18 26 -11

SUNY-Buffalo 39 45 43 47 46 Not

Ranked

At least -12

Boston University 31 37 32 43 Not

Ranked

46 -15

Syracuse University 28 41 46 45 46 50 -22

Rutgers State

University-New

Brunswick

49 33 29 30 33 39 +10

University of

Minnesota-Twin Cities

25 7 9 11 10 14 +11

University of Pittsburgh,

Main Campus

44 Not

Ranked

43 34 37 33 +11

Temple University 33 30 34 28 20 20 +13

George Washington

University

45 39 37 30 34 30 +15

University of

Michigan-Ann Arbor

22 9 8 6 8 7 +15

University of North

Carolina-Chapel Hill

32 32 31 28 22 17 +15

University of

Texas-Austin

27 19 12 13 11 12 +15

New York University 40 28 23 19 16 12 +28

Arizona State

University-Main Campus

47 29 39 27 24 17 +30

This table does not include schools that were not ranked in 1995 but appeared in the top 50 in the

9 of 20

2000 edition.

Cross-year data for the top-fifty schools in 1995 in other rankings were also examined to assess the

extent to which similar movements in rank occurred (only data for the top 25 schools of

medicine/primary-care and the top 40 national liberal arts colleges were available). Only nine

business schools, one engineering school, eight law schools, no medical or primary-care schools,

three national liberal arts colleges and two national universities differed by ten or more places in

their 1995 and 2000 ranks.

It is not clear why there was more movement among schools of education compared to other types

of schools. If changes in indicators (i.e., weight, definition, or other changes) are not responsible,

movement could be due to changes in schools' performance on the indicators or errors or

inconsistencies in the information reported by schools. Unfortunately, it is difficult to identify the

real reasons for these movement patterns among schools of education over time, as well as why

these differ from other rankings, as U.S. News did not print much information on schools'

performance on the individual indicators until 1999.

Estimating Error or Uncertainty around the Overall Score

There is no universally agreed-upon set of information for creating academic quality rankings.

Thus, various ranking efforts use indicators that differ in whole or in part from those used by others

even when attempting to rank the same schools. It is not difficult to imagine that slight changes in

the set of indicators used–such as the addition or removal of a single indicator–may move a school

up or down a ranking, depending on how it performs on the indicator relative to other schools. To

gauge the effect of slight changes in the set of indicators on the stability of the overall score and

subsequent ranking for a school, a technique called jackknifing (Efron and Tibshirani, 1993) was

applied to the data for the top-50 schools in each of the 2000 business, education, law, national

liberal arts college, and national university rankings. (Note 4)

First, a baseline regression model was created for each of the rankings, with schools' overall scores

as the dependent or outcome variable and the indicators used for each ranking as the independent or

predictor variables. The overall fit of the model to the data was assessed in terms of the adjusted R

Squared. Values of .9 and above were considered a good fit, meaning that the overall score

predicted by the model for a school was highly correlated with the score produced by U.S. News'

ranking formula, and that the regression model was an effective substitute for the weights-and-sum

formula used by U.S. News. All models met this criterion, with adjusted Rs Squared varying

between .99 for the national liberal arts college and national university models, .98 for the business

school and law school models, and .95 for the education school model. (Note 5)

An approximation to a standard error for each school's overall score was obtained using the

following formula (Efron and Tibshirani, 1993): (Note 6)

10 of 20

The removal of one indicator at a time for the jackknife regression models did not seem to affect

substantially the overall adjusted R Squared in most instances. For example, for each of the 9

models estimated using the law school data, the adjusted R Squared never varied by more than .01

from the adjusted R Squared for the overall model (i.e., .98), suggesting that the indicators are

contributing fairly similar information to the estimation of the overall score. As a result, the

jackknife standard errors are quite small, varying, in the case of law schools, from a low of .74 for

the University of Michigan, Ann Arbor to a high of 3.06 for Harvard University. A similar range of

standard error values was obtained for all rankings except for schools of education. The regression

model for schools of education was not as robust to changes in indicators and the adjusted R

Squared dropped considerably (by .13) when one indicator in particular—Research

Expenditure—was removed. The resultant jackknife standard errors for schools of education are

therefore quite large, varying from a low of 1.78 for Stanford University to a high of 11.98 for the

University of Southern California.

Differences in the standard errors for individual schools are due to differences in how the removal

of different indicators from the equation affects the prediction of their overall score. For schools

that have large standard errors, the removal of certain indicators makes it much harder to predict the

overall score they received from U.S. News. For school with smaller standard errors, the removal of

indicators does not appreciably reduce the precision of estimation of their overall score. This

suggests that schools are differentially affected by the presence or absence of certain indicators in

terms of their overall score and subsequent rank.

This error estimate was then used in a t-test to assess the extent to which one school's overall score

was significantly different from that of another. The t-test formula employed was: (Note 7)

The results of these comparisons are summarized in Tables 5 through 9 which are in the form of

Excell spreadsheets. In each table, schools are ordered by their overall ranking score across the

heading and down the rows. Read across the row for a school in order to compare its performance

with the schools listed in the heading of the chart. The symbols indicate whether the overall score

of the school in the row is significantly lower than that of the comparison school in the heading

(arrow pointing down), significantly higher than that of the comparison school (arrow pointing up),

11 of 20

or if there is no statistically significant difference between the two schools (circle). The blank

diagonal represents where a school is compared against itself.

If there were no error around the overall scores for schools, Tables 5 through 9 would only consist

of arrows pointing up and down, except for instances where two schools have the same overall

score and are tied for rank. This is not the case. For example, in the business school rankings

comparison table (Table 5) Harvard is listed first in the row and heading as it has the highest overall

score among business schools. However, reading across the row, it appears that Harvard's overall

score of 100 is not significantly different from that of nine other schools that are ranked beneath it.

These include Stanford, which is tied for first rank with Harvard with an overall score of 100, and

University of California, Berkeley, ranked tenth with a score of 90. Only schools ranked below

tenth have scores that are significantly lower than Harvard's.

Tables 5 - 9

Statistical Significance of Comparisons of Overall Scores

in Five Areas (Data in the Form of Excell Spreadsheets)

Table 5: Business

Table 6: Education

Table 7: Law

Table 8: Liberal Arts

Table 9: National Universities

In general, when the overall score for a school is compared to that of every other school in its

ranking (top-fifty schools only), three groups emerge: schools that score significantly higher,

schools that score significantly lower, and schools with scores that are not significantly different.

This pattern is consistent across all the comparison tables. For example, among the business

schools in Table 5, three distinct groupings emerge. The first group comprises 10 schools at the top

of the rankings, extending from first-ranked Harvard to tenth-ranked University of California,

Berkeley. These schools have scores that are not significantly different from each other but that are

significantly higher than all other schools' scores. The second grouping extends from

eleventh-ranked Dartmouth, University of California Los Angeles, and the University of Virginia to

nineteenth-ranked Carnegie Mellon. These schools have scores that are not significantly different

from each other but that are significantly lower than the top-ranked schools in the first group and

significantly higher than the lower-ranked schools in the third grouping. The third group is the

largest. It comprises 31 schools, extending from twentieth-ranked Indiana University to

forty-eighth-ranked University of Georgia, University of Illinois-Urbana Champagne, and the

University of Notre Dame. These schools all have scores that are not significantly different from

each other but that are significantly lower than the scores of schools in the first two groups.

This three-groupings pattern is evident for all rankings except schools of education. There are only

two groupings evident in Table 6. The first group comprises the top-three-ranked schools of

education—Harvard University, Stanford University, and Teacher's College/Columbia University.

These schools have scores that are not significantly different from each other but that are

12 of 20

significantly higher than the scores for almost all other schools in the top fifty. The second group of

schools extends from fourth-ranked University of California-Berkeley to the four schools tied for

fiftieth rank. These schools all have scores that are not significantly different from each other but

that are significantly lower than the scores of most schools in the top group. This two-grouping

effect occurs because schools of education are more sensitive to changes in the indicators used than

other types of schools. This results in larger standards errors around their overall score and fewer

significant differences between the scores of neighboring schools.

Conclusions and Recommendations

The results of these analyses show that, given the number and annual nature of changes to each

ranking formula, it is generally not possible to interpret year-to-year shifts in a school's rank in

terms of change in its relative academic quality. Depending on the ranking, it is possible to make

cross-year comparisons of a school's relative performance on between a third to three-quarters of

the individual indicators used. While not experiencing much change to their ranking formula over

time, schools of education have experienced markedly more movement in their ranks than other

schools. It is not evident why this has occurred or what it says about the U.S. News rankings as a

measure of the relative quality of these schools. The overall rate of change in the ranking formulas

appears to be slowing and it was possible to make cross-year comparisons of schools' ranks for

almost all rankings between 1999 and 2000.

The results of the error analyses call into question the use of overall scores to assign schools to

individual ranks. The analyses show that when interpreting scores for school with the aid of their

standard errors, precision blurs and schools start to group in bands rather than discrete ranks. The

results confirm the critics' sense of unease at the precision of a single score, particularly in the case

of the education rankings.

At least five recommendations can be made for improving the interpretability and usefulness of the

U.S. News rankings.

First, U.S. News needs to stabilize their ranking methodology. This is particularly important

since the rankings are annual in nature and imply some kind of comparability. A related issue

to consider is whether the rankings need to be annual in nature. While there is an obvious

commercial value to annual rankings, particularly one that keeps changing the winners, it is

doubtful whether there is an educational or consumer value.

Second, U.S. News needs to recognize the uncertainty around schools' overall scores. The

results of this analysis suggest that it would be more accurate to group schools in bands than

to assign them discrete ranks. This approach would avoid the misleading effect that small

changes in a school's rank from year to year produces in terms of the public perception's of its

academic quality.

Third, the schools of education rankings need to be reassessed since they do not seem to

"hold together." Better comparisons might emerge if they were divided into two more

conceptually coherent groups (e.g., those that are primarily research oriented and those that

are primarily teacher-training oriented.) U.S. News already does this for schools of

medicine—i.e., there is an overall ranking of medical schools as well as a ranking of schools

that focus on the training of primary-care physicians.

Fourth, in order to be accountable to consumers, U.S. News needs to make available all data

used to create the rankings. Currently, US News only publishes information for the

13 of 20

top-ranked schools and less or no information on lower-ranked schools. While space

constraints may make it difficult to publish this information in the magazine, no such

restrictions apply on the US News website.

A final general recommendation is that U.S. News should adopt a model similar to that used

by Consumer Reports for reporting its quality ratings. Consumer Reports rates products, but

does not allow the product manufacturers to use these ratings in their advertising. Similarly,

U.S. News should not allow schools to use their ratings in their promotional materials or other

advertising. This approach might relieve some of the tension and debate that currently

surrounds the rankings and make their annual arrival on newsstands a less stressful event for

the higher education community.

Notes

The term "rankings," as used in this artcles, refers to a list of schools or universities that are

ordered according to their overall score on a formula created by U.S. News. Thus, the business

rankings are a list of business schools ordered according to their overall score on a formula that

U.S. News uses to rank graduate schools of business, and the national university rankings are a list

of schools ordered according to their overall score on a formula that U.S. News uses to rank

national universities. The year appended to a ranking is the calendar year in which it was released,

i.e., the 2000 education rankings were published in the year 2000.

It is worth noting that several of these indicators—such as test scores, reputation, research

expenditure, and faculty awards—have been used traditionally to measure quality (Hattendorf,

1993: Webster, 1986). The U.S. News rankings differ from most other rankings in that they assign

weights to these indicators in order to combine them and produce a composite score.

The full text of this letter is available at:

http://www-portfolio.stanford.edu:8050/documents/president/961206gcfallow.html

No data was available for schools below the top-50 for most of the rankings.

U.S. News does not make available in its magazine or on its website all the data it uses to rank

schools, nor is this information available on request. On average, each ranking is missing

information on two or three indicators. This was not a problem for this analysis, since the available

indicators, as indicated by the adjusted R Squared values, almost perfectly replicated the overall

scores produced by U.S. News. Thus, very little information was lost.

While the "error estimate" obtained is not strictly a standard error, since the indicators are not

randomly sampled, it may still be viewed as a general indication of the uncertainty around an

overall score due to changes in the indicators used to compute that score. In addition, it is probably

a conservative estimate of the uncertainty around scores as the indicators chosen by U.S. News tend

to be highly correlated. A random sample from the population of indicators would probably be less

highly correlated, which would result in larger standard errors around schools' overall scores.

Since there are, on average, 50 schools in each ranking, around 49 t-test comparisons were made

for each school in the rankings. In order to control for the increased probability of a significant

finding due to chance alone, a Bonferroni adjustment was applied.

14 of 20

For more information see http://www.usnews.com/usnews/edu/college/corank.htm

U.S. News uses a modification of the classification system developed by the Carnegie Foundation

for the Advancement of Teaching in order to classify colleges and universities. The Carnegie

system is a generally accepted classification system for higher education.

References

Art and Science Group, (1995). Influence of U.S. News rankings on college choice, StudentPOLL:

Market Intelligence for Higher Education, 1(1).

Crenshaw, A. B. (1999). Colleges by the numbers. Washington Post, August 29, p. H01.

Efron, B., and Tibshirani, R. J. (1993). An introduction to the bootstrap. New York: Chapman and

Hall.

Hattendorf, L. C. (ed.) (1993). Educational rankings annual. Detroit, MI: Gale Research Inc.

Levin, S. (1999). Ignore college ranking—become an educated consumer. Washington Parent

Magazine. Available at: http://www.washingtonparent.com/articles/9712/rankings.htm

Machung, A. (1998). What's at stake: College rankings and the new media. Change, July/August,

pp. 13-16.

McDonough, P. M., Antonio, A. L., Walpole, M., and Perez, L. X. (1998). College rankings:

Democratized knowledge for whom? Research in Higher Education, 39(5), pp. 513-37.

Monks, J., and Ehrenberg, R. G. (1999). The impact of U.S. News and World Report college

rankings on admissions outcomes and pricing policies at selective private institutions. Cambridge,

MA: National Bureau of Economic Research.

Mufson, S. (1999). Rankings all-important to GWU. Washington Post, Sunday, March 14, P.A01.

Pellegrini, F. (1999). Those bouncing college rankings—a 101. Time, August 31.

Smetanka, M. J. (1998). Magazine's new ratings on colleges don't rank high at "U." Star Tribune,

February 20, p. 7B.

Stecklow, S. (1995). Cheat sheets: Colleges inflate SATs and graduation rates in popular

guidebooks: Schools say they must fib to US News and others to compete effectively: Moody's

requires the truth. The Wall Street Journal, Wednesday, April 5, pp. 1, A8.

Thompson, J. J., and Morse, R. J. (1998). An explanation of the U.S. News rankings: Putting the

numbers into context. U.S. News and World Report America's Best Colleges 1998, p. 66-68.

Webster, D. S. (1986). Academic quality rankings of American colleges and universities.

Springfield, IL: Charles C. Thomas.

Wright, B. A. (1990-91). The rating game: How the media affect college admission. The College

Board Review, No. 158, Winter, pp. 12-17, 31.

15 of 20

About the Author

Marguerite Clarke

Boston College

Phone: 617-552-0665

Fax: 617-552-8419

Email: [email protected]

Marguerite Clarke is an Assistant Professor of Research at Boston College. She has a Ph.D. in

educational research, measurement, and evaluation as well as degrees in bilingual/multicultural and

elementary education. Her research interests include policy and technical issues surrounding

large-scale testing and accountability programs; the impact of testing on teaching and learning; and

the relationship between test use and educational opportunity and access for different student

populations.

Appendix A

Current U.S. News College and Graduate School Ranking Methodology

The current method that U.S. News uses to produce college rankings has three basic steps. (Note 8)

First, colleges in the U.S. are placed into categories based on mission and region. (Note 9) Colleges

within each category are ranked separately. Second, U.S. News collects data from each school on up

to 16 separate indicators of what it believes reflects academic quality. As Table 10 indicates, each

indicator is assigned a weight in the ranking formula that reflects the judgement of U.S. News about

which measures of quality matter most. Column 4 of Table 10 shows the weight that each indicator

(shown in column 3 of Table 10) receives within its category and column 2 shows the weight this

category receives in the overall ranking formula. For example, a school's acceptance rate is 15

percent of its Student Selectivity category score or rank, and the Student Selectivity category

contributes 15 percent to a school's overall score and rank.

Indicators are standardized and then combined (using weights) to produce an overall score for each

school. These scores are re-scaled. The top school is assigned a value of 100, and the other schools'

weighted scores are calculated as a proportion of that top score. Final scores for each ranked school

are rounded to the nearest whole number and ranked in descending order. U.S. News publishes the

individual ranks of only the top schools; the remainder is grouped into tiers.

Table 10

U.S. News Indicators and Weights for the 2000 College Rankings

Ranking Category Category

Weight

Indicator Indicator

Weight

Academic Reputation

25% Academic Reputation Survey 100%

Student Selectivity

15% Acceptance Rate

Yield

High School Standing— Top

10%

SAT/ACT Scores

15%

10%

35%

40%

16 of 20

Faculty Resources

20% Faculty Compensation

Faculty With Top Terminal

Degree

Percent Full-time Faculty

Student/Faculty Ratio

Class Size, 1-19 Students

Class Size, 50+ Students

35%

15%

30%

10%

Retention Rate

20% Average Graduation Rate

Average Freshmen Retention

Rate

80%

20%

Financial Resources

10% Educational Expenditures Per

Student

100%

Alumni Giving

5% Alumni Giving Rate 100%

Graduation Rate

Performance

5% Graduation Rate Performance 100%

These indicators and weights are for the national liberal arts and national university rankings only.

A similar methodology is employed for the graduate school rankings. U.S. News collects data from

each program on indicators of what it believes reflect academic quality. Each indicator is assigned a

weight based on U.S. News' judgment about which measures matter most. Data are standardized,

and standardized scores are weighted, totaled, and re-scaled so that the top school receives 100;

other schools receive a percentage of the top score. Schools are then ranked based on the score they

receive.

The five major disciplines examined yearly are business, education, engineering, law, and medicine.

Master's and doctoral programs in areas such as the arts, sciences, social sciences, humanities,

library science, public affairs, and various health fields are ranked only by reputation and are

generally evaluated every third year. The specific indicators and weights used for rankings within

each of the five major disciplines are outlined in Tables 11 through 15.

Table 11

U.S. News Indicators and Weights for the 2000 Business Rankings

Ranking

Category

Weight

Indicator Indicator

Weight

Reputation 40%

Academic Survey

Non-academic Survey

60%

40%

Placement

Success

35%

Mean Starting Salary and Bonus

Employment at Graduation and Three

Months Later

40%

20% and 40%

Student

Selectivity

25%

Mean Graduate Management Admission

Test Scores

Mean Undergraduate Grade Point Average

Proportion of Applicants Accepted

65%

30%

17 of 20

Table 12

U.S. News Indicators and Weights for the 2000 Education Rankings

Ranking

Category

Weight

Indicator Indicator

Weight

Reputation 40%

Academic Survey

Non-academic Survey

60%

40%

Student

Selectivity

20%

Average Verbal, Analytical and Quantitative

GREs

Proportion of Applicants Accepted

30% each

10%

Faculty

Resources

20%

Ratio of Full-time Doctoral and Master's Degree

Candidates to Full-time Faculty

Percent of Faculty Given Awards

Number of Doctoral and Master's Degrees

Granted in the past school year

Proportion of Graduate Students Who Are

Doctoral Candidates

25% and 20%

20%

15% and10%

10%

Research

Activity

20%

Total Research Expenditures

Research Expenditures Per Faculty Member

75%

25%

Table 13. U.S. News Indicators and Weights for the 2000 Engineering Rankings

Ranking

Category

Weight

Indicator Indicator

Weight

Reputation 40%

Academic Survey

Non-academic Survey

60%

40%

Student

Selectivity

10%

Average Quantitative and Analytical GREs

Proportion of Applicants Accepted

45% each

10%

Faculty

Resources

25%

Ratio of Full-time Doctoral and Master's Degree

Candidates to Full-time Faculty

Proportion of Faculty Members of NAE

Number of Ph.D Degrees Granted in the last

school year

Proportion of Faculty Holding Doctoral Degrees

25% and 10%

25%

20%

Research

Activity

25%

Total Research Expenditures

Research Expenditures Per Faculty Member

60%

40%

Table 14

U.S. News Indicators and Weights for the 2000 Law Rankings

Ranking

Category

Weight

Indicator Indicator

Weight

18 of 20

Reputation

40% Academic Survey

Non-academic Survey

60%

40%

Student

Selectivity

25% Median LSAT Scores

Median Undergraduate GPA

Proportion of Applicants Accepted

50%

40%

10%

Placement

Success

20% Employment Rates at Graduation and Nine

Months Later

Bar Passage Rate

30% and 60%

10%

Faculty

Resources

15% Average Expenditures Per Student For

Instruction etc.

Student to Teacher Ratio

Average Expenditures Per Student For

Financial Aid etc.

Total Number of Volumes in Law Library

65%

20%

10%

Table 15

U.S. News Indicators and Weights for the 2000 Medicine and Primary-Care (in

parentheses where different) Rankings

Ranking

Category

Weight

Indicator Indicator

Weight

Reputation 40%

Academic Survey

Non-academic Survey

50% (60%)

50% (40%)

Student

Selectivity

20%

Mean MCAT Scores

Mean Undergraduate Grade Point Average

Proportion of Applicants Accepted

65%

30%

Faculty

Resources

10%

Ratio of Full-time Science and Clinical Faculty to

Full-time Students

100%

Primary Care

Rate

(Primary

Care Only)

30%

The Percentage of MDs From a School Entering

Primary-care Residencies, Averaged Over 1997, 1998,

and 1999

100%

Research

Activity

(Medicine

only)

30%

Total Dollar Amount of National Institutes of Health

Research Grants Awarded to the Medical School and

its Affiliated Hospitals, Averaged for 1998 and 1999

100%

The World Wide Web address for the Education Policy Analysis Archives is epaa.asu.edu

General questions about appropriateness of topics or particular articles may be

addressed to the Editor, Gene V Glass, [email protected] or reach him at College

19 of 20

of Education, Arizona State University, Tempe, AZ 85287-2411. The

Commentary Editor is Casey D. Cobb: casey[email protected] .

EPAA Editorial Board

Michael W. Apple

University of Wisconsin

Greg Camilli

Rutgers University

John Covaleskie

Northern Michigan University

Alan Davis

University of Colorado, Denver

Sherman Dorn

University of South Florida

Mark E. Fetler

California Commission on Teacher Credentialing

Richard Garlikov

hmwkhelp@scott.net

Thomas F. Green

Syracuse University

Alison I. Griffith

York University

Arlen Gullickson

Western Michigan University

Ernest R. House

University of Colorado

Aimee Howley

Ohio University

Craig B. Howley

Appalachia Educational Laboratory

William Hunter

University of Calgary

Daniel Kallós

Umeå University

Benjamin Levin

University of Manitoba

Thomas Mauhs-Pugh

Green Mountain College

Dewayne Matthews

Education Commission of the States

William McInerney

Purdue University

Mary McKeown-Moak

MGT of America (Austin, TX)

Les McLean

University of Toronto

Susan Bobbitt Nolen

University of Washington

Anne L. Pemberton

apembert@pen.k12.va.us

Hugh G. Petrie

SUNY Buffalo

Richard C. Richardson

New York University

Anthony G. Rud Jr.

Purdue University

Dennis Sayers

California State University—Stanislaus

Jay D. Scribner

University of Texas at Austin

Michael Scriven

scriven@aol.com

Robert E. Stake

University of Illinois—UC

Robert Stonehill

U.S. Department of Education

David D. Williams

Brigham Young University

EPAA Spanish Language Editorial Board

Associate Editor for Spanish Language

Roberto Rodríguez Gómez

Universidad Nacional Autónoma de México

roberto@servidor.unam.mx

20 of 20

Adrián Acosta (México)

Universidad de Guadalajara

adrianacosta@compuserve.com

J. Félix Angulo Rasco (Spain)

Universidad de Cádiz

felix.angulo@uca.es

Teresa Bracho (México)

Centro de Investigación y Docencia

Económica-CIDE

bracho dis1.cide.mx

Alejandro Canales (México)

Universidad Nacional Autónoma de

México

canalesa@servidor.unam.mx

Ursula Casanova (U.S.A.)

Arizona State University

casa[email protected]du

José Contreras Domingo

Universitat de Barcelona

Jose.Contreras@doe.d5.ub.es

Erwin Epstein (U.S.A.)

Loyola University of Chicago

Eepstein@luc.edu

Josué González (U.S.A.)

Arizona State University

josue@asu.edu

Rollin Kent (México)

Departamento de Investigación

Educativa-DIE/CINVESTAV

rkent@gemtel.com.mx

kentr@data.net.mx

María Beatriz Luce (Brazil)

Universidad Federal de Rio Grande do

Sul-UFRGS

lucemb@orion.ufrgs.br

Javier Mendoza Rojas (México)

Universidad Nacional Autónoma de

México

javiermr@servidor.unam.mx

Marcela Mollis (Argentina)

Universidad de Buenos Aires

[email protected]r

Humberto Muñoz García (México)

Universidad Nacional Autónoma de

México

humberto@servidor.unam.mx

Angel Ignacio Pérez Gómez (Spain)

Universidad de Málaga

aiper[email protected]

Daniel Schugurensky

(Argentina-Canadá)

OISE/UT, Canada

dschugurensky@oise.utoronto.ca

Simon Schwartzman (Brazil)

Fundação Instituto Brasileiro e Geografia

e Estatística

simon@openlink.com.br

Jurjo Torres Santomé (Spain)

Universidad de A Coruña

jurjo@udc.es

Carlos Alberto Torres (U.S.A.)

University of California, Los Angeles

torres@gseisucla.edu