Choropleth Maps and Census Data


Dr. James R. Carter, Geography-Geology Department

Illinois State University, Normal, IL, USA



Geography relies heavily on data collected by the various censuses of the world, including the data from the U.S. Bureau of the Census. Many geographers around the world are employed with census organizations because censuses address many geographical questions.  Thus, students of Geography need to understand the nature of data collection, processing, aggregation and presentation.

The choropleth map is based on predefined areal units, such as states, counties, census tracts, etc.  Here is an example of a choropleth map created on the U.S. Bureau of the Census American FactFinder web site. This map is based on state level data.  This map has been reduced to about 80% of its original size for this discussion.  







The map above portrays data for the 50 states and Puerto Rico.  The subject is the ratio of the number of males in the population compared to the number of females.  As reported in the Census 2000 data, the values are given as the number of males per 100 females.  The values range from the low of less than 90 males per 100 females in one or more of the eastern states to a dominance of males to females in Nevada and Alaska.  The map does not tell us exact values for each state but rather the map aggregates the states into categories and portrays those categories on the map.  This map is the default map produced for this variable at the web site.  It employs natural breaks to aggregate the data into the five categories.

The power of a map is that it should be able to show spatial patterns, if any such pattern exists.  It is quite apparent that the ratio of males to females at the state level of aggregation is not random.  The western portion of the U.S. has a higher ratio than the eastern part of the U.S.  And, there are finer distinctions that appear on the map.

The map above is but one view of ratio of males to females across the U.S. and we should learn to not base our knowledge on a single map.  Different maps of the same variable will give different perspectives on the finer details of the distribution of this variable.  It is true that on different maps the highest state will always fall into the highest category and the lowest state will fall into the lowest category.  But, in between there may be significant variations in the images portrayed by different maps.  For example, here is the same data broken into four classes.  In this example the four classes are based on equal intervals.

This map looks quite different from the other map simply because it employs a different color scheme and uses a series of only four colors.  On the other hand, most states on this map appear to be aggregated into the same group of states as they were on the map above with five classes and the green color sequence.

Changing the number of classes to six gives a map that is quite different in appearance. Alaska now stands apart from all other states with the highest ratio.  Nevada is also alone in the next to highest class.  It appears that no state falls into the lowest class and indeed no state does.  The lowest ratio is based on the District of Columbia, with a value of 89.0.  That information was masked in the other maps.  To get this information requires going to a table of data.  And, because the District of Columbia is so small it is not shown on this map.  Giving equal representation to all areas is an inherent problem with choropleth maps when the geographic areas vary greatly in size.   

In this map most of the states in the western part of the U.S. fall into the third and fourth classes while the states of the eastern U.S. fall into the second and third classes. Indeed, each map shows a somewhat different perspective on the spatial variation of the same variable.  

In the maps above the data were aggregated into classes based on equal intervals or natural breaks.  To determine if there are natural breaks and where the breaks are requires an analysis of the data.  A discussion of this technique is treated well in many textbooks, including our text Elements of Cartography, 6th ed.  Below is a graph showing the distribution of the values.  Note that only the District of Columbia lies below 90 and at the top end of the graph we see Nevada at almost 104 and Alaska at more than 106. 

Are there logical breaks in this distribution?  The extremes stand apart but in some classifications these extremes will be cluster with other values.  Most of the states are bunched between 92 and 102.  There are no obvious places to break the data, but in most classifications the data should be broken up.  Consider the possibilities.

Two other ways to aggregate data for choropleth maps are to use round numbers or to divide the data into quantiles. Quantiles is the generic term for putting an equal number of states into each category.  The two most common forms of quantiles are: quartiles where one-fourth of the units are in each category and deciles where one-tenth of the units are in each category.  

Below is another version of this map in which the categories are based on round numbers.


On this map the top class is the ratio of 105 and higher.  The next class is 100 -105.  Thus the two shades of green show those states where males outnumber females.  Actually, when looking at the numbers it is not exactly this way, but close.  The mapping program does not permit defining class limits to greater precision than whole numbers.  Thus, Arizona (99.7) and North Dakota (99.6) have been put into the class of 100-105, while California (99.3) and Montana (99.3) have been put into the class of 95 - 99.  Data precision and rounding will always have some impact on data classification.

Below is an example of aggregating the data by quantiles--in this case quintiles--five groups with the same number of states in each group.

So which one of these five maps is the best representation of the Ratio of Males to Females based on Census 2000 data?  Sorry to inform you but there is no best.  Each of these maps is equally valid.  Depending on what you want to do with the map, one map might be better than another for that purpose.  And, if you want to get an appreciation for the spatial variation of this variable, then you should look at all of these maps, a graph of the data and perhaps more.




Thankfully, we have online mapping programs that permit users to quickly change the appearance of a map and explore the spatial patterns of the data.  The U.S. Bureau of the Census American FactFinder web site has the program that was used to produce the maps shown above.  CIESIN has a web site with similar choropleth mapping tools, but in some ways their software is more flexible.  However, the CIESIN site as of June 2003 has data for only the 1990 Census, but it is still a very instructional site.

I have put together two pages showing examples of a variety of maps created using two online web packages, and how to work through the packages. One page shows how to work through the U.S. Bureau of the Census American FactFinder web site to produce choropleth maps like those above.

The other page shows how to work through the CIESIN web site employing the Java 3.0 mapping engine. CIESIN is the acronym for the Center for International Earth Science Information Network at Columbia University.

In the Spring 2002 and in subsequent semesters students have been assigned to go to these two web sites and create their own web pages based on the maps they created at these online sites.  Many students have completed this exercise.  Here are links to the pages of two of those students. Laura has one perspective on the assignment and Tom has another.

Both of these are good examples of what I wanted students to create as they learn what goes into making effective choropleth maps.  I should note there were other good pages, but I have decided to feature only these two.




Return to the master pages of James R. Carter, the author of this page