Is The World Getting Better?
The complexities of this seemingly simple question appear quickly, with even a surface analysis: “What metrics will be assessed?,” “Across what timeframe?,” “‘Better,’ in comparison to what, and for whom?,”etc.. The Millennium Project developed its State of the Future report as a “compelling overview of humanity’s present situation, challenges and opportunities, potentials for the future, and actions and policies that could improve humanity’s outlook”. To this end, they release an annual State of the Future Index (SOFI) that helps assess the contributing and detrimental factors for a better world across 28 dimensions. Data spanning more than 20 years are extrapolated 10 years into the future to assess the world’s rate of improvement. Thankfully, on some levels, such as with decreasing global poverty, or increasing electricity from renewables, the world is projected to improve. On others, however, including renewable freshwater and terrorism incidents, the world is projected to worsen. To impact policy, raise global awareness, and ultimately create meaningful change, visualizations such as this help to answer not only if the world is getting better, but where resources should be allocated to maximize the effect.
Scholars and practitioners agree that data visualizations should be engaging in their presentation; illuminate substantial insights; accessible and egalitarian in their representations; and transparent in data collection and method. While The Millennium Project does visualize the change in the SOFI over 30 years, as well as the 28 dimensions from which it’s derived, the visualizations can be improved across the four criteria listed below:
“The Data Doesn’t Speak For Itself”; rather, it is the task of the visualization to craft a narrative that guides and engages the audience to insights both apparent and discoverable. To that end, the new presentation includes an animated informational video optimized for social media that encourages viewers to form a relational dialog with the visualization. As detailed in the “Exemplars” section of this report, standout visualizations create this dialogue through their use of color, line weight, hierarchy of composition, interaction, and callouts to guide the reader through a data driven narrative. Similarly, this new data visualization contextualizes the SOFI among the nations it represents. Users are able to interactively determine the granularity of analysis by browsing through all 28 dimensions that comprise the SOFI within a single nation, across multiple nations, and at the world level.
Arguably the biggest contribution of this new data visualization is the inclusion of National Comparison SOFIs. While it is important to know to what extent the combined SOFI dimensions (poverty, health expenditure, life expectancy, etc) are changing global outcomes, The Millennium Project’s current visualization is limited by its inability to persuade action from individual nations on nation-specific goals. Now, through the comparison of extrapolated SOFIs between nations and the world, a clearer depiction of which nations are exceeding or falling behind global progress emerges. Moreover, this comparison extends to all 28 dimensions that comprise the SOFI. By highlighting this information in scalar color values over a geographic base-map, the data more clearly identifies global regions of successful efforts, and areas that need improvement.
The Millennium Project’s mission, to “improve thinking about the future and make that thinking available through a variety of media for feedback to accumulate wisdom about the future for better decisions today” is better realized with this new informational video that describes its objectives. Intentionally sharable, and optimized for social media, it invites viewers to see themselves as both citizens impacted by and contributors to the dimensions that make up their nations’ SOFIs, while simultaneously positioning them as members of a global society, with a responsibility to encourage world-wide improvement. When they are directed to the new visualization, they are given the chance to explore the data contextualized within both of these personas - a national, and a human.
This visualization helps expose the known unknown. Many nations within the visualization have incomplete (or nonexistent) historical data. Data transparency strengthens the power of this visualization, because it highlights where researchers should focus their efforts to create more accurate analysis. Further, it enables policy makers to leverage that knowledge for international change. Additionally, this accompanying report summarizes the methodological approach of calculating National Comparison SOFIs and makes the accumulated data more accessible for those wanting to repeat the work.
This visualization builds heavily from the methodology used in The Millennium Project's biennial report on the State of the Future. Within it, they generate a State of the Future Index that measures 28 expert selected dimensions on which to evaluate the increasing or decreasing trend of the world’s betterment. The report gives guidelines for individual nations to calculate National Comparison SOFIs, which utilize variants of the same 28 dimensions for uniform comparisons. Select countries of Europe and America, as well as Azerbaijan, the Democratic Republic of Timor-Leste, Kuwait, the Republic of South Korea, and Turkey have all created their own National Comparison SOFIs.
Additional data visualizations are shown below, noted for their novelty of presentation, level of engagement, and clarity in revealing insight.
This table outlines the record of all data used in the final visualization. The visualization sources from The Millennium Project's proprietary data set to visualize the State of the Future Index for 2014-2015, and gathers data from the Carbon Dioxide Information Analysis Center, the Footprint Network, Freedom House, the Global Terrorism Database, the International Labor Organization, Wikipedia, and The World Bank to populate the 28 dimensions that comprise the National Comparison SOFIs for 214 nations. These sources provided 68,292 retrievable records. Using methods outlined by The Millennium Project for National Comparison SOFIs, an additional 44,912 were extrapolated using various regression methods. In total, the data set contains 186,470 records spanning 30 years. Data set attributes are summarized by dimension category. Data sources and code for extraction and processing are available through the links within the table below:
Select Two or More Countries: Control + Click/(Mac Cmnd + Click):
Zoom/Pan (Map View): Use the Navigation Icon on the left to select Zoom, Pan, and Multi-select options.
Missing data are still data. When data are not available, it skews our perceptions of visualizations, and can lead to false assumptions, and ultimately poor policy choices. Missing data highlight the disparity between regressed and developed nations. Those with the means are able to calculate accurate metrics, while those without have sparse accounts. Some metrics, such as terrorism incidents were difficult to capture, however they themselves have no effect on the global state of the future.
Most countries have a projected SOFI that is better than the current global average. This is very encouraging because it means countries are projected to have more prosperous futures. However, the rate by which nations are improving varies significantly. As the graphs indicate, certain nations are meeting and exceeding their normalized values within the 28 dimensions that comprise the National SOFI, while others are stagnating.
There are no unified dimensions where all nations are improving. While at the turn of the century, fossil fuel production were the greatest determinant of a nation's SOFI, access to fresh water source provides the greatest increase to a nation’s future outlook. In contrast, elements such as terrorism incidents, bio-capacity, and fossil fuel production had the least effect on the global state of the future.
Foreign direct investment spending is changing (increasing) at the fastest rate compared to the other SOFI dimensions; this is in sharp contrast to economic inequality, which is on a global decline, but projected at the same rate as it has been for the last 20 years.
With this information, policy makers might ask, “why is the rate of foreign direct investment spending increasing, but yielding near stagnate declination levels in economic inequality?” or “Is there a relationship between the lack of clean water contributing to an uncertain or negative state of the future, and the increase in terrorism incidents?” At the national level, nations with disproportionately high need for freshwater (Yemen, Haiti, Tanzania, Angola), have seen some of the largest gains in Health Expenditure per Capita, and likewise Life Expectancy at Birth.
It is the hope that this visualization tool, and the non-proprietary data made available via Tableau, will allow policy makers and the public to form relevant insights to the goals of their organizations.
Data Analysis & Validation Challenges/Solutions
To maintain coherence with The Millennium Project’s State of The Future Index methodology for the global SOFI, data was sourced from the same databanks used for each of the 28 dimensions that comprise the SOFI. As such, certain nations have both missing and incomplete records for their National Comparison SOFIs (see “National Comparison SOFIs). This link shares all extracted data files. Here is a summary of the data validation challenges and solutions for each data set and their combination into the final visualization:
Data Sets: Shown below, each data set contains extraction method; and cleaning process used to prepare the data for National Comparison SOFI normalization.
Extraction Method: The CDIAC maintains published global and national estimates of Fossil-Fuel CO2 emissions from 2005 - 2014 as a free downloadable Microsoft Excel file.
Cleaning: The downloaded file uses Country Codes according to ISO 3166 standards. Missing data was coded as [..]. SQL queries were used to parse missing data and pair with country codes.
Extraction Method: The Footprint Network license a “public data package” for non-commercial purposes to academics. It contains the 2012 Ecological Footprint and biocapacity results for all countries included in the National Footprint Accounts of 2016 n=186.
Cleaning: The downloaded file uses ISO 3166-2 standards for country names. Missing data is coded as [“”]. SQL queries were used to parse missing data and pair with country names.
Extraction: Freedom House maintains annual data reports from 1973 - 2016 nations ranking them as “Free,” “Partially Free,” and “Not Free,” on a scale of 1-7, based on their assessment of the Nation’s Political and Civil liberties. These data come as an Excel download.
Cleaning: In Freedom Ranking, The Millennium Project methodology recommends taking the lower of the two assessments between political and civil liberties, and assigning free, partially free, and not free as the nation’s freedom score. These assignments were made and the file was then paired with the ISO 3166-2 standards for country names. Manual editing was necessary as certain countries (n=6) had name changes prior to 1973, and the original nation names were used in the downloaded table.
Extraction: The Global Terrorism Database offers an extensive list of armed military conflict between Nation-States and groups without internationally recognized geo-political affiliation. The database, however, is under a license restriction that prevents the files from being shared via a public website: “Unauthorized Publication of the Data: No part of the GTD may be republished on any website or accessible for public download in any format without the express permission of a GTD staff member. In addition, no part of the GTD may be distributed for any commercial purpose, nor with the intent that the data be used in any commercial enterprise, without the express permission of a GTD staff member. START reserves the right to withhold this permission.” Further, it states that: “Unauthorized Reconstruction of the Data: The GTD website allows the USER to download the entire GTD dataset in a portable file format through the Download page. In addition, the GTD's World Wide Web interface allows the USER to download information on up to 1000 incidents at a time. However, the USER is not permitted to use this feature or any other method to reconstruct the original data set upon which the GTD interface is based. This includes the compilation of more than 1000 incidents from the GTD interface by either manual or automated methods.” Given Tableau Public’s configuration, it is possible to reconstruct multi-variate data structures from their downloaded files, and we thus decided to omit this data.
Extraction: The International Labor Organization maintains record of labor policy, practice, and public opinion from over 187 nations. These records are compiled through their searchable database, and may be extracted as Comma Separated Value and Excel Files. The data obtained were used for calculating unemployment, high-skilled employment, and social-unrest.
Cleaning Each separate data file contained the appropriate metric sorted by gender. Values were combined, and paired with the ISO 3166 standard country code. No national record contained missing data.
Extraction The most utilized source for data collection came from The World Bank. Similar to The International Labor Organization, The World Bank maintains a record of policy, practice, and opinion from 214 nations. These records are compiled through their searchable database, and may be extracted as Comma Separated Values and Excel Files. For this visualization a C# program was executed to mine the source, and parse the records into fields by category, year, country, ISO 3166 country code, and value.
Cleaning Records were compiled by nation, ISO 3166 country code, year, category, and value. This formed the basis by which all other data were cleansed.
National Comparison SOFIs
Extrapolation: Extrapolation model varied by nation and SOFI dimension. The model used to extrapolate the world SOFI was compared against polynomial regression (n<4) to determine which contained lower standard error of known data values.
Data Non-Dimensionalization and Normalization: To non-dimensionalize and normalize the 28 metrics, we followed the guidelines published by The Millennium Project’s National Comparison SOFI Report. While that report contains the full methodology, relevant extracts are compiled below:
Non-dimensionalizing the variables. This is needed since, obviously, values of different variables, with different units and from different data sources cannot be simply added to one another. For non-dimensionalizing most indexes (including the SOFI) use the formula: X = (actual value of the variable– MIN)/(MAX – MIN). The best and worst values of the variables were identified in the TMP Global Lookout Panel
Weighting the variables, needed because not all of the variables are of equal importance. The values were provided by the Millennium Project for use in National Comparison SOFIs and obtained from a global expert panel.
With such a large data set it was important the the presentation highlights three main components: All 28 dimensions that comprise the National Comparison SOFI, a nation’s projected value of the dimension relative to other countries in the world; the projected SOFI of every nation. This was achieved through the use of the publicly available Tableau software platform. Data were entered into the appropriate fields for mapping, and temporal analysis. These were then mapped with action and parameter functions to create the interactivity between selected nation and the corresponding visualizations. A complete breakdown of the visualization components can be explored through the online worksheet here:
Challenges & Opportunities
The World Bank and Millennium Project site both contain large data sets and accurate visualizations that communicate the broad overview of the project’s mission. What is missing, however, is a compelling narrative that communicates how the data directly relates to the viewer. There is no clear definition of audience. A long term goal would be to create an animation in the spirit of "If The World Were 100 People" and "Corruption is Legal in America" that showcases our visualizations by way of a narrative that highlights the import of the data. For instance, could the concept of “Time” form a linking thread between the 15 Global Challenges and 28 domains that comprise the State of the Future Index? How might communicating the buying power of $1.00 encourage an affluent nation to contribute to the needs of a developing nation? These concepts are inspired by the data, but rely heavily on artistic techniques of both composition and storytelling.
While the stand-alone interactive visualization has a nearly immediate response time, the web version is markedly slower. Interaction design theory strongly encourages real-time, immediate feedback for an engaging user experience. Unfortunately, our present limitations in backend development prevented us from providing this level of engagements. Future versions would work towards optimization.
Similarly, very early in the design process, we had an idea of what we wanted the data to express, and how we thought best to visualize it. For stability in code, and flexibility in editing, we opted for a third-party design/development tool - Tableau. While the efficiency benefits were readily apparent, we were limited in some of the design features that we wanted to implement: Scrolling auto-sort navigation of all countries and all dimensions, animated visualization of data trend lines across time, embedded links to metadata sources as tooltips, and opportunities for citizen scientist to become engaged with the mission of the project.
We would like to thank Jerome Glenn of The Millennium Foundation for his expertise and dedication to the mission of improving the state of the world; Katy Börner for her instruction and encouragement to never cease looking for insights, and pushing the boundaries of information visualization; and both Ashish Shindure and Michael Ginda for their 0th hour assistance with the IVMOOC course.
"2015-16 SOF - The Millennium Project." 2015. 26 Apr. 2016 <http://www.millennium-project.org/millennium/201516SOF.html>
"About the Millennium Project." 2010. 28 Apr. 2016 <http://www.millennium-project.org/overview.html>
Boden, T.A., G. Marland, and R.J. Andres. 2016. Global, Reginal, and National Fossil-Fuel CO2 Emissions. Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, U.S. Department of Energy, Oak Ridge, Tenn., U.S.A. doi 10.3334/CDIAC/00001_V2016
Börner, Katy. Atlas of knowledge: Anyone can map. MIT Press, 2015.
"Data Doesn't Speak for Itself - Harvard Business Review." 2014. 28 Apr. 2016 <https://hbr.org/2014/04/data-doesnt-speak-for-itself/>
"Essentials of Sociology | W. W. Norton & Company." 29 Apr. 2016 <http://books.wwnorton.com/books/Essentials-of-Sociology>
Klanten, Robert. Data flow: visualising information in graphic design. Ferdi Van Heerden. Gestalten, 2009.
"National SOFIs - The Millennium Project." 28 Apr. 2016 <http://www.millennium-project.org/millennium/SOFI-national.pdf>
Nielsen, Jakob. Usability engineering. Elsevier, 1994.
"Tableau Software: Business Intelligence and Analytics." 29 Apr. 2016 <http://www.tableau.com/>
Rendgen, Sandra. "Information graphics." (2012).
Tufte, Edward R, and E Weise Moeller. Visual explanations: images and quantities, evidence and narrative. Cheshire, CT: Graphics Press, 1997.
Tufte, Edward R. "Beautiful evidence." New York (2006).