Data Adventure #3: Comparing Data About Two or More Cities

Posted on 07/01/19 by LA Counts


This guide is for novice users interested in analyzing data that varies based on the city. Organizations, LA County, and city governments collect all sorts of city data, from animals in shelters to the number of students on free or reduced lunch.

While programming experience helps for this instructable, it is not required. (Please see our first and second instructables for information on the tools used in this exercise, and the final instructable for information on APIs)

Say you are curious about the number of Internet hotspots and public computers in cities, like in the library above! Does Alhambra have more free wireless hotspots than Long Beach? To answer this question, you'll need a convenient way to compare and graph counts of public wifi hotspots in multiple cities. In this guide, you'll learn how to compare access to public internet hotspots in different cities in LA County. Once you learn how to do this, you can also use this same technique to analyze similar types of datasets.

Ingredients Used in This Notebook

A dataset from LA Counts on locations for public internet access. When you open the CSV file (see our last instructable for how to view and clean CSV files), you'll see columns similar to the below image. Remembering the last guide, does the data look clean, complete, and consistent? It does! We also selected this file because it was recently updated, is in CSV format, and has a field for different cities. A Jupyter Notebook like this one, hosted on Google's colab. Free Python Libraries (numpy, pandas, and plotly). These are accessible within Jupyter Notebooks, so you don't need to download them. Your smarts!

Linked stories

comments powered by Disqus