Wednesday, January 29, 2014

How to build a Google Fusion Tables map

These are the basic steps we talked about in class today for building a Google Fusion Tables map. There are lots of nuances and such, but this is just the basic stuff. There are four steps: Upload, set features styles, set info window and publish.

  • Start with a good data set (we plotted points using this Dangerous Dogs list). While you could start with data that has a standard address, Google may or may not do a good jog geocoding that data. It's best if you have good Latitude and Longitude for each record. You can have a separate column each for Latitude and Longitude, or put them in the same column with a comma between them.
    • Texas A&M has a free Geocoding service that is wonderful. Once it codes everything, it also tells you how close it got on the address. Watch out for anything that used ZIP code, as that wasn't good and you'll need to manually fix them.
    • The Classic Google Maps makes geocoding a single address easy. Just find the place on the map, then right-click at that place and choose What's here, and it will put the Lat,Long in the search bar.
  • Log into Fusion Tables. You have to have a regular free Google account (in other words, not your UTMail account.)
    • Fusion Tables is connected to Google Drive. Go under Create and see if Fusion Tables is listed. If not, go to "Get more apps" and find it and add it.
    • That said, I suggest you bookmark this link that shows only FT tables. It's hard to find otherwise, and is useful in searching for public tables.
  • In Drive, go File > Create > Fusion Table (or choose New Table in the showtables list), which will take you to a window where you can upload your .xls file, or find one of your existing Google Spreadsheets.
  • Let the wizard guide you through the upload until you see the rows of data.
    • Note which columns are colored yellow, as FT thinks those are Locations. Sometimes you have to go under Edit > Change Columns and find the right Lat/Long column and change the type to "Location"
  • Most times, FT will recognize the location columns (Lat/Long) and create a tab called "Map of [whatever]". If you don't have Lat/Long or shapes (covered later), then it might try to geocode based on an address field it finds. It might also ask for help to make it better, like adding a city or state, etc.
  • Once you have the map and your points are plotted it's time to work on the feature styles and info window.
    • While looking at the map tab, look left for Feature Styles. This is where you can pick what kind of marker you want to use for points. You can use a single style, or use different markers based on a column that has numeric values. You can also set the marker by name using a column.
    • Click on Set Info Window to change what shows up in the pop-up window when you click on features in map. The Automatic sides lets you check on and off fields in the window. But if you want to edit what the label says, or add some simple HTML, then go to Custom and change the info window there.
  • Once all that is set, there are two steps to publish.
    • Click on the Share button at top right and set it to "View" for "Anyone with link can view" or for "Public. But DON'T USE THE URL THEY GIVE YOU THERE.
    • Instead, now choose the menu under the Map tab and go to Publish. There you can get a link, iframe or javascript embed. to use in your blog.

Using shapefiles for data

Later in class, we published using shapes. I'll go over how to get your own shape files in a future class, but in this case we merged some Census data with an existing shapefile that was already public in Fusion Tables. This is a real abbreviated version:
  • Upload your data file to Fusion Tables. Note the GEOID field, which as field unique to each county.
  • Go either to the advanced search in Drive or the Showtables page and search for Texas County Shapefile. If it doesn't come up, make sure you are searching for "public tables." Once you open that file, copy the URL into your clipboard.
  • Go back into your census data file, and go under File > Merge, and the put in the URL of the county shapefile and hit next.
  • Match up GEOID column in your data file to the GEOID10 file in the shapefile and merge them. You can keep all the columns for this demonstration.
  • A map tab was created for you upon the merge because you have a "geometry" column with all the shape information.
  • Now you can use the Set Feature Style, and set the ploygon fills to color by a bucket on the Median Age field. You can also set up an Automatic Legend there.
  • Set up your Info Window with the important information, publish your map like you did above and BAM!, you are done.
Easy sneazy.

Now, here is the dirty little secret. The Fusion Table part is the easy part. It's getting the data formatted the way you want it before you import it that is the trick. You have to watch things like:

  • ID numbers that start with 0. Those fields have to be set as text fields in Excel, or you'll drop the leading zeros. Zip codes on the East Coast are bad. So are school codes.
  • Fields you have to merge on need to be identical. If you are merging by county name, you can't have La Salle County in one file and LaSalle County in another.
  • Fusion tables takes .kml for shapefiles (keyhole markup language), but most government supply in the multi-file .shp format. You can convert them at shpescape.com, or with QGIS, which will learn later in the course.

Saturday, January 4, 2014

Data-related blogs, sites and Twitter accounts

In no particular order, and sure to change, these are blogs and sites that you should read regularly for ideas, lessons and knowledge about data and journalism.


Twitter accounts and people worth following
Some Tableau-centric data blogs