Thursday, May 16, 2013

One Bedroom Apartment Listing Prices, San Francisco, 2012

If you click on a neighborhood in the map, you get an estimate of what you *should* be making to live there, based on the assumption that rent is 1/3 of your income.

The map lists median rent for a one bedroom apartment in the second half of 2012 for each neighborhood. The marbled effect comes from the data itself -- the higher the listing prices in a particular area, the darker the blue. Hence if you are surprised by the high median listing price in Bayview, when you zoom in on that neighborhood you see that high rents are clustered around redevelopments near Mission Bay and Candlestick Park. I also visualized the distribution of listing prices as a violin plot to give you a better sense of this:

One Bedroom Rental Listings - Distribution by Neighborhood
This map got a lot of attention, and was lavished with praise by Burrito Justice, called 'terrifying' by the Huffington Post, and lightly teased by Gizmodo because I left in a few stray geocodes that landed in Golden Gate Park (why not?). It was posted twice in a row on Reddit, in the San Francisco subreddit, once with a link to my Tumblr, where I explained a bit about my methodology and sourcing, and once to the MapBox hosting site. I learned a lot from the comments there. People were respectful, if skeptical, particularly in the thread with no contextualizing metadata. It is weird to read about yourself being referred to as 'they', but you do see how readily people slide into othering when they depersonify you in that way. It was nice to see how much more intelligent people got in the thread from my Tumblr, since they had the metadata explaining how the map was created. And most of all, it was interesting to see how many people did not get what the map is about. Hopefully, I have done a better job explaining it here than I did in my original Tumblr posting, which was hastily executed as I had a lot of other stuff going on at the time.

I acquired the data used for this analysis last December, and immediately began work on an analysis that I had completed already when this happened:

In Februrary SFist published this infographic/advertisement for Zumper displaying the average rent in San Francisco.

SFist Map


Sorry guys. Blech. Average is the wrong statistic and one month of data is not enough. Rather than sit on my hands, I used leftover data from the set acquired from Padmapper to try and make a better map. The results are what you see, above.

Methodology details, from mapnostic.tumblr.com:
Here is why my analysis of devastatingly high housing prices in San Francisco is better than all the others that came out in the past two weeks:
1. I used a dataset that spans a longer time range than one month, so that the sample sizes per neighborhood are larger. My data spans the second half of 2012.
2. I filtered for duplicates, so that the same apartment listed multiple times does not skew the data.
3. I used median instead of mean, which makes sense when you are dealing with a lot of outliers. You should be suspicious every time you see some shocking statistic about the “average rent” in San Francisco. The high end skews things quite a bit.
4. The fact that I am even explaining my methods and assumptions.
5. Instead of a silly choropleth, I made a super-pretty heat map based on the spatial variability of listing prices. The interpolation is based on ordinary kriging, and I underlay it with contours to bring out the variation a bit more.
As with several previous projects, I designed this map in Tilemill and am hosting it via MapBox, and overlayed it on a hard working OpenStreetMap. For the raster processing I used ArcGIS and QGIS. The data was kindly donated by PadMapper.com.
Template developed by Confluent Forms LLC