Building AthleteDataViz - Using Hosted Services (Part 1)

As part of an ongoing series, we will cover the interesting steps needed to build awesome visuals as part of the AthleteDataViz web application.

Download the project code on Github at  https://github.com/ryanbaumann/athletedataviz

First, the architecture.  Where are we going?  We started by sketching out the functional requirements of the idea, and the potential tools we could use to implement the functions (on a whiteboard, of course!)

Whiteboards.  Reality_to_whiteboard_efficiency = 1 .0

How can we break this list of functions and tools and make it something concrete?  

First we look for "groupings" of functionality.  Fetching Data, Storing Data, Analyzing Data, Visualizing data all require some web infrastructure, and a software platform to build on.  Since this tool spans all of our functions (fetch, store, analyze...etc), we should look for a tool to help us manage the whole workflow if it exists.

Luckily the Platform as a Service (Paas) business has been booming, and there are a ton of options on where to host your app, database, and servers.  The PaaS options we researched included Redhat's OpenShift, Heroku, DigitalOcean, and Amazon AWS Elastic Beanstalk.  We spent significant time on OpenShift and Heroku in particular.  In the end, Heroku won the PaaS battle due to it's excellent developer documentation, and relatively equivalent pricing to OpenShift.  Digital Ocean and AWS Elastic Beanstalk were good options, but require more "sysadmin" time because those services mainly provide the server infrastructure, but not the tools to help manage the software built on top of the servers for you.

Next, we looked at the Fetch Data, Store Data, and Analyze Data functions.  We needed a web framework that could handle the back end for all three of these functions.  We choose Python with the Flask web framework because it can easily facilitate all three fetch, store, and especially analyze functions - also, it was the programming language our team had the most experience with.

Moving on to Visualize Data - there are lots of options here.  In essence, we don't want to limit ourselves to any one tool.  Rather we want to set up a framework where we can pick the best data viz tool for the job, and plug our user's data into that tool.  Therefore, we need to set requirements for the Store and Analyze data functions, so that they can easily adapt to deliver the data to a new viz source.  Because of this, we choose to store our data in a PostGIS database, which will allow us to perform the geospatial and timeseries analysis  and return the data to the viz tool in the exact format that the viz tool needs.  For example, Mapbox can retrieve the data using Javascript as geojson ; Tableau can query the data as lat/long and use it's own mapping engine ; and Plotly can query data as json for timeseries visualization.

Finally, Buy Product.  To serve our customers with a great product, we need excellent partners who make images into amazing products like glass prints, metal prints, tshirts, and more.  We also need services to handle purchases and a Point of Sale System.  While we continue to research and evaluate business partners, we took a stab at finding well-established business who can meet our customers needs.  

A bit of risk, a bit of luck, held together by a flexible but well-thought out plan - It's part of trying to change an idea into a business!

In part 2, we'll dive into some of the aspects of building the web application using Python, Flask, and Postgis.