Luke Stanke

Data Science – Analytics – Psychometrics – Applied Statistics

Impressions of Climate Change

NFL Analytical MVP 2016

Which NFL tickets are hottest?

5 tips for mobile dashboards (that are good for any device)

Designing mobile data tools can be intimidating particularly because we think we don’t have a lot of space to tell the same story we would with other devices. The format of data tools – including dashboards – for phones can appear rather limiting, but that’s just a myth. While it would be nice to be device agnostic – where we ignore the different methods data is now consumed (phones, tablets, desktop, blah, blah, blah) – we just can’t. It’s not best practice. We consume information differently by device so we need to design around each experience. Given the shifting landscape of how we consume information (hint: it’s increasing mobile) we need to developing appropriate data tools now. With that, here are five device agnostic data tool development best practices I follow (but are prioritized because of a mobile design).

Tell a clear and direct, but guided story.

This is my number one rule for any dashboard or data tool. Remember that our tools should answer the initial question asked by our stakeholders. But as we answer the question we should also shine a light on a what is likely a deeper actionable issue. To get to the actionable issue we need to provide context – and this means allowing users to “explore” the data. I use the term explore loosely because we want to give them the feel that they are diving into the data and blazing their own trail, but in reality we have curated the data and we are guiding users through the story. This is approach is similar to the one followed by the authors of  Data Fluency. Don’t be afraid to come up with a storyboard of how you want to guide stakeholders.

Use the least astonishing: Scroll first, then swipe or tap.

Consider the principle of least astonishment: if a necessary feature has a high astonishment factor, it may be necessary to redesign the feature. This means keep it simple and choose what audiences expect, which is usually scrolling down/swiping up. From a storytelling point-of-view when you scroll down/swipe up you keep the story on a single page. This makes going back and re-reading or re-interpreting something a lot easier than swiping or tapping.

When it comes to dropdown menus, use parsimony.

First, try to avoid dropdowns all together. But if you need a number: limit yourself to three dropdown menus. If you can get users to the exact source of data they need, do that. Dropdowns that apply filters or parameters make visualizations complicated. It’s not that dropdowns are bad, they just need to be customized for a mobile device. Affording each dropdown it’s necessary space takes away from the functionality of the data tool. Don’t forget that dropdown menus are going to have low discoverability – meaning you will have to touch the dropdown menu to know its there. One last thing, users like to see all of the options in a dropdown, so consider that space, as well.

Cursors don’t exist so skip hover functionality.

With desktop dashboards and other data tools we often hide additional data with tooltips (that extra stuff that shows up in a small box when we hover or click on something). Sometimes tooltips show up with a hover of a cursor – when you are using a desktop with a mouse, of course, and other times tooltips show up with a click or a tap. Once again, consider discoverability: Since it’s impossible to tell when a visualizations are going to have a tooltips – unless you explicitly state it – its best to avoid them.

Let the data and visualizations breathe.

Keep your visualization organized: Grids are good. Larger fonts are good. Consistent formatting is good. And don’t forget white space is good. Yes. White space. We don’t need to cram things in – but you already knew that. If we can give our audiences some space to process the findings then we don’t need to simplify our data tools to a few basic charts. More complexity should be accompanied with more whitespace!

Just one last thing: designing a mobile format is a blessing. We don’t feel the same obligation to fill an empty space on a dashboard – even if it doesn’t add value. Mobile tools force us to think about what’s important and actionable. This pressure allows us to make better visualizations and better tools, which should – if done right – lead to better outcomes for our stakeholders.

Most Downloaded R Packages of 2016

I was curious what packages are the most downloaded from CRAN. A quick google search of “most downloaded R packages 2016” produces outdated articles (see here, here, and here). Luckily the Rstudio CRAN keeps logs of the number of users downloading packages each day. With a few lines of code the data downloaded and aggregated. This data was then placed in Tableau and packages were given weekly ranks. Here are the results:

Considering the Possiblities: The Story Behind My IronViz Submission

Getting Started: The data

When developing my visualization for the IronViz competition I wasn’t sure where to begin. I thought that I needed a good data that would stand out from all other competitors. From early submissions, I saw a lot of competitors using similar datasets—I’m no different in this regard, but I think my process to get to this conclusion was much different than other competitors. The data I decided to use was more a product of my design philosophy for this competition than being able to find a unique, distinguishing dataset. Actually, I considered a number really interesting ideas—all which I possessed at some point during this month, but I decided to go a different route altogether.

Originally, I really focused on trying to find a very unique data set. I scraped data from the webz to find the locations of major grocery store locations and then determined  the distance from the centroid of every census track to the nearest grocery store. The plan was to look for trends in this data and make a color-coded clickbait map. I was actually almost done with the proof-of-concept when I stumbled upon a different dataset. As a Blue Apron user, I considered scraping recipe ratings and gathering insights about the best and worst meals. Finally I gathered geodata on food deserts in the United States at the census tract level and identifying patterns between food deserts and the characteristics of people who lived in these places. I decided to save this a future project, as well.

It bothered me about all this is I was finding a really cool data sets that would be very interesting in general, but I wasn’t really showcase seen the quality of tableau desktop. In one last search, I’ve came across a National Geographic visualization about food consumption. This I thought was really interesting and I wanted to see the underlying data. Luckily, National Geographic provided a link to FAOSTAT, a site that contains information about about food consumption around the world. I really wanted to data about overall foot habits in my data tool but my kept focusing and wondering about world meat consumption. I know that meat consumption was a big issue—as populations develop more more meat is consumed by population and generally speaking meat has a large impact on the environment. It also fit nicely with the theme #foodtipsmonth, so meat consumption became the focus of my visualization as did showcase the possibilities of data visualization in Tableau.

Tableau: The Focus of the Competition

Because it’s in written form, I feel like the context of this next section might be missed. Tableau is a great tool, but in some areas it falls short. In this next section, I discuss what I feel are some of the shortfalls of Tableau.

Story Points. When developing this visualization—this data tool—I really wanted to tell the story of meat consumption across the world, the differences in meat consumption by major world regions, and the impact it has on the environment. I also wanted to highlight some features of Tableau that I wish were present, but, generally, are still not possible without a lot of work. I tried really had to try to hack as much of the data tool as I could, I really think that this speaks more about the competition then a really interesting data set.

The final design I came up with reminds me of my first ever encounter with Tableau. I wanted to make a dynamic report combining both text and data-based graphics—because I believe that we should go beyond just a dashboard, we as developers should really should create a tool with some context. And other experts like Stephanie Evergreen agree.  And when we often create dashboards with lots of filters on the right part of a screen and and multiple visuals scattered on a single page, the story within the data gets lost. I wanted to go back to the style of my first project in Tableau—provide a mini report and interesting graphics that enhance the story.

I really wanted to the use the Story feature in Tableau. I’ve used it once before, but I found the functionality rather limited. Story points uses a lot of screen real estate. The borders are too large. The captions take up too much space. And what if I wanted to put the caption at the bottom of the page? Or what if I didn’t want captions altogether? I see the concept as perfect for both mobile design and as a best practice for focusing users. I think it’d be really cool if I used it in mobile and when I swipe left or right the story points moves to the next card. I see all of these as opportunities to highlight in this competition. When I saw a tweet from Jewel Loree on how others use story points, it was the tipping point.

Interactive Text. I think in part because of my design philosophy, I think copy (text) is a big an integral part of data telling the story. It’s always been my desire to have text that interactive and linked to charts in Tableau. For instance if the text focuses on a specific subset of the data, and this is something that is color-coded in a graphic, I might want to highlight that information by hovering my mouse over the text. When I do this, it highlights a specific part of the chart. This way the graphic and the copy are linked and interactive. We often see this right now in some of the HTML/javascript/D3.js designs on the New York Times, Wall Street Journal and other major online news sources. With my data tool I developed here, I tried to create interactive text by embedding visualizations within a text to show exactly what I want. It’s not perfect, but I think the concept is there.

R Server. Finally, it’d be great to showcase R’s capabilities in Tableau Public, but it’s not available so all the cool stuff I do in R has to be done ahead of time. This isn’t a big deal, but it’d be nice to showcase the possibilities. I use R A LOT. Often if I want to do some cool graphic in Tableau I prep the data ahead of time in R. This is no different. But it’d be great to be able to use it on Public. Here is 1/3 of the code I used for the IronViz challenge

meat_summary %

##  Select only meat data from 1992 to 2011
  dplyr::filter(
    stringr::str_detect(Item, 'meat') | stringr::str_detect(Item, 'Meat') | stringr::str_detect(Item, 'Seafood'),
    Year <= 2011, Year >= 1992,
    Item != 'Meat'
  ) %>%

##  Edit variables
  dplyr::mutate(
    Item = ifelse(Item == 'Meat, Aquatic Mammals', 'Meat, Other', Item),
    Region = ifelse(Country == 'World', 'World', Region)
  ) %>%

##  Select specific columns
  dplyr::select(Region, Country, Item, Year, Value, Population, Total) %>%

##  Set aggregation level
  dplyr::group_by(Region, Country, Item, Year, Population) %>%

##  Create aggregated variables
  dplyr::summarise(
    Value = sum(Value),
    Total = sum(Total)
  ) %>%

##  Set new aggregation levels
  dplyr::ungroup() %>%
  dplyr::group_by(Region, Year, Item) %>%

##  Create aggregated variables
  dplyr::summarise(
    `Value Wt` = sum(Value*Population)/sum(Population),
    Total = sum(Total),
    Population = sum(Population)
  ) %>%
  dplyr::ungroup() %>%

##  Left join meat_id -- for sorting meats
  dplyr::left_join(meat_id) %>%

##  Order rows in data frame
  dplyr::arrange(Region, Year, meat_id) %>%

##  Set aggregation levels
  dplyr::group_by(Region, Year) %>%

##  Get cumulative sum of varibles (for stacked bar charts)
  dplyr::mutate(
    ValueStacked = cumsum(`Value Wt`),
    TotalStacked = cumsum(Total)
  ) %>%

##  Set aggregation levels
  dplyr::ungroup() %>%
  dplyr::group_by(Region, Year) %>%

##  Create steam chart data.
  dplyr::mutate(
    ValueSteam = ValueStacked - (max(ValueStacked)/2),
    TotalSteam = TotalStacked - (max(TotalStacked)/2)
  )

Other Design Considerations

Mobile or Desktop? I struggled with the layout. Should it be for mobile? Or should it be a larger tool made for a desktop? When I think about the usage of dashboards and tools in the way that most information is consumed today, I think mobile design. First, the constraint forces the designer to create a tool that clearly and concisely tells the story at hand. Second, designers are forced to create clean, relatively simple data graphics,—whether it is a bubble chart, a line graph, or some other way of displaying data. When desktop tools are utilized, we data tool designers are more likely to make some lazy decisions and might not think about the full data tool experience.

Plus, a competitive event like the IronViz Feeder necessitates a zig while the competition zags.

Scrolling vs. Drilling down. When it comes to mobile design, you basically have two options, scroll or drill down. I’d normally design for scrolling. I really think thats just what people like to do. Look at most websites today, most sites have infinite scrolling to keep the user engaged. The only reason I chose to go the drill down route was to show off how I’d like to see story points look in the future.

Is seafood a meat? According to FAOSTAT: no. But if I talked to a vegetarian, would they call it meat: maybe, but generally yes. When creating the tool, I had to decide whether or not to include seafood as a meat. Clearly, overconsumption of seafood, particularly fish, is an issue. So in the end I did include seafood and fish. Most other articles out there don’t consider seafood meat, so my story is slightly different.

Bacon Charts. I was playing around with the data and I made a mistake in R doing the calculations, but I liked the end result, I thought I generally resembled bacon. I added a cosine function in Tableau and added a wave to it and came up with a food-themed bacon chart. With the case of the chart, the whitespace in the bacon chart represents an earlier time where meat consumption was lower. It also looks like like bacon. MMmmmm, bacon. See, sometimes mistakes can be good. Bacon.

Testing for multiple browsers. I use Tableau Desktop on a Mac and I check my work on Tableau Public using Chrome. I always worry about rendering—fonts in particular. They tend to render differently on different systems, so when doing my work I also check to see how things render in Internet Explorer on a PC. Doing the quality assurance on this project was not fun, I kept finding that my interactive text boxes were not aligning perfectly in Internet Explorer like they where in Chrome. This meant lots of tweaks. Over a few hours I worked out all of the minutiae.

Polar Clock in Tableau

About/How to

If you’ve read anything about data visualization you probably have heard of Mike Bostock. He was the genius behind many great New York Times graphics and created D3.js – one of the best data visualization tools available. Recently he offered updated versions of his polar clocks, which were originally inspired by Gabriel Mak’s PolarClock screen saver. Seeing this, I challenged myself to build the clock in Tableau.

Bostock's Polar Clock II

Not knowing how to do this, I decided to build a versatile dataset that could handle the job. The dataset has 606 rows, these rows are broken out in 202 row groups for hours, minutes, and seconds. Within those 202 rows, 201 are devoted to the trail/tail and 1 row is devoted to the main time location. Each row has 6 columns: 1) an index column; 2) a degrees column where row 201 for a given group is equal to 90; 3) a radius column – all values set to 1; 4) a size column – which I intended to use for setting the width of the lines, a factor column; 5) to indicate if its the main time point or the tail; 6) and segment – either hour, minute, or second.

Polar Clock Data Screen Shot

Once loaded into Tableau. we just need to figure out the current time. This can be done with the NOW() function. Since this information is possible, we now need to get the current hour, minute, or second. Once again, Tableau has a great function called DATEPART() that can pull any part of a date out from a string. Here is how you do it for minutes:

//  Get current hour
DATEPART('minute', NOW())

After getting this information, we just need to get this information on the clock in the correct position — for the sake of this tutorial we will use degrees. On a 12 hour clock, 45 minutes will be positioned 75% of the way around the clock — or 270º from the start point. The degrees in hours is a bit more complicated — though not much, since the clock handles 12 hours, but a day is 24. I just used the modulo % function which returns the remainder in a division problem. Here are each of the variables we’ll use for further analysis later

Hour
// Hours in degrees
(((DATEPART('hour', NOW()) + (DATEPART('minute', NOW())/60)) % 12)/12)*360

[Minute]
// Minutes in degrees
(DATEPART('minute',NOW())/60)*360

[Second]
// Seconds in degrees
(DATEPART('second',NOW())/60)*360

Generally once we know this, its just a bit of geometry to identify the x and y coordinates. Here’s the general formula for finding x and y coordinates given radius and degrees:

[x]
//  Get x coordinates from degrees and radius.
[radius]*SIN(RADIANS([degrees]))

[y]
//  Get y coordinates from degrees and radius.
[radius]*COS(RADIANS([degrees]))

Since we have all of the information about hours, minutes, and seconds in a column called [segment], we need to specify the coordinates for hours minutes and seconds in one column using the CASE() function. I also want to give hours minutes and seconds each a different radius so they don’t overlap on my clock. I’m going to hard code hours to a radius of 0.8, minutes to 1.0, and seconds to 1.2. When I make my clock this will put hours on the innermost ring and seconds on the outermost ring. Here’s the code only for the x coordinates:

[X]
//  X coordinates from time.
CASE [segment]
WHEN 'hour' THEN .8*SIN(RADIANS([Hour]))
WHEN 'minute' THEN SIN(RADIANS([Minute]))
WHEN 'second' THEN 1.2*SIN(RADIANS([Second]))
END

If I create a visual using this data you’ll see it’s not anything special.

Polar Clock - Incomplete Pt 1

I really need to add the tail to make it interesting. I can do this by modifying the code just slightly.

[X]
//  X coordinates from time with a tail.
CASE [segment]
WHEN 'hour' THEN .8*SIN(RADIANS([Hour]-([Deg])))
WHEN 'minute' THEN SIN(RADIANS([Minute]-([Deg])))
WHEN 'second' THEN 1.2*SIN(RADIANS([Second]-([Deg])))
END

Using this function, we can begin to have something that resembles the polar clock.

Tableau Polar Clock - Incomplete part 2

This could be good enough, but we want to create a parameter to control the tail length — Let’s just call the parameter [Length] to make it easier — and set the range from 0.1 to 2. We just need to add this to our [x] variable:

[X]
//  X coordinates from time with a tail.
CASE [segment]
WHEN 'hour' THEN .8*SIN(RADIANS([Hour]-([Deg]*2*[Length])))
WHEN 'minute' THEN SIN(RADIANS([Minute]-([Deg]*2*[Length])))
WHEN 'second' THEN 1.2*SIN(RADIANS([Second]-([Deg]*2*[Length])))
END

To make things easier, let’s also add a background to the chart. Here is the image i used:

Polar Clock Background

If you set the min and max values to -1.5 and 1.5 for x and y it produces this chart:

Tableau Polar Clock - Incomplete Pt. 3

One last thing we want to add: a parameter to change the line width of the tails. We want it to either be tapered or blunt. So let’s create a [Taper] parameter where the values range from 0 to 1. Then we can write a function that uses the taper parameter:

[varied taper]
// Varies line width
([Taper]*[Size])+((1-[Taper])*(1-([Index]/202)))

Once I have all this information, I can create a polar clock with parameters to control the tail. I’ll also throw in some additional styling.

This is excellent, except one thing — the clock only has the correct time when you load it. I want it to tick. Luckily, you can do this by using Tableau’s javascript API. UPDATE: I’m providing the API script written by Allan Walker and Chris DeMartini below — Here is the article that provides some additional guidance.

  var viz, one, workbooka, activeSheet, Worksheet, worksheet;

  var onePh = document.getElementById("tableauViz1");


  var oneUrl = "https://public.tableau.com/views/PolarClock/PolarClock?:showVizHome=no&:display_spinner=no&:jsdebug=n&:embed=y&:display_overlay=no&:display_static_image=no&:animate_transition=yes"; //Polar Clock

  var oneOptions = {

      width: "105%",
      height: "112%",
      hideTabs: true,
      hideToolbar: true,

  onFirstInteractive: function () {
          }
        };

  viz1 = new tableauSoftware.Viz(onePh, oneUrl, oneOptions);
  ;

  function clock() {
    viz1.refreshDataAsync();
    setTimeout(clock, 500);
  }

  clock();

Adding this to my page — and doing some stuff to make it happen in WordPress — produced this as my final result:

Voronoi Diagrams with Weather Data

I wrote some code to make a voronoi diagram from point data using NOAA weather stations. The goal original goal was to find the closest weather station given any point in the US. I decided to show some examples of the output in Tableau – since it utilizes R. 

Who is the best running back?