work data visualization
PEOPLE ON TEAM
DATA VIS COURSE
At DePaul University, HCI students are allowed to take DSC 465 Data Visualization, a Data Science course, as an elective. The professor required a group project at the end of the course, utilizing our newfound knowledge of data and visualizations. I will concentrate on one of the visualizations I created for this project.
The final project in the course was a group effort to build an original collection of visualizations for an appropriate dataset chosen by the group.
Our datasets, entitled “Biodiversity in National Parks”, are from the National Park Service. One dataset contains information on all 56 National Parks including park name, acreage, latitude and longitude. The second dataset contains information for 119K+ species within these National Parks, including category, abundance, and conservation status.
We successfully drilled down into the data and found statistically significant results between region and endangered species, and told this story with our visualizations.
Our efforts and final report were rewarded with a letter grade of A.
data exploration, statistical analysis, ux design, front-end development
process and methods
understand • define
- Our datasets included 56 National Parks, 199,248 species, 14 categories of species (e.g. mammals), and 8 levels of conservation status (e.g. endangered).
- We did experience some issues with our datasets, including a major issue where some of the data was offset by a column because of incorrectly handled quotation marks.
- I quickly realized that we needed to introduce a new variable into the dataset, because it would be difficult to fit 56 parks into a single visualization (outside of a map).
- I introduced a new variable, Region, into the datasets. We started with 5 regions:
- We found the data heavily skewed towards the West region, which included both Alaska and the Pacific Islands. So I found a new set of 8 climatic regions from the Department of the Interior Science Center:
- North Central
- South Central
- Pacific Islands
explore • ideate
- Our exploratory analysis included bar charts, contingency tables, box plots, heat maps, tree maps and bubble charts for a range of variables within the datasets. Below are a few of the ones I created.
The visualizations are exploratory, which means they are quick and rough and created in bulk. Their purpose is to find a story, not to be displayed or used in a report/presentation.
explore • prototype
- A sunburst chart would be a good fit for our hierarchical and categorical data. A sunburst chart is built in rings with the inner ring being the top level categories.
On GitHub I found SunburstR , which can be used to create this type of chart in R.
Below is an image and a link to the first draft of the chart built using this code:
- From here, I edited the data to compress the 3 levels into just 2 and manually edited the resulting html to function differently for my prototype.
After class was over, I updated the prototype to include an onboarding sequence to help orient the user to the graph and showcase the interactive element.
- We haven’t even scratched the surface of this dataset.
- Categorical data, while not glamorous, can still be statistically significantly and hold important information.
- Graphs suited to categorical data can actually be interesting and look great, and can, in some cases, be easier for a general audience to understand than a scatterplot.
- It’s easy to crank out random bar charts, but it isn’t easy to isolate a story worth telling, and then it is even more difficult to find a way to tell that story with professional looking visualizations. I have a greater respect for those awesome looking/functioning visualizations we have seen as examples in class and from reading the textbooks.
- It is also easy to screw up your data! In this class I “broke” our data in a few ways - through manipulation into another form, not understanding how the calculations in Tableau work, or just resizing things in Illustrator. Luckily I caught my own issues (I think?), but it was a great lesson to see how easy it is to do this and to always be on the lookout for things that don’t add up.
- With 119K rows of data, it is hard to “look” through it to find any anomalies or issues, and I wish I was better skilled at looking for and recognizing things like that - I need to practice in this area.
- I started my Master’s degree with DePaul in the HCI program, but I find myself (after this class and HCI 512 Infographics) being drawn to this data vis world - hopefully there are great jobs out there where I get to work with both UX and data visualization.