The Relevance of Data and Visualization in COVID-19 Crisis
Part 1: Visualization through Spotfire: From Insights to Information
Author: Julius Gamboa
We are now at the 3rd week of community quarantine here in the Philippines. My mind is racing on a Tuesday evening around 10:30 PM. Can’t help but feel restless. While I enjoy the comfort of my home, the real heroes are out there risking their lives for me.
There’s also a lot of uncertainty. I want to do something but right now, but I know, staying put at home is the biggest help I can do. Somehow I wanted to do more. What can I do? Is there a way I can help somehow? While front liners are doing what they do best, maybe I should too.
One of the best ways to be prepared for these kinds of situations is to have the right information at the right time. That’s it! I work in IT and one of the key areas I focus on is Data Analytics.
Now it’s all about getting insights from the data to help decision making. For the first part of this mini-series, my goal is just to start to visualize the COVID-19 data to help paint the picture of the situation in the local scene and probably it could somehow, somehow inform others. As the saying goes, “A picture paints a thousand words”, so let me get started right away.
First, I need to have data, where to get it? I found a good data set in a Facebook group called “Data Science Philippines Discussion Group”. Yes, it’s public and anyone can request to join. Credits to John Rich Nicdao and his post in the group where I learned to get the data.
The data is being updated daily. The following shows the data set in Google sheet format. It has two sheets, the first is the “Cases” sheet, which contains row by row information about each case being reported.
The second sheet named “Historical” contains a daily summary of the data by reported cases, deaths, recoveries etc.
As a Business Intelligence & Data Analytics Technology Manager, I know a Data Visualization tool called “Spotfire” that I can use for the data set. First thing I want to do right away is to see a picture of what is happening. The second sheet is perfect for Time-Series Analysis/Forecasting. Time-Series Analysis is basically just plotting the data on a line vs. time.
I downloaded a copy of the Google sheet data and fired up the data visualization tool in my laptop. You can download a trial copy of the software with the following link below.
*FREE TRIAL* button
Adding data is as easy as browsing the file in my “Downloads” directory and choosing the downloaded Google sheet file. The tool takes care of parsing the file.
After loading the data set in the tool, I am now presented with the options.
One of the cool features of this tool is that I can immediately start searching the data and it will instantly try to answer based on what it finds on the data. I choose to start by clicking “Explore by searching and a Google like search bar pops up as shown below.
I type in the word status in the search bar and it immediately returns some recommended visualizations.
It is also interesting to see that the tool detected a possible relationship of the column “Status” with another column named “Health Status” and it is suggesting me to visualize the data using a bar chart. I click that bar chart recommendation and voila! I have immediately created a visualization in a span of seconds.
After working for a short while, I was able to craft the following dashboard.
Mapping is also easily done using the tool and you can overlay the data points on top of a map and I have also added an additional “feature” layer to outline the regions on the map. This is easily done by using shape files which are standard format used by Geographical Information Systems or GIS. Obtaining shape files for the Philippines can easily be found by doing a Google search and there are many free sources to download. I got mine from the link below:
As I mentioned before, the second sheet is perfect for Time-Series Analysis/Forecasting, however there is a slight issue with the data set I obtained, notice the first few rows have data for January 29 and 30 then skips January 31 and only has 2 records for February.
In order to do forecasting, the data points has to have values at equal/regular intervals, meaning the data has to have transaction rows for each date even though there were no cases recorded. So, I modified the data set and added a few rows to complete each date of a month as shown below.
Now that the data is updated, I can add it on Spotfire the same way I added the “Cases” data from the first sheet. I simply add a line chart and click on the out-of-the-box “Forecast” feature.
The tool automatically generates a fitted line using Holt-Winters algorithm, along with a forecasted line shown below. I have also tweaked the setting in order to show the confidence interval and a certain value along the vertical axis which is 5000.
Again, few seconds spent after, I have come up with the following charts.
Interpreting this, it would look like the country will breach the 5000 cases mark after a week from now (April 7). Another line chart is showing that the death toll is higher that the recovery rate, which might imply that we haven’t reached the peak of the curve yet.
Things look gloomy right now and if there is any comfort I can find despite all the things happening now, I am reminded of a post I recently saw that talks about the situation of Noah in the Bible- a family in ‘lockdown’.
40 days and 40 nights in the Ark, Noah’s family was confined in a boat. There were no windows, no balconies, no terraces, no internet, no phone, no TV, no Youtube, Facebook, or Netflix. They only heard the rain. They spent their time praying, loving each other and caring for the animals. God the Father took care of them as Noah was a man of faith and obeyed His word. Remember even though there is out there an ocean of viruses and life seems like a stormy ride, our God is watching over us! Do not be afraid! Be faithful to Him and wait patiently. The rain will stop one day. A rainbow will shine and all be well again.
In my next post, I will try to do more in depth analysis of the situation.
Why not take this quarantine/lockdown period as an opportunity to learn something new.
*Let’s learn together* button
What will you benefit from learning this? You’re going to take one step to learning an essential skill needed today in this Information age which is Data Analytics. This skill is needed by not just any person but organizations as well. In these kinds of situations, a well formed Analytics framework can help make decisions in time and as needed. I’d like to quote some paragraphs from the following article written by Dean Stoecker:
“In the context of a global pandemic, the rules of the game change and businesses are suddenly asked to produce the same output with broken operations, limited resources and less clarity on what our world will look like in the next week, let alone the next month, quarter or year. In turbulent times, good information, good data and the capacity to derive good decisions with analytics are more critical than ever.
Analytics can serve as a stabilizer. Whether you’re a data worker in healthcare tasked with optimizing hospital capacity, in transportation tasked with re-evaluating flight schedules or in manufacturing tasked with determining supplier capacity, analytics can serve as both a trusted advisor and your most powerful defence when making a decision that has significant consequences. Data analytics, like many scientific practices, is often associated with fact, logic and precision, rather than emotion; and yet, it informs human decisions that impact personal outcomes (which patient gets admitted, whose flight home gets cancelled or who gets laid off when their company can’t meet production schedules.)”
Let’s get better together!