Tableau 10.3 adds PDF connector

Anyone remember the days of having to manually re-type PDF tables?

Before the days of open government data policies and other efforts to share data in useful formats, data was often locked up in PDF tables that required some heavy lifting to move into more analyzable formats. Various scraper wikis (that were at one point free) felt like game changers when I learned about the at Open Data Day a few years back. I've also been grateful for the CSV downloads sites like the World Bank Databank provide to streamline the process of analyzing and visualizing that data.

Today, we've made leaps and bounds towards storing, sharing, and archiving data in more accessible formats, but some data is still housed in PDF tables. Visualization platforms, like Tableau and PowerBI, are consistently rolling out new ways to connect with different data sources, but updates are often for adding links to new big data or enterprise level systems that aren't as commonly used in the global health and development world.

In the new Tableau Desktop 10.3, you can connect directly to a PDF table to extract, analyze, and visualize your data. From the announcement from Tableau: 

The PDF connector will allow you to connect to PDF files, identify tables, and let you treat this like any other data source within Tableau. With this connector, pre-processing data from PDF documents by brute force or copy-pasting is a thing of the past. Now you can connect to PDF documents like you can a text file, leverage all of Tableau’s awesome capabilities (cross data-source joins, parameters, and more), and build impactful visualizations with ease.

Currently Tableau Desktop 10.3 is only available in a beta testing program. We'll post an update when the platform is live. And remember, even if you don't have a Tableau Desktop license, you can always download Tableau Public, which gives you the same software but with the limitation that you should only use it to visualize public-facing data.

