Complete Story
 

08/07/2015

Overcoming the challenges of data-driven journalism

by Courtney Stanley, ONA Intern

Writing a data-driven story challenges journalists to turn spreadsheets full of numbers or documents full of legal jargon into informative and interesting narratives. Two Ohio journalists discuss overcoming the challenges of data-driven journalism.

Where to begin

Doug LivingstonData-driven stories typically take root in one of two ways, according to Doug Livingston, education writer at the Akron Beacon Journal: exploring and developing an issue you know exists or searching through data for clues on potential stories.

Ideas for fresh and relevant issues can come from knowing what is important to your readers, expanding on recent stories, or localizing wider reports with more data.

“I spend some time aimlessly wandering through reams of data,” Livingston said. He said if you get stuck with a data set it helps to rearrange the information to look at the same numbers in different ways.

Difficult data

All data is not created equally. Different organizations and groups all store their data in different ways, and some data can be missing, compromised, or unavailable to the public.

Jill Riepenhoff, projects reporter at The Columbus Dispatch, requested code violations from a governmental agency and received a spreadsheet with around 300,000 rows of information for a five-year time span.

Riepenhoff was baffled. The spreadsheet took weeks to examine, but she finally found out what was wrong—the way the city maintained records created an enormous amount of data. One call to the city about a pothole on a private street had produced around 300 code violations, one for every resident on the street, which made 300 rows of information.

It was the most frustrating data set she’s worked with. “It looked like an electronic cabinet people just threw things in,” she said. But after all of the work, she found what she needed and the governmental agency decided to invest in a new system for their data storage.

Jill RiepenhoffLivingston has had similar problems. When comparing data on graduation and test rates across the country, he has found data is not always comparable state by state because students are held to different standards across the country. Livingston suggested finding a reliable common denominator, in this case a nationally standardized test, to compare states.

Riepenhoff said one of the hardest parts of writing data-driven journalism is working around a lack of data.

“Sometimes you know there’s something wrong, but the hard part is going about exposing it,” she said.

In a series on credit reporting problems, Riepenhoff said she struggled to find data because the credit companies were private industries. She found a way to end-run the private companies by looking at consumer complaints to the attorney general. “Purely by dumb luck,” she said. Those dumb luck moments can save a story with no data to fuel the theory.

Compiling new data

In some cases, Livingston and Riepenhoff said, you’ll have to compile your own data.

Riepenhoff said she had to compile data from 1,500 case files while working on a series about adults placed under legal guardianship because they were not kept electronically.

Instead of regurgitating issues the public already has data on, Livingston said he tries to research and write about data that no one else is talking about. This often means compiling data from many different sources, he said.

One topic he has looked into is students being hit by cars as they walk to and from school. Livingston looked up police reports of accidents, checked who was at fault, confirmed school enrollment numbers, and more to create the data he used to draw conclusions about the likelihood of a student being hit by a car on their walk to or from school.

If data on the topic already exists, Livingston said he likes to do his own research, which he can check against the “so-called experts” rather than just reporting on their findings.

Connecting to your audience

When Riepenhoff and her team talked to their editor about their credit report story, even he worried about how they could possibly make community members care about the story. Even though it was an important topic that affected nearly everyone, “There couldn’t be anything more boring,” Riepenhoff said with a laugh.

Riepenhoff and her colleagues had to find a way to make the facts and figures real to people, and to do that they needed humans at the center of their story.

Through consumer reports and lawsuits, Riepenhoff and her team found and contacted the people affected by mistakes on their credit reports and stolen identities. The stories they heard were just what they needed to convey the reality of bad credit reports to the public.

They heard the story of a woman who, while trying to buy a car, found out her credit report mixed up her name with someone on a terrorist watch list. She was detained for hours while the car dealership said they were calling the FBI.

Another woman whose credit report was mixed with someone else’s with a similar name tried to fix her situation for 12 years before getting it straightened out—she couldn’t buy a car, couldn’t get loans for her daughter’s college, and almost lost her house.

Without data a story has no credibility, but without a human element readers won’t care about the story, Riepenhoff said.

Livingston agreed, saying it’s important to not only translate the meaning of the story, but also the impact on different community members. What does your story mean to parents, taxpayers, property owners, lawmakers, policy advisors, and others, he said.

Once you find the human face behind the data, Livingston stressed it is important not to misrepresent that person by manipulating their story to fit your data.

The outcomes of data reporting

“We’ve been very lucky—our work has definitely led to many changes for the better,” Riepenhoff said. Her work with The Dispatch has helped put abusive attorneys under investigation or indictment, kick-started a 31-state attorney general investigation of credit companies, and pushed governmental agencies into better record keeping. Her partner, projects reporter Mike Wagner, has worked on exposing faulty DNA systems resulting in five or six wrongly convicted men being released from prison.

Other times the long and difficult data search produces fewer results.

Livingston was reviewing a report of national graduation rates when he found several thousand online charter school students in Ohio weren’t included. He said it would have been a disservice to the community to report the national numbers knowing thousands of Ohio students were excluded.

Although the hours spent checking the report did not result in an article, Livingston said the time was well-spent because he prevented the public from being misinformed.

Who does data reporting?

“I’m very privileged to work at the Dispatch where I have months to work on a story,” Riepenhoff said, but she said there is a misconception that all data-driven stories take months to report.

Livingston said data reporter positions are a “disappearing luxury” in newsrooms. The Akron Beacon Journal has no designated data writing or researching positions, but relies on its reporters to find and investigate these types of stories like Livingston does as education reporter.

Livingston said having the trust of his editors allows him to do the stories he thinks are most important for the public. “I’m not pressured to do certain stories,” he said.

Riepenhoff agreed that beat reporters should embrace data journalism. As experts of their sections, she said, “They’re the ones who can kick over the rocks that expose wrongdoing.”