Columbia Journalism Review: What is machine learning and why should I care?

Columbia Journalism Review: What is machine learning and why should I care?

YOU MAY NOT REALIZE IT, but you’ve probably already used machine learning technology in your journalism. Perhaps you used a service like Trint to transcribe your interviews, punched in some text for Google to translate, or converted the Mueller Report into readable text. And if you haven’t used it yourself, machine learning is probably at work in the bowels of your news organization, tagging text or photos so they can be found more easily, recommending articles on the company website or social media to optimize their reach or stickiness, or trying to predict who to target for subscription discounts.

Machine learning has already infiltrated some of the most prosaic tasks in journalism, speeding up and making possible stories that might otherwise have been too onerous to report. We’re already living the machine-learning future. But, particularly on the editorial side, we’ve only begun to scratch the surface.

To be clear: I’m not here to hype you on a fabulous new technology. Sorry, machine learning is probably not going to save the news industry from its financial woes. But there’s nonetheless a lot of utility for journalists to discover within it. What else can machine learning do for the newsroom? How can journalists use it to enhance their editorial work in new ways? And what should they be wary of as they take up these powerful new tools?

The phrase “machine learning” describes a kind of finely crafted and engineered tool. Trint, for example, is able to transcribe an audio clip because its algorithm has learned how patterns of sound correspond to patterns of letters and words. Such algorithms are trained on many hours of manually transcribed audio. The algorithm learns the ways that patterns in audio translate into patterns of text, and can then perform transcription on new samples of audio.

Read the full article here.