Extracting Protest Events from Newspaper Articles with ChatGPT

Neal Caren, Kenneth T. Andrews, Rashawn Ray

August 2023

Abstract

This research note examines the abilities of a large language model (LLM), ChatGPT, to extract structured data on protest events from media accounts. Based on our analysis of 500 articles on Black Lives Matter protests, after an iterative process of prompt improvement on a training dataset, ChatGPT can produce data comparable to or better than a hand-coding method with an enormous reduction in time and minimal cost. While the technique has limitations, LLMs show promise and deserve further study for their use in protest event analysis.

Type

Preprint

Publication

SocArXiv

Extracting Protest Events from Newspaper Articles with ChatGPT

Abstract

Neal Caren

Associate Professor of Sociology