Uncovering Patterns in White Supremacist and Male Supremacist Online Discourses

August 21, 2023
Computational and quantitative tools are needed to handle the volume of communications in online platforms that foster hate. This post elaborates on work that CAH postdoc Michael Miller Yoder presented at the annual meeting of the Association for Computational Linguistics (ACL) and the Workshop on Online Abuse and Harms (WOAH) in July 2023. Preprints of these papers are available at links here and here.
Michael Yoder, CAH postdoc, discussing his poster with attendees.

To work against hateful ideologies, we must understand what is happening on the online platforms that develop, spread, and circulate such ideologies. From a discourse perspective, this includes narratives that emerge, and from a social network perspective, this includes connections and interactions among users. Approaches from computational social science and natural language processing can help surface and analyze these narratives and interactions at the large scale of online social media and forums. Such methods can also be used to develop automated tools to identify and measure the spread and influence of hateful ideologies from discourse and interactions online. The two projects I will discuss here focus on white supremacist extremism, an enduring hateful ideology, and male supremacism from the more recent "incels" (involuntary celibates) social movement.

Our project focusing on white supremacy has two main contributions: a large text dataset of white supremacist extremist content, and a text classifier that has been trained to distinguish white supremacist content from other online talk. We first assembled a large text dataset from a wide variety of online sources: platforms explicitly formed to promote white supremacy (such as Stormfront and Iron March) and text from white supremacist organizations on mainstream platforms such as Twitter and Discord. This dataset, available to researchers through a vetting process, can facilitate research on the dynamics of narratives and communities that draw people into white supremacist ideology. At over 200 million words, it also can be used to train machine learning classifiers to identify the nuances of white supremacist discourse. Practitioners have called for hate speech classifiers that focus on specific ideologies (see our CAH white paper on research needs from practitioners working against hate and extremism). We take first steps in this direction by training classifiers to distinguish white supremacist text from online texts with similar structures (such as forums or tweets) that do not feature white supremacist ideologies. This approach outperforms prior work on white supremacist text identification. We also find that including anti-racist texts as counter-examples can help mitigate issues of bias against mentions of marginalized identities that commonly arise in hate speech classifiers.

Online communities of those calling themselves "involuntary celibates" (incels) are known for an extreme misogyny that in some cases has led to mass violence targeting women. As a relatively new far-right social movement, we were interested in identifying some of the changing dynamics of online incel discourse. In this project, we focused on the use of identity terms as a lens into how incels were framing their own identities and those of other identity-based groups. Across a dataset of over 6 million posts, we found that almost 30% of the identity terms used were novel and relatively specific to this community, which has implications for the automatic detection of this new form of misogyny. Though mentions of women are unsurprisingly very high across posts, we found that mentions of other marginalized identities, including Black people, Jewish people, and LGBTQ+ people, increase from 2017-2021. Users who are central in the platform's network of interaction appear to lead these changes. Analyzing the contexts in which these identities are mentioned, we found that these identities are often cast with harmful stereotypes. These findings suggest the consolidation of this newer movement with broader far-right ideologies.

Join the effort.