The situation with algospeak is complex: a given word can have multiple meanings and can be used in a variety of contexts.
Yesterday I got my first 🥕. I officially joined the swimmers club!
Confused? Meet ‘algospeak’, a coded language sometimes used online by people trying to circumvent content moderation systems. Some call it the rise of a new internet-driven Aespospian language, or even a modern form of stenography, others refer to it as a new form of ‘Voldemorting’. Regardless of terminology, society’s tendency of recording everyday life and uploading it on social media, is putting unprecedented strain on platforms which are responsible for speedily reviewing and accurately removing harmful content that violates their guidelines.
The situation with algospeak is complex: a given word can have multiple meanings and can be used in a variety of contexts. While bad actors do use it to promote violence and spread misinformation, not all algospeak use equates to harm. For some communities it is the only way to talk safely about sensitive subjects. Either way, it presents a true moderation challenge.
The practice of using coded language to subvert communication systems is not new — for centuries, people encoded language in different ways. Today, simple examples include puns, or even visual codes that are often found in memes.
So, what is unique about algospeak? It is a byproduct of social media’s algorithmic filtering. When using communication technologies we inevitably adapt to their limitations, forgetting that we use the technology on its own terms, not ours. In the 2000s, texting, enabled by the numeric keyboard, became mainstream. But character limitations per text message led to a rise in ‘txtspk’, with words such as brb, afaik, ttyl and emoticons, becoming widespread.
All this to say that today’s communication culture is controlled by the mediums we use and governed by the underlying algorithms. As social media platforms are used to translate ideas into shareable media, they moderate creator content, deciding what gets down-ranked, demonetized or goes viral. Consequently, creators feel compelled to tailor their content not to their followers, but to the algorithm itself. Hence, the term ‘algospeak’ — it’s as if you are talking to the algorithm.
So why do content creators need to speak to a platform’s algorithm? It’s usually to gain more visibility. A platform’s algorithm may decrease a creator’s visibility for a) legal or political reasons or b) social, moral and community reasons.
Today, common examples of content at risk of being flagged include:
At first glance, these are all legitimate reasons to flag content. However, the problem is more complex. Currently, most moderation algorithms are unable to distinguish between a creator talking about their experience with hate speech and a bad actor spreading hate speech. The result? Content is flagged either way and creators use algospeak, ‘just in case’.
In fact, a survey shows that spikes in algospeak often occur during polarising social events, with usage following this logic:
Certain, often marginalised, online communities resort to algospeak as a way to communicate safely, without their content being flagged or down-ranked. For example, LGBTQ communities may avoid using the word ‘gay’ or ‘lesbian’ as they believe it gets demonetized, so they may replace ‘lesbian’ with ‘le dollar bean’ and ‘LGTBQ’ with ‘leg booty’.
Similarly, conversations about women’s health (e.g. pregnancy, menstruation etc.) are reportedly being down-ranked, resulting in creators replacing ‘nipples’ with ‘nip nops.’
It is problematic when benign use of certain terms results in moderation false positives, pushing these communities to increase their use of code words. There is a big difference between sharing experiences about sexuality and identity and being explicitly homophobic. The problem of content moderation systems treating these instances as interchangeable is often down to a poor grasp of context.
Part of the challenge with algospeak is that it can also be used with bad intentions, like spreading misinformation, promoting violence and putting someone’s life at risk. For example, an anti-vax Facebook group with over 200,000 members used the carrot emoji to hide content from Facebook’s moderation system. Regardless of the legitimacy of COVID-19 vaccines, the problem is that these groups shared unverified claims about people dying from the vaccine.
In case you are wondering who decides when and how to use algospeak, in this instance the group’s ‘rules’ stated that “members have to use code words for everything, specifically the c word, v word or b word” (c for covid, v for vaccine and b for booster). Here, ‘emojis’ are used as ‘code words’, which pose yet another interesting content moderation challenge.
In fact, Hannah Rose Kirk, researcher at
reveals how social media content moderation algorithms are primarily trained on words and text, instead of emojis. To compensate for this gap, Hannah and her team developed HatemojiCheck, a tool designed to tackle emoji-based hate.
This is an important and much needed advancement. In the USA, the Drug Enforcement Association (DEA), published a poster revealing how drug dealers exploit this moderation gap by using the following emojis as code words for drugs:
The biggest challenge? These trends have an extremely short lifespan and rapid turnover — by the time it is on the DEA’s desk, new emojis / code words are already in use.
Content creators are the most affected by content moderation systems, as they often make a living out of online platforms. Without sufficient transparency into algorithmic decision-making, creators turn to creative misuse. Creative misuse refers to the practice of re-imagining tactical ways of using a technological tool. Zuck Got Me For acts as a manual for creative misuse, featuring a ‘cemetery of banned content’, ‘tips for posting without getting flagged’ and an ‘algorithms explained’ section.
Today, the more platforms strengthen their moderation systems, the more certain users adopt conventions allowing them to circumvent these restrictions. A study analysing 2.5 million Instagram posts of a pro-eating disorder (pro-ED) community reveals how the more the platform tightened its moderation efforts, the more the community increased the number of lexical variations used to communicate, and the higher the engagement between its members.
When it comes to algospeak, automated content moderation systems are usually responsible for either flagging benign examples (false positives), or failing to detect the coded language spreading harm (false negatives). Often, this boils down to these systems failing to understand context. There are three major challenges to consider:
Algospeak is designed to circumvent automated moderation systems, and today, it is largely successful. Attempts to catch new ways of communicating often result in over-flagging, which drives creators to find new ways to encode their messages. Creators will increasingly adopt algospeak to communicate sensitive issues, making the dialogue only accessible to those who can decode it. We must remember that not all use of algospeak equates to harm, and that using it fractures our ability to communicate collectively. When developing moderation algorithms, we must move towards a more nuanced understanding of a posts’ context so that people who use algospeak benignly must no longer resort to it.
For more posts like these follow us on Twitter, LinkedIn and Medium. Stay tuned for more interviews and discussions with people working across the Trust & Safety and Brand Safety space.
At Unitary we build technology that enables safe and positive online experiences. Our goal is to understand the visual internet and create a transparent digital space.
For more information on what we do you can check out our website or email us at contact@unitary.ai.