People often think about content moderation as reactive in nature — that we only take content down when it’s flagged by our systems or people. In reality, the bulk of our work focuses on the future. There’s a long process that’s designed to give our teams visibility into emerging issues before they reach, or become widespread on, our platform.
That valuable visibility is driven by our Intelligence Desk, a team within YouTube’s Trust & Safety organization. These specialized analysts identify potentially violative trends — whether new vectors of misinformation or dangerous internet challenges — and the risks they pose. They’re also regularly monitoring ongoing threats like extremist conspiracy theories, both tracking their prevalence across media and evaluating how they morph over time.
These insights then feed into thinking through how current or future policies would manage these new threats. For example, based on evidence gathered by the Intelligence Desk, we updated our hate and harassment policies to better combat harmful conspiracy theories on our platform.
How do we make sure policies are enforced consistently?
The implementation of a new policy is a joint effort between people and machine learning technology. In practice, that means in order for a policy to be successfully launched and enforced, people and machines need to work together to achieve consistently high levels of accuracy when reviewing content.
We start by giving our most experienced team of content moderators enforcement guidelines (detailed explanation of what makes content violative), and ask them to differentiate between violative and non-violative material. If the new guidelines allow them to achieve a very high level of accuracy, we expand the testing group to include hundreds of moderators across different backgrounds, languages and experience levels.
At this point, we begin revising the guidelines so that they can be accurately interpreted across the larger, more diverse set of moderators. This process can take a few months, and is only complete once the group reaches a similarly high degree of accuracy. These findings then help train our machine learning technology to detect potentially violative content at scale. As we do with our content moderators, we test models to understand whether we’ve provided enough context for them to make accurate assessments about what to surface for people to review.
After this testing period, the new policy can finally launch. But the refinement continues in the months that follow. Every week, our Trust & Safety leadership meet with quality assurance leads from across the globe (those responsible for overseeing content moderation teams) to discuss particularly thorny decisions and review the quality of our enforcement. If needed, guideline tweaks are then drafted to address gaps or to provide clarity for edge cases.
How do people and machines work together to enforce our policies?
Once models are trained to identify potentially violative content, the role of content moderators remains essential throughout the enforcement process. Machine learning identifies potentially violative content at scale and nominates for review content that may be against our Community Guidelines. Content moderators then help confirm or deny whether the content should be removed.
This collaborative approach helps improve the accuracy of our models over time, as models continuously learn and adapt based on content moderator feedback. And it also means our enforcement systems can manage the sheer scale of content that’s uploaded to YouTube (over 500 hours of content every minute), while still digging into the nuances that determine whether a piece of content is violative.
For example, a speech by Hilter at the Nuremberg rallies with no additional context may violate our hate speech policy. But if the same speech was included in a documentary that decried the actions of the Nazis, it would likely be allowed under our EDSA guidelines. EDSA takes into account content where enough context is included for otherwise violative material, like an educational video or historical documentary.
This distinction may be more difficult for a model to recognize, while a content moderator can more easily spot the added context. This is one reason why enforcement is a fundamentally shared responsibility — and it underscores why human judgment will always be an important part of our process. For most categories of potentially violative content on YouTube, a model simply flags content to a content moderator for review before any action may be taken.
How do we measure success?
We’re driven in all of our work to live up to our Community Guidelines and further our mission to allow new voices and communities to find a home on YouTube. Success on this front is hard to pin down to a single metric, but we’re always listening to feedback from stakeholders and members of our community about ways we can improve — and we continuously look to provide more transparency into our systems and processes (including efforts like this blog).
To measure the effectiveness of our enforcement, we release a metric called our violative view rate, which looks at how many views on YouTube come from violative material. From July through September of this year, that number was 0.10% – 0.11%, which means that for every 10,000 views, between 10 and 11 were of content that violated our Community Guidelines.
We also track the number of appeals submitted by creators in response to videos that are removed (an option available to any Creator on YouTube), as this helps us gain a clearer understanding about the accuracy of our systems. For example, during the same time period mentioned above, we removed more than 5.6 million videos for violating our Community Guidelines and received roughly 271,000 removal appeals. Upon review, we reinstated about 29,000 appeals.
And while metrics like appeals, reinstatements, and our violative view rate don’t offer a perfect solution to understand consistency or accuracy, they’re still pivotal in benchmarking success on an ongoing basis.
Community Guidelines are concerned with language and expression — two things that, by their very nature, evolve over time. With that shifting landscape, we’ll continue to regularly review our policy lines to make sure they’re drawn in the right place. And to keep our community informed, we’ll be sharing further how we’re adapting in the months ahead.
By Matt Halprin, Vice President, Global Head of Trust & Safety and Jennifer Flannery O'Connor, Vice President, Product Management