top of page

AI underpinned by an invisible and exploited workforce

The AI industry employs large numbers of data labellers in countries in the Global South such as the Philippines, Venezuela, Kenya and India. Workers in these countries face stagnating or shrinking wages, and often earn far below the minimum wage. Many data labellers work in overcrowded and dusty environments which pose a serious risk to their health. They also often work as independent contractors, lacking access to protections such as health care or compensation.


In dusty factories, cramped internet cafes and makeshift home offices around the world, millions of people sit at computers tediously labelling data. These workers are the lifeblood of the burgeoning artificial intelligence (AI) industry. Without them, products such as ChatGPT simply would not exist. That’s because the data they label helps AI systems “learn”.


But despite the vital contribution this workforce makes to an industry which is expected to be worth US$407 billion by 2027, the people who comprise it are largely invisible and frequently exploited.


What is data labelling?

Data labelling is the process of annotating raw data — such as images, video or text — so that AI systems can recognise patterns and make predictions. Self-driving cars, for example, rely on labelled video footage to distinguish pedestrians from road signs. Large language models such as ChatGPT rely on labelled text to understand human language. These labelled datasets are the lifeblood of AI models. Without them, AI systems would be unable to function effectively.


Tech giants like Meta, Google, OpenAI and Microsoft outsource much of this work to data labelling factories in countries such as the Philippines, Kenya, India, Pakistan, Venezuela and Colombia. China is also becoming another global hub for data labelling.


Outsourcing companies that facilitate this work include Scale AI, iMerit, and Samasource. These are very large companies in their own right. For example, Scale AI, which is headquartered in California, is now worth US$14 billion.


Cutting corners

Major tech firms like Alphabet (the parent company of Google), Amazon, Microsoft, Nvidia and Meta have poured billions into AI infrastructure, from computational power and data storage to emerging computational technologies.


Large-scale AI models can cost tens of millions of dollars to train. Once deployed, maintaining these models requires continuous investment in data labelling, refinement and real-world testing.

But while AI investment is significant, revenues have not always met expectations. Many industries continue to view AI projects as experimental with unclear profitability paths.


In response, many companies are cutting costs which affect those at the very bottom of the AI supply chain who are often highly vulnerable: data labellers.


Low wages, dangerous working conditions

One way companies involved in the AI supply chain try to reduce costs is by employing large numbers of data labellers in countries in the Global South such as the Philippines, Venezuela, Kenya and India. Workers in these countries face stagnating or shrinking wages. For example, an hourly rate for AI data labellers in Venezuela ranges from between 90 cents and US$2. In comparison, in the United States, this rate is between US$10 to US$25 per hour. In the Philippines, workers labelling data for multi-billion dollar companies such as Scale AI often earn far below the minimum wage.


Some labelling providers even resort to child labour for labelling purposes.


But there are many other labour issues within the AI supply chain. Many data labellers work in overcrowded and dusty environments which pose a serious risk to their health. They also often work as independent contractors, lacking access to protections such as health care or compensation.


The mental toll of data labelling work is also significant, with repetitive tasks, strict deadlines and rigid quality controls. Data labellers are also sometimes asked to read and label hate speech or other abusive language or material, which has been proven to have negative psychological effects.

Errors can lead to pay cuts or job losses. But labellers often experience lack of transparency on how their work is evaluated. They are often denied access to performance data, hindering their ability to improve or contest decisions.


Edel Rodriguez

Based on article written by Ganna Pogrebna, originally published in The Conversation

Comentários


Committee secretary

2021-2.jpg

Espen Løken

Espen Løken has been secretary for the prize committee since the prize was established in 2010. He is international advisor in the union "Styrke", responsible for the Arthur Svensson prize. 

Forbundet Styrke

Torggata 15, 0181 Oslo

espen.loken@styrke.no

  • White Facebook Icon

© 2018 - Styrke

bottom of page