Speaking different languages can be an insurmountable barrier to communication. The leaders of Meta are determined to facilitate connections between people from different countries and/or cultures as much as possible. Both to increase interactions on the company’s social networks and to make the metaverse more attractive in the future. Meta-researchers have been working for years on sophisticated artificial intelligence (AI) models capable of translating multiple languages. Today they presented NLLB-200, a pioneering system capable of translating 200 languages in real time, twice as many as calculated in the best system Meta has had so far.
“The AI modeling techniques we used help achieve high-quality translations,” Meta Founder and CEO Mark Zuckerberg said in a post today on his Facebook account. “To give an idea of the scale of the program, the model in 200 languages analyzes more than 50 billion parameters. We trained it using the Research SuperCluster, one of the fastest supercomputers in the world. The NLLB-200 system, an acronym for No language left out (No language is left behind), is ready to perform 25,000 daily translations in all Meta applications, according to the young tycoon.
The tool is able to translate both spoken and written language. From the company they present it as a model for the 4,000 million people who speak languages that are not widespread on the Internet (on the Internet, rules in English and Mandarin, Spanish, Portuguese or Arabic are widely used). Of the 200 operating languages, 55 African languages have been included, many of which were not available in any machine translator until now.
The company’s intention is that in the future, Meta’s augmented reality glasses will be able to translate in real time and offer captions visible only to those wearing the glasses. Google is also working on this line, as it revealed in May when presenting a similar prototype of glasses.
The model on which NLLB-200 is based is inspired by the M2M-100, presented in 2020 and which presented a fundamental improvement: the translations are done directly from the source language to the target language, without going through English. As the latter is the most widespread on the Internet, it is also the one that feeds most of the global databases with which natural language processing systems are trained. Therefore, translators would first convert any language to English and then translate it into another, resulting in a great loss of nuance and meaning.
To make this leap requires millions of sentences meticulously matched between different language combinations. The problem is that there are underrepresented languages on the Internet. Meta gives the example of Swedish and Lingala, a language spoken in the Democratic Republic of Congo, the Republic of Congo, the Central African Republic and South Sudan. The European language, used by 10 million Swedes and Finns, has some 2.5 million articles on Wikipedia; the African, practiced by 45 million people, has only 3,260.
To solve this problem, Meta researchers have developed a model capable of extracting great performance from each sentence analyzed, while increasing the size of the databases that feed the algorithm.
The company decided to open up the NLLB-200 model and its model training code to help other researchers improve their translation tools and develop new technologies.
you can follow COUNTRY TECHNOLOGY in Facebook Yes Twitter or sign up here to receive our weekly newsletter.
#Meta #presents #translator #capable #working #real #time #languages