Mandarin has long been near-impossible for computers to translate. That is changing say Brunswick's Beijing office
The us state department estimates it takes 2,220 class hours to reach full professional proficiency in Mandarin – that’s five hours a day, five days a week, for 88 weeks straight. For similar levels of proficiency in Spanish or Italian, it estimates 600 hours, or 24 weeks, are needed.
Languages difficult for people to learn tend to be difficult for computers to translate. For decades, computers substituted words in one language for words in another, producing predictably bad results.
But that’s changing. Computers are employing a more human-like approach to translation, and the results are drastically improving. Microsoft recently announced its translation software can now “match human performance in translating news from Chinese to English.”
The implications of this improvement are difficult to overstate: It could help remove the language barrier from business.
Machine translations of Chinese are important for two reasons. First, Mandarin is the most widely spoken language in the world. Second, translations between Mandarin and English are perhaps the most difficult for a machine to make; if AI can solve that problem, the thinking goes, it will be able to do so for other languages as well.
Word-for-word systems are woefully ill-equipped to even loosely translate Mandarin into English. A Chinese character can take on different meanings when combined with others, and a comprehensive Chinese dictionary has about 20,000 characters. The Chinese character 钱 (qian) means money, 钟 (zhong) means clock, 书(shu) means book. However, when used together, 钱钟书 (Qian Zhongshu) is the name of a well-known Chinese author. Unless an algorithm reads all three together, it will produce babble.
Recent progress in computer translation has been largely driven by Google’s Neural Machine Translation (NMT) system, developed in late 2016. NMT uses an artificial neural network, which mirrors how our brains solve problems: classifying, organizing and weighting information, and then adjusting based on feedback.
China’s tech sector has also entered the field in the last few years, particularly in travel-related translation. Any international visitor to Beijing, Shanghai or Guangzhou, knows that some information is available in English, but a guide still comes in handy. However, for the more than 130 million Chinese tourists who traveled overseas in 2017 – a number that is expected to grow to 200 million by 2020, according to the China Tourism Academy – language remains a major barrier. Vital information in major Western cities isn’t always displayed in Mandarin, making a guide a necessity.
Products like the Sogou Travel Translator are making independent travel more feasible: The device uses the company’s expertise in natural language processing derived from big data – Sogou receives over 200 million voice requests per day, amounting to approximately 240,000 hours of data – to translate between Chinese and 24 languages, and uses Optical Character Recognition (OCR) to read menus and street signs. Another, iFlytek, is a pocket-sized real-time speech translator. Company Chairman Liu Qingfeng tested the device in front of media at the National People’s Congress of China. Chinese search giant Baidu has released its own pocket translator, and Microsoft is applying the technology to translating news coverage.
If some professional translators scorn AI, others are learning to work with machines, even helping to train their potential replacements. Neural network technology allows computers to learn from and adapt to human feedback in real time. Translators are the ideal tutors for these data-hungry machines and could accelerate their development.
For all the tantalizing possibilities in machine translation, real limitations remain. Unstructured conversations – the nature of many, if not most, discussions – remain problematic. So do idioms and slang. At China’s prestigious Boao Forum, where President Xi Jinping delivered the keynote speech this year, Tencent, the Chinese tech giant, debuted a system powered by AI that was supposed to translate the event in real time. The results were described by the media as illegible. Luckily, the Forum had a staff of human translators on hand.
Rachael Layfield, a Director, and Amy Wang, a Senior Translator, are both based in Brunswick’s Beijing office.