Hand aufs Herz: Egal, wie gut die Fremdsprachenkenntnisse auch sind, so ist es ab und zu wirklich hilfreich, einen Satz in ein Übersetzungstool wie DeepL oder Google Translate zu tippen und sich eine maschinelle Übersetzung liefern zu lassen, oder? Das Problem: Wer sich in der jeweiligen Fremdsprache nicht gut auskennt, der merkt wahrscheinlich auch nicht, wenn das KI-Werkzeug Ergebnisse liefert, die jedem Muttersprachler die Haare zu Berge stehen lassen. Ein Beispiel ist das vielfach diskutierte Thema „Gender“. Hier zeigt sich deutlich, warum maschinelle Übersetzungslösungen in einigen Fällen zwar nützlich, in anderen aber überaus fehleranfällig sein können.
Wer etwa Sätze wie „I visited a doctor yesterday“ eingibt, erhält automatisch „Ich war gestern beim Arzt“. Soll heißen: „DeepL“ und andere Tools halten scheinbar nicht viel von Emanzipation und Gleichberechtigung. Berufe werden automatisch „gegendert“, also einem „typischen“ Geschlecht zugeordnet. Wir von ACT Translations haben es einmal mit den verschiedensten Berufen und Sprachen ausprobiert, kamen aber immer wieder zu recht ernüchternden Ergebnissen. Eine „Nurse“ ist automatisch eine „Krankenschwester“, auch wenn der englische Begriff „Nurse“ durchaus auch für männliche Pfleger gebräuchlich ist. Ein „Mechanic“ wird mit „Mechaniker“, „mecánico“ oder „mécanicien“ übersetzt, ein „Physical Therapist“ wird „Physiotherapeutin“. Der Professor ist, wen wundert’s, männlich, „Housekeeper“ hingegen typisch weiblich.
The AI industry – white, male and not very diverse
So are concepts like deep learning, machine learning or AI testosterone-driven macho systems per se? Or why else don’t these tools suggest females terms like “Ärztin” or “KFZ-Mechanikerin”? The answer is complex. For one thing, the field of artificial intelligence is dominated by white, male scientists. According to the MIT Technology Review, only 18 percent of the speakers invited to leading AI conferences are women, only 20 percent of AI professors and only ten percent of research assistants at Facebook or Google are female. But that is only one small aspect of a larger problem.
AI is based on vast quantities of data
Machine learning systems certainly can be powerful tools, but they are only as good as the data fed into them. In other words, if there is a systematic error in the data used to train a machine learning algorithm, the resulting model will reflect this.
In most cases, this is not just a matter of bias or stereotyping. It is also not down to those people, who may have been superficial in selecting their data sets or training their models. No – inherent social distortions manifest in the results of machine translations, much like historical records of a society merely provide highlights of the overall situation. Datasets, in turn, pass on their bias – or superficiality – to the machine learning models that learn from them. If more men than women had been physicians in the past, any machine learning model trained on historical data would ‘learn’ that physicians are more likely to be male than female, regardless of the current gender distribution among physicians.
Typical language patterns instead of context
Machine translation models are trained using huge textual contexts. They include pairs of sentences that have already been translated once. Added to this are linguistic nuances that often make it difficult to provide an accurate and direct translation. When translating from English into languages such as German, French, or Spanish, gender-neutral nouns are translated into gender-specific nouns. Although the word “friend” is gender-neutral in English, the term becomes “amiga” (feminine) or “amigo” (masculine) in Spanish. A human translator can ask: Are we talking about a man or a woman? The machine translation tool is not in a position to ask for this context.
Challenge: communication and the spoken word
Things get really complicated when machine tools are used to translate speeches and the like. Spoken language can contain intonations, facial expressions, repetition, sarcasm, or irony that can completely change the meaning of a sentence. Modern machine translation solutions do not accurately reflect such nuances. There are other examples that show the limitations of translation tools. For example, if you type “cream of tartar” (a type of baking powder) into Google Translate, you get the literal German translation “Sahne von Zahnstein.” In the other direction: “Kernseife” used to end up as “nuclear soap” a few years ago. But in this particular example, the system has learned in the meantime and now translates it as “curd soap.”
Neural networks instead of statistical methods
Translation errors like this can be explained by the fact that Google Translate worked for many years with statistical translation methods that proceeded word by word and did not recognize correlations. Fails like this are far rarer today than they were a few years ago. Google Translate now relies on neural networks, i.e., artificial intelligence, to more accurately capture context for many of the more than 100 languages it has on offer. Scientists are also working flat out on the gender challenge.
A feel for language required
There’s no doubt about it: machines work really well with structured language for specific applications. Examples include weather reports, financial reports, government minutes, or sports scores. However, there are many areas where these smart tools quickly reach their limits and human translators and language experts have to get involved. Anyone who has ever tried to translate a snappy marketing text by machine will quickly see that it simply doesn’t work! Thus, many customers of the perfume chain Douglas in Germany misunderstood their slogan “Come in and find out” to mean “Come in and find your way out again.” Legal texts should also be translated by experts rather than machines and then carefully proofread again to avoid misunderstandings – or even lawsuits.
The interplay of human and machine
Even though machine translation tools have become better and more accurate in recent years, and many studies have addressed gender-specific challenges in this regard, there is still a huge amount of catching up to do in this area – as there is in the entire field of gender research and equality. So it’s a good thing that human translators have the necessary linguistic skills and intuition to help out. And there’s nothing wrong with good interaction between humankind and machine, is there?