It’s amazing to watch a machine translation tool such as Google Translate create a sentence in an instant. It’s funny when the translation is nonsense. But bad MT (machine translation) can never be amusing when translation counts. So, is the solution to just get a better MT tool? Or does MT itself have no place in the field of competent translation? Alternatively, should a real human being and qualified pro linguist always review MT?
As a translation and localization provider, I am often asked if TrueLanguage uses machine translation (MT) or human translators. MT tools have been around for a long time, and have undergone some significant changes to how the process of MT is approached. For certain applications, modern MT tools are useful saving time and money. Even in cases where MT is useful, it would be unwise to completely remove the human component from the process.
There are currently a variety of machine translation tools available, each with a slightly different intended application. I will cover two major categories of MT tools; those designed for consumers, and those that are designed for businesses and specific industries.
Consumer MT Tools
All machine translation tools are required to have a large database of language. The tools use this database to make comparisons which allows them to translate new words and sentences. Perhaps the most common MT tool for general consumers is Google Translate. This application is extremely useful for certain types of individual consumers, such as travelers or students. Google uses a set of general language in order to create a database for each of its many language combinations. This means that someone who is studying Spanish could find Google Translate very useful to learn vocabulary and even phrases and sentences for general, everyday interactions. Even many languages that are not widely used have a robust enough data set to produce accurate translations for these language contexts.
The main issue that arises with consumer-focused MT tools is that they are designed to be general in nature, which means that they are not very useful for specific applications, like those required by businesses that operate in a particular industry. Unfortunately, some businesses that lack this awareness use Google for their translations, which creates detrimental results for a company’s image in their foreign markets.
For this reason, many companies have spent years developing effective MT tools that are designed to work in these more specified contexts. The results are a variety of industry-specific machine translation tools that are useful for industries such as law, ecommerce, eLearning, technology, healthcare and many others.
MT Tools Designed for Business Application
Looking at the history of machine translation tools helps to understand how useful current tools can be. The first MT tools designed specifically for business application were developed as long ago as the 1970s, and were rule-based machine translation (RBMT). Basically, developers approached MT by creating a set of grammatical rules between a source and target language. These rule sets were used to convert text from the source to the target language. In theory, the concept works well as long as the content is not too technical or specific. However, the database of the rules is difficult and expensive to maintain in order to continue to produce accurate content.
In the late 1980s, RBMT tools were replaced with statistical machine translation (SMT) models. With SMT, rules are replaced with a statistical model that analyzes a training set (essentially a database) of language and extrapolates a translation of new text using the statistical model developed. Some of the advantages of SMT over RBMT include a built-in learning process, and models could be adapted easily by adding to or changing the training set.
Though SMT was a major improvement, it has its own set of unique challenges and limitations. SMT requires an extensive training set of source and target language pairs. This is both time consuming and expensive to develop. In addition, SMT approaches translation largely at a phrase level and does not have a set of rules like the earlier model. The result – SMT engines are very good at translating individual phrases and words accurately, but overall fluency and grammar suffers. This grammatical inaccuracy is not a problem if you are using an SMT for personal use, but results in extensive post-editing investment for outward-facing content.
In recent years, a new approach has emerged based on neural machine translation (NMT) tools. Unlike SMT, NMT tools look at an entire sentence, and analyze associations between phrases that do not appear next to one another. This results in greater fluency and grammatical correctness over earlier SMT models.
Challenges of NMT
One of the challenges of NMT tools is that errors in terminology are common. While looking at the broader context results in higher fluency, this broad view leads to a decrease in terminology accuracy.
An additional challenge is that, in order to work well, an NMT needs a massive data set of source and target language pairs. Most experts agree that about half a million words should be the standard. That means that about one million words (half a million for both the source and target language) must be available for use in creating the NMT data set. For a language pair like English and Spanish, this is relatively easy and an NMT tool is very useful. However, for a pair like English and Russian it is much more challenging. This is why it is important to consider what language pairs are feasible to use with an NMT.
In addition to the sheer volume, good results from NMT depends on the quality of the database and the relevance of the application. Simply adding content to increase the volume of the data set won’t do much, and will result in inferior translations. The relevance of the application is also important; using an NMT engine designed for industrial engineering will not perform well on translating legal documents.
Another challenge with NMT, as well as any machine translation tool, is that they are effective for certain applications but not for others. For example, NMT engines are very effective for technical language applications where the language is controlled, as long as the data set is sufficient. On the other hand, even with its advances, an NMT is ineffective for application in, say, marketing materials, since these materials are less controlled and often require transcreation.
Advantages of NMT
The good news about NMT is that, despite all of these challenges, it is a great improvement over older SMT models, and is promising technology as it continues to develop. NMT tools perform well to maintain overall context within a sentence or even beyond the sentence level. Rather than analyzing language simply as a set of rules or word-by-word, NMT tools look at language as a whole. This results in greater accuracy in the language produced.
Today, sophisticated NMT tools are available to translation service providers, and are useful for a variety of applications and industries. These are great tools for increasing productivity and managing costs for both service providers and customers. As companies continue to develop NMT engines, the issues will reduce and benefits will increase, especially its usefulness for a wider range of industries.
However, the truth is that, even with the advantages of NMT, human translators are required to review the translations produced by NMT tools to ensure their accuracy and appropriateness. All the research into machine translation tools shows that human translators are the best option, because they provide the best quality in translation.
How Can TrueLanguage Help
TrueLanguage always uses human translators and proofreaders because they produce the best translations in any industry. However, our own translation tools include a variety of commercially available NMT engines. We assist you in completing a full evaluation to decide whether or not these tools would be useful and advantageous for your organization. If your organization meets the criteria necessary for an NMT to reduce your costs, we assist you to implement these tools as part of our translation process. Contact us today to talk more about our services and how TrueLanguage can help you to improve your translation process.