With the advent of “next-generation” machine translation (Neural Machine Translation, or NMT for short), there is a lot of discussion about revolutionising the world of translation. We certainly believe in using technology to assist the translation process where appropriate, and feel that there are many areas where it can be applied to hugely benefit end users, yet others where it should be used sparingly. For texts such as technical manuals, NMT makes a lot of sense as it saves time and money, and ensures consistency of terminology, whereas for other areas which, at present, require value judgements from humans (such as for marketing or creative media), this isn’t necessarily the case.

Assessment report text often contains a lot of repetitions, and so we are able to leverage our translation memory systems to ensure that blocks of repeated text only have to be translated once, which leads to cost savings for our clients and ensures consistency of vocabulary. Perhaps, given a long, text-heavy report, this will be an area where machine translation could be successfully applied to the testing process to introduce further efficiencies.

Using NMT means that a much larger volume of text, which wasn’t previously financially viable to translate, can now be localised to the benefit of others around the world. It may therefore be that with a sufficiently high volume of preexisting, professionally-translated psychometric report text, NMT will be able to provide suitable results, when coupled with human post-editing (reviewing of the machine-translated output). However, the training data, which is specific to particular subject areas or types of assessment, has to come from somewhere. The only place it can come from is previous translations of your intellectual property, or that of others (which certainly isn’t always desirable). Of course, this isn’t an issue where material is out in the public domain, such as technical manuals and web sites, but it can become a big issue when you are talking about assessments and exams, which tends to be a lot smaller-scale and niche.

Having said this, as things stand at the moment, there are types of text which do not easily lend themselves to NMT and for these, it is very important to use properly qualified and experienced human translators. The types of text that we usually work on are those which require human intervention to decipher exactly what is meant and to understand the nuances of particular words and phrases as well as the context.

For example, in a personality questionnaire or a clinical assessment, it is very difficult for a machine to understand which of the numerous possible interpretations of the word “upset” should be used in a particular translation – should it be “sad”, or should it be “annoyed”? Likewise, for verbal reasoning assessments, questions involving synonyms where the distractors have been chosen on the basis of their visual or semantic links with the correct answer option will not function correctly if translated by a machine.

Take a look at the following sample of a typical verbal reasoning item:

“He found the task assigned by his supervisor onerous…”

1) Which of the following words is a synonym of “onerous”?

  • superfluous
  • strenuous
  • scrupulous

Given the visual symmetry in English between the answer options and the word in the stimulus, the question might appear more obscure than a direct translation may yield in another language (onerous could become difficult, superfluous could become not necessary and strenuous could become tiring).

If relying on a direct machine translation and post-editing of these answer options, the difficulty level will likely decrease significantly for several reasons. Firstly, as there is little context that a machine can interpret with short answer options and questions such as these, it will struggle to provide such coherent translations as is the case in more prosaic text.

Secondly, there may not be such a high register translation of “onerous” in a given language to obscure the correct answer. Likewise, the distractors have presumably been chosen in English for the very reason that they share common letters, sounds, collocations with other words, or a certain level of complexity.

Tests such as these rely on some form of adaptation of the items (although admittedly, this will depend on the nature of the items themselves), and this is something which, given the volume of items for translation, will not necessarily lead to cost savings in comparison to using human translators and revisers in the first place.

So, to summarise: while for certain materials we are looking with anticipation to leverage developments in the technology, we feel that for our core business in test translation, localisation and adaptation, there is still a way to go before we are able to enter the mainstream use of NMT.