Machine Translation On Local Computer: the AI Way or Drained-Wallet Way

Short Introduction

Are you aware that you can have a fully flagged local machine translation engine running on your local laptop?

I know. Who needs this, in this time when Google offers this for free on his site? Not mentioning Microsoft here, they are always behind Google in the Internet game.

Yet, try to use either Google or Mircosoft translation services from your application, and you are in trouble. You have two options, and in the most common use case, you are a losing player:

Strategy no. 1

You try to use an already published web interface and automatically generate HTTP requests. A strategy that almost works. Implemented in several Python libraries. Unfortunately, both Gooogle and Microsoft are resilient to such “attacks”. They can block your requests. Then you can catch their catches, delay your HTTP requests for a second or two, then try again. Classic cat and mouse game, and cats are winning almost all the time, on long terms.

Strategy no. 2

Pay for translation API. Both Google and Microsoft will give you an API key, and you can use it within your application as much as you need and pay for what you have used. It is weird to me why the translations provided are different from the translations given in their free web pages. And the differences are enormous. I been there, tried them, was surprised by the results. I can’t say this for sure because I have no proof from inside, but, looking at them as the black boxes, they run different engines for paid and free translation services. Oh, and yes, they are not that cheap as it might seem if you are looking at the prices for the translated texts per character. I just paid 45$ to MS for the translation services, used for short testing tiny application, with several translations over text that weights 7 kb. Try to put this in production, with texts 200+ kb (which is a realistic scenario for my application), run translations several times, since there is a need for manual tunings over and there while producing the final result, and just try to calculate the costs.

Both Google and Microsoft are not consistent in this game, they change the translation Rest API, and there is a need to spend the un-plannable time maintaining them.

Of course, there are smaller players in this translation game, specialized for translation services only. They are far better than both Google and Microsoft. You can take a look at https://www.deepl.com/translator. Almost the same story again: good quality on the free web translation. They provide API for automated translation, and they are consistently good here. And expensive: two times more costly than Google or Microsoft paid translation services.

Long story short: you lose either on quality, using Google or Microsoft, or let DeepL drain your wallet, with high quality and reliability.

Strategy no. 3: Use AI on your local computer.

Back to the roots: who needs this, in this time when there is a freely available translation on the web? The answer is: if you need this in your own application, it’s you.

So, you need an automated translation service that is reachable within your application. What you are translating is not that complicated: it’s not an essay, neither Shakespeare to the Russian language. Let’s say you need translation over customer feedback. If you are lucky, that would be a bunch of short texts, and you need them translated for automated analysis, whatever kind of it is.

It is very much possible to establish and run a local machine translation on your computer. Of course, you cannot compete with Google, Microsoft, or DeepL web-based translation with the translation quality and speed. But, for a use-case when all you need is an automatic translation of not-that-complex and large texts, all you need is:

  1. Python environment,
  2. A will to wait for a couple of seconds per translation, depending of the speed of your local machine. If you have CUDA-enabled GPU, you’ll wait 10 times less.
  3. Few lines of Python code, as follows:
from transformers import pipelinetranslation = pipeline("translation_en_to_de")translated_text = translation('I love AI', max_length=10)[0]['translation_text']print(translated_text)

Yes, you counted right: 4 lines of code.

Enjoy in your slower but free AI lunch.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Stojancho Tudjarski

Stojancho Tudjarski

ML and AI enthusiast, learning new things all the time and looking at how to make something useful with them.