You'd think everyone would be aware of the fact that automatic machine translation tools -- such as Google Translate, Babelfish, or Systran -- do not create polished, publishing-quality translations. They are useful tools, but what you get is a "gist translation" which requires human correction, sometimes a lot of it.
Still, more and more sites are offering automatically generated translations of their content. Many of the users of such services are probably completely unaware of what gibberish they are putting on their web sites. And it would be just their own problem if it were not for the fact that these junk pages end up in Google results.
In order to avoid improving their search engine rankings, I will not link directly to any such sites, but an example is at www dot raymond dot cc slash blog slash archives slash 2007 slash 06 slash 04 slash mac-osx-tiger-theme-on-windows slash fi slash
In the interest of illustrating the problem, here is my carefully human-generated translation back to English of the Finnish text on that page. For orthogonality I have put in Finnish words where the translator had left in an English word. This particular translation tool also left in some grammatical particles which I have simply written as PARTICLE. I have not referred to the English original text. Here's the first couple of paragraphs.
I-LETTER stake olet' hear multiple PARTICLE occasionally that lot PARTICLE Window Outlook connection has to cope descend Son OSX. Want so that checkered broken Son OSX connection itself only special language' gives only that is special language' put Intelligence translation PARTICLE Son OSX degrees? Your endure preemption be so that install Son OSX subject malli jälkeen Window.
dobee lienee cause joltain excellent Son OSX subject guest OSX Jaguar duration Windows.
Laugh or cry? Your call.
But seriously, the providers of these tools and services should make sure they don't end up polluting search engine results. The Robots Exclusion Standard explains how to do this. Basically, put in a "noindex" meta tag on the generated page.
Some of the translations are kind of peculiar. The word "akkuna" means window but it's markedly dialectal. The word "etuosto-oikeus" is very specific to stock trading; it does mean "preempt" but in the very narrow context of shareholders having the right to buy shares before they are sold to a third party (it quite literally means "preemptive buying rights"). In fact googling for 'translation "etuosto-oikeus"' mainly seems to bring up other samples of badly translated pages, probably produced by the same tool -- apparently this one originates from Google Translate (probably via some third-party tool) apparently not.
Update: this looks like the culprit: http://anaconda.taragana.net/angsumans-translator-plugin-pro-version-31-released/
Quotable quip: "Did you know Google loves our translated pages?" Oh noes. I rest my case.

Recent Comments