Amir E. Aharoni wrote: > On Tue, Jun 9, 2009 at 23:42, Brian<brian.min...@colorado.edu> wrote: >> Google has built in support for using its machine translation technology to >> help bootstrap human translations of Wikipedia articles. >> >> http://translate.google.com/toolkit/docupload >> >> The benefit to Google is clear - they need sentence-aligned text in multiple >> languages in order to bootstrap their automated system. >> >> This is a great example of machines helping people help machines help >> people, etc... I'm sure this is now the most efficient way to produce high >> quality translations of Wikipedia articles en masse. >> >> We should take the ToS to make sure the translated text can be CC-BY-SA >> licensed. > > OK, after a bit of drama in this discussion, i actually tried this toolkit. > > Then i tried to translate [[Art critic]] from English into Hebrew. > There were a few pleasant surprises, but on the whole the machine > translation was bad to the point of being unusable. It is much easier > to translate it using vi.
I tried translating [[Astronomy]] and [[Eothyrididae]] (at least, the part of it that is in English) to Serbian and was pleasantly surprised. Sure, literally every sentence needed major corrections, but for me it was still much easier to do that than to translate from scratch. > I *had* to make very deep changes to paragraph structure - not to > mention sentence structure -, and not just because the Hebrew > Wikipedia has a different MOS, but because it's the basis of the This is then apparently the case of English→Hebrew translation working worse than English→Serbian (possibly due to Hebrew being a non-indo-european language)? I have never had to make any changes to paragraph structure, only occasionally changes to sentence structure (I'd say there were about 10% of sentences I had to change the structure of and another 10% that had uncommon structure but I let them slide). > Hebrew language. A text without these changes would be next to > unreadable. I doubt that a document which is changed so deeply is very While I would probably delete an article that would be dumped straight from a machine translation, I still find it fully understandable. To illustrate: Then i tried to translate [[Art critic]] from English into Hebrew. There were a few pleasant surprises, but on the whole the machine translation was bad to the point of being unusable. It is much easier to translate it using vi. translates to: Tada sam pokušao prevesti [[umetnički kritičar]] sa engleskog na hebrejskom. Bilo je nekoliko ugodnih iznenađenja, nego na ceo mašina prevod je loš do tačke da je neupotrebljiva. To je mnogo lakše prevesti preko VI. I would retranslate this to broken English li: Then i tried to translate [[Art critic]] from English into Hebrew's. There were a few pleasant surprises, than on entire machine's translation was bad to the point of being unusably. Much easier translated via VI. and the correct would be (I highlighted the changes): Tada sam pokušao prevesti [[umetnički kritičar]] sa engleskog na *hebrejski*. Bilo je nekoliko ugodnih iznenađenja, *ali u celini* *mašinski* prevod je loš do tačke da je *neupotrebljiv*. *Mnogo je* lakše prevesti *ga* *pomoću vi-ja*. _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l