Re: [Wikimediaindia-l] Philosophical view on Google translated articles

2010-04-21 Thread BalaSundaraRaman
Hi Samuel,

Thanks for the clarification. Good to know that the foundation is in the know.

Ravi and I have acted as interlocutors with the Google team for Tamil 
Wikipedia. We have exchanged several emails and have had one conference call 
with the Google team. During these communications, we have conveyed clear 
bullet-pointed requirements that are the bare minimum necessities to meet our 
guidelines and are very much doable. Of these, to be fair, they did address 
some of our issues, but not the most important ones.

The most important of the issues stem from the pillars of Wikipedia and we 
absolutely can't compromise on that. For Google, the required outcome is the 
number of words in Indian languages SEOed from their query logs. For the 
translators, it's the money that they'll get for each word translated. For 
Wikipedia, the basic necessity is readable and meaningful content added through 
a process that doesn't subvert the Wiki way.

Following is a summary:
1. The quality is abysmal. Too mechanical and ungrammatical more than 50% of 
the time. [To set the context for Samuel (who might mistake that it works like 
it does for European languages), the toolkit is not anywhere ready for Indian 
languages and doesn't do any translation as such, it's the translators who do 
that and it's unimaginable that a native speaker writes those words, not 
sentences.]
2. The process is hands-off, the translators don't even read the page that 
they've dumped. 
3. The pages are broken with infinite erroneous redlinks and missing templates 
due to an easy-to-fix bug in the kit.
4. The basic premise of the team is 'something's better than nothing'. It's 
not. Having no article on a subject is better than having an unreadable text of 
2000 words on that subject.
5. Their process requirement: you can pick subjects, give guidelines, but we 
can't guarantee anything. We don't carry any responsibility to improve the 
articles once dumped and we don't want you to mess with them. Of course, on the 
last point, they have come down. They agreed to have a look at talk page 
feedback and only one translator (of nearly 20-30) has responded so far. This 
is CLEARLY unacceptable and our editors have said it in as many words.

I also request the community here and the foundation folks to reflect on the 
policy issues: how can we let someone post articles of no acceptable level 
which they won't edit further? Tomorrow, if a vandal does the same, won't we 
block them? On top of this, they casually mentioned some sort of agreement or 
contract with the foundation, but decline to give any information regarding 
that. Either they don't get what Wikipedia is or they don't care about it.

On a positive note, we still have our channel open with them and we're going to 
propose that they approach universities or the Classical Tamil Institute in 
Chennai who undertake such projects employing retired Tamil professors and 
teachers. Also, carrying an obligation to fix issues before adding new 
articles. If they can't do that, we don't have any other option left.

- Sundar

 That language is an instrument of human reason, and not merely a medium for 
the expression of thought, is a truth generally admitted.
- George Boole, quoted in Iverson's Turing Award Lecture



- Original Message 
 From: Samuel Klein meta...@gmail.com
 To: wikimediaindia-l@lists.wikimedia.org
 Sent: Wed, April 21, 2010 11:41:16 PM
 Subject: [Wikimediaindia-l] Re: Philosophical view on Google translated 
 articles
 
 Hello,

My first post on this list, and a long one :-)  The topic of 
 better
supporting small language Wikipedias is one that is close to my 
 heart.

The foundation doesn't have any particular policy on 
 third-party
translations or article-writing projects.   As Achal says, 
 every
community is welcome to use translation tools or not as they see 
 fit;
and to work with outside translation groups or not as they see 
 fit.

Ravi's concerns are valid -- people interested in translation as 
 a
whole may want to discuss some of these issues on the foundation 
 and
translation mailing lists -- you will find that there are 
 many
multilingual editors who are interested in the good (and bad) uses 
 of
GTT and other translation tools.


== on the use of automatic 
 translations ==

Automatic translations can be useful as one arrow in the 
 quiver of a
community of editors.  For instance, I find it helpful for 
 translated
pages to have an automatic category, and a large cleanup template 
 at
top, something like:
  this page was automatically translated by 
 [TOOL]
   from [permalink to revision of article in another 
 language].
   It may need cleanup to meet [[STYLE GUIDE|community 
 standards]].

In the case of Google and their Translation Toolkit, I 
 think it would
be good for Wikipedians to give them strong feedback about how 
 they
need to improve the tool for it to be more useful to 
 Wikipedians.
(and, if it is more of a nuisance than a help

Re: [Wikimediaindia-l] Philosophical view on Google translated articles

2010-04-20 Thread Ramesh N G
I have tried the google translator to translate some en article into
Malayalam. around 5, 6.

The difficulty which I felt was the word to word translation. We have
a difference in style. Though we translate content of lot en wiki
articles to ML, we dont do it as a word by word. we need to change the
sentence, or sometimes the entire explanation in the way we write the
ml articles.
I left that option of google translator for ml wiki now.

regards
Rameshng


On Tue, Apr 20, 2010 at 10:51 PM, Shiju Alex shijualexonl...@gmail.com wrote:
 Ravi,

 Here is my view on your questions.

 However, a majority of the Tamil Wiki community is showing stiff
 resistance for this operation.

 I too support them :). Word-to-word translation is not good from the reader
 point of view. It may kill the project also.


 1. Is Paid editing against Wikipedia principles or spirit?

 Yes, according to me.

 2.  Besides the article count etc., Wiki is first of all a vibrant and
 healthy community of like minded individuals interacting on a friendly note.
 This is very important in a small community especially. But this operation
 creates a divide like regular Wikipedians Vs Google Translators. User - 
 User interaction has changed to User -  Google -  Translation team
 coordinator  interaction. From friendly reminders we have reached a
 complaining stage.

 Is this good for the community in the long run?

 No it is not good for the community or for wiki.

 3. Who benefits more from this operation? Google or Wiki? Even if it is
 assumed that it will serve the language ultimately, who has the control in
 this operation? Of course, the Wiki communities have the control but they
 haven't exercised yet.

 This can hardly be considered partnership or collaboration. Google has not
 been transparent on this so far.

 Of course. Google will benefit more.  Wiki will get many artificial
 word-to-word translated articles from which the readers will run away.
 Google will enhance their English to Tamil translator tool through this
 exercise.

 4. Wikipedia is a volunteer project and Wikipedians contribute out of free
 will. Is this free will ensured for the Google Translators? (choice of
 articles, translation style, work load, tool for translation etc.,)

 No. since they are translating articles for money. And wikipedians will not
 have time to go through each and every word that they translate. I think
 Tamil wikipedians need to concentrate more on tamil wikipedia article
 contest now. So who will review the Google translated  articles.

 5. Use of Google translation kit is not wrong per se. But the Kit is
 partly responsible for many of the issues. Is it right to continue using
 this without cleaning up existing articles?

 I am against any type of word to word translation of English Wiki articles
 to any other language. We have tried this Malayalam Wikipedia 3 years
 before. We discontinued that since we found that exercise is creating
 artificial articles which is not good from the reader point of view.

 6. Not all en wiki articles are the best. Some have factual errors and
 some have bias. Especially, articles on culture, politics etc., Is it good
 to translate them as it is? A paid translator can hardly be expected or
 allowed to correct them.

 Your concern is true.

 7. Style of English and Style of Indian languages are quite different.
 Since the translation is done through the kit, we can see literal
 translations resulting in a dry or artificial style in the local language.
 This can harm the nature of the local language in long run. Translation is
 OK but not everywhere and as it is.

 True. I already mentioned this above.

 8. IS THE WIKIMEDIA FOUNDATION AWARE OF THIS OPERATION ? WHAT IS ITS VIEW?
 Its view need not be binding on the local community. Nevertheless it will be
 interesting to know.

 It is the local wiki community to decide about this.

 9. WHY HASN'T GOOGLE ANNOUNCED ABOUT THIS PROJECT OPENLY YET? Google has
 emphasized many times that it doesn't create information but only organizes
 it. Whenever it showed hints of creating information, it was highly noted
 with concern. This operation is a clear move towards information creation.

 Since they are not sure whether this project will happen.

 On the outset, this looks like a good gesture from Google to Indian
 language Wikis. But it should also be noted in the context Google and
 Wikmedia are both powerful entities in the Internet and one's effect on
 another should be watched.

 Google's only aim is to enhance their English- Indian language translator
 tools. All other discussions are the ways to achieve that aim.


 Shiju



 On Tue, Apr 20, 2010 at 9:19 PM, Ravishankar ravidre...@gmail.com wrote:

 Hi all,

 An update on the Google translated articles issue in Tamil Wiki.

 We exchanged few emails with the Google team and had a conference call
 once and the progress so far has been:

 * Google gets our approval on article topics before translating.
 * Some of the 

Re: [Wikimediaindia-l] Philosophical view on Google translated articles

2010-04-20 Thread Ravishankar
Hi Shiju, Ramesh

Thanks for the input. Can you update the status in ml wiki regarding Google
translated articles? Seems it is banned in Bengali Wikipedia.

We understand that this project is run in many Indian language Wikis but not
sure of the exact list.

Manish,

This whole conversation is about using

http://translate.google.com/toolkit/

The Kit by itself may be somewhat flexible to edit as you mentioned. But the
paid translators don't seem to be using it.

Ravi
___
Wikimediaindia-l mailing list
Wikimediaindia-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l


Re: [Wikimediaindia-l] Philosophical view on Google translated articles

2010-04-20 Thread Srikanth Ramakrishnan
Manish, what you say is true. Sometimes even I use Google Translator to
translate simple words to Hindi, maybe because I don't know what it is, or
because I know the Urdu word, whatever the reason, I still use it. The
toolkit is something I don't like as it messes things up.

On 21 April 2010 00:42, Manish Goregaokar manishsm...@gmail.com wrote:

 The reason for sending the mail was translation memories (Which I stupidly
 forgot to put in)
 An added benefit of the kit which I forgot to mention is that you can  use
 translation 
 memorieshttp://translate.google.com/support/toolkit/bin/answer.py?answer=147863hl=en.
 In these, you can upload translations (In TMX
 http://en.wikipedia.org/wiki/Translation_Memory_eXchangeformat), which
 take precedence over the default memories. This drastically improves
 performance.
 For example, you can upload a TMX file (I'm not sure how to make them--try
 the Third-party 
 toolshttp://en.wikipedia.org/wiki/Translation_Memory_eXchange#Third-party_toolslisted
  at enwiki), which contains a correctly translated (Mistakes will get
 greatly magnified) passage from English into Malyalam (Along with the
 originally translated passage). Many such passages in the TM teach the Kit
 to work better for those using the TM (Which can be shared).
 Making TMs especially helps change the style/dialect of the translation.

 Actually, I support completely hand-translated documents, because, when
 hand translating, the translation flows out of you, alongwith the style.
 Editing machine translations inhibits your mind and doesn't allow the use of
 style except in spurts. I know this because I've tried translating stuff
 from English to French (Not for WP, though), and I found the toolkit useful.
 but I decided to stop and I typed the rest of it, using Google only as a
 dictionary.
 This might be the reason the translators don't want to use the Kit.
 Manish*Earth* 
 http://en.wikipedia.org/wiki/User:ManishearthTalkhttp://en.wikipedia.org/wiki/User_talk:Manishearth
  • Stalk http://en.wikipedia.org/wiki/Special:Contributions/Manishearth



 On Tue, Apr 20, 2010 at 1:57 PM, Ravishankar ravidre...@gmail.com wrote:

 Hi Shiju, Ramesh

 Thanks for the input. Can you update the status in ml wiki regarding
 Google translated articles? Seems it is banned in Bengali Wikipedia.

 We understand that this project is run in many Indian language Wikis but
 not sure of the exact list.

 Manish,

 This whole conversation is about using

 http://translate.google.com/toolkit/

 The Kit by itself may be somewhat flexible to edit as you mentioned. But
 the paid translators don't seem to be using it.

 Ravi

 ___
 Wikimediaindia-l mailing list
 Wikimediaindia-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l



 ___
 Wikimediaindia-l mailing list
 Wikimediaindia-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l




-- 
Regards,
Arun Srikanth.
Wear a Lungi, Support the Movement, Uyirin Uyirae
Sent from my Motorola SLVR L9.
___
Wikimediaindia-l mailing list
Wikimediaindia-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l