Dear Graham, Greetings. Please clarify that Pre-Ordering in your reply means, that making the word order of both Source Language Sentences and Target Language Sentences similar in Source-Target Language Pair before going for training so that it becomes similar to the scenario of closely related language Pair.
On Sat, Mar 26, 2016 at 10:57 AM, Graham Neubig <neu...@is.naist.jp> wrote: > Hi Vito, > > English-Japanese and Japanese-English translation are very difficult due > to the grammatical differences between the languages. > > You have a couple options to overcome this problem: > 1) If you want to use phrase-based Moses, you will have to perform some > variety of pre-ordering, in which you rearrange the words in the source > sentence before training/testing. > 2) You can use a syntax-based system, either using the functionality in > Moses (http://www.statmt.org/moses/?n=Moses.SyntaxTutorial), or using > another decoder specifically designed for syntax-based MT such as my > Travatar decoder (http://www.phontron.com/travatar/). I have released the > setup for training our strongest Japanese-English and English-Japanese > systems here: https://github.com/neubig/wat2014 > > Regarding the different types of characters, I would leave them as-is. It > is possible to perform normalization, which will help in a limited number > of cases, but if you're just starting out this is really the least of your > problems. > > Graham > > > On Fri, Mar 25, 2016 at 7:51 PM, Vito Mandorino < > vito.mandor...@linguacustodia.com> wrote: > >> Dear all, >> >> does anyone have ever done experiments for English-Japanese and >> Japanese-English translation? Do you know about useful ressources for this >> language pair, or some specific gotchas one should be aware of? >> >> More specifically, what is the best policy for dealing with alphabets? Do >> you think it is a good idea to keep different alphabets (Kanji, Hiragana, >> Katakana, ...) in the corpus, or should one try to convert Kanji into one >> of the other alphabets? >> >> Best regards, >> >> Vito Mandorino >> >> -- >> *M**. Vito MANDORINO -- Chief Scientist* >> >> >> [image: Description : Description : lingua_custodia_final full logo] >> >> *The Translation Trustee* >> >> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux* >> >> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89 >> <%2B33%206%2084%2065%2068%2089>* >> >> *Email :* *vito.mandor...@linguacustodia.com >> <massinissa.ah...@linguacustodia.com>* >> >> *Website :* >> *www.linguacustodia.finance <http://www.linguacustodia.com/>* >> >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > -- *Regards,* Vishal Goyal, Ph.D., M.Tech., MCA, M.C.S.D. Associate Professor(Stage IV), Department of Computer Science, Punjabi University Patiala-147002. *Machine Translation Systems:* [*Online Hindi to Punjabi Machine Translation Tool -* http://h2p.learnpunjabi.org ] [*Statistical Approach Based Hindi to Punjabi Machine Translation System * - http://statmt.org/~vishal/hp/index.cgi - http://tdil-dc.in/hi2pu/index.cgi ] *Online Journal: [Research Cell: An International Journal of Engineering Sciences, http://ijoes.vidyapublications.com <http://ijoes.vidyapublications.com>]* *Book: A Simplified Approach to Data Structures, Shroff Publications and Distributors* http://www.shroffpublishers.com/detail.aspx?title=6163
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support