Dear Graham,
Greetings.
Please clarify that Pre-Ordering in your reply means, that making the word
order of both Source Language Sentences and Target Language Sentences
similar in Source-Target Language Pair before going for training so that it
becomes similar to the scenario of closely related language Pair.

On Sat, Mar 26, 2016 at 10:57 AM, Graham Neubig <neu...@is.naist.jp> wrote:

> Hi Vito,
>
> English-Japanese and Japanese-English translation are very difficult due
> to the grammatical differences between the languages.
>
> You have a couple options to overcome this problem:
> 1) If you want to use phrase-based Moses, you will have to perform some
> variety of pre-ordering, in which you rearrange the words in the source
> sentence before training/testing.
> 2) You can use a syntax-based system, either using the functionality in
> Moses (http://www.statmt.org/moses/?n=Moses.SyntaxTutorial), or using
> another decoder specifically designed for syntax-based MT such as my
> Travatar decoder (http://www.phontron.com/travatar/). I have released the
> setup for training our strongest Japanese-English and English-Japanese
> systems here: https://github.com/neubig/wat2014
>
> Regarding the different types of characters, I would leave them as-is. It
> is possible to perform normalization, which will help in a limited number
> of cases, but if you're just starting out this is really the least of your
> problems.
>
> Graham
>
>
> On Fri, Mar 25, 2016 at 7:51 PM, Vito Mandorino <
> vito.mandor...@linguacustodia.com> wrote:
>
>> Dear all,
>>
>> does anyone have ever done experiments for English-Japanese and
>> Japanese-English translation? Do you know about useful ressources for this
>> language pair, or some specific gotchas one should be aware of?
>>
>> More specifically, what is the best policy for dealing with alphabets? Do
>> you think it is a good idea to keep different alphabets (Kanji, Hiragana,
>> Katakana, ...) in the corpus, or should one try to convert Kanji into one
>> of the other alphabets?
>>
>> Best regards,
>>
>> Vito Mandorino
>>
>> --
>> *M**. Vito MANDORINO -- Chief Scientist*
>>
>>
>> [image: Description : Description : lingua_custodia_final full logo]
>>
>>  *The Translation Trustee*
>>
>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>
>> *Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89
>> <%2B33%206%2084%2065%2068%2089>*
>>
>> *Email :*  *vito.mandor...@linguacustodia.com
>> <massinissa.ah...@linguacustodia.com>*
>>
>> *Website :*
>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 
*Regards,*
Vishal Goyal,
Ph.D., M.Tech., MCA, M.C.S.D.
Associate Professor(Stage IV),
Department of Computer Science,
Punjabi University Patiala-147002.


*Machine Translation Systems:*
[*Online Hindi to Punjabi Machine Translation Tool -*
http://h2p.learnpunjabi.org ]
[*Statistical Approach Based Hindi to Punjabi Machine Translation System *
- http://statmt.org/~vishal/hp/index.cgi
- http://tdil-dc.in/hi2pu/index.cgi
]
*Online Journal: [Research Cell: An International Journal of Engineering
Sciences, http://ijoes.vidyapublications.com
<http://ijoes.vidyapublications.com>]*
*Book: A Simplified Approach to Data Structures, Shroff Publications and
Distributors*
http://www.shroffpublishers.com/detail.aspx?title=6163
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to