Re: [Moses-support] Regarding Transliteration using Moses

Hassan Sajjad Sat, 30 Nov 2013 23:08:49 -0800

Hi Pranjal,

If the command you mentioned is of your transliteration system then you are
very close to done. Rather than specifying like the following:
echo 'भारत'| ~/mosesdecoder/bin/moses -f ~/work1/train/model/moses.ini


put space between the characters of the input word. So, moses considers it
as a complete sentence and outputs its translation (transliteration)
according to the model.
echo 'भ ा र त'| ~/mosesdecoder/bin/moses -f ~/work1/train/model/moses.ini

Now your output will look something like 'bh a r a t'. You can use a simple
perl script to remove spaces from the characters.

You can also generate an N-best list of transliteration by using -n-best-list
listfile 100. See
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc8

Cheers,
Hassan




On Sun, Dec 1, 2013 at 3:13 AM, Pratik Jain <[email protected]> wrote:

> Pranjal. as I told you in my previous reply, you can't get the whole word
> transliterated. You have to break it, transliterate and join it again.
>
>
> On Sat, Nov 30, 2013 at 12:24 PM, Pranjal Das <[email protected]>wrote:
>
>> Thank you prashant and pratik...i have done the splitting character-wise
>> and then the training part...as you said...but i am unable to do the
>> reassemble part. For eg The letter ' भ ' is translietrated to bh, ' र '
>> is transliterated to r..but the complete word
>> भारत is not transliterated. it gives the same output भारत. please can
>> you guide me...how i can get the complete word.
>>  echo 'भारत'| ~/mosesdecoder/bin/moses -f ~/work1/train/model/moses.ini
>> gives the output भारत
>>
>> *Pranjal Das*
>> Department of Information Technology,
>> Institute of Science and Technology,
>> Gauhati University,Guwahati,Assam
>> Phone- +91-8399879454
>>
>>
>> On Fri, Nov 29, 2013 at 1:02 PM, Prashant Mathur 
>> <[email protected]>wrote:
>>
>>> hi pranjal,
>>>
>>> when you trained the system there is no rule for translating भारत but
>>> there is a translation rule for translating भा and रत..
>>> having said that if you want a good transliteration system you should
>>> separate the vowels from consonants.. like if you want to split भारत try
>>> spring it like
>>> भ + ा + र + त  -> bh + a + ra + ta
>>> it can be even more fine grained
>>>
>>> and when you want to transliterate a word you would have to split the
>>> test set like the training data.
>>>
>>> this is how i would do it.
>>> i don't know  how pratik did.
>>>
>>> just being curious.. have you tried translating from english to wx
>>> notation in hindi
>>>
>>> Prashant
>>> On Nov 29, 2013 10:33 AM, "Pranjal Das" <[email protected]> wrote:
>>>
>>>> hii ,
>>>> hav tried as you said pratik..but cant get the original word.as for
>>>> example भारत as
>>>> भा रत ->bha rat ,if i echo bha i get the word भा but in its place if i
>>>> write भारत not transliterated.plese help me out if you hav any idea
>>>> about this..how to get that complete word..as those which are not
>>>> translated i suppose must be transliterated.
>>>> Thanks
>>>>
>>>>
>>>> regards
>>>>
>>>> *Pranjal Das*
>>>> Department of Information Technology,
>>>> Institute of Science and Technology,
>>>> Gauhati University,Guwahati,Assam
>>>> Phone- +91-8399879454
>>>>
>>>>
>>>> On Wed, Nov 27, 2013 at 9:44 PM, Pratik Jain <[email protected]>wrote:
>>>>
>>>>> Just get a parallel list of words in the 2 languages. Try to break
>>>>> each word syllable-wise or simply character-wise. Thus, each line of the
>>>>> two files would have single words with a space between each character (or
>>>>> each syllable). Use these 2 files to train the system as normal 
>>>>> translation
>>>>> task.
>>>>> If your word list is large enough, this works quite nice. Try it out,
>>>>> it worked good for me.
>>>>>
>>>>>
>>>>> On Wed, Nov 27, 2013 at 10:49 PM, Hieu Hoang <[email protected]>wrote:
>>>>>
>>>>>> There's no support for transliteration in Moses. If you create 1,
>>>>>> please let us know and we can add it to Moses
>>>>>>
>>>>>>
>>>>>> On 27 November 2013 07:44, Pranjal Das <[email protected]> wrote:
>>>>>>
>>>>>>> hii,
>>>>>>> i have completed my translation using moses toolkit,following the
>>>>>>> manual for installing moses,GIZZA++,IRSTLM and have tried out the sample
>>>>>>> models for translation and applying those on my own parallel
>>>>>>> corpus(fr-en),now i am facing problems regarding transliteration as 
>>>>>>> there
>>>>>>> is no proper way given in the manual for Transliteration using 
>>>>>>> moses.kindly
>>>>>>> help me with  transliteration steps.where i can find them.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanku
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> *Pranjal Das*
>>>>>>>  Department of Information Technology,
>>>>>>> Gauhati University Institute of Science and Technology,Guwaha
>>>>>>> ti,India
>>>>>>> Phone- +91-8399879454
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Moses-support mailing list
>>>>>>> [email protected]
>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Hieu Hoang
>>>>>> Research Associate
>>>>>> University of Edinburgh
>>>>>> http://www.hoang.co.uk/hieu
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Moses-support mailing list
>>>>>> [email protected]
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> [email protected]
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>
>>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Regarding Transliteration using Moses

Reply via email to