I think what you're doing has been successfully done by a few of people, for numbers and names entities too.
You're replacing a number with a placeholder in the training data. Did you do the same for the LM training data? Did you also replace numbers with placeholders in the input sentence too? How did you then put back the original number in the output? On 6 June 2013 09:03, Arezki Sadoune <[email protected]> wrote: > Dear All, > > I'm having some troubles translating numbers on a Moses phrase based > system (some were misaligned by Giza which has led to a confusion on the > phrase-table). In order to address that issue, I have tried several > approaches to skip the numbers through the translation process by making > them unknown to the Engine, I ended up converting each of them to the value > '0' in my corpora so all of them will be unchanged (it was my hypothesis). > > After this processing and the training of the new system, I found out that > it was a bad idea, the translation quality was significantly bad compared > to an engine with actual numbers because of the large amount of unknowns( I > suppose) > > If some of you have had a similar experience it would be tremendous if you > could help me > > Looking forward to get an answer > > Kind regards > > Arezki > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > -- Hieu Hoang Research Associate University of Edinburgh http://www.hoang.co.uk/hieu
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
