Many thanks, Barry. I'll try tokenising and truecasing the input sentence as you said.
Regards, GG. ________________________________ Da: Barry Haddow <bhad...@staffmail.ed.ac.uk> Inviato: sabato 2 agosto 2014 16.29 A: ULStudent:GIOVANNI.GALLO; Philipp Koehn Cc: moses-support@mit.edu Oggetto: Re: [Moses-support] problem with translation output Hi Before decoding, you need to apply the same preprocessing steps to the input sentence as were applied during training. Normally this would mean tokenisation and either lowercasing and truecasing. During decoding, Moses will split the sentence on whitespace only, so it will see "personas," as different to "personas", and if it has not seen the former in training (because the training data was tokenised) then it will pass it through, cheers - Barry On 02/08/14 12:12, ULStudent:GIOVANNI.GALLO wrote: Hi Philipp, Thank you for your reply. Here is an example: ??Source sentence in Spanish Es un local climatizado con capacidad para 150 personas, con una terraza para 30 personas y rodeado de una zona ajardinada. Translated sentence into Italian Es un locale climatizzato capienza 150 personas,|UNK|UNK|UNK con terrazza per 30 persone e circondato una zona ajardinada. |UNK|UNK|UNK The strange thing is that if I remove the comma and the period from the source sentence, the |UNK|UNK|UNK disappear and personas and ajardinada get translated: Es un locale climatizzato capienza 150 persone con terrazza per 30 persone e circondato una giardino Any idea? Does it have to do with the fact that my model in unfactored? Thank you in advance. GG ________________________________ Da: phko...@gmail.com<mailto:phko...@gmail.com> <phko...@gmail.com><mailto:phko...@gmail.com> per conto di Philipp Koehn <pko...@inf.ed.ac.uk><mailto:pko...@inf.ed.ac.uk> Inviato: venerdì 1 agosto 2014 16.59 A: ULStudent:GIOVANNI.GALLO Cc: moses-support@mit.edu<mailto:moses-support@mit.edu> Oggetto: Re: [Moses-support] problem with translation output Hi, this should not happen - are they really identically (uppercase/lowercase etc.)? You can run the decoder with more verbose output (-v 2 or even -v 3) and the trace options (-t) to dig a bit deeper on what is going on. -phi On Fri, Aug 1, 2014 at 7:24 AM, ULStudent:GIOVANNI.GALLO <12064...@studentmail.ul.ie<mailto:12064...@studentmail.ul.ie>> wrote: Hi everyone, I'm running some experiments with Moses and I noticed that in translated sentences there are always one or two words/phrases that don't get translated (they appear as in the source sentence). I checked the phrase table and there are many entries corresponding to those words/phrases that might be used during decoding. Do you have any idea what's happening here? Maybe something I need to modify in the moses.ini file? Thank you in advance for your help. Regards, Giovanni Gallo. _______________________________________________ Moses-support mailing list Moses-support@mit.edu<mailto:Moses-support@mit.edu> http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list Moses-support@mit.edu<mailto:Moses-support@mit.edu> http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support