Many thanks, Barry. I'll try tokenising and truecasing the input sentence as 
you said.


Regards,

GG.

________________________________
Da: Barry Haddow <bhad...@staffmail.ed.ac.uk>
Inviato: sabato 2 agosto 2014 16.29
A: ULStudent:GIOVANNI.GALLO; Philipp Koehn
Cc: moses-support@mit.edu
Oggetto: Re: [Moses-support] problem with translation output

Hi

Before decoding, you need to apply the same preprocessing steps to the input 
sentence as were applied during training. Normally this would mean tokenisation 
and either lowercasing and truecasing.

During decoding, Moses will split the sentence on whitespace only, so it will 
see "personas," as different to "personas", and if it has not seen the former 
in training (because the training data was tokenised) then it will pass it 
through,

cheers - Barry


On 02/08/14 12:12, ULStudent:GIOVANNI.GALLO wrote:

Hi Philipp,


Thank you for your reply. Here is an example:


??Source sentence in Spanish

Es un local climatizado con capacidad para 150 personas, con una terraza para 
30 personas y rodeado de una zona ajardinada.

Translated sentence into Italian

Es un locale climatizzato capienza 150 personas,|UNK|UNK|UNK con terrazza per 
30 persone e circondato una zona ajardinada. |UNK|UNK|UNK


The strange thing is that if I remove the comma and the period from the source 
sentence, the |UNK|UNK|UNK disappear and personas and ajardinada get translated:


Es un locale climatizzato capienza 150 persone con terrazza per 30 persone e 
circondato una giardino


Any idea? Does it have to do with the fact that my model in unfactored?

Thank you in advance.


GG


________________________________
Da: phko...@gmail.com<mailto:phko...@gmail.com> 
<phko...@gmail.com><mailto:phko...@gmail.com> per conto di Philipp Koehn 
<pko...@inf.ed.ac.uk><mailto:pko...@inf.ed.ac.uk>
Inviato: venerdì 1 agosto 2014 16.59
A: ULStudent:GIOVANNI.GALLO
Cc: moses-support@mit.edu<mailto:moses-support@mit.edu>
Oggetto: Re: [Moses-support] problem with translation output

Hi,

this should not happen - are they really identically (uppercase/lowercase etc.)?

You can run the decoder with more verbose output (-v 2 or even -v 3) and the
trace options (-t) to dig a bit deeper on what is going on.

-phi



On Fri, Aug 1, 2014 at 7:24 AM, ULStudent:GIOVANNI.GALLO 
<12064...@studentmail.ul.ie<mailto:12064...@studentmail.ul.ie>> wrote:

Hi everyone,


I'm running some experiments with Moses and I noticed that in translated 
sentences there are always one or two words/phrases that don't get translated 
(they appear as in the source sentence). I checked the phrase table and there 
are many entries corresponding to those words/phrases that might be used during 
decoding. Do you have any idea what's happening here? Maybe something I need to 
modify in the moses.ini file?

Thank you in advance for your help.


Regards,

Giovanni Gallo.

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu<mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support





_______________________________________________
Moses-support mailing list
Moses-support@mit.edu<mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support


_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to