On Friday 06 August 2010 15:36:14 Gary Daine wrote: > I have a very basic-sounding question, but I've not been able to find > any reference in the documentation. > > Since Moses is trained on tokenized, lowercased corpora, is it necessary > to tokenize and lowercase the text to be translated as well (and do the > reverse to the output)? > > TIA > > Gary
Hi Gary I guess it depends on the material you've trained your language and translation models on. If they have been trained on lowercased, tokenized data, then I would say yes, you need to lowercase and tokenize your input. regards, -- Sylvain _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support