Ok, Thank you
Bests Cyrine 2014-02-21 14:55 GMT+01:00 Thomas Meyer <ithurts...@gmail.com>: > Hi, > > Ah, in that case it can actually cause problems: your training data should > always be formatted in the same way as your dev/test data. > > 2 possibilities: > > - re-tokenize training data with the actual tokenizer script to have the > same mark-up (then retrain your system) > - re-tokenize your dev/test data with the same (possibly older) tokenizer > script as was used for your training data (then run tuning/decoding) > > HTH, > Thomas > > > On 21 February 2014 14:49, cyrine.na...@univ-lorraine.fr < > cyrine.na...@gmail.com> wrote: > >> Thank you Thomas, >> >> So, i keep the text with these Special characters, it will not cause >> problems? beacuse the training corpus is without these characters but only >> the development and test corpus are like this. >> >> Thank you :) >> >> Bets >> >> >> 2014-02-21 14:40 GMT+01:00 Thomas Meyer <ithurts...@gmail.com>: >> >>> >>> >>> Hi, >>> >>> That is not a 'problem' but XML >>> entities<http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references> >>> mark-up >>> for special characters. You don't have to worry about this, as the >>> tokenizer script does it for all characters in a consistent way. >>> >>> Best, >>> Thomas >>> >>> >>> On 21 February 2014 14:20, cyrine.na...@univ-lorraine.fr < >>> cyrine.na...@gmail.com> wrote: >>> >>>> >>>> Hello all, >>>> >>>> I have a problem with the tokenizer.pl script. i get as a result a >>>> text ith some special punctuation , like this for example : >>>> >>>> EU 's Luxembourg-based statistical office reported >>>> >>>> The input file is a .txt file >>>> >>>> Is there any solution for this problem >>>> >>>> Thank you in advance >>>> >>>> >>>> Bests >>>> -- >>>> *Cyrine* >>>> >>>> _______________________________________________ >>>> Moses-support mailing list >>>> Moses-support@mit.edu >>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>> >>>> >>> >> >> >> -- >> >> *Cyrine NASRIPh.D. Student in Computer Science* >> > > -- *Cyrine NASRIPh.D. Student in Computer Science*
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support