Dears, I found the problem
At the line number 289 in the tokenizer.perl script just add a space like that The original code $text =~ s/([\p{IsAlpha}])[']([\p{IsAlpha}])/$1 ' $2/g; The modified one $text =~ s/([\p{IsAlpha}])[']([\p{IsAlpha}])/$1 ' $2/g; By this modification tokenization of files will be the same as tokenizing one segment Thanks From: Ihab Ramadan [mailto:i.rama...@saudisoft.com] Sent: Wednesday, January 14, 2015 11:14 AM To: moses-support@mit.edu Subject: RE: Tokenization problem Dears, I still have this problem, for not confusing the decoder I used the no-escape parameter in the tokenizer.perl script but still have the problem of adding extra space after quotations for tokenizing files however in tokenizing a segment it comes without the extra space For example In the file which will guide you through connecting and configuring your printer's wireless connection. à which will guide you through connecting and configuring your printer ' s wireless connection . As a segment which will guide you through connecting and configuring your printer's wireless connection. à which will guide you through connecting and configuring your printer 's wireless connection . I wonder if it is the same script why it generated two different outputs I have no experience in perl so I could not get the line of code which differ between if the segment in a file or just one segment passed as a parameter to the script Please help From: Ihab Ramadan [mailto:i.rama...@saudisoft.com] Sent: Monday, January 5, 2015 10:09 AM To: moses-support@mit.edu Subject: Tokenization problem Dears, Using the tokenizer on the training files replaces the apostrophes with ' s (with space) but if I use the same script to tokenize a sentence it makes the apostrophes to be 's (without a space) This problem confuse the decoder while translation How to solve this peoblem Thanks Best Regards Ihab Ramadan| Senior Developer| <http://www.saudisoft.com/> Saudisoft - Egypt | Tel +2 02 330 320 37 Ext- 0 | Mob+201007570826 | Fax+20233032036 | Follow us on <http://www.linkedin.com/company/77017?trk=vsrp_companies_res_name&trkInfo=V SRPsearchId%3A1489659901402995947155%2CVSRPtargetId%3A77017%2CVSRPcmpt%3Apri mary> linked | <https://www.facebook.com/pages/Saudisoft-Co-Ltd/289968997768973?ref_type=bo okmark> ZA102637861 | <https://twitter.com/Saudisoft> ZA102637858
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support