okay really weird. wc gives me the same numbers as you, but gedit give another 2 different numbers for each file. Must be special characters somewhere.
Le 13/09/2017 à 18:52, Barry Haddow a écrit : > Hi Vincent > > Looks fine to me: > >> wc -l news-commentary-v12.de-en.* >> 270769 news-commentary-v12.de-en.de >> 270769 news-commentary-v12.de-en.en >> 541538 total > > What are you running that shows you different line numbers? > > cheers - Barry > > On 12/09/17 10:06, Vincent Nguyen wrote: >> Hi, >> Is there an updated version of NCv12 for this >> http://data.statmt.org/wmt17/translation-task/training-parallel-nc-v12.tgz >> >> >> the number of lines for de-en is not the same in the 2 languages. >> >> Cheers, >> Vincent >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> > > _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support