okay really weird.
wc gives me the same numbers as you, but gedit give another 2 different 
numbers for each file. Must be special characters somewhere.


Le 13/09/2017 à 18:52, Barry Haddow a écrit :
> Hi Vincent
>
> Looks fine to me:
>
>> wc -l news-commentary-v12.de-en.*
>>   270769 news-commentary-v12.de-en.de
>>   270769 news-commentary-v12.de-en.en
>>   541538 total
>
> What are you running that shows you different line numbers?
>
> cheers - Barry
>
> On 12/09/17 10:06, Vincent Nguyen wrote:
>> Hi,
>> Is there an updated version of NCv12 for this
>> http://data.statmt.org/wmt17/translation-task/training-parallel-nc-v12.tgz 
>>
>>
>> the number of lines for de-en is not the same in the 2 languages.
>>
>> Cheers,
>> Vincent
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to