Hi Vincent

Could you say exactly which files you are comparing?

cheers - Barry

On 04/10/16 21:20, Vincent Nguyen wrote:
>
> no.... but my mistake I was comparing with that link for the per year 
> files : http://www.statmt.org/wmt15/translation-task.html
>
> what is the difference ? (with the wmt11 files)
>
>
>
> Le 04/10/2016 à 21:46, Barry Haddow a écrit :
>> Hi Vincent
>>
>> Are you comparing compressed with uncompressed files?
>>
>> cheers - Barry
>>
>> On 04/10/16 14:40, Vincent Nguyen wrote:
>>> Hi,
>>>
>>> on this link:
>>>
>>> http://www.statmt.org/wmt11/translation-task.html
>>>
>>> on the download section for monolingual data, there is :
>>>
>>> one big file : http://www.statmt.org/wmt11/training-monolingual.tgz
>>>
>>> And separate files, of which news crawls per year.
>>>
>>> However, when you take a single file for a specific year, it is not the
>>> same size as the same name file in the big download.
>>>
>>> expanded size for english corpus :
>>>
>>> news2008: 4.3GB vs 1.6GB for single download
>>> news2009: 5.3GB vs 1.8GB for single download
>>>
>>> etc...
>>>
>>> can someone please explain the difference ?
>>>
>>> thanks
>>>
>>> Vincent.
>>>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>
>>
>
>


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to