[Xmldatadumps-l] corrupted files english december

2015-12-28 Thread Luigi Assom
Hello Wikiteam!

just in time to say have a good vacation and happy 2016 :)

well I am also here about corrupted files :)

I downloaded three times from different wifi networks and using download
managers from Firefox:

enwiki-20151201-pages-articles-multistream.xml.bz2

and two times:

 enwiki-latest-pages-articles-multistream.xml.bz2


MD5 checksum is correct (*-latest-* have checksum of -*20151201-*)

but file is corrupted.


Cannot use bzip2recover for file is too large and I should recompile it
because it will have more than maxlimit of generated tokens... and I think
it is way better to take a fixed file :D


Could you please check if I am the only one having this issue?

dumps from other languages had worked fine for me, en-* is problematic.
___
Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l


Re: [Xmldatadumps-l] corrupted files english december

2015-12-28 Thread Luigi Assom
Ohh good good,
please let me know when ready :)

Or ps. does dbpedia is working with wikimedia staff or are two completely
separated things?
I was wondering why wiki release dumps every month, while ~dbpedia each
year.

HaPPy VaCaTIOn (also to Ariel who is looking to this stuff before (or doing
:D ) party time :)))

see you guys & gals!


On Mon, Dec 28, 2015 at 4:05 PM, Hydriz Scholz  wrote:

> Happy holidays!
>
> This issue has already been reported at T121348 [1], so you are not alone.
> Ariel is already looking into it.
>
> [1]: https://phabricator.wikimedia.org/T121348
>
> On 28 Dec 2015, at 18:18, Luigi Assom  wrote:
>
> Hello Wikiteam!
>
> just in time to say have a good vacation and happy 2016 :)
>
> well I am also here about corrupted files :)
>
> I downloaded three times from different wifi networks and using download
> managers from Firefox:
>
> enwiki-20151201-pages-articles-multistream.xml.bz2
>
> and two times:
>
>  enwiki-latest-pages-articles-multistream.xml.bz2
>
>
> MD5 checksum is correct (*-latest-* have checksum of -*20151201-*)
>
> but file is corrupted.
>
>
> Cannot use bzip2recover for file is too large and I should recompile it
> because it will have more than maxlimit of generated tokens... and I think
> it is way better to take a fixed file :D
>
>
> Could you please check if I am the only one having this issue?
>
> dumps from other languages had worked fine for me, en-* is problematic.
>
>
>
>
>
> ___
> Xmldatadumps-l mailing list
> Xmldatadumps-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>
>


-- 
*Luigi Assom*

T +39 349 304 | +1 415 707 9684
Skype oggigigi
___
Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l


Re: [Xmldatadumps-l] corrupted files english december

2015-12-28 Thread Federico Leva (Nemo)

Luigi Assom, 28/12/2015 16:47:

Or ps. does dbpedia is working with wikimedia staff or are two
completely separated things?


Completely separate.


I was wondering why wiki release dumps every month, while ~dbpedia each
year.


Probably because DBpedia isn't an automated process and their maps need 
to be updated regularly? http://blog.dbpedia.org/?p=148 explains some of 
the steps required for a release of theirs, you can probably find more 
on their wiki or list 
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Nemo

___
Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l