ArielGlenn added a comment.

I should point out that in the original files enwiki-20190101-pages-meta-history10.xml-p2534537p2554779.bz2 and wikidatawiki-20190101-pages-meta-history27.xml-p56428595p56649675.bz2, the revison counts were comparable. 1500850 vs 1484162, but the line counts of the uncompressed content were not: 693941886 vs 28694664 lines. That's due to the very particular structure of a wikidata entry as stored, and it seems that structure is one of the worst cases for the standard bzip2 implementation. Anyways, more testing soon!


TASK DETAIL
https://phabricator.wikimedia.org/T214293

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: hoo, ArielGlenn, Nandana, Lahi, Gq86, Darkminds3113, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, Vali.matei, _jensen, Volker_E, gnosygnu, Wikidata-bugs, aude, GWicke, Dinoguy1000, Mbch331, Jay8g
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to