https://bugzilla.wikimedia.org/show_bug.cgi?id=27064
Summary: itwiki-20110130-pages-articles.xml.bz2 is corrupted Product: XML Snapshots Version: unspecified Platform: All OS/Version: All Status: NEW Severity: enhancement Priority: Normal Component: General AssignedTo: ar...@wikimedia.org ReportedBy: k...@fotonauts.com CC: tf...@wikimedia.org $ md5sum itwiki-20110130-pages-articles.xml.bz2 7eac57c7c521bf6f36e9a5d7ec476562 itwiki-20110130-pages-articles.xml.bz2 which is fine, according to http://dumps.wikimedia.org/itwiki/20110130/itwiki-20110130-md5sums.txt but... $ bunzip2 itwiki-20110130-pages-articles.xml.bz2 bunzip2: Data integrity error when decompressing. Input file = itwiki-20110130-pages-articles.xml.bz2, output file = itwiki-20110130-pages-articles.xml It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. bunzip2: Deleting output file itwiki-20110130-pages-articles.xml, if it exists. $ bunzip2 -tvv itwiki-20110130-pages-articles.xml.bz2 itwiki-20110130-pages-articles.xml.bz2: [1: huff+mtf rt+rld] [2: huff+mtf rt+rld] [.... snip ....] [2510: huff+mtf rt+rld] [2511: huff+mtf rt+rld] [2512: huff+mtf data integrity (CRC) error in data You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l