[Bug 21917] mwdumper does not generates some page_id's
https://bugzilla.wikimedia.org/show_bug.cgi?id=21917 --- Comment #14 from Umherirrender umherirrender_de...@web.de 2010-04-06 11:46:54 UTC --- Works for my with ruwiki-20100331-pages-articles.xml. Have the tables page, revision and text all the same number of rows? (1478943) Maybe that is a encoding problem, try to append characterEncoding=utf8 to the --output parameter -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 21917] mwdumper does not generates some page_id's
https://bugzilla.wikimedia.org/show_bug.cgi?id=21917 --- Comment #13 from Kein Zantezuken zantezu...@gmail.com 2010-03-30 11:36:36 UTC --- binary -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 21917] mwdumper does not generates some page_id's
https://bugzilla.wikimedia.org/show_bug.cgi?id=21917 --- Comment #12 from Platonides platoni...@gmail.com 2010-03-29 22:06:56 UTC --- Which option did you select for the database? utf-8, binary or backwards-compatible with mysql4? -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 21917] mwdumper does not generates some page_id's
https://bugzilla.wikimedia.org/show_bug.cgi?id=21917 --- Comment #9 from Kein Zantezuken zantezu...@gmail.com 2009-12-25 15:26:23 UTC --- Yeh, I found 'Операционная_система' in the SQL dump, but that's weird... the whole INSERT into page was skipped, I can't find any page_id's for all these articles in that INSERT script. Weird. Tho, I have [[Category:Операционная система]] and [[Template:Операционная система]] ; Annoying. Well, anyway, bug is INVALID, sorry for the false report. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 21917] mwdumper does not generates some page_id's
https://bugzilla.wikimedia.org/show_bug.cgi?id=21917 --- Comment #10 from Kein Zantezuken zantezu...@gmail.com 2009-12-25 18:05:16 UTC --- Or, perhaps, it is valid since mwdumper does not generate correct SQL dump. Too many duplicate entries but why? The DB is empty. Looks like mwdumper does something wrong. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 21917] mwdumper does not generates some page_id's
https://bugzilla.wikimedia.org/show_bug.cgi?id=21917 --- Comment #11 from Kein Zantezuken zantezu...@gmail.com 2009-12-25 18:48:34 UTC --- Here is the [http://shinra.ru/kein/out.7z full log]. As you can see all errors related to 'rev_comment' only, so, if generated SQL is correct there shoul not by any issue with missing articles. But there is ; -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 21917] mwdumper does not generates some page_id's
https://bugzilla.wikimedia.org/show_bug.cgi?id=21917 --- Comment #4 from Platonides platoni...@gmail.com 2009-12-23 14:06:12 UTC --- I do find the insert for Операционная система at page table: $ bzcat ruwiki-20091207-pages-articles.xml.bz2|java -jar mwdumper.jar --format=sql:1.5 | grep -m 1 'Операционная_система' INSERT INTO page (page_id,page_namespace,page_title,page_restrictions,page_counter,page_is_redirect,page_is_new,page_random,page_touched,page_latest,page_len) VALUES (3428,0,'Эмоция','',0,0,0,RAND(),DATE_ADD('1970-01-01', INTERVAL UNIX_TIMESTAMP() SECOND)+0,20389302,28622),(3432,0,'Человек_разумный','',0,0,0,RAND(),DATE_ADD('1970-01-01', INTERVAL UNIX_TIMESTAMP() SECOND)+0,20412105,62890), ... ... (4590,0,'1545_год','',0,0,0,RAND(),DATE_ADD('1970-01-01', INTERVAL UNIX_TIMESTAMP() SECOND)+0,17287964,3074), (4591,0,'Операционная_система','',0,0,0,RAND(),DATE_ADD('1970-01-01', INTERVAL UNIX_TIMESTAMP() SECOND)+0,20354505,39406), (4593,0,'Рим','',0,0,0,RAND(),DATE_ADD('1970-01-01', INTERVAL UNIX_TIMESTAMP() SECOND)+0,20389427,116221),(4595,0,'Двоичные_приставки','',0,0,0,RAND(),DATE_ADD('1970-01-01', INTERVAL UNIX_TIMESTAMP() SECOND)+0,19830413,15461)... ...(4904,0,'23_января','',0,0,0,RAND(),DATE_ADD('1970-01-01', INTERVAL UNIX_TIMESTAMP() SECOND)+0,20288736,13963),(4905,0,'24_января','',0,0,0,RAND(),DATE_ADD('1970-01-01', INTERVAL UNIX_TIMESTAMP() SECOND)+0,20288479,14120),(4906,0,'25_января','',0,0,0,RAND(),DATE_ADD('1970-01-01', INTERVAL UNIX_TIMESTAMP() SECOND)+0,20288290,29154),(4907,0,'26_января','',0,0,0,RAND(),DATE_ADD('1970-01-01', INTERVAL UNIX_TIMESTAMP() SECOND)+0,20313506,16559),(4908,0,'27_января','',0,0,0,RAND(),DATE_ADD('1970-01-01', INTERVAL UNIX_TIMESTAMP() SECOND)+0,20313334,14701),(4909,0,'28_января','',0,0,0,RAND(),DATE_ADD('1970-01-01', INTERVAL UNIX_TIMESTAMP() SECOND)+0,20420267,22861),(4910,0,'29_января','',0,0,0,RAND(),DATE_ADD('1970-01-01', INTERVAL UNIX_TIMESTAMP() SECOND)+0,20352730,19695); Maybe mysql didn't accept the full line for some reason? -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 21917] mwdumper does not generates some page_id's
https://bugzilla.wikimedia.org/show_bug.cgi?id=21917 --- Comment #5 from Kein Zantezuken zantezu...@gmail.com 2009-12-23 17:07:26 UTC --- No, as I said I didn't get INSERT for that page at all. Can you compile latest mwdumper for window, please? So, I can test. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 21917] mwdumper does not generates some page_id's
https://bugzilla.wikimedia.org/show_bug.cgi?id=21917 --- Comment #6 from Platonides platoni...@gmail.com 2009-12-23 17:46:12 UTC --- I used the same mwdumper.jar as you. Jar files work cross platform. How were you looking for the insert? -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 21917] mwdumper does not generates some page_id's
https://bugzilla.wikimedia.org/show_bug.cgi?id=21917 --- Comment #7 from Kein Zantezuken zantezu...@gmail.com 2009-12-23 19:40:49 UTC --- I searched the whole dump for INSERT into `page` with 'Операционная система'. I found many articles which I already has in DB, but articles which is missing in my current DB missing in the generated SQL-dump as well. Well, the only missing thing is INSERT into `page`, old_text, old_data and old_id is here. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 21917] mwdumper does not generates some page_id's
https://bugzilla.wikimedia.org/show_bug.cgi?id=21917 --- Comment #8 from Platonides platoni...@gmail.com 2009-12-23 22:05:45 UTC --- Don't look for 'Операционная система', you must look for 'Операционная_система', it will be in db form, with spaces converted into underscores. There will be three instances: [[Операционная система]] which is the insert line I included above, [[Category:Операционная система]] and [[Template:Операционная система]]. All of them INSERT INTO page lines, albeit really long lines. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 21917] mwdumper does not generates some page_id's
https://bugzilla.wikimedia.org/show_bug.cgi?id=21917 --- Comment #3 from Kein Zantezuken zantezu...@gmail.com 2009-12-22 22:31:04 UTC --- Ok, here is the one of missing articles: http:;/shinra.ru/kein/w/operating_system.xml (40Kb, UTF-8) Dunno how can I help else : Really annoying bug, make the whole dumps useless since I can't import things properly ; -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 21917] mwdumper does not generates some page_id's
https://bugzilla.wikimedia.org/show_bug.cgi?id=21917 Platonides platoni...@gmail.com changed: What|Removed |Added CC||platoni...@gmail.com --- Comment #1 from Platonides platoni...@gmail.com 2009-12-21 17:30:44 UTC --- Which xml dump is it? Can you provide some of those missing articles? -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 21917] mwdumper does not generates some page_id's
https://bugzilla.wikimedia.org/show_bug.cgi?id=21917 --- Comment #2 from Kein Zantezuken zantezu...@gmail.com 2009-12-21 17:38:53 UTC --- It's http://download.wikimedia.org/ruwiki/20091207/ruwiki-20091207-pages-articles.xml.bz2 Article Операционная система for example (line number 515495 in the dump). -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l