Tim Besard a écrit : > Hi, > > It seems that the Dutch wikipedia contains some UTF-8 only characters, > which crashes the parser after all due to the "system echo" in the > exception handler. Changing the offending line to > os.system('echo \"%s\" >> fault_articles.txt' % > title.encode("utf8")) > fixes the issue. > > Tim > Well, thanks a lot Tim, the error occured also on the french parsing. And as I told before, I'm a Pythonbeginner, so the only way I found to avoid this was to ... remove the line, and keep the counter alive.
For information I finished rendering wfrench Wikipedia dump : 1 140 000 articles 61 false articles parsing took 12 hours rendering 18 hours The image weights 1,6 Gigs, but only in one file (don't sure it is normal ?) All this was done on a QuadCore 2.2Ghz, 2Go Ram. I have to notice that the disk is NTFS, perhaps a ext4 would be better (my mount process dramatically worked during those processes). The image is readable by the emulator, but as it was finish while I'm at the office, I could only try with a deported X display (through SSH).... I Will post later (at home) when the file will be in the reader. Some friends will host the file, and I'm working on a automated script (weekly french image ?) See you tonight Thomas from "Wikilecteur" Team _______________________________________________ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community