Re: [Wikireader] Error on processing the German Wikipedia
Can you reproduce this with a neutral locale? export LC_ALL=C I'm at the moment trying the same. I had a lot of hickups, caused by many things. Among them missing tools and not enough memory. This is currently where I'm stuck with the German wikipedia. Count: 823000 Count: 824000 Count: 825000 Count: 826000 Count: 827000 Count: 828000 Count: 829000 Count: 83 Count: 831000 Count: 832000 Count: 833000 Traceback (most recent call last): File ./ArticleParser.py, line 203, in module main() File ./ArticleParser.py, line 168, in main process_article_text(title.encode('utf-8'), f.read(length), newf) File ./ArticleParser.py, line 197, in process_article_text newf.write(text + '\n') IOError: [Errno 32] Broken pipe make[1]: *** [parse] Error 1 make[1]: Leaving directory `/home/tilli/wikireader/host-tools/offline-renderer' make: *** [parse] Error 2 I suppose it failed somewhere in PARSER_COMMAND Before that, the following steps went through without fail. make make DESTDIR=image WORKDIR=work XML_FILES=dewiki-20091028-pages-articles.xml index David Reyes Samblas Martinez wrote: After the success of the spanish wikipedia pending to resolve the indexing part, I was starting to work on the german wikipedia http://download.wikipedia.org/dewiki/latest/dewiki-latest-pages-meta-current.xml.bz2 but it fails at first step with the following error #make DESTDIR=image WORKDIR=work XML_FILES=dewiki-latest-pages-meta-current.xml index parse render combine awk: línea ord.:1: fatal: no se puede abrir el fichero `work/counts.text' para lectura (No existe el fichero ó directorio) cd host-tools/offline-renderer make index \ XML_FILES=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/dewiki-latest-pages-meta-current.xml RENDER_BLOCK=0 \ WORKDIR=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work DESTDIR=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/image make[1]: se ingresa al directorio `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer' ./ArticleIndex.py \ --article-index=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/articles.db \ --article-offsets=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/offsets.db \ --article-counts=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/counts.text \ --prefix=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/image/pedia /OE/Proyectos/tuxbrain/productos/wikireader/wikireader/dewiki-latest-pages-meta-current.xml Traceback (most recent call last): File ./ArticleIndex.py, line 611, in module main() File ./ArticleIndex.py, line 172, in main limit = processor.process(f, limit) File /OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer/FileScanner.py, line 141, in process if '#' == body[0] and 'redirect' == body[1:9].lower(): IndexError: string index out of range Flushing databases Writing: files Time: 0s Writing: articles Time: 0s Writing: offsets Time: 0s Loading: articles Time: 0s Loading: offsets and files Time: 0s make[1]: *** [index] Error 1 make[1]: se sale del directorio `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer' make: *** [index] Error 2 Regards David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community -- ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [Wikireader] Error on processing the German Wikipedia
Well spanish one give me the same error before but now it works, I'm parsing the de wikipedia right now (Count: 173000) lets see whats happens :) Note:Parsing the 2009-Nov-11 http://download.wikipedia.org/dewiki/latest/dewiki-latest-pages-articles.xml.bz2 Regards David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! 2009/11/20 Tilman Baumann til...@baumann.name: Can you reproduce this with a neutral locale? export LC_ALL=C I'm at the moment trying the same. I had a lot of hickups, caused by many things. Among them missing tools and not enough memory. This is currently where I'm stuck with the German wikipedia. Count: 823000 Count: 824000 Count: 825000 Count: 826000 Count: 827000 Count: 828000 Count: 829000 Count: 83 Count: 831000 Count: 832000 Count: 833000 Traceback (most recent call last): File ./ArticleParser.py, line 203, in module main() File ./ArticleParser.py, line 168, in main process_article_text(title.encode('utf-8'), f.read(length), newf) File ./ArticleParser.py, line 197, in process_article_text newf.write(text + '\n') IOError: [Errno 32] Broken pipe make[1]: *** [parse] Error 1 make[1]: Leaving directory `/home/tilli/wikireader/host-tools/offline-renderer' make: *** [parse] Error 2 I suppose it failed somewhere in PARSER_COMMAND Before that, the following steps went through without fail. make make DESTDIR=image WORKDIR=work XML_FILES=dewiki-20091028-pages-articles.xml index David Reyes Samblas Martinez wrote: After the success of the spanish wikipedia pending to resolve the indexing part, I was starting to work on the german wikipedia http://download.wikipedia.org/dewiki/latest/dewiki-latest-pages-meta-current.xml.bz2 but it fails at first step with the following error #make DESTDIR=image WORKDIR=work XML_FILES=dewiki-latest-pages-meta-current.xml index parse render combine awk: línea ord.:1: fatal: no se puede abrir el fichero `work/counts.text' para lectura (No existe el fichero ó directorio) cd host-tools/offline-renderer make index \ XML_FILES=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/dewiki-latest-pages-meta-current.xml RENDER_BLOCK=0 \ WORKDIR=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work DESTDIR=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/image make[1]: se ingresa al directorio `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer' ./ArticleIndex.py \ --article-index=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/articles.db \ --article-offsets=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/offsets.db \ --article-counts=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/counts.text \ --prefix=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/image/pedia /OE/Proyectos/tuxbrain/productos/wikireader/wikireader/dewiki-latest-pages-meta-current.xml Traceback (most recent call last): File ./ArticleIndex.py, line 611, in module main() File ./ArticleIndex.py, line 172, in main limit = processor.process(f, limit) File /OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer/FileScanner.py, line 141, in process if '#' == body[0] and 'redirect' == body[1:9].lower(): IndexError: string index out of range Flushing databases Writing: files Time: 0s Writing: articles Time: 0s Writing: offsets Time: 0s Loading: articles Time: 0s Loading: offsets and files Time: 0s make[1]: *** [index] Error 1 make[1]: se sale del directorio `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer' make: *** [index] Error 2 Regards David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community -- ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [Wikireader] Error on processing the German Wikipedia
David Reyes Samblas Martinez wrote: Well spanish one give me the same error before but now it works, Any idea what solved it? Or is it just random and will go away if I try it again? :) I'm parsing the de wikipedia right now (Count: 173000) lets see whats happens :) I would definitely be interessted in the results... Note:Parsing the 2009-Nov-11 http://download.wikipedia.org/dewiki/latest/dewiki-latest-pages-articles.xml.bz2 Regards David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! 2009/11/20 Tilman Baumann til...@baumann.name: Can you reproduce this with a neutral locale? export LC_ALL=C I'm at the moment trying the same. I had a lot of hickups, caused by many things. Among them missing tools and not enough memory. This is currently where I'm stuck with the German wikipedia. Count: 823000 Count: 824000 Count: 825000 Count: 826000 Count: 827000 Count: 828000 Count: 829000 Count: 83 Count: 831000 Count: 832000 Count: 833000 Traceback (most recent call last): File ./ArticleParser.py, line 203, in module main() File ./ArticleParser.py, line 168, in main process_article_text(title.encode('utf-8'), f.read(length), newf) File ./ArticleParser.py, line 197, in process_article_text newf.write(text + '\n') IOError: [Errno 32] Broken pipe make[1]: *** [parse] Error 1 make[1]: Leaving directory `/home/tilli/wikireader/host-tools/offline-renderer' make: *** [parse] Error 2 I suppose it failed somewhere in PARSER_COMMAND Before that, the following steps went through without fail. make make DESTDIR=image WORKDIR=work XML_FILES=dewiki-20091028-pages-articles.xml index David Reyes Samblas Martinez wrote: After the success of the spanish wikipedia pending to resolve the indexing part, I was starting to work on the german wikipedia http://download.wikipedia.org/dewiki/latest/dewiki-latest-pages-meta-current.xml.bz2 but it fails at first step with the following error #make DESTDIR=image WORKDIR=work XML_FILES=dewiki-latest-pages-meta-current.xml index parse render combine awk: línea ord.:1: fatal: no se puede abrir el fichero `work/counts.text' para lectura (No existe el fichero ó directorio) cd host-tools/offline-renderer make index \ XML_FILES=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/dewiki-latest-pages-meta-current.xml RENDER_BLOCK=0 \ WORKDIR=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work DESTDIR=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/image make[1]: se ingresa al directorio `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer' ./ArticleIndex.py \ --article-index=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/articles.db \ --article-offsets=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/offsets.db \ --article-counts=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/counts.text \ --prefix=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/image/pedia /OE/Proyectos/tuxbrain/productos/wikireader/wikireader/dewiki-latest-pages-meta-current.xml Traceback (most recent call last): File ./ArticleIndex.py, line 611, in module main() File ./ArticleIndex.py, line 172, in main limit = processor.process(f, limit) File /OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer/FileScanner.py, line 141, in process if '#' == body[0] and 'redirect' == body[1:9].lower(): IndexError: string index out of range Flushing databases Writing: files Time: 0s Writing: articles Time: 0s Writing: offsets Time: 0s Loading: articles Time: 0s Loading: offsets and files Time: 0s make[1]: *** [index] Error 1 make[1]: se sale del directorio `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer' make: *** [index] Error 2 Regards David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community -- ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community -- ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [Wikireader] Error on processing the German Wikipedia
Don't hold your breath :( failing at Count: 832000 David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! 2009/11/20 Tilman Baumann til...@baumann.name: David Reyes Samblas Martinez wrote: Well spanish one give me the same error before but now it works, Any idea what solved it? Or is it just random and will go away if I try it again? :) I'm parsing the de wikipedia right now (Count: 173000) lets see whats happens :) I would definitely be interessted in the results... Note:Parsing the 2009-Nov-11 http://download.wikipedia.org/dewiki/latest/dewiki-latest-pages-articles.xml.bz2 Regards David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! 2009/11/20 Tilman Baumann til...@baumann.name: Can you reproduce this with a neutral locale? export LC_ALL=C I'm at the moment trying the same. I had a lot of hickups, caused by many things. Among them missing tools and not enough memory. This is currently where I'm stuck with the German wikipedia. Count: 823000 Count: 824000 Count: 825000 Count: 826000 Count: 827000 Count: 828000 Count: 829000 Count: 83 Count: 831000 Count: 832000 Count: 833000 Traceback (most recent call last): File ./ArticleParser.py, line 203, in module main() File ./ArticleParser.py, line 168, in main process_article_text(title.encode('utf-8'), f.read(length), newf) File ./ArticleParser.py, line 197, in process_article_text newf.write(text + '\n') IOError: [Errno 32] Broken pipe make[1]: *** [parse] Error 1 make[1]: Leaving directory `/home/tilli/wikireader/host-tools/offline-renderer' make: *** [parse] Error 2 I suppose it failed somewhere in PARSER_COMMAND Before that, the following steps went through without fail. make make DESTDIR=image WORKDIR=work XML_FILES=dewiki-20091028-pages-articles.xml index David Reyes Samblas Martinez wrote: After the success of the spanish wikipedia pending to resolve the indexing part, I was starting to work on the german wikipedia http://download.wikipedia.org/dewiki/latest/dewiki-latest-pages-meta-current.xml.bz2 but it fails at first step with the following error #make DESTDIR=image WORKDIR=work XML_FILES=dewiki-latest-pages-meta-current.xml index parse render combine awk: línea ord.:1: fatal: no se puede abrir el fichero `work/counts.text' para lectura (No existe el fichero ó directorio) cd host-tools/offline-renderer make index \ XML_FILES=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/dewiki-latest-pages-meta-current.xml RENDER_BLOCK=0 \ WORKDIR=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work DESTDIR=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/image make[1]: se ingresa al directorio `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer' ./ArticleIndex.py \ --article-index=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/articles.db \ --article-offsets=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/offsets.db \ --article-counts=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/counts.text \ --prefix=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/image/pedia /OE/Proyectos/tuxbrain/productos/wikireader/wikireader/dewiki-latest-pages-meta-current.xml Traceback (most recent call last): File ./ArticleIndex.py, line 611, in module main() File ./ArticleIndex.py, line 172, in main limit = processor.process(f, limit) File /OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer/FileScanner.py, line 141, in process if '#' == body[0] and 'redirect' == body[1:9].lower(): IndexError: string index out of range Flushing databases Writing: files Time: 0s Writing: articles Time: 0s Writing: offsets Time: 0s Loading: articles Time: 0s Loading: offsets and files Time: 0s make[1]: *** [index] Error 1 make[1]: se sale del directorio `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer' make: *** [index] Error 2 Regards David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community -- ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community --
Re: [Wikireader] Error on processing the German Wikipedia
David Reyes Samblas Martinez wrote: Don't hold your breath :( failing at Count: 832000 Same error as I? David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! 2009/11/20 Tilman Baumann til...@baumann.name: David Reyes Samblas Martinez wrote: Well spanish one give me the same error before but now it works, Any idea what solved it? Or is it just random and will go away if I try it again? :) I'm parsing the de wikipedia right now (Count: 173000) lets see whats happens :) I would definitely be interessted in the results... Note:Parsing the 2009-Nov-11 http://download.wikipedia.org/dewiki/latest/dewiki-latest-pages-articles.xml.bz2 Regards David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! 2009/11/20 Tilman Baumann til...@baumann.name: Can you reproduce this with a neutral locale? export LC_ALL=C I'm at the moment trying the same. I had a lot of hickups, caused by many things. Among them missing tools and not enough memory. This is currently where I'm stuck with the German wikipedia. Count: 823000 Count: 824000 Count: 825000 Count: 826000 Count: 827000 Count: 828000 Count: 829000 Count: 83 Count: 831000 Count: 832000 Count: 833000 Traceback (most recent call last): File ./ArticleParser.py, line 203, in module main() File ./ArticleParser.py, line 168, in main process_article_text(title.encode('utf-8'), f.read(length), newf) File ./ArticleParser.py, line 197, in process_article_text newf.write(text + '\n') IOError: [Errno 32] Broken pipe make[1]: *** [parse] Error 1 make[1]: Leaving directory `/home/tilli/wikireader/host-tools/offline-renderer' make: *** [parse] Error 2 I suppose it failed somewhere in PARSER_COMMAND Before that, the following steps went through without fail. make make DESTDIR=image WORKDIR=work XML_FILES=dewiki-20091028-pages-articles.xml index David Reyes Samblas Martinez wrote: After the success of the spanish wikipedia pending to resolve the indexing part, I was starting to work on the german wikipedia http://download.wikipedia.org/dewiki/latest/dewiki-latest-pages-meta-current.xml.bz2 but it fails at first step with the following error #make DESTDIR=image WORKDIR=work XML_FILES=dewiki-latest-pages-meta-current.xml index parse render combine awk: línea ord.:1: fatal: no se puede abrir el fichero `work/counts.text' para lectura (No existe el fichero ó directorio) cd host-tools/offline-renderer make index \ XML_FILES=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/dewiki-latest-pages-meta-current.xml RENDER_BLOCK=0 \ WORKDIR=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work DESTDIR=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/image make[1]: se ingresa al directorio `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer' ./ArticleIndex.py \ --article-index=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/articles.db \ --article-offsets=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/offsets.db \ --article-counts=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/counts.text \ --prefix=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/image/pedia /OE/Proyectos/tuxbrain/productos/wikireader/wikireader/dewiki-latest-pages-meta-current.xml Traceback (most recent call last): File ./ArticleIndex.py, line 611, in module main() File ./ArticleIndex.py, line 172, in main limit = processor.process(f, limit) File /OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer/FileScanner.py, line 141, in process if '#' == body[0] and 'redirect' == body[1:9].lower(): IndexError: string index out of range Flushing databases Writing: files Time: 0s Writing: articles Time: 0s Writing: offsets Time: 0s Loading: articles Time: 0s Loading: offsets and files Time: 0s make[1]: *** [index] Error 1 make[1]: se sale del directorio `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer' make: *** [index] Error 2 Regards David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community -- ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community ___ Openmoko community mailing list community@lists.openmoko.org
Re: [Wikireader] Error on processing the German Wikipedia
yes :( David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! 2009/11/20 Tilman Baumann til...@baumann.name: David Reyes Samblas Martinez wrote: Don't hold your breath :( failing at Count: 832000 Same error as I? David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! 2009/11/20 Tilman Baumann til...@baumann.name: David Reyes Samblas Martinez wrote: Well spanish one give me the same error before but now it works, Any idea what solved it? Or is it just random and will go away if I try it again? :) I'm parsing the de wikipedia right now (Count: 173000) lets see whats happens :) I would definitely be interessted in the results... Note:Parsing the 2009-Nov-11 http://download.wikipedia.org/dewiki/latest/dewiki-latest-pages-articles.xml.bz2 Regards David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! 2009/11/20 Tilman Baumann til...@baumann.name: Can you reproduce this with a neutral locale? export LC_ALL=C I'm at the moment trying the same. I had a lot of hickups, caused by many things. Among them missing tools and not enough memory. This is currently where I'm stuck with the German wikipedia. Count: 823000 Count: 824000 Count: 825000 Count: 826000 Count: 827000 Count: 828000 Count: 829000 Count: 83 Count: 831000 Count: 832000 Count: 833000 Traceback (most recent call last): File ./ArticleParser.py, line 203, in module main() File ./ArticleParser.py, line 168, in main process_article_text(title.encode('utf-8'), f.read(length), newf) File ./ArticleParser.py, line 197, in process_article_text newf.write(text + '\n') IOError: [Errno 32] Broken pipe make[1]: *** [parse] Error 1 make[1]: Leaving directory `/home/tilli/wikireader/host-tools/offline-renderer' make: *** [parse] Error 2 I suppose it failed somewhere in PARSER_COMMAND Before that, the following steps went through without fail. make make DESTDIR=image WORKDIR=work XML_FILES=dewiki-20091028-pages-articles.xml index David Reyes Samblas Martinez wrote: After the success of the spanish wikipedia pending to resolve the indexing part, I was starting to work on the german wikipedia http://download.wikipedia.org/dewiki/latest/dewiki-latest-pages-meta-current.xml.bz2 but it fails at first step with the following error #make DESTDIR=image WORKDIR=work XML_FILES=dewiki-latest-pages-meta-current.xml index parse render combine awk: línea ord.:1: fatal: no se puede abrir el fichero `work/counts.text' para lectura (No existe el fichero ó directorio) cd host-tools/offline-renderer make index \ XML_FILES=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/dewiki-latest-pages-meta-current.xml RENDER_BLOCK=0 \ WORKDIR=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work DESTDIR=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/image make[1]: se ingresa al directorio `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer' ./ArticleIndex.py \ --article-index=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/articles.db \ --article-offsets=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/offsets.db \ --article-counts=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/counts.text \ --prefix=/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/image/pedia /OE/Proyectos/tuxbrain/productos/wikireader/wikireader/dewiki-latest-pages-meta-current.xml Traceback (most recent call last): File ./ArticleIndex.py, line 611, in module main() File ./ArticleIndex.py, line 172, in main limit = processor.process(f, limit) File /OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer/FileScanner.py, line 141, in process if '#' == body[0] and 'redirect' == body[1:9].lower(): IndexError: string index out of range Flushing databases Writing: files Time: 0s Writing: articles Time: 0s Writing: offsets Time: 0s Loading: articles Time: 0s Loading: offsets and files Time: 0s make[1]: *** [index] Error 1 make[1]: se sale del directorio `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer' make: *** [index] Error 2 Regards David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community -- ___