David Reyes Samblas Martinez wrote: > Don't hold your breath :( failing at Count: 832000
Same error as I? > David Reyes Samblas Martinez > http://www.tuxbrain.com > Open ultraportable & embedded solutions > Openmoko, Openpandora, Arduino > Hey, watch out!!! There's a linux in your pocket!!! > > > > > 2009/11/20 Tilman Baumann <til...@baumann.name>: >> >> David Reyes Samblas Martinez wrote: >>> Well spanish one give me the same error before but now it works, >> Any idea what solved it? Or is it just random and will go away if I try >> it >> again? :) >> >>> I'm parsing the de wikipedia right now (Count: 173000) lets see whats >>> happens :) >> >> I would definitely be interessted in the results... >> >>> Note:Parsing the 2009-Nov-11 >>> http://download.wikipedia.org/dewiki/latest/dewiki-latest-pages-articles.xml.bz2 >>> >>> Regards >>> >>> David Reyes Samblas Martinez >>> http://www.tuxbrain.com >>> Open ultraportable & embedded solutions >>> Openmoko, Openpandora, Arduino >>> Hey, watch out!!! There's a linux in your pocket!!! >>> >>> >>> >>> >>> 2009/11/20 Tilman Baumann <til...@baumann.name>: >>>> Can you reproduce this with a neutral locale? >>>> export LC_ALL=C >>>> >>>> I'm at the moment trying the same. I had a lot of hickups, caused by >>>> many >>>> things. Among them missing tools and not enough memory. >>>> >>>> This is currently where I'm stuck with the German wikipedia. >>>> >>>> Count: 823000 >>>> Count: 824000 >>>> Count: 825000 >>>> Count: 826000 >>>> Count: 827000 >>>> Count: 828000 >>>> Count: 829000 >>>> Count: 830000 >>>> Count: 831000 >>>> Count: 832000 >>>> Count: 833000 >>>> Traceback (most recent call last): >>>> File "./ArticleParser.py", line 203, in <module> >>>> main() >>>> File "./ArticleParser.py", line 168, in main >>>> process_article_text(title.encode('utf-8'), f.read(length), newf) >>>> File "./ArticleParser.py", line 197, in process_article_text >>>> newf.write(text + '\n') >>>> IOError: [Errno 32] Broken pipe >>>> make[1]: *** [parse] Error 1 >>>> make[1]: Leaving directory >>>> `/home/tilli/wikireader/host-tools/offline-renderer' >>>> make: *** [parse] Error 2 >>>> >>>> I suppose it failed somewhere in PARSER_COMMAND >>>> >>>> >>>> Before that, the following steps went through without fail. >>>> make >>>> make DESTDIR=image WORKDIR=work >>>> XML_FILES=dewiki-20091028-pages-articles.xml index >>>> >>>> >>>> David Reyes Samblas Martinez wrote: >>>>> After the "success" of the spanish wikipedia pending to resolve the >>>>> indexing part, I was starting to work on the german wikipedia >>>>> http://download.wikipedia.org/dewiki/latest/dewiki-latest-pages-meta-current.xml.bz2 >>>>> >>>>> but it fails at first step with the following error >>>>> >>>>> #make DESTDIR=image WORKDIR=work >>>>> XML_FILES=dewiki-latest-pages-meta-current.xml index parse render >>>>> combine >>>>> >>>>> awk: línea ord.:1: fatal: no se puede abrir el fichero >>>>> `work/counts.text' para lectura (No existe el fichero ó directorio) >>>>> cd host-tools/offline-renderer && make index \ >>>>> >>>>> XML_FILES="/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/dewiki-latest-pages-meta-current.xml" >>>>> RENDER_BLOCK="0" \ >>>>> >>>>> WORKDIR="/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work" >>>>> DESTDIR="/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/image" >>>>> make[1]: se ingresa al directorio >>>>> `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer' >>>>> ./ArticleIndex.py \ >>>>> >>>>> --article-index="/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/articles.db" >>>>> \ >>>>> >>>>> --article-offsets="/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/offsets.db" >>>>> \ >>>>> >>>>> --article-counts="/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/work/counts.text" >>>>> \ >>>>> >>>>> --prefix="/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/image/pedia" >>>>> /OE/Proyectos/tuxbrain/productos/wikireader/wikireader/dewiki-latest-pages-meta-current.xml >>>>> Traceback (most recent call last): >>>>> File "./ArticleIndex.py", line 611, in <module> >>>>> main() >>>>> File "./ArticleIndex.py", line 172, in main >>>>> limit = processor.process(f, limit) >>>>> File >>>>> "/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer/FileScanner.py", >>>>> line 141, in process >>>>> if '#' == body[0] and 'redirect' == body[1:9].lower(): >>>>> IndexError: string index out of range >>>>> Flushing databases >>>>> Writing: files >>>>> Time: 0s >>>>> Writing: articles >>>>> Time: 0s >>>>> Writing: offsets >>>>> Time: 0s >>>>> Loading: articles >>>>> Time: 0s >>>>> Loading: offsets and files >>>>> Time: 0s >>>>> make[1]: *** [index] Error 1 >>>>> make[1]: se sale del directorio >>>>> `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer' >>>>> make: *** [index] Error 2 >>>>> >>>>> Regards >>>>> >>>>> David Reyes Samblas Martinez >>>>> http://www.tuxbrain.com >>>>> Open ultraportable & embedded solutions >>>>> Openmoko, Openpandora, Arduino >>>>> Hey, watch out!!! There's a linux in your pocket!!! >>>>> >>>>> _______________________________________________ >>>>> Openmoko community mailing list >>>>> community@lists.openmoko.org >>>>> http://lists.openmoko.org/mailman/listinfo/community >>>>> >>>> >>>> >>>> -- >>>> >>>> >>>> >>>> _______________________________________________ >>>> Openmoko community mailing list >>>> community@lists.openmoko.org >>>> http://lists.openmoko.org/mailman/listinfo/community >>>> >>> >>> _______________________________________________ >>> Openmoko community mailing list >>> community@lists.openmoko.org >>> http://lists.openmoko.org/mailman/listinfo/community >>> >> >> >> -- >> >> >> >> _______________________________________________ >> Openmoko community mailing list >> community@lists.openmoko.org >> http://lists.openmoko.org/mailman/listinfo/community >> > > _______________________________________________ > Openmoko community mailing list > community@lists.openmoko.org > http://lists.openmoko.org/mailman/listinfo/community > -- _______________________________________________ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community