Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread Sean Moss-Pultz
On Fri, Oct 30, 2009 at 4:50 AM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:
 Hi I'm trying to generate the file for a spainsh wikipedia on the WR ,
 after compiling succsesfuly the source on the git and solve some
 annoyings with utf8 encoding on phyton error was somthing like this:
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
 position: ordinal not in range(128)
 this was solved changing the default encode ascii to utf8 int the
 /usr/lib/python2.6/site.py file
 after this I was hable to execute ok the instruction:
 make DESTDIR=image WORKDIR=work
 XML_FILES=xml-file-samples/eswiki-latest-pages-articles.xml index
 parse render combine

 Every thing seem fine for a couple(about 6-7h) of hours parsing the
 70 articles in spanish but  then ... the horror
 Count: 38
 Traceback (most recent call last):
  File ./ArticleParser.py, line 224, in module
    main()
  File ./ArticleParser.py, line 172, in main
    process_article_text(title.encode('utf-8'),  f.read(length), newf)
  File ./ArticleParser.py, line 218, in process_article_text
    newf.write(text + '\n')
 IOError: [Errno 32] Broken pipe
 make[1]: *** [parse] Error 1
 make[1]: se sale del directorio
 `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer'
 make: *** [parse] Error 2

OK that's fixed now. Chris already checked in the code. Our build
worked fine. We need to do a few more tweaks and then we can post a
(super) early test image. Give us until early this coming week.

  -Sean

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread David Reyes Samblas Martinez
Are you uploading this changes to git? can I take a look?

David Reyes Samblas Martinez
http://www.tuxbrain.com
Open ultraportable  embedded solutions
Openmoko, Openpandora,  Arduino
Hey, watch out!!! There's a linux in your pocket!!!




2009/10/30 Sean Moss-Pultz s...@openmoko.com:
 On Fri, Oct 30, 2009 at 4:50 AM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 Hi I'm trying to generate the file for a spainsh wikipedia on the WR ,
 after compiling succsesfuly the source on the git and solve some
 annoyings with utf8 encoding on phyton error was somthing like this:
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
 position: ordinal not in range(128)
 this was solved changing the default encode ascii to utf8 int the
 /usr/lib/python2.6/site.py file
 after this I was hable to execute ok the instruction:
 make DESTDIR=image WORKDIR=work
 XML_FILES=xml-file-samples/eswiki-latest-pages-articles.xml index
 parse render combine

 Every thing seem fine for a couple(about 6-7h) of hours parsing the
 70 articles in spanish but  then ... the horror
 Count: 38
 Traceback (most recent call last):
  File ./ArticleParser.py, line 224, in module
    main()
  File ./ArticleParser.py, line 172, in main
    process_article_text(title.encode('utf-8'),  f.read(length), newf)
  File ./ArticleParser.py, line 218, in process_article_text
    newf.write(text + '\n')
 IOError: [Errno 32] Broken pipe
 make[1]: *** [parse] Error 1
 make[1]: se sale del directorio
 `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer'
 make: *** [parse] Error 2

 OK that's fixed now. Chris already checked in the code. Our build
 worked fine. We need to do a few more tweaks and then we can post a
 (super) early test image. Give us until early this coming week.

  -Sean

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community


___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread Laszlo KREKACS
On Fri, Oct 30, 2009 at 4:22 PM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:
 Are you uploading this changes to git? can I take a look?

Btw is there any plan to implement images rendering?

If so, any time estimation?

Best regards,
 Laszlo

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread David Reyes Samblas Martinez
just an think I realized , all faulty articles the title starts with
the ~ simbol
regards
David Reyes Samblas Martinez
http://www.tuxbrain.com
Open ultraportable  embedded solutions
Openmoko, Openpandora,  Arduino
Hey, watch out!!! There's a linux in your pocket!!!




2009/10/30 David Reyes Samblas Martinez da...@tuxbrain.com:
 Are you uploading this changes to git? can I take a look?

 David Reyes Samblas Martinez
 http://www.tuxbrain.com
 Open ultraportable  embedded solutions
 Openmoko, Openpandora,  Arduino
 Hey, watch out!!! There's a linux in your pocket!!!




 2009/10/30 Sean Moss-Pultz s...@openmoko.com:
 On Fri, Oct 30, 2009 at 4:50 AM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 Hi I'm trying to generate the file for a spainsh wikipedia on the WR ,
 after compiling succsesfuly the source on the git and solve some
 annoyings with utf8 encoding on phyton error was somthing like this:
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
 position: ordinal not in range(128)
 this was solved changing the default encode ascii to utf8 int the
 /usr/lib/python2.6/site.py file
 after this I was hable to execute ok the instruction:
 make DESTDIR=image WORKDIR=work
 XML_FILES=xml-file-samples/eswiki-latest-pages-articles.xml index
 parse render combine

 Every thing seem fine for a couple(about 6-7h) of hours parsing the
 70 articles in spanish but  then ... the horror
 Count: 38
 Traceback (most recent call last):
  File ./ArticleParser.py, line 224, in module
    main()
  File ./ArticleParser.py, line 172, in main
    process_article_text(title.encode('utf-8'),  f.read(length), newf)
  File ./ArticleParser.py, line 218, in process_article_text
    newf.write(text + '\n')
 IOError: [Errno 32] Broken pipe
 make[1]: *** [parse] Error 1
 make[1]: se sale del directorio
 `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer'
 make: *** [parse] Error 2

 OK that's fixed now. Chris already checked in the code. Our build
 worked fine. We need to do a few more tweaks and then we can post a
 (super) early test image. Give us until early this coming week.

  -Sean

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community



___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread David Reyes Samblas Martinez
2009/10/30 Laszlo KREKACS laszlo.krekacs.l...@gmail.com:
 On Fri, Oct 30, 2009 at 4:22 PM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 Are you uploading this changes to git? can I take a look?

 Btw is there any plan to implement images rendering?

 If so, any time estimation?

 Best regards,
  Laszlo

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community

Some kind of renderer has been already implemented because keyboard,
and the erase history dialog are images .  I'm wrong?

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread Sean Moss-Pultz
On Fri, Oct 30, 2009 at 11:22 PM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:
 Are you uploading this changes to git? can I take a look?

Yes. The latest commit fixes it. Have a look here:

  http://github.com/wikireader/wikireader

Sean

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread Sean Moss-Pultz
On Fri, Oct 30, 2009 at 11:29 PM, Laszlo KREKACS
laszlo.krekacs.l...@gmail.com wrote:
 On Fri, Oct 30, 2009 at 4:22 PM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 Are you uploading this changes to git? can I take a look?

 Btw is there any plan to implement images rendering?

Math (images) are on our roadmap. Hopefully before the end of this
year. The screen is only 1bit. So anything else would look kinda
funny.

 -Sean

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread Sean Moss-Pultz
On Sat, Oct 31, 2009 at 2:46 AM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:
 just an think I realized , all faulty articles the title starts with
 the ~ simbol

David

No that's not a problem. That character gets removed in a later build
stage. We had to add that because of a integer conversion issue with
SQLite. It was automatically converting articles like 1984 into
integers (not strings) and storing them in the database.

SQLite, BTW, claims this is a feature.

Sean

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-29 Thread Sean Moss-Pultz
David

We're working on exactly the same thing now :-)

I'll ask Chris to email the list once we get past it. I think the
problem is with the mixtures of different encodings (latin-1 and
UTF-8) in the Spanish Wikipedia and the way our code is handling this.
For some reason Python's print  (at times) wants to default to ascii,
even after we explicitly tell it to use UTF-8.

  -Sean


On Fri, Oct 30, 2009 at 4:50 AM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:

 Hi I'm trying to generate the file for a spainsh wikipedia on the WR ,
 after compiling succsesfuly the source on the git and solve some
 annoyings with utf8 encoding on phyton error was somthing like this:
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
 position: ordinal not in range(128)
 this was solved changing the default encode ascii to utf8 int the
 /usr/lib/python2.6/site.py file
 after this I was hable to execute ok the instruction:
 make DESTDIR=image WORKDIR=work
 XML_FILES=xml-file-samples/eswiki-latest-pages-articles.xml index
 parse render combine

 Every thing seem fine for a couple(about 6-7h) of hours parsing the
 70 articles in spanish but  then ... the horror
 Count: 38
 Traceback (most recent call last):
  File ./ArticleParser.py, line 224, in module
    main()
  File ./ArticleParser.py, line 172, in main
    process_article_text(title.encode('utf-8'),  f.read(length), newf)
  File ./ArticleParser.py, line 218, in process_article_text
    newf.write(text + '\n')
 IOError: [Errno 32] Broken pipe
 make[1]: *** [parse] Error 1
 make[1]: se sale del directorio
 `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer'
 make: *** [parse] Error 2

 I have relaunched the process again with the (few)hope that was a
 temporary fault but If any one has a clue will be helpfull.

 BTW.- I documenting all this proccess to make a step by step howto on
 how to put the wikipedia in other languages on the wikireader.



 David Reyes Samblas Martinez
 http://www.tuxbrain.com
 Open ultraportable  embedded solutions
 Openmoko, Openpandora,  Arduino
 Hey, watch out!!! There's a linux in your pocket!!!

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-29 Thread David Reyes Samblas Martinez
Great! :) good to see you are working on this!, please count on me for
any testing to be done, I will try to make a look on the code myself
to kill the bug but no time and nor expertise so no promises :P
David Reyes Samblas Martinez
http://www.tuxbrain.com
Open ultraportable  embedded solutions
Openmoko, Openpandora,  Arduino
Hey, watch out!!! There's a linux in your pocket!!!




2009/10/30 Sean Moss-Pultz s...@openmoko.com:
 David

 We're working on exactly the same thing now :-)

 I'll ask Chris to email the list once we get past it. I think the
 problem is with the mixtures of different encodings (latin-1 and
 UTF-8) in the Spanish Wikipedia and the way our code is handling this.
 For some reason Python's print  (at times) wants to default to ascii,
 even after we explicitly tell it to use UTF-8.

  -Sean


 On Fri, Oct 30, 2009 at 4:50 AM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:

 Hi I'm trying to generate the file for a spainsh wikipedia on the WR ,
 after compiling succsesfuly the source on the git and solve some
 annoyings with utf8 encoding on phyton error was somthing like this:
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
 position: ordinal not in range(128)
 this was solved changing the default encode ascii to utf8 int the
 /usr/lib/python2.6/site.py file
 after this I was hable to execute ok the instruction:
 make DESTDIR=image WORKDIR=work
 XML_FILES=xml-file-samples/eswiki-latest-pages-articles.xml index
 parse render combine

 Every thing seem fine for a couple(about 6-7h) of hours parsing the
 70 articles in spanish but  then ... the horror
 Count: 38
 Traceback (most recent call last):
  File ./ArticleParser.py, line 224, in module
    main()
  File ./ArticleParser.py, line 172, in main
    process_article_text(title.encode('utf-8'),  f.read(length), newf)
  File ./ArticleParser.py, line 218, in process_article_text
    newf.write(text + '\n')
 IOError: [Errno 32] Broken pipe
 make[1]: *** [parse] Error 1
 make[1]: se sale del directorio
 `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer'
 make: *** [parse] Error 2

 I have relaunched the process again with the (few)hope that was a
 temporary fault but If any one has a clue will be helpfull.

 BTW.- I documenting all this proccess to make a step by step howto on
 how to put the wikipedia in other languages on the wikireader.



 David Reyes Samblas Martinez
 http://www.tuxbrain.com
 Open ultraportable  embedded solutions
 Openmoko, Openpandora,  Arduino
 Hey, watch out!!! There's a linux in your pocket!!!

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community


___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-29 Thread Nelson Castillo
On Thu, Oct 29, 2009 at 6:54 PM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:
 Great! :) good to see you are working on this!, please count on me for
 any testing to be done, I will try to make a look on the code myself
 to kill the bug but no time and nor expertise so no promises :P

I haven't seen the code but if you don't feel like fixing it now you
can add a try/catch on the block that is processing each page so that
you have a wiki to play with while the error is fixed.

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-29 Thread Sean Moss-Pultz
On Fri, Oct 30, 2009 at 7:54 AM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:
 Great! :) good to see you are working on this!, please count on me for
 any testing to be done, I will try to make a look on the code myself
 to kill the bug but no time and nor expertise so no promises :P

We'll get it working. Just give us a bit of time. And it would be
super helpful if you could help test / QA. Thanks a lot for the offer!

  -Sean

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-29 Thread Sean Moss-Pultz
On Fri, Oct 30, 2009 at 7:58 AM, Nelson Castillo
arhu...@freaks-unidos.net wrote:
 On Thu, Oct 29, 2009 at 6:54 PM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 Great! :) good to see you are working on this!, please count on me for
 any testing to be done, I will try to make a look on the code myself
 to kill the bug but no time and nor expertise so no promises :P

 I haven't seen the code but if you don't feel like fixing it now you
 can add a try/catch on the block that is processing each page so that
 you have a wiki to play with while the error is fixed.

Yeah we're trying exactly that Nelson. It's just a long process to
render all this stuff. We actually have 9 quad-core systems running in
parallel now. Each with at least six GB of ram :-)

  -Sean

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community