The text encoding is read from an email.message.Message created when
the mail is fetched from the server and before sending the data to the
base adapter parse function (by the way, I sent new versions of the
adapter these days to the issue page after it was marked as fixed)

In IMAPAdapter I am passing the complete message RFC822 payload to a
unicode instance with the charset declared in the message's envelope
(or using "utf-8" as default), and this way the unicode error doesn't
reproduce (without need to change the sqlhtml.py module).

I did not edit the layout encoding, I am just using the scaffolding
app to test the email queries. The message rows passed to sqltables
might contain html, is it possible that these parts are producing
unicode errors?

On 19 ene, 12:55, Massimo Di Pierro <massimo.dipie...@gmail.com>
wrote:
> Is the page html header declaring the utf8 encoding or are using in a
> layout that uses a different encoding?
>
> On Jan 19, 7:21 am, Alan Etkin <spame...@gmail.com> wrote:
>
> > I found that the Unicode errors are originated because of incompatible
> > encodings when web2py tries to read the raw message and render the
> > data for browser output. I solved it encoding the RFC822 raw text
> > before parsing the response data as Rows. Still i am not sure if this
> > is the correct way for processing the response text so it can be sent
> > safely (without misread characters) to the user interface. Anyway, it
> > seems to work well, without instensive testing.
>
> > On 18 ene, 18:40, Alan Etkin <spame...@gmail.com> wrote:
>
> > > I am trying to generate sqltables with the experimental IMAPAdapter
> > > select output, but with some rows, an exception is raised at the
> > > sqlhtml module. It has to do with unicode and charsets when sqlhtml
> > > processes the rows object returned by the adapter's select method, but
> > > I cannot find a proper way of solving it. Here is the error trace:
>
> > > Traceback (most recent call last):
> > >   File "/home/alan/web2py-hg/gluon/restricted.py", line 204, in
> > > restricted
> > >     exec ccode in environment
> > >   File "/home/alan/web2py-hg/applications/queries/views/default/
> > > index.html", line 126, in <module>
> > >   File "/home/alan/web2py-hg/gluon/globals.py", line 181, in write
> > >     self.body.write(xmlescape(data))
> > >   File "/home/alan/web2py-hg/gluon/html.py", line 114, in xmlescape
> > >     return data.xml()
> > >   File "/home/alan/web2py-hg/gluon/dal.py", line 7442, in xml
> > >     return sqlhtml.SQLTABLE(self).xml()
> > >   File "/home/alan/web2py-hg/gluon/sqlhtml.py", line 2197, in __init__
> > >     ur = unicode(r, 'utf8')
> > >   File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
> > >     return codecs.utf_8_decode(input, errors, True)
> > > UnicodeDecodeError: 'utf8' codec can't decode bytes in position
> > > 1227-1229: invalid data
>
> > > I found a workaround to avoid the exception but I doubt it's the
> > > correct fix, because it just prevents web2py to create the unicode
> > > object and use the raw input instead.
>
> > > This is the workaround: (gluon/sqlhtml.py Line 2196)
>
> > >                     try:
> > >                         ur = unicode(r, 'utf-8')
> > >                     except UnicodeDecodeError, e:
> > >                         ur = r
>
> > > Replacing this line:
> > >                     ur = unicode(r, 'utf8')
>
> > > When creating the Row objects at the adapter, I have to handle
> > > different encodings depending on the message. What would be an
> > > appropiate way of encoding data before creating the Row objects, so
> > > unicode errors can be avoided?
>
> > > This is the adapter method i am using to store the parse input for
> > > each text field
>
> > >     def encode_text(self, text, charset, errors="replace"):
> > >         """ convert text for mail to unicode"""
> > >         if text is None:
> > >             text = ""
> > >         else:
> > >             if charset is not None:
> > >                 text = unicode(text, charset, errors)
> > >             else:
> > >                 text = unicode(text, "utf-8", errors)
> > >         return text
>
> > > Thanks
>
> > > I am using the last source hg version (1.99.4) with Python 2.6.5 on a
> > > Mandriva GNU/Linux machine.
>
>

Reply via email to