I am trying to generate sqltables with the experimental IMAPAdapter
select output, but with some rows, an exception is raised at the
sqlhtml module. It has to do with unicode and charsets when sqlhtml
processes the rows object returned by the adapter's select method, but
I cannot find a proper way of solving it. Here is the error trace:

Traceback (most recent call last):
  File "/home/alan/web2py-hg/gluon/restricted.py", line 204, in
restricted
    exec ccode in environment
  File "/home/alan/web2py-hg/applications/queries/views/default/
index.html", line 126, in <module>
  File "/home/alan/web2py-hg/gluon/globals.py", line 181, in write
    self.body.write(xmlescape(data))
  File "/home/alan/web2py-hg/gluon/html.py", line 114, in xmlescape
    return data.xml()
  File "/home/alan/web2py-hg/gluon/dal.py", line 7442, in xml
    return sqlhtml.SQLTABLE(self).xml()
  File "/home/alan/web2py-hg/gluon/sqlhtml.py", line 2197, in __init__
    ur = unicode(r, 'utf8')
  File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position
1227-1229: invalid data

I found a workaround to avoid the exception but I doubt it's the
correct fix, because it just prevents web2py to create the unicode
object and use the raw input instead.

This is the workaround: (gluon/sqlhtml.py Line 2196)

                    try:
                        ur = unicode(r, 'utf-8')
                    except UnicodeDecodeError, e:
                        ur = r

Replacing this line:
                    ur = unicode(r, 'utf8')

When creating the Row objects at the adapter, I have to handle
different encodings depending on the message. What would be an
appropiate way of encoding data before creating the Row objects, so
unicode errors can be avoided?

This is the adapter method i am using to store the parse input for
each text field

    def encode_text(self, text, charset, errors="replace"):
        """ convert text for mail to unicode"""
        if text is None:
            text = ""
        else:
            if charset is not None:
                text = unicode(text, charset, errors)
            else:
                text = unicode(text, "utf-8", errors)
        return text

Thanks

I am using the last source hg version (1.99.4) with Python 2.6.5 on a
Mandriva GNU/Linux machine.

Reply via email to