[sqlalchemy] Re: Multiple encodings in my database

jason kirtland Fri, 27 Jun 2008 09:21:06 -0700

my understanding is that mysql works a little differently here.  the
column-level character sets are storage encodings only.  all data to and
from the database is encoded in the database connection's configured
encoding.  that can either be left as-is or converted to Unicode for you.


Bobby Impollonia wrote:
> If I am using the mysql-specific Column constructs with the charset
> option, will things be automatically encoded/ decoded by SA using that
> charset? Or is the charset option only used for Create Table?
> 
> On Thu, Jun 26, 2008 at 7:20 PM, Michael Bayer <[EMAIL PROTECTED]> wrote:
>> first of all, the stack trace suggests you have not set the "encoding"
>> parameter on create_engine() as it's still using UTF-8.
>>
>> If you mean that a single database column may have different encodings
>> in different rows, you want to do your own encoding/decoding with
>> "encoding errors" set to something liberal like "ignore".  You also
>> need to use your own custom type, as below:
>>
>> from sqlalchemy import types
>> class MyEncodedType(types.TypeDecorator):
>>        impl = String
>>
>>        def process_bind_param(self, value, dialect):
>>                assert isinstance(value, unicode)
>>                return value.encode('latin-1')
>>
>>        def process_result_value(self, value, dialect):
>>                return value.decode('latin-1', 'ignore')
>>
>> then use MyEncodedType() as the type for all your columns which
>> contain random encoding.   No convert_unicode setting should be used
>> on your engine as this type replaces that usage.
>>
>>
>>
>> On Jun 26, 2008, at 6:55 PM, Hermann Himmelbauer wrote:
>>
>>> Hi,
>>> I'm trying to access a database via SA, which contains varchars with
>>> different, arbitrary encodings. Most of them are ascii or ISO-8859-2
>>> encoded,
>>> however, many are windows-1252 encoded and there are also some other
>>> weird
>>> ones.
>>>
>>> In my engine setup, I set the encoding to latin1 and set
>>> convert_unicode to
>>> True, as I my application requires the database values in unicode
>>> format.
>>>
>>> If SA now tries to retrieve such a key, the following traceback
>>> occurs:
>>>
>>> ------------------
>>>  File "/home/dusty/prog/python_modules/sqlalchemy/engine/base.py",
>>> line 1605,
>>> in _get_col
>>>    return processor(row[index])
>>>  File "/home/dusty/prog/python_modules/sqlalchemy/databases/
>>> maxdb.py", line
>>> 112, in process
>>>    return value.decode(dialect.encoding)
>>>
>>> File "/local/home/dusty/python/Python-2.4.4/lib/python2.4/encodings/
>>> utf_8.py",
>>> line 16, in decode
>>>    return codecs.utf_8_decode(input, errors, True)
>>> UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-6:
>>> invalid
>>> data
>>> -----------------
>>>
>>> What can I do? It's not so important that all characters are correctly
>>> displayed, but it's vital that such improper encodings do not crash my
>>> application. Perhaps, there's some "universal" encoding that is able
>>> to deal
>>> with such problems?
>>>
>>> Best Regards,
>>> Hermann
>>>
>>> --
>>> [EMAIL PROTECTED]
>>> GPG key ID: 299893C7 (on keyservers)
>>> FP: 0124 2584 8809 EF2A DBF9  4902 64B4 D16B 2998 93C7
>>>
>>
> 
> > 


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

[sqlalchemy] Re: Multiple encodings in my database

Reply via email to