my understanding is that mysql works a little differently here. the column-level character sets are storage encodings only. all data to and from the database is encoded in the database connection's configured encoding. that can either be left as-is or converted to Unicode for you.
Bobby Impollonia wrote: > If I am using the mysql-specific Column constructs with the charset > option, will things be automatically encoded/ decoded by SA using that > charset? Or is the charset option only used for Create Table? > > On Thu, Jun 26, 2008 at 7:20 PM, Michael Bayer <[EMAIL PROTECTED]> wrote: >> first of all, the stack trace suggests you have not set the "encoding" >> parameter on create_engine() as it's still using UTF-8. >> >> If you mean that a single database column may have different encodings >> in different rows, you want to do your own encoding/decoding with >> "encoding errors" set to something liberal like "ignore". You also >> need to use your own custom type, as below: >> >> from sqlalchemy import types >> class MyEncodedType(types.TypeDecorator): >> impl = String >> >> def process_bind_param(self, value, dialect): >> assert isinstance(value, unicode) >> return value.encode('latin-1') >> >> def process_result_value(self, value, dialect): >> return value.decode('latin-1', 'ignore') >> >> then use MyEncodedType() as the type for all your columns which >> contain random encoding. No convert_unicode setting should be used >> on your engine as this type replaces that usage. >> >> >> >> On Jun 26, 2008, at 6:55 PM, Hermann Himmelbauer wrote: >> >>> Hi, >>> I'm trying to access a database via SA, which contains varchars with >>> different, arbitrary encodings. Most of them are ascii or ISO-8859-2 >>> encoded, >>> however, many are windows-1252 encoded and there are also some other >>> weird >>> ones. >>> >>> In my engine setup, I set the encoding to latin1 and set >>> convert_unicode to >>> True, as I my application requires the database values in unicode >>> format. >>> >>> If SA now tries to retrieve such a key, the following traceback >>> occurs: >>> >>> ------------------ >>> File "/home/dusty/prog/python_modules/sqlalchemy/engine/base.py", >>> line 1605, >>> in _get_col >>> return processor(row[index]) >>> File "/home/dusty/prog/python_modules/sqlalchemy/databases/ >>> maxdb.py", line >>> 112, in process >>> return value.decode(dialect.encoding) >>> >>> File "/local/home/dusty/python/Python-2.4.4/lib/python2.4/encodings/ >>> utf_8.py", >>> line 16, in decode >>> return codecs.utf_8_decode(input, errors, True) >>> UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-6: >>> invalid >>> data >>> ----------------- >>> >>> What can I do? It's not so important that all characters are correctly >>> displayed, but it's vital that such improper encodings do not crash my >>> application. Perhaps, there's some "universal" encoding that is able >>> to deal >>> with such problems? >>> >>> Best Regards, >>> Hermann >>> >>> -- >>> [EMAIL PROTECTED] >>> GPG key ID: 299893C7 (on keyservers) >>> FP: 0124 2584 8809 EF2A DBF9 4902 64B4 D16B 2998 93C7 >>> >> > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~----------~----~----~----~------~----~------~--~---