Re: [sqlalchemy] Char encoding..
Simple enough Blob is a binary type, not a character type. You need to put binary data in there, so in python 2x that's str, ie encode it. Direct Unicode support in sqla is via the string/Unicode types. Sent from my iPhone On Nov 29, 2010, at 8:43 PM, Warwick Prince warwi...@mushroomsys.com wrote: Hi Michael Thanks for your thoughts and comments to date. I can replicate the problem with ease, so perhaps this will help; # -*- coding: utf-8 -*- e = create_engine('mysql+mysqlconnector://user:passw...@127.0.0.1/testdb?use_unicode=0', encoding='utf8', echo=False) m = MetaData(e) t = Table('test_table', m, autoload=True) #test_table is; Table('test_table', MetaData(Engine(mysql+mysqlconnector://user:passw...@127.0.0.1/testdb?use_unicode=0)), Column(u'ID', INTEGER(display_width=11), table=test_table, primary_key=True, nullable=False), Column(u'SourceType', VARCHAR(length=10), table=test_table), Column(u'SourceID', VARCHAR(length=128), table=test_table), Column(u'Date', DATE(), table=test_table), Column(u'Time', TIME(timezone=False), table=test_table), Column(u'UserID', VARCHAR(length=10), table=test_table), Column(u'Note', BLOB(length=None), table=test_table), Column(u'Division', VARCHAR(length=3), table=test_table), schema=None) # Set some row data in a dict columns = dict(ID=1, SourceType='TEST', SourceID='WAP', Note=u'Aligot\xe9') # The Note column is set to a unicode value for a French word with accents. Column type is BLOB # insert it t.insert(values=columns).execute() get this; Traceback (most recent call last): File interactive input, line 1, in module File C:\Python26\lib\site-packages\sqlalchemy\sql\expression.py, line 1217, in execute return e._execute_clauseelement(self, multiparams, params) File C:\Python26\lib\site-packages\sqlalchemy\engine\base.py, line 1722, in _execute_clauseelement return connection._execute_clauseelement(elem, multiparams, params) File C:\Python26\lib\site-packages\sqlalchemy\engine\base.py, line 1235, in _execute_clauseelement parameters=params File C:\Python26\lib\site-packages\sqlalchemy\engine\base.py, line 1343, in __create_execution_context connection=self, **kwargs) File C:\Python26\lib\site-packages\sqlalchemy\engine\default.py, line 384, in __init__ self.parameters = self.__convert_compiled_params(self.compiled_parameters) File C:\Python26\lib\site-packages\sqlalchemy\engine\default.py, line 513, in __convert_compiled_params param[key] = processors[key](compiled_params[key]) File C:\Python26\lib\site-packages\sqlalchemy\types.py, line 1209, in process return DBAPIBinary(value) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 6: ordinal not in range(128) It appears to be in the processing of the Binary type that something is going wrong. Further testing showed something interesting. I changed around the data above and set the unicode value to the VARCHAR column SourceID. That worked.. Therefore, the issue is related to storing a unicode value into a BLOB. Surely I can store anything in a BLOB, or am I missing something? Cheers Warwick Warwick Prince Managing Director mobile: +61 411 026 992 skype: warwickprince phone: +61 7 3102 3730 fax: +61 7 3319 6734 web: www.mushroomsys.com On 30/11/2010, at 1:29 AM, Michael Bayer wrote: we've got unicode round trips down very well for years now with plenty of tests, so would need a specific series of steps to reproduce what you're doing here. Note that the recommended connect string for MySQL + Mysqldb looks like mysql://scott:ti...@localhost/test?charset=utf8use_unicode=0 . On Nov 29, 2010, at 2:37 AM, Warwick Prince wrote: Hi All I thought I had Character Encoding licked, but I've hit something I can't work through. Any help appreciated. I have a legacy non SQL database that I read legacy data from (using cool Python code that emulates the old ISDB binary comms) and it reads a str which has Foreign language chars in it. (French for example). So, firstly, I have myStr = ''Aligot\xc3\xa9 which when printed is Aligoté. So far so good. I then convert that to unicode by myUnicode = unicode(myStr, 'utf-8', errors='ignore') and get u'Aligot\xe9'. This printed is also Aligoté, therefore all is good. I have a MySQL database, InnoDB table, charset utf-8. I set up my values in a dict called setValues with all the columns and their respective unicode'd values ready to go I then do a table.insert(values=setValues).execute() and get this error. Traceback (most recent call last): File C:\Documents and Settings\wprince\Desktop\PY CODE DEVELOPMENT\CESyncSQL\TEST_Sync.py, line 148, in SYNC_IT SyncFunction(ceDB, session, meta) File C:\Documents and Settings\wprince\Desktop\PY CODE DEVELOPMENT\CESyncSQL\TEST_Sync.py, line 840, in SYNC_VarietiesOUT
Re: [sqlalchemy] Char encoding..
we've got unicode round trips down very well for years now with plenty of tests, so would need a specific series of steps to reproduce what you're doing here. Note that the recommended connect string for MySQL + Mysqldb looks like mysql://scott:ti...@localhost/test?charset=utf8use_unicode=0 . On Nov 29, 2010, at 2:37 AM, Warwick Prince wrote: Hi All I thought I had Character Encoding licked, but I've hit something I can't work through. Any help appreciated. I have a legacy non SQL database that I read legacy data from (using cool Python code that emulates the old ISDB binary comms) and it reads a str which has Foreign language chars in it. (French for example). So, firstly, I have myStr = ''Aligot\xc3\xa9 which when printed is Aligoté. So far so good. I then convert that to unicode by myUnicode = unicode(myStr, 'utf-8', errors='ignore') and get u'Aligot\xe9'. This printed is also Aligoté, therefore all is good. I have a MySQL database, InnoDB table, charset utf-8. I set up my values in a dict called setValues with all the columns and their respective unicode'd values ready to go I then do a table.insert(values=setValues).execute() and get this error. Traceback (most recent call last): File C:\Documents and Settings\wprince\Desktop\PY CODE DEVELOPMENT\CESyncSQL\TEST_Sync.py, line 148, in SYNC_IT SyncFunction(ceDB, session, meta) File C:\Documents and Settings\wprince\Desktop\PY CODE DEVELOPMENT\CESyncSQL\TEST_Sync.py, line 840, in SYNC_VarietiesOUT DAPDB_SetColumns(meta, 'varieties', {'DescriptiveText':self.CEUnicode(tVarieties.ceVarietyText.value), 'FlavourText':self.CEUnicode(tVarieties.ceFlavourText.value), 'ImageURL':imageURL}, Variety=variety) File C:\Python26\lib\DAPDBHelpers.py, line 323, in DAPDB_SetColumns table.insert(values=setColumns).execute() File C:\Python26\lib\site-packages\sqlalchemy\sql\expression.py, line 1217, in execute return e._execute_clauseelement(self, multiparams, params) File C:\Python26\lib\site-packages\sqlalchemy\engine\base.py, line 1722, in _execute_clauseelement return connection._execute_clauseelement(elem, multiparams, params) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 4: ordinal not in range(128) I know what the error means, I just don't know why I'm getting it. The offending u'\xe9' character is in the DescriptiveText column. DAPDB_SetColumns is a simple wrapper around an update/insert that builds up the table.insert(values=setColumns).execute() you see. This is what setColumns looks like; {'ImageURL': '', 'DescriptiveText': u'Carm\xe9n\xe8re is a red wine grape variety originally from Bordeaux, France. Having lost favor in France, the largest area planted with this variety is in now Chile. It only survived, due to growers believing it was Merlot. The vines were imported into Chil', 'FlavourText': u'Carmenere is a full bodied red wine with approachable tannins and a combination of sweet berry fruit, savory pepper, smoke, tar, with a slight leafy character.\n', 'Variety': u'Carmenere'} 'Variety' is the primary key BTW. What gives? It feels like SQLA is encoding/decoding somewhere it shouldn't.. Cheers Warwick -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] Char encoding..
Hi Michael Thanks for your thoughts and comments to date. I can replicate the problem with ease, so perhaps this will help; # -*- coding: utf-8 -*- e = create_engine('mysql+mysqlconnector://user:passw...@127.0.0.1/testdb?use_unicode=0', encoding='utf8', echo=False) m = MetaData(e) t = Table('test_table', m, autoload=True) #test_table is; Table('test_table', MetaData(Engine(mysql+mysqlconnector://user:passw...@127.0.0.1/testdb?use_unicode=0)), Column(u'ID', INTEGER(display_width=11), table=test_table, primary_key=True, nullable=False), Column(u'SourceType', VARCHAR(length=10), table=test_table), Column(u'SourceID', VARCHAR(length=128), table=test_table), Column(u'Date', DATE(), table=test_table), Column(u'Time', TIME(timezone=False), table=test_table), Column(u'UserID', VARCHAR(length=10), table=test_table), Column(u'Note', BLOB(length=None), table=test_table), Column(u'Division', VARCHAR(length=3), table=test_table), schema=None) # Set some row data in a dict columns = dict(ID=1, SourceType='TEST', SourceID='WAP', Note=u'Aligot\xe9') # The Note column is set to a unicode value for a French word with accents. Column type is BLOB # insert it t.insert(values=columns).execute() get this; Traceback (most recent call last): File interactive input, line 1, in module File C:\Python26\lib\site-packages\sqlalchemy\sql\expression.py, line 1217, in execute return e._execute_clauseelement(self, multiparams, params) File C:\Python26\lib\site-packages\sqlalchemy\engine\base.py, line 1722, in _execute_clauseelement return connection._execute_clauseelement(elem, multiparams, params) File C:\Python26\lib\site-packages\sqlalchemy\engine\base.py, line 1235, in _execute_clauseelement parameters=params File C:\Python26\lib\site-packages\sqlalchemy\engine\base.py, line 1343, in __create_execution_context connection=self, **kwargs) File C:\Python26\lib\site-packages\sqlalchemy\engine\default.py, line 384, in __init__ self.parameters = self.__convert_compiled_params(self.compiled_parameters) File C:\Python26\lib\site-packages\sqlalchemy\engine\default.py, line 513, in __convert_compiled_params param[key] = processors[key](compiled_params[key]) File C:\Python26\lib\site-packages\sqlalchemy\types.py, line 1209, in process return DBAPIBinary(value) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 6: ordinal not in range(128) It appears to be in the processing of the Binary type that something is going wrong. Further testing showed something interesting. I changed around the data above and set the unicode value to the VARCHAR column SourceID. That worked.. Therefore, the issue is related to storing a unicode value into a BLOB. Surely I can store anything in a BLOB, or am I missing something? Cheers Warwick Warwick Prince Managing Director mobile: +61 411 026 992 skype: warwickprince phone: +61 7 3102 3730 fax: +61 7 3319 6734 web: www.mushroomsys.com On 30/11/2010, at 1:29 AM, Michael Bayer wrote: we've got unicode round trips down very well for years now with plenty of tests, so would need a specific series of steps to reproduce what you're doing here. Note that the recommended connect string for MySQL + Mysqldb looks like mysql://scott:ti...@localhost/test?charset=utf8use_unicode=0 . On Nov 29, 2010, at 2:37 AM, Warwick Prince wrote: Hi All I thought I had Character Encoding licked, but I've hit something I can't work through. Any help appreciated. I have a legacy non SQL database that I read legacy data from (using cool Python code that emulates the old ISDB binary comms) and it reads a str which has Foreign language chars in it. (French for example). So, firstly, I have myStr = ''Aligot\xc3\xa9 which when printed is Aligoté. So far so good. I then convert that to unicode by myUnicode = unicode(myStr, 'utf-8', errors='ignore') and get u'Aligot\xe9'. This printed is also Aligoté, therefore all is good. I have a MySQL database, InnoDB table, charset utf-8. I set up my values in a dict called setValues with all the columns and their respective unicode'd values ready to go I then do a table.insert(values=setValues).execute() and get this error. Traceback (most recent call last): File C:\Documents and Settings\wprince\Desktop\PY CODE DEVELOPMENT\CESyncSQL\TEST_Sync.py, line 148, in SYNC_IT SyncFunction(ceDB, session, meta) File C:\Documents and Settings\wprince\Desktop\PY CODE DEVELOPMENT\CESyncSQL\TEST_Sync.py, line 840, in SYNC_VarietiesOUT DAPDB_SetColumns(meta, 'varieties', {'DescriptiveText':self.CEUnicode(tVarieties.ceVarietyText.value), 'FlavourText':self.CEUnicode(tVarieties.ceFlavourText.value), 'ImageURL':imageURL}, Variety=variety) File C:\Python26\lib\DAPDBHelpers.py, line 323, in DAPDB_SetColumns table.insert(values=setColumns).execute() File C:\Python26\lib\site-packages\sqlalchemy\sql\expression.py, line