[sqlalchemy] Re: Again MySQL/unicode - roundtrip failed
On 12/15/06, Stefan Meretz [EMAIL PROTECTED] wrote: On 2006-12-13 22:38, Shannon -jj Behrens wrote: My memory is that MySQLdb recently changed a bunch of stuff and that it was a simple logic bug. You mean, that just the entire logic is reversed? This would explain, why reading is working (from Mike's Mail): def convert_result_value(self, value, dialect): if value is not None and not isinstance(value, unicode): return value.decode(dialect.encoding) else: return value If value is already unicode, then the value is simply handed over (else-part). However, writing goes wrong, because MySQLdb (wrongly) expects an unicode object but gets an utf8 encoded string (if-part): def convert_bind_param(self, value, dialect): if value is not None and isinstance(value, unicode): return value.encode(dialect.encoding) else: return value Am I right? I can't say with certainty exactly how the code is broken. If you can write a simple, stand-alone test to prove your point, that would indeed be a useful addition to the bug. Remember to use the MySQL client and the hex function to see what's *actually* stored in the database. Here's the bug I filed: http://sourceforge.net/tracker/index.php?func=detailaid=1592353grou p_id=22307atid=374932 If I am right, I would add a note to the bug you already filed. Best Regards, -jj -- http://jjinux.blogspot.com/ --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Again MySQL/unicode - roundtrip failed
My memory is that MySQLdb recently changed a bunch of stuff and that it was a simple logic bug. Here's the bug I filed: http://sourceforge.net/tracker/index.php?func=detailaid=1592353group_id=22307atid=374932 On 12/13/06, Michael Bayer [EMAIL PROTECTED] wrote: in fact its almost definitely a bug in mysqldb - mysqldb should be detecting unicode instances at on the bind parameter side and encoding based on that (otherwise doing nothing), and decoding into unicode instances at the result set level. if it did that, then it would not conflict with the Unicode type on SQLAlchemy's side. here is the source to the SA unicode type: class Unicode(TypeDecorator): impl = String def convert_bind_param(self, value, dialect): if value is not None and isinstance(value, unicode): return value.encode(dialect.encoding) else: return value def convert_result_value(self, value, dialect): if value is not None and not isinstance(value, unicode): return value.decode(dialect.encoding) else: return value as evidence of this, the above Unicode type works completely fine with pysqlite, which also accepts unicode bind params and returns all string values as unicodes. -- http://jjinux.blogspot.com/ --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: convert_unicode=True results in double encoding
On 11/12/06, Michael Bayer [EMAIL PROTECTED] wrote: since create_engine deals with class constructors, i went with this approach: def get_cls_kwargs(cls): return the full set of legal kwargs for the given cls kw = [] for c in cls.__mro__: cons = c.__init__ if hasattr(cons, 'func_code'): for vn in cons.func_code.co_varnames: if vn != 'self': kw.append(vn) return kw so now you get these luxurious TypeErrors if you send any combination of invalid arguments: Thanks! You'll probably never hear about this again, but I bet this will save many people hours of frustration. :) Best Regards, -jj -- http://jjinux.blogspot.com/ --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: convert_unicode=True results in double encoding
The following results in correct data going into and coming out of the database, but the data in the database itself looks double encoded: import MySQLdb connection = MySQLdb.connect(host=fmapp03, user=foxmarks, passwd='ChunkyBacon', db=users) cursor = connection.cursor() cursor.execute( INSERT INTO users VALUES (12345678, 'jjtest1234', '[EMAIL PROTECTED]', 'pass', %s, 'asdf', 'N/A', 'N/A', 0, NOW(), NOW()) , ('\xc3\xa7',)) cursor.execute(SELECT * FROM users WHERE id = 12345678) row = cursor.fetchone() print `row` connection.commit() The following results in correct data going into and out of the database, but does not result in the data in the database itself being double encoded: import MySQLdb connection = MySQLdb.connect(host=fmapp03, user=foxmarks, passwd='ChunkyBacon', db=users, charset='utf8') cursor = connection.cursor() cursor.execute( INSERT INTO users VALUES (12345678, 'jjtest1234', '[EMAIL PROTECTED]', 'pass', %s, 'asdf', 'N/A', 'N/A', 0, NOW(), NOW()) , (u'\xe7',)) cursor.execute(SELECT * FROM users WHERE id = 12345678) row = cursor.fetchone() print `row` connection.commit() It looks like for the version of MySQLdb I'm using, 1.2.1p2, a lot of this stuff has changed. If you don't let MySQLdb take care of encoding and decoding, it ends up double encoding things in the database. This must be a bug in MySQLdb. The clear way to work around the bug is to let the driver take care of encoding and decoding instead of SQLAlchemy. Yuck, -jj --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: convert_unicode=True results in double encoding
On 11/7/06, Michael Bayer [EMAIL PROTECTED] wrote: yeah, or use introspection to consume the args, i thought of that too. i guess we can do that. of course i dont like having to go there but i guess not a big deal. as far as kwargs collisions, yeah, thats a potential issue too. but the number of dialects/pools is not *that* varied, theyre generally pretty conservative with the kwargs. if we broke out create_engine() to take in dialect, pool, etc., well id probably just create a new function for that first off so create_engine() can just remain...im not sure if i want to force users to be that exposed to the details as people might get a little intimidated by all that. By the way, as if to prove my point, I had another problem. I was trying to pass a create_args keyword argument per the example http://www.sqlalchemy.org/docs/dbengine.myt#dbengine_establishing_custom. SQLAlchemy didn't complain. My code didn't even complain when I used a cedilla (รง). My code crashed when I used Japanese. It turns out that MySQLdb was defaulting to Latin-1 or something like that. SQLAlchemy was ignoring my create_args keyword argument. It turns out that that example is wrong (should I file a bug?). The documentation above it is correct; the correct keyword argument is named connect_args. Best Regards, -jj -- http://jjinux.blogspot.com/ --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: convert_unicode=True results in double encoding
On 11/3/06, Shannon -jj Behrens [EMAIL PROTECTED] wrote: I'm using convert_unicode=True. Everything is fine as long as I'm the one reading and writing the data. However, if I look at what's actually being stored in the database, it's like the data has been encoded twiced. If I switch to use_unicode=True, which I believe is MySQL specific, things work just fine and what's being stored in the database looks correct. I started looking through the SQLAlchemy code, and I came across this: def convert_bind_param(self, value, dialect): if not dialect.convert_unicode or value is None or not isinstance(value, unicode): return value else: return value.encode(dialect.encoding) def convert_result_value(self, value, dialect): if not dialect.convert_unicode or value is None or isinstance(value, unicode): return value else: return value.decode(dialect.encoding) The logic looks backwards. It says, If it's not a unicode object, return it. Otherwise, encode it. Later, If it is a unicode object, return it. Otherwise decode it. Am I correct that this is backwards? If so, this is going to be *painful* to update all the databases out there! Ok, MySQLdb doesn't have a mailing list, so I can't ask there. Here are some things I've learned: Changing from convert_unicode=True to use_unicode=True doesn't do what you'd expect. SQLAlchemy is passing keyword arguments all over the place, and use_unicode actually gets ignored. minor rantI personally think that you should be strict *somewhere* when you're passing around keyword arguments. I've been bitten in this way too many times. Unknown keyword arguments should result in exceptions./minor rant Anyway, I'm still a bit worried about that code above like I said. However, here's what's even scarier. If I use the following code: import MySQLdb for use_unicode in (True, False): connection = MySQLdb.connect(host=localhost, user=user, passwd='dataase', db=users, use_unicode=use_unicode) cursor = connection.cursor() cursor.execute(select firstName from users where username='test') row = cursor.fetchone() print use_unicode:%s %r % (use_unicode, row) I get use_unicode:True (u'test \xc3\xa7',) use_unicode:False ('test \xc3\xa7',) Notice the result is the same, but one has a unicode object and the other doesn't. Notice that it's \xc3\xa7 each time? It shouldn't be. Consider: s = 'test \xc3\xa7' s.decode('utf-8') u'test \xe7' *It's creating a unicode object without actually doing any decoding!* This is somewhere low level. Like I said, this is lower level than SQLAlchemy, but I don't have anywhere else to turn. SQLAlchemy: 0.2.8 MySQLdb: 1.36.2.4 mysql client and server: 5.0.22 Ubuntu: 6.0.6 Help! -jj -- http://jjinux.blogspot.com/ --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---