[sqlalchemy] Re: Again MySQL/unicode - roundtrip failed

2006-12-15 Thread Shannon -jj Behrens

On 12/15/06, Stefan Meretz [EMAIL PROTECTED] wrote:
 On 2006-12-13 22:38, Shannon -jj Behrens wrote:
  My memory is that MySQLdb recently changed a bunch of stuff and that
  it was a simple logic bug.

 You mean, that just the entire logic is reversed?

 This would explain, why reading is working (from Mike's Mail):

  def convert_result_value(self, value, dialect):
   if value is not None and not isinstance(value, unicode):
   return value.decode(dialect.encoding)
   else:
   return value

 If value is already unicode, then the value is simply handed over
 (else-part).

 However, writing goes wrong, because MySQLdb (wrongly) expects an
 unicode object but gets an utf8 encoded string (if-part):

  def convert_bind_param(self, value, dialect):
   if value is not None and isinstance(value, unicode):
return value.encode(dialect.encoding)
   else:
return value

 Am I right?

I can't say with certainty exactly how the code is broken.  If you can
write a simple, stand-alone test to prove your point, that would
indeed be a useful addition to the bug.  Remember to use the MySQL
client and the hex function to see what's *actually* stored in the
database.

  Here's the bug I filed:
  http://sourceforge.net/tracker/index.php?func=detailaid=1592353grou
 p_id=22307atid=374932

 If I am right, I would add a note to the bug you already filed.

Best Regards,
-jj

-- 
http://jjinux.blogspot.com/

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Again MySQL/unicode - roundtrip failed

2006-12-13 Thread Shannon -jj Behrens

My memory is that MySQLdb recently changed a bunch of stuff and that
it was a simple logic bug.  Here's the bug I filed:

http://sourceforge.net/tracker/index.php?func=detailaid=1592353group_id=22307atid=374932

On 12/13/06, Michael Bayer [EMAIL PROTECTED] wrote:

 in fact its almost definitely a bug in mysqldb - mysqldb should be
 detecting unicode instances at on the bind parameter side and
 encoding based on that (otherwise doing nothing), and decoding into
 unicode instances at the result set level.  if it did that, then it
 would not conflict with the Unicode type on SQLAlchemy's side.  here is
 the source to the SA unicode type:

 class Unicode(TypeDecorator):
 impl = String
 def convert_bind_param(self, value, dialect):
  if value is not None and isinstance(value, unicode):
   return value.encode(dialect.encoding)
  else:
   return value
 def convert_result_value(self, value, dialect):
  if value is not None and not isinstance(value, unicode):
  return value.decode(dialect.encoding)
  else:
  return value

 as evidence of this, the above Unicode type works completely fine with
 pysqlite, which also accepts unicode bind params and returns all string
 values as unicodes.


 



-- 
http://jjinux.blogspot.com/

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: convert_unicode=True results in double encoding

2006-11-13 Thread Shannon -jj Behrens

On 11/12/06, Michael Bayer [EMAIL PROTECTED] wrote:
 since create_engine deals with class constructors, i went with this
 approach:

 def get_cls_kwargs(cls):
 return the full set of legal kwargs for the given cls
 kw = []
 for c in cls.__mro__:
 cons = c.__init__
 if hasattr(cons, 'func_code'):
 for vn in cons.func_code.co_varnames:
 if vn != 'self':
 kw.append(vn)
 return kw

 so now you get these luxurious TypeErrors if you send any combination
 of invalid arguments:

Thanks!  You'll probably never hear about this again, but I bet this
will save many people hours of frustration. :)

Best Regards,
-jj

-- 
http://jjinux.blogspot.com/

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: convert_unicode=True results in double encoding

2006-11-07 Thread Shannon -jj Behrens

The following results in correct data going into and coming out of the
database, but the data in the database itself looks double encoded:

import MySQLdb


connection = MySQLdb.connect(host=fmapp03, user=foxmarks,
 passwd='ChunkyBacon', db=users)
cursor = connection.cursor()
cursor.execute(
INSERT INTO users
VALUES (12345678, 'jjtest1234', '[EMAIL PROTECTED]', 'pass', %s,
'asdf', 'N/A', 'N/A', 0, NOW(), NOW())
, ('\xc3\xa7',))
cursor.execute(SELECT * FROM users WHERE id = 12345678)
row = cursor.fetchone()
print `row`
connection.commit()

The following results in correct data going into and out of the
database, but does not result in the data in the database itself being
double encoded:

import MySQLdb


connection = MySQLdb.connect(host=fmapp03, user=foxmarks,
 passwd='ChunkyBacon', db=users,
 charset='utf8')
cursor = connection.cursor()
cursor.execute(
INSERT INTO users
VALUES (12345678, 'jjtest1234', '[EMAIL PROTECTED]', 'pass', %s,
'asdf', 'N/A', 'N/A', 0, NOW(), NOW())
, (u'\xe7',))
cursor.execute(SELECT * FROM users WHERE id = 12345678)
row = cursor.fetchone()
print `row`
connection.commit()

It looks like for the version of MySQLdb I'm using, 1.2.1p2, a lot of
this stuff has changed. If you don't let MySQLdb take care of encoding
and decoding, it ends up double encoding things in the database. This
must be a bug in MySQLdb. The clear way to work around the bug is to
let the driver take care of encoding and decoding instead of
SQLAlchemy.

Yuck,
-jj

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: convert_unicode=True results in double encoding

2006-11-07 Thread Shannon -jj Behrens

On 11/7/06, Michael Bayer [EMAIL PROTECTED] wrote:
 yeah, or use introspection to consume the args, i thought of that too.
 i guess we can do that.  of course i dont like having to go there but i
 guess not a big deal.

 as far as kwargs collisions, yeah, thats a potential issue too.  but
 the number of dialects/pools is not *that* varied, theyre generally
 pretty conservative with the kwargs.  if we broke out create_engine()
 to take in dialect, pool, etc., well id probably just create a new
 function for that first off so create_engine() can just remain...im not
 sure if i want to force users to be that exposed to the details as
 people might get a little intimidated by all that.

By the way, as if to prove my point, I had another problem.  I was
trying to pass a create_args keyword argument per the example
http://www.sqlalchemy.org/docs/dbengine.myt#dbengine_establishing_custom.
 SQLAlchemy didn't complain.  My code didn't even complain when I used
a cedilla (รง).  My code crashed when I used Japanese.  It turns out
that MySQLdb was defaulting to Latin-1 or something like that.
SQLAlchemy was ignoring my create_args keyword argument.  It turns out
that that example is wrong (should I file a bug?).  The documentation
above it is correct; the correct keyword argument is named
connect_args.

Best Regards,
-jj

-- 
http://jjinux.blogspot.com/

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: convert_unicode=True results in double encoding

2006-11-03 Thread Shannon -jj Behrens

On 11/3/06, Shannon -jj Behrens [EMAIL PROTECTED] wrote:
 I'm using convert_unicode=True.  Everything is fine as long as I'm the
 one reading and writing the data.  However, if I look at what's
 actually being stored in the database, it's like the data has been
 encoded twiced.  If I switch to use_unicode=True, which I believe is
 MySQL specific, things work just fine and what's being stored in the
 database looks correct.

 I started looking through the SQLAlchemy code, and I came across this:

 def convert_bind_param(self, value, dialect):
 if not dialect.convert_unicode or value is None or not
 isinstance(value, unicode):
 return value
 else:
 return value.encode(dialect.encoding)
 def convert_result_value(self, value, dialect):
 if not dialect.convert_unicode or value is None or
 isinstance(value, unicode):
 return value
 else:
 return value.decode(dialect.encoding)

 The logic looks backwards.  It says, If it's not a unicode object,
 return it.  Otherwise, encode it.  Later, If it is a unicode object,
 return it.  Otherwise decode it.

 Am I correct that this is backwards?  If so, this is going to be
 *painful* to update all the databases out there!

Ok, MySQLdb doesn't have a mailing list, so I can't ask there.  Here
are some things I've learned:

Changing from convert_unicode=True to use_unicode=True doesn't do what
you'd expect.  SQLAlchemy is passing keyword arguments all over the
place, and use_unicode actually gets ignored.  minor rantI
personally think that you should be strict *somewhere* when you're
passing around keyword arguments.  I've been bitten in this way too
many times.  Unknown keyword arguments should result in
exceptions./minor rant

Anyway, I'm still a bit worried about that code above like I said.
However, here's what's even scarier.  If I use the following code:

import MySQLdb


for use_unicode in (True, False):
connection = MySQLdb.connect(host=localhost, user=user,
 passwd='dataase', db=users,
 use_unicode=use_unicode)
cursor = connection.cursor()
cursor.execute(select firstName from users where username='test')
row = cursor.fetchone()
print use_unicode:%s %r % (use_unicode, row)

I get

use_unicode:True (u'test \xc3\xa7',)
use_unicode:False ('test \xc3\xa7',)

Notice the result is the same, but one has a unicode object and the
other doesn't.  Notice that it's \xc3\xa7 each time?  It shouldn't be.
 Consider:

 s = 'test \xc3\xa7'
 s.decode('utf-8')
u'test \xe7'

*It's creating a unicode object without actually doing any decoding!*

This is somewhere low level.  Like I said, this is lower level than
SQLAlchemy, but I don't have anywhere else to turn.

SQLAlchemy: 0.2.8
MySQLdb: 1.36.2.4
mysql client and server: 5.0.22
Ubuntu: 6.0.6

Help!
-jj

-- 
http://jjinux.blogspot.com/

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---