thanks!! you just confirmed my empirical observations, which puts me very much at ease :) for versions, 1.2.2 mysqldb, and v 5.0.67 and 6.0.7 (alpha) mysql (community ed.)
thank again. On Dec 7, 8:52 am, Michael Bayer <[EMAIL PROTECTED]> wrote: > you should also be on MySQLdb 1.2.2. Using the Unicode type in > conjunction with charset=utf8&use_unicode=0 and always passing Python > unicode (u'') objects is the general recipe for unicode with MySQL. > All this means is that SQLA sends utf-8-encoded strings to MySQLdb, > MySQLdb does not try to encode them itself and makes MySQL aware the > data should be considered as utf-8. I'm not sure what version of > MySQL you're on or how older versions of that might get in the way. > > On Dec 6, 2008, at 1:26 PM, n00b wrote: > > > > > thanks for the quick reply. i kept trying with it and no have reached > > the utter state of confusion. > > the specification of Unicode versus String in the table def's coupled > > with actual str representation > > has my totally confused. here's a quick script, have a look at the > > mysql table itself to see character > > display: > > > #!/usr/bin/env python > > # -*- coding: utf-8 -*- > > > import os, sys > > import unicodedata > > > from sqlalchemy import * > > from sqlalchemy.orm import * > > > #set db > > import MySQLdb > > db = MySQLdb.connect(host='localhost', user='root', passwd='', > > db='xxx', use_unicode=True, charset='utf8') > > cur = db.cursor() > > cur.execute('SET NAMES utf8') > > cur.execute('SET CHARACTER SET utf8') > > cur.execute('SET character_set_connection=utf8') > > cur.execute('SET character_set_server=utf8') > > cur.execute('''SHOW VARIABLES LIKE 'char%'; ''') > > print cur.fetchall() > > > utf_repr = '\xc3\xab' > > hex_repr = '\xeb' > > > mysql_url = 'mysql://root:@localhost/xxx' > > connect_args = {'charset':'utf8', 'use_unicode':'0'} > > engine = create_engine(mysql_url, connect_args=connect_args) > > metadata = MetaData() > > > test_table = Table('encoding_test', metadata, > > Column(u'id', Integer, primary_key=True), > > Column(u'unicode', Integer), > > Column(u'u_hex', Unicode(10)), > > Column(u'u_utf', Unicode(10)), > > Column(u'u_str', Unicode(10)), > > Column(u's_hex', String(10)), > > Column(u's_utf', String(10)), > > Column(u's_str', String(10)) > > ) > > > class EncodingTest(object): pass > > > mapper(EncodingTest, test_table) > > > metadata.create_all(engine) > > Session = sessionmaker(bind=engine) > > > session = Session() > > et = EncodingTest() > > et.unicode = 1 > > et.u_str = u'ë' > > et.u_hex = u'\xeb' > > et.u_utf = u'\xc3\xab' > > et.s_str = u'ë' > > et.s_hex = u'\xeb' > > et.s_utf = u'\xc3\xab' > > session.add(et) > > session.commit() > > et = EncodingTest() > > et.unicode = 0 > > et.u_str = 'ë' > > et.u_hex = '\xeb' > > et.u_utf = '\xc3\xab' > > et.s_str = 'ë' > > et.s_hex = '\xeb' > > et.s_utf = '\xc3\xab' > > session.add(et) > > session.commit() > > session.close() > > > session = Session() > > results = session.query(EncodingTest).all() > > for result in results: > > print result.unicode > > print repr(result.u_hex), repr(result.u_utf), repr(result.u_str) > > print repr(result.s_hex), repr(result.s_utf), repr(result.s_str) > > print > > > in addition, i don't seem to be able to run the mysql settings (# set > > db) from SA. > > any insights are greatly appreciated. btw, the use_unciode, either in > > MySQLdb or SA, > > doesn't seem to have any effect on results. > > > thx > > > On Dec 5, 3:25 pm, Michael Bayer <[EMAIL PROTECTED]> wrote: > >> I'm not sure of the mechanics of what you're experiencing, but make > >> sure you use charset=utf8&use_unicode=0 with MySQL. > > >> On Dec 5, 2008, at 4:17 PM, n00b wrote: > > >>> greetings, > > >>> SA (0.5.0rc1) keeps returning utf hex in stead of utf-8 and in the > >>> process driving me batty. all the mysql setup is fine, the chars > >>> look > >>> good and are umlauting to goethe's delight. moreover, insert and > >>> select are working perfectly with the MySQLdb api on three different > >>> *nix systems, two servers, ... it works. > > >>> where things fall apart is on the retrieval side of SA; inserts are > >>> fine (using the config_args = {'charset':'utf8'} dict in the > >>> create_engine call). > > >>> for example, ë, the latin small letter e with diaeresis, is stored > >>> in > >>> mysql hex as C3 AB; using the MySQldb client, this is exactly what i > >>> get back: '\xc3\xab' (in the # -*- coding: UTF-8 -*- environment) no > >>> further codecs work required. SA, on the other hand, hands me back > >>> the > >>> utf-hex representation, '\xeb'. > > >>> there must be some setting that i'm missing that'll give the > >>> appropriate utf-8 representation at the SA (api) level. any ideas, > >>> suggestions? > > >>> thx > > >>> yes, i could do '\xeb'.encode('utf8) but it's not an option. we got > >>> too much data to deal with and MySQLdb is working perfectly well > >>> without the extra step. thx. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~----------~----~----~----~------~----~------~--~---