you should also be on MySQLdb 1.2.2.  Using the Unicode type in  
conjunction with charset=utf8&use_unicode=0 and always passing Python  
unicode (u'') objects is the general recipe for unicode with MySQL.     
All this means is that SQLA sends utf-8-encoded strings to MySQLdb,  
MySQLdb does not try to encode them itself and makes MySQL aware the  
data should be considered as utf-8.   I'm not sure what version of  
MySQL you're on or how older versions of that might get in the way.

On Dec 6, 2008, at 1:26 PM, n00b wrote:

>
> thanks for the quick reply. i kept trying with it and no have reached
> the utter state of confusion.
> the specification of Unicode versus String in the table def's coupled
> with actual str representation
> has my totally confused. here's a quick script, have a look at the
> mysql table itself to see character
> display:
>
> #!/usr/bin/env python
> # -*- coding: utf-8 -*-
>
> import os, sys
> import unicodedata
>
> from sqlalchemy import *
> from sqlalchemy.orm import *
>
> #set db
> import MySQLdb
> db = MySQLdb.connect(host='localhost', user='root', passwd='',
> db='xxx', use_unicode=True, charset='utf8')
> cur = db.cursor()
> cur.execute('SET NAMES utf8')
> cur.execute('SET CHARACTER SET utf8')
> cur.execute('SET character_set_connection=utf8')
> cur.execute('SET character_set_server=utf8')
> cur.execute('''SHOW VARIABLES LIKE 'char%'; ''')
> print cur.fetchall()
>
> utf_repr = '\xc3\xab'
> hex_repr = '\xeb'
>
> mysql_url = 'mysql://root:@localhost/xxx'
> connect_args = {'charset':'utf8', 'use_unicode':'0'}
> engine = create_engine(mysql_url, connect_args=connect_args)
> metadata = MetaData()
>
>
> test_table = Table('encoding_test', metadata,
>    Column(u'id', Integer, primary_key=True),
>    Column(u'unicode', Integer),
>    Column(u'u_hex', Unicode(10)),
>    Column(u'u_utf', Unicode(10)),
>    Column(u'u_str', Unicode(10)),
>    Column(u's_hex', String(10)),
>    Column(u's_utf', String(10)),
>    Column(u's_str', String(10))
> )
>
> class EncodingTest(object): pass
>
> mapper(EncodingTest, test_table)
>
> metadata.create_all(engine)
> Session = sessionmaker(bind=engine)
>
> session = Session()
> et = EncodingTest()
> et.unicode = 1
> et.u_str = u'ë'
> et.u_hex = u'\xeb'
> et.u_utf = u'\xc3\xab'
> et.s_str = u'ë'
> et.s_hex = u'\xeb'
> et.s_utf = u'\xc3\xab'
> session.add(et)
> session.commit()
> et = EncodingTest()
> et.unicode = 0
> et.u_str = 'ë'
> et.u_hex = '\xeb'
> et.u_utf = '\xc3\xab'
> et.s_str = 'ë'
> et.s_hex = '\xeb'
> et.s_utf = '\xc3\xab'
> session.add(et)
> session.commit()
> session.close()
>
> session = Session()
> results = session.query(EncodingTest).all()
> for result in results:
>    print result.unicode
>    print repr(result.u_hex), repr(result.u_utf), repr(result.u_str)
>    print repr(result.s_hex), repr(result.s_utf), repr(result.s_str)
>    print
>
> in addition, i don't seem to be able to run the mysql settings (# set
> db) from SA.
> any insights are greatly appreciated. btw, the use_unciode, either in
> MySQLdb or SA,
> doesn't seem to have any effect on results.
>
> thx
>
> On Dec 5, 3:25 pm, Michael Bayer <[EMAIL PROTECTED]> wrote:
>> I'm not sure of the mechanics of what you're experiencing, but make
>> sure you use charset=utf8&use_unicode=0 with MySQL.
>>
>> On Dec 5, 2008, at 4:17 PM, n00b wrote:
>>
>>
>>
>>> greetings,
>>
>>> SA (0.5.0rc1) keeps returning utf hex in stead of utf-8 and in the
>>> process driving me batty.  all the mysql setup is fine, the chars  
>>> look
>>> good and are umlauting to goethe's delight. moreover, insert and
>>> select are working perfectly with the MySQLdb api on three different
>>> *nix systems, two servers, ... it works.
>>
>>> where things fall apart is on the retrieval side of SA; inserts are
>>> fine (using the config_args = {'charset':'utf8'} dict in the
>>> create_engine call).
>>
>>> for example, ë, the latin small letter e with diaeresis, is stored  
>>> in
>>> mysql hex as C3 AB; using the MySQldb client, this is exactly what i
>>> get back: '\xc3\xab' (in the # -*- coding: UTF-8 -*- environment) no
>>> further codecs work required. SA, on the other hand, hands me back  
>>> the
>>> utf-hex representation, '\xeb'.
>>
>>> there must be some setting that i'm missing that'll give the
>>> appropriate utf-8 representation at the SA (api) level. any ideas,
>>> suggestions?
>>
>>> thx
>>
>>> yes, i could do  '\xeb'.encode('utf8) but it's not an option. we got
>>> too much data to deal with and MySQLdb is working perfectly well
>>> without the extra step. thx.
> >


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to