Hello, Django users and developers. I'd been trying to handle non-ascii string (such as Japanese text) from MySQL database for recent several days, on version 0.95 "post-magic-removal". Django loads onto memory as raw byte strings and saves similarly too, so string data I can see directly on database: u'\u3042\u3044\u3046\u3048\u304a' (This is "hiragana" sequence, just like ABC... on English) appers after loading on django, like this: '\xe3\x81\x82\xe3\x81\x84\xe3\x81\x86\xe3\x81\x88\xe3\x81\x8a'
This is because of ignorance of "utf-8" sequence, I want to treat this as unicode string using "unicode()" or "decode()": >>> '\xe3\x81\x82\xe3\x81\x84\xe3\x81\x86\xe3\x81\x88\xe3\x81\x8a'.decode('utf-8') u'\u3042\u3044\u3046\u3048\u304a' but, django directly loads onto "models" object's attributes, and treats as "CharField" string ... it doesn't take care of string encoding. If possible, I want to propose such as "UTF8StringField" to use utf-8 string. It converts a raw byte sequence of string with decoding as "utf-8", holds as "unicode" string internally, and saves as "utf-8" byte sequence to database. I made a so-easy patch to fulfill this. Maybe I don't completely read and understand through whole parts of Django..., then excuse me. ;-) But if this feature is not implemented yet, please use this patch. thanks. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users -~----------~----~----~----~------~----~------~--~---