On Mon, May 24, 2010 at 8:27 AM, Scott Gould <zinck...@gmail.com> wrote:
> > My database and all of its tables are UTF8 encoded with UTF8 collation > > (DEFAULT CHARSET=utf8;) > > The data I am inputting is unicode > > (u'Save up to 25% on your online order of select HP LaserJet\x92s') > > <type 'unicode'> > > > > But when I try to save this data I get an error > > Incorrect string value: '\\xC2\\x92s' for column 'title' at row 1 > > > > I assume I am missing something, but not sure what I am missing. > > Your string is a unicode string (u'...') but you have UTF-8 encoded > text inside it. No, that is just the way Python displays unicode repr. The value shown is a valid unicode string with a character \x92 in it. This is encoded to utf-8 as \xC2\x92 for storage in the database, and the database is reporting an error with that uf8 encoded value, likely because the table actually has a non-utf8 charset that has no mapping for unicode u+0092. > Unicode is not UTF-8; UTF-8 is a way to represent > unicode in ASCII. You should be able to fix it by either casting that > string to str(), Casting to str() would raise a UnicodeEncodeError, because the unicode character \x92 cannot be encoded in ASCII: >>> u u'LaserJet\x92' >>> type(u) <type 'unicode'> >>> str(u) Traceback (most recent call last): File "<console>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\x92' in position 8: ordinal not in range(128) > or by having "real" unicode inside it (difficult to > say which is better without knowing how you're obtaining that string > to begin with). It is real unicode as it is, though rather odd (it's a "private use" character). Karen -- http://tracey.org/kmt/ -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.