Re: Unicode Error when Saving Django Model
Thanks. It was actually a combination of issues. The database was UTF8, I should have added to my original post that I could manually insert and retrieve UTF8 data. The data we are pulling (migrating one system to a new one, built on django) is a bit of a nest of encoding issues. So things that may look like UTF8 may not be, etc. So I think my attempts to encode this data as UTF8 started the problem. Thanks for the help and the general heads up on encoding and unicode with django. I have read about it, but I understand it better each time I encounter a problem with it. --Jim On May 24, 8:30 am, Karen Tracey wrote: > On Sun, May 23, 2010 at 10:10 PM, vjimw wrote: > > I have been reading up on Unicode with Python and Django and I think I > > have my code set to use UTF8 data when saving or updating an object > > but I get an error on model.save() > > > My database and all of its tables are UTF8 encoded with UTF8 collation > > (DEFAULT CHARSET=utf8;) > > The data I am inputting is unicode > > (u'Save up to 25% on your online order of select HP LaserJet\x92s') > > > > > But when I try to save this data I get an error > > Incorrect string value: '\\xC2\\x92s' for column 'title' at row 1 > > This error implies that your MySQL table is not set up the say you think it > is, with a charset of utf8. Given a table that actually has a utf8 charset: > > k...@lbox:~/software/web/playground$ mysql -p Play2 > Enter password: > Reading table information for completion of table and column names > You can turn off this feature to get a quicker startup with -A > > Welcome to the MySQL monitor. Commands end with ; or \g. > Your MySQL connection id is 5852 > Server version: 5.0.67-0ubuntu6.1 (Ubuntu) > > Type 'help;' or '\h' for help. Type '\c' to clear the buffer. > > mysql> show create table ttt_tag; > +-+ > --- > --+ > | Table | Create > Table > | > +-+ > --- > --+ > | ttt_tag | CREATE TABLE `ttt_tag` ( > `id` int(11) NOT NULL auto_increment, > `name` varchar(88) NOT NULL, > PRIMARY KEY (`id`) > ) ENGINE=MyISAM AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 | > +-+ > --- > --+ > 1 row in set (0.00 sec) > > I can create an object in Django using the odd unicode character your > string includes (though I'm not sure what it is supposed to be -- based on > its placement I'd guess it is supposed to be a registered trademark symbol > but that's not what you actually have): > > k...@lbox:~/software/web/playground$ python manage.py shell > Python 2.5.2 (r252:60911, Jan 20 2010, 23:16:55) > [GCC 4.3.2] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > (InteractiveConsole) > > >>> from ttt.models import Tag > >>> t = Tag.objects.create(name=u'HP LaserJet\x92s') > >>> print t > HP LaserJet s > >>> quit() > > So that works, though the character does not print as anything useful. > > If I change the table to have a charset of latin1 (MySQL's default): > > mysql> drop table ttt_tag; > Query OK, 0 rows affected (0.00 sec) > mysql> create table ttt_tag (id int(11) not null auto_increment, name > varchar(88) not null, primary key (id)) engine=myisam default charset > latin1; > Query OK, 0 rows affected (0.01 sec) > > I can then recreate the error you report: > > >>> t = Tag.objects.create(name=u'HP LaserJet\x92s') > > Traceback (most recent call last): > File "", line 1, in > [snipped] > File "/usr/lib/python2.5/warnings.py", line 102, in warn_explicit > raise message > Warning: Incorrect string value: '\xC2\x92s' for column 'name' at row 1 > > So I think one problem is that your table is not actually set up the way you > think it is. > > Another may be that you data is not really correct either. What you are > showing that you have in your data is this character: > > http://www.fileformat.info/info/unicode/char/0092/index.htm > > and I suspect what you really want is either of these: > > http://www.fileformat.info/info/unicode/char/2122/index.htmhttp://www.fileformat.info/info/unicode/char/00ae/index.htm > > Either of these would display better than what you have: > > >>> u1 = u'LaserJet\u2122' > >>> print u1 > LaserJet(tm) > >>> u2 = u'LaserJet\xae' > >>> print u2 > > LaserJet(R) > > Karen > --http://tracey.org/kmt/ > > -- > You received this message because you are subscribed to the Google Groups > "Django users" group. > To post to this group, send email to django-us...@googlegroups.com. > To unsubscribe from this group, send email to > django-users+unsubscr.
Re: Unicode Error when Saving Django Model
Point taken, three times. On May 24, 9:40 am, Karen Tracey wrote: > On Mon, May 24, 2010 at 8:27 AM, Scott Gould wrote: > > > My database and all of its tables are UTF8 encoded with UTF8 collation > > > (DEFAULT CHARSET=utf8;) > > > The data I am inputting is unicode > > > (u'Save up to 25% on your online order of select HP LaserJet\x92s') > > > > > > > But when I try to save this data I get an error > > > Incorrect string value: '\\xC2\\x92s' for column 'title' at row 1 > > > > I assume I am missing something, but not sure what I am missing. > > > Your string is a unicode string (u'...') but you have UTF-8 encoded > > text inside it. > > No, that is just the way Python displays unicode repr. The value shown is a > valid unicode string with a character \x92 in it. This is encoded to utf-8 > as \xC2\x92 for storage in the database, and the database is reporting an > error with that uf8 encoded value, likely because the table actually has a > non-utf8 charset that has no mapping for unicode u+0092. > > > Unicode is not UTF-8; UTF-8 is a way to represent > > unicode in ASCII. You should be able to fix it by either casting that > > string to str(), > > Casting to str() would raise a UnicodeEncodeError, because the unicode > character \x92 cannot be encoded in ASCII: > > >>> u > u'LaserJet\x92' > >>> type(u) > > >>> str(u) > > Traceback (most recent call last): > File "", line 1, in > UnicodeEncodeError: 'ascii' codec can't encode character u'\x92' in position > 8: ordinal not in range(128) > > > or by having "real" unicode inside it (difficult to > > say which is better without knowing how you're obtaining that string > > to begin with). > > It is real unicode as it is, though rather odd (it's a "private use" > character). > > Karen > --http://tracey.org/kmt/ > > -- > You received this message because you are subscribed to the Google Groups > "Django users" group. > To post to this group, send email to django-us...@googlegroups.com. > To unsubscribe from this group, send email to > django-users+unsubscr...@googlegroups.com. > For more options, visit this group > athttp://groups.google.com/group/django-users?hl=en. -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: Unicode Error when Saving Django Model
On Mon, May 24, 2010 at 8:27 AM, Scott Gould wrote: > > My database and all of its tables are UTF8 encoded with UTF8 collation > > (DEFAULT CHARSET=utf8;) > > The data I am inputting is unicode > > (u'Save up to 25% on your online order of select HP LaserJet\x92s') > > > > > > But when I try to save this data I get an error > > Incorrect string value: '\\xC2\\x92s' for column 'title' at row 1 > > > > I assume I am missing something, but not sure what I am missing. > > Your string is a unicode string (u'...') but you have UTF-8 encoded > text inside it. No, that is just the way Python displays unicode repr. The value shown is a valid unicode string with a character \x92 in it. This is encoded to utf-8 as \xC2\x92 for storage in the database, and the database is reporting an error with that uf8 encoded value, likely because the table actually has a non-utf8 charset that has no mapping for unicode u+0092. > Unicode is not UTF-8; UTF-8 is a way to represent > unicode in ASCII. You should be able to fix it by either casting that > string to str(), Casting to str() would raise a UnicodeEncodeError, because the unicode character \x92 cannot be encoded in ASCII: >>> u u'LaserJet\x92' >>> type(u) >>> str(u) Traceback (most recent call last): File "", line 1, in UnicodeEncodeError: 'ascii' codec can't encode character u'\x92' in position 8: ordinal not in range(128) > or by having "real" unicode inside it (difficult to > say which is better without knowing how you're obtaining that string > to begin with). It is real unicode as it is, though rather odd (it's a "private use" character). Karen -- http://tracey.org/kmt/ -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: Unicode Error when Saving Django Model
On Sun, May 23, 2010 at 10:10 PM, vjimw wrote: > I have been reading up on Unicode with Python and Django and I think I > have my code set to use UTF8 data when saving or updating an object > but I get an error on model.save() > > My database and all of its tables are UTF8 encoded with UTF8 collation > (DEFAULT CHARSET=utf8;) > The data I am inputting is unicode > (u'Save up to 25% on your online order of select HP LaserJet\x92s') > > > But when I try to save this data I get an error > Incorrect string value: '\\xC2\\x92s' for column 'title' at row 1 > This error implies that your MySQL table is not set up the say you think it is, with a charset of utf8. Given a table that actually has a utf8 charset: k...@lbox:~/software/web/playground$ mysql -p Play2 Enter password: Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 5852 Server version: 5.0.67-0ubuntu6.1 (Ubuntu) Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> show create table ttt_tag; +-+-+ | Table | Create Table | +-+-+ | ttt_tag | CREATE TABLE `ttt_tag` ( `id` int(11) NOT NULL auto_increment, `name` varchar(88) NOT NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 | +-+-+ 1 row in set (0.00 sec) I can create an object in Django using the odd unicode character your string includes (though I'm not sure what it is supposed to be -- based on its placement I'd guess it is supposed to be a registered trademark symbol but that's not what you actually have): k...@lbox:~/software/web/playground$ python manage.py shell Python 2.5.2 (r252:60911, Jan 20 2010, 23:16:55) [GCC 4.3.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. (InteractiveConsole) >>> from ttt.models import Tag >>> t = Tag.objects.create(name=u'HP LaserJet\x92s') >>> print t HP LaserJet s >>> quit() So that works, though the character does not print as anything useful. If I change the table to have a charset of latin1 (MySQL's default): mysql> drop table ttt_tag; Query OK, 0 rows affected (0.00 sec) mysql> create table ttt_tag (id int(11) not null auto_increment, name varchar(88) not null, primary key (id)) engine=myisam default charset latin1; Query OK, 0 rows affected (0.01 sec) I can then recreate the error you report: >>> t = Tag.objects.create(name=u'HP LaserJet\x92s') Traceback (most recent call last): File "", line 1, in [snipped] File "/usr/lib/python2.5/warnings.py", line 102, in warn_explicit raise message Warning: Incorrect string value: '\xC2\x92s' for column 'name' at row 1 So I think one problem is that your table is not actually set up the way you think it is. Another may be that you data is not really correct either. What you are showing that you have in your data is this character: http://www.fileformat.info/info/unicode/char/0092/index.htm and I suspect what you really want is either of these: http://www.fileformat.info/info/unicode/char/2122/index.htm http://www.fileformat.info/info/unicode/char/00ae/index.htm Either of these would display better than what you have: >>> u1 = u'LaserJet\u2122' >>> print u1 LaserJet(tm) >>> u2 = u'LaserJet\xae' >>> print u2 LaserJet(R) Karen -- http://tracey.org/kmt/ -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: Unicode Error when Saving Django Model
> My database and all of its tables are UTF8 encoded with UTF8 collation > (DEFAULT CHARSET=utf8;) > The data I am inputting is unicode > (u'Save up to 25% on your online order of select HP LaserJet\x92s') > > > But when I try to save this data I get an error > Incorrect string value: '\\xC2\\x92s' for column 'title' at row 1 > > I assume I am missing something, but not sure what I am missing. Your string is a unicode string (u'...') but you have UTF-8 encoded text inside it. Unicode is not UTF-8; UTF-8 is a way to represent unicode in ASCII. You should be able to fix it by either casting that string to str(), or by having "real" unicode inside it (difficult to say which is better without knowing how you're obtaining that string to begin with). Regards Scott -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Unicode Error when Saving Django Model
I have been reading up on Unicode with Python and Django and I think I have my code set to use UTF8 data when saving or updating an object but I get an error on model.save() My database and all of its tables are UTF8 encoded with UTF8 collation (DEFAULT CHARSET=utf8;) The data I am inputting is unicode (u'Save up to 25% on your online order of select HP LaserJet\x92s') But when I try to save this data I get an error Incorrect string value: '\\xC2\\x92s' for column 'title' at row 1 I assume I am missing something, but not sure what I am missing. Thanks! -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.