On Tue, Nov 17, 2009 at 3:14 PM, Tom Evans <tevans...@googlemail.com> wrote:
> Hi all > > I'm encountering a difficult to solve unicode problem whilst saving data to > the database. Worst of all, any attempt to reduce it to a simple test case, > or reproduce it in the console fail(!). This is on django 1.0. > > The process encountering the error is a simple daemon, run from a > management command [1]. The process looks up a task [2] to run and executes > it. After the task has finished executing, it updates the generated_content > member on the model, either to contain any pertinent error messages if there > was a failure, or to store rendered HTML if the task was successful. > > The problem occurs when the generated HTML contains particular unicode > characters (in this case, right single quotation mark, \u2019), which for > some reason prompts django or MySQLdb to decide to convert it to unicode. > The unicode HTML comes from rendering a django template; here's the snippet > that generates the HTML: > > cdict = { ... } # left out; template renders correctly, so not > important.. > ctxt = Context(cdict) > from django.template import loader > content = loader.render_to_string('the_template.html', > context_instance=ctxt) > self.task.generated_content = content > > This code is called from MigrationTask::execute() - this is in the > (working) PerformMigration class - and is the last thing that happens before > we call save() on the modified instance. Apart from the generated_content, > the only other thing that changes on this model as a result of this code is > the status attribute. > > When we do call save(), the following traceback is produced: > > Traceback (most recent call last): > File > "/usr/local/www/django/ssosp/externals/identity_provider/tasks/management/commands/taskrunner.py", > line 44, in handle > task.execute() > File > "/usr/local/www/django/ssosp/externals/identity_provider/tasks/models.py", > line 39, in execute > self.save() > File > "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/base.py", > line 307, in save > self.save_base(force_insert=force_insert, force_update=force_update) > File > "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/base.py", > line 358, in save_base > rows = manager.filter(pk=pk_val)._update(values) > File > "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/query.py", > line 429, in _update > return query.execute_sql(None) > File > "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/sql/subqueries.py", > line 117, in execute_sql > cursor = super(UpdateQuery, self).execute_sql(result_type) > File > "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/sql/query.py", > line 1700, in execute_sql > cursor.execute(sql, params) > File > "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/backends/mysql/base.py", > line 83, in execute > return self.cursor.execute(query, args) > File "/usr/local/lib/python2.5/site-packages/MySQLdb/cursors.py", line > 151, in execute > query = query % db.literal(args) > File "/usr/local/lib/python2.5/site-packages/MySQLdb/connections.py", > line 247, in literal > return self.escape(o, self.encoders) > File "/usr/local/lib/python2.5/site-packages/MySQLdb/connections.py", > line 180, in string_literal > return db.string_literal(obj) > UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in > position 1182: ordinal not in range(128) > > If I set a break point where we generate the content, print out > repr(content), copy paste that into a django python shell and assign it to a > task's generated_content property, it saves correctly. > > If I manually change content to u'\u2019' inside the debugger, it also > saves correctly. It also works correctly for u'\u2019'*2048, just in case > size of string matters. > > The database and all tables are set to UTF-8 in mysql. My locale is > correctly set up in both cases (en_GB.UTF-8). I'm very confused as to why it > is attempting to convert it to ascii :/ > > Any hints/tips greatly appreciated. > > Cheers > > Tom > > [1] http://pastebin.com/m9e23563 > [2] http://pastebin.com/m564e1cd7 > > This appears to be some sort of issue between the mysqldb and django's templating system. This code generates the UnicodeEncodeError as shown above, when the model is saved: from django.template import loader content = loader.render_to_string('the_template.html', context_instance=ctxt) self.task.generated_content = content This code does not: from django.template import loader content = loader.render_to_string('the_template.html', context_instance=ctxt) self.task.generated_content = content[:] IE, taking a copy of the string that was rendered avoids mysql barfing on it as input. Bonus points if someone can explain what on earth it is that django is doing to a unicode string that stops mysql treating it as a unicode string! Here is a test case: class Content(models.Model): content = models.TextField() >>> from tasks.models import Content >>> from django.template import loader >>> c = loader.render_to_string('uni.html') >>> m=Content(content=c) >>> m.save() Traceback (most recent call last): File "<console>", line 1, in <module> File "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/base.py", line 307, in save self.save_base(force_insert=force_insert, force_update=force_update) File "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/base.py", line 379, in save_base result = manager._insert(values, return_id=update_pk) File "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/manager.py", line 138, in _insert return insert_query(self.model, values, **kwargs) File "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/query.py", line 888, in insert_query return query.execute_sql(return_id) File "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/sql/subqueries.py", line 308, in execute_sql cursor = super(InsertQuery, self).execute_sql(None) File "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/sql/query.py", line 1700, in execute_sql cursor.execute(sql, params) File "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/backends/mysql/base.py", line 83, in execute return self.cursor.execute(query, args) File "/usr/local/lib/python2.5/site-packages/MySQLdb/cursors.py", line 151, in execute query = query % db.literal(args) File "/usr/local/lib/python2.5/site-packages/MySQLdb/connections.py", line 247, in literal return self.escape(o, self.encoders) File "/usr/local/lib/python2.5/site-packages/MySQLdb/connections.py", line 180, in string_literal return db.string_literal(obj) UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 0: ordinal not in range(128) The template 'uni.html' consists of a single character, \u2019 encoded in UTF-8. Looks like this in od -x: > $ cat uni.html | od -x 0000000 80e2 0a99 0000004 Cheers Tom -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=.