On Tue, Nov 17, 2009 at 3:14 PM, Tom Evans <tevans...@googlemail.com> wrote:

> Hi all
>
> I'm encountering a difficult to solve unicode problem whilst saving data to
> the database. Worst of all, any attempt to reduce it to a simple test case,
> or reproduce it in the console fail(!). This is on django 1.0.
>
> The process encountering the error is a simple daemon, run from a
> management command [1]. The process looks up a task [2] to run and executes
> it. After the task has finished executing, it updates the generated_content
> member on the model, either to contain any pertinent error messages if there
> was a failure, or to store rendered HTML if the task was successful.
>
> The problem occurs when the generated HTML contains particular unicode
> characters (in this case, right single quotation mark, \u2019), which for
> some reason prompts django or MySQLdb to decide to convert it to unicode.
> The unicode HTML comes from rendering a django template; here's the snippet
> that generates the HTML:
>
>       cdict = { ... } # left out; template renders correctly, so not
> important..
>       ctxt = Context(cdict)
>       from django.template import loader
>       content = loader.render_to_string('the_template.html',
> context_instance=ctxt)
>       self.task.generated_content = content
>
> This code is called from MigrationTask::execute() - this is in the
> (working) PerformMigration class - and is the last thing that happens before
> we call save() on the modified instance. Apart from the generated_content,
> the only other thing that changes on this model as a result of this code is
> the status attribute.
>
> When we do call save(), the following traceback is produced:
>
> Traceback (most recent call last):
>   File
> "/usr/local/www/django/ssosp/externals/identity_provider/tasks/management/commands/taskrunner.py",
> line 44, in handle
>     task.execute()
>   File
> "/usr/local/www/django/ssosp/externals/identity_provider/tasks/models.py",
> line 39, in execute
>     self.save()
>   File
> "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/base.py",
> line 307, in save
>     self.save_base(force_insert=force_insert, force_update=force_update)
>   File
> "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/base.py",
> line 358, in save_base
>     rows = manager.filter(pk=pk_val)._update(values)
>   File
> "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/query.py",
> line 429, in _update
>     return query.execute_sql(None)
>   File
> "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/sql/subqueries.py",
> line 117, in execute_sql
>     cursor = super(UpdateQuery, self).execute_sql(result_type)
>   File
> "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/sql/query.py",
> line 1700, in execute_sql
>     cursor.execute(sql, params)
>   File
> "/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/backends/mysql/base.py",
> line 83, in execute
>     return self.cursor.execute(query, args)
>   File "/usr/local/lib/python2.5/site-packages/MySQLdb/cursors.py", line
> 151, in execute
>     query = query % db.literal(args)
>   File "/usr/local/lib/python2.5/site-packages/MySQLdb/connections.py",
> line 247, in literal
>     return self.escape(o, self.encoders)
>   File "/usr/local/lib/python2.5/site-packages/MySQLdb/connections.py",
> line 180, in string_literal
>     return db.string_literal(obj)
> UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in
> position 1182: ordinal not in range(128)
>
> If I set a break point where we generate the content, print out
> repr(content), copy paste that into a django python shell and assign it to a
> task's generated_content property, it saves correctly.
>
> If I manually change content to u'\u2019' inside the debugger, it also
> saves correctly. It also works correctly for u'\u2019'*2048, just in case
> size of string matters.
>
> The database and all tables are set to UTF-8 in mysql. My locale is
> correctly set up in both cases (en_GB.UTF-8). I'm very confused as to why it
> is attempting to convert it to ascii :/
>
> Any hints/tips greatly appreciated.
>
> Cheers
>
> Tom
>
> [1] http://pastebin.com/m9e23563
> [2] http://pastebin.com/m564e1cd7
>
>
This appears to be some sort of issue between the mysqldb and django's
templating system.

This code generates the UnicodeEncodeError as shown above, when the model is
saved:

      from django.template import loader
      content = loader.render_to_string('the_template.html',
context_instance=ctxt)
      self.task.generated_content = content

This code does not:

      from django.template import loader
      content = loader.render_to_string('the_template.html',
context_instance=ctxt)
      self.task.generated_content = content[:]

IE, taking a copy of the string that was rendered avoids mysql barfing on it
as input. Bonus points if someone can explain what on earth it is that
django is doing to a unicode string that stops mysql treating it as a
unicode string!

Here is a test case:

class Content(models.Model):
  content = models.TextField()

>>> from tasks.models import Content
>>> from django.template import loader
>>> c = loader.render_to_string('uni.html')
>>> m=Content(content=c)
>>> m.save()
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File
"/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/base.py",
line 307, in save
    self.save_base(force_insert=force_insert, force_update=force_update)
  File
"/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/base.py",
line 379, in save_base
    result = manager._insert(values, return_id=update_pk)
  File
"/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/manager.py",
line 138, in _insert
    return insert_query(self.model, values, **kwargs)
  File
"/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/query.py",
line 888, in insert_query
    return query.execute_sql(return_id)
  File
"/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/sql/subqueries.py",
line 308, in execute_sql
    cursor = super(InsertQuery, self).execute_sql(None)
  File
"/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/models/sql/query.py",
line 1700, in execute_sql
    cursor.execute(sql, params)
  File
"/usr/local/www/django/ssosp/root/lib/python2.5/site-packages/django/db/backends/mysql/base.py",
line 83, in execute
    return self.cursor.execute(query, args)
  File "/usr/local/lib/python2.5/site-packages/MySQLdb/cursors.py", line
151, in execute
    query = query % db.literal(args)
  File "/usr/local/lib/python2.5/site-packages/MySQLdb/connections.py", line
247, in literal
    return self.escape(o, self.encoders)
  File "/usr/local/lib/python2.5/site-packages/MySQLdb/connections.py", line
180, in string_literal
    return db.string_literal(obj)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in
position 0: ordinal not in range(128)

The template 'uni.html' consists of a single character, \u2019 encoded in
UTF-8. Looks like this in od -x:

> $ cat uni.html | od -x
0000000      80e2    0a99
0000004

Cheers

Tom

--

You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=.


Reply via email to