Hello,

Django 1.5 beta 1 contains a regression for users who install Django or their 
projects under non-ASCII paths: https://code.djangoproject.com/ticket/19357 
Unfortunately, the patch isn't going to be trivial. I'd like to have some 
feedback before making changes.


In order to add compatibility with Python 3, the first step was to remove all 
uses of the `u"..."` syntax, add `from __future__ import unicode_literals` in 
many modules, and use the `b"…"` syntax in the rare cases where a bytestring is 
really needed.

Unfortunately, after enabling unicode_literals, under Python 2, Django attempts 
to concatenate bytestrings and unicode, for instance:

    # django/utils/translation/trans_real.py, line 154
    apppath = os.path.join(os.path.dirname(app.__file__), 'locale')

This pattern occurs in several areas of Django: fixtures, static files, 
templates and translations, etc. It's also very common in tests.

In the example above, when the first argument only contains ASCII characters, 
it's silently converted to unicode. This explains why the problem wasn't 
detected earlier.


Fundamentally, the unicode_literals patch had the side effect of switching the 
internal representation of filesystem paths from str to unicode under Python 2 
in many modules. (Under Python 3, everything is working fine!)

Since rolling back that patch isn't possible, I see three options.

1) Restore Django 1.4's behavior and switch filesystem paths handling back to 
str. That means using native strings (str objects) under Python 2 and 3, like 
Python itself does. The example above would become:

    apppath = os.path.join(os.path.dirname(app.__file__), str('locale'))

That's what I've started doing on the ticket. This is (probably) the most 
backwards-compatible solution. The patch is large — but not that large compared 
to the unicode_literals path itself… When we added support for Python 3 with a 
single codebase, we knew we'd have to use this pattern wherever we needed a 
native string.

2) Keep filesystem paths handling in unicode. In general, it's a good practice 
to work in unicode and convert at the edges ("unicode sandwich"). But in this 
case, it also means deviating from Python's behavior. This would be a major 
change in Django's APIs, and one whose consequences haven't been well 
anticipated. I haven't explored this solution.

3) Document this as a known limitation of Django 1.5, and postpone the fix to 
Django 1.6.


How do you think we should move forward?

Best regards,

-- 
Aymeric.


-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to