On Sun, Dec 2, 2012 at 2:43 AM, Aymeric Augustin <
aymeric.augus...@polytechnique.org> wrote:

> Hello,
>
> Django 1.5 beta 1 contains a regression for users who install Django or
> their projects under non-ASCII paths:
> https://code.djangoproject.com/ticket/19357 Unfortunately, the patch
> isn't going to be trivial. I'd like to have some feedback before making
> changes.
>
>
> In order to add compatibility with Python 3, the first step was to remove
> all uses of the `u"..."` syntax, add `from __future__ import
> unicode_literals` in many modules, and use the `b"…"` syntax in the rare
> cases where a bytestring is really needed.
>
> Unfortunately, after enabling unicode_literals, under Python 2, Django
> attempts to concatenate bytestrings and unicode, for instance:
>
>     # django/utils/translation/trans_real.py, line 154
>     apppath = os.path.join(os.path.dirname(app.__file__), 'locale')
>
> This pattern occurs in several areas of Django: fixtures, static files,
> templates and translations, etc. It's also very common in tests.
>
> In the example above, when the first argument only contains ASCII
> characters, it's silently converted to unicode. This explains why the
> problem wasn't detected earlier.
>
>
> Fundamentally, the unicode_literals patch had the side effect of switching
> the internal representation of filesystem paths from str to unicode under
> Python 2 in many modules. (Under Python 3, everything is working fine!)
>
> Since rolling back that patch isn't possible, I see three options.
>
> 1) Restore Django 1.4's behavior and switch filesystem paths handling back
> to str. That means using native strings (str objects) under Python 2 and 3,
> like Python itself does. The example above would become:
>
>     apppath = os.path.join(os.path.dirname(app.__file__), str('locale'))
>
> That's what I've started doing on the ticket. This is (probably) the most
> backwards-compatible solution. The patch is large — but not that large
> compared to the unicode_literals path itself… When we added support for
> Python 3 with a single codebase, we knew we'd have to use this pattern
> wherever we needed a native string.
>
> 2) Keep filesystem paths handling in unicode. In general, it's a good
> practice to work in unicode and convert at the edges ("unicode sandwich").
> But in this case, it also means deviating from Python's behavior. This
> would be a major change in Django's APIs, and one whose consequences
> haven't been well anticipated. I haven't explored this solution.
>
> 3) Document this as a known limitation of Django 1.5, and postpone the fix
> to Django 1.6.
>
> (3) sounds like a non-started to me.

I've had a look at the patches for (1) and (2), and to me, the look like
mirror images of the same patch -- it's just a matter of whether we convert
everything to bytes or unicode when we have the opportunity.

My immediate reaction is that (2) -- keeping everything in unicode until it
doesn't need to be -- looks like the better long term solution, but I'll
also admit that this is based purely upon history and an inspection of the
patch. In particular, I'm not completely up to speed with the Python3
implications. In the notes for approach 2, you say that this approach would
be "deviating from Python's" behaviour -- can you summarise what the
expected Python behaviour here is (especially for Python 3, but summarising
Python 2 wouldn't hurt either)?

Russ %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to