My preference is for option 2: convert file system paths to unicode and use
unicode internally as much as possible. This is consistent with what we
have been doing/recommending for years, even if it is at odds with Python's
default for 2.X. See for example:

https://code.djangoproject.com/ticket/9579

fixed by:

https://github.com/django/django/commit/dfa90aec1b

which converts the file system paths returned by Python to unicode for
future use.

#9579 was essentially the same problem we are again facing in #19357, only
the unicode literals change has made it way more widespread. I think we
should approach #19357 in a consistent fashion with recommendations we have
made ever since adding unicode support to Django: convert to/from unicode
at the edges, use unicode internally. True, this is somewhat at odds with
Python's 2.X behavior/default, but we found Python's behavior in 2.X to be
unworkable for practical unicode support so I'm not particularly concerned
with "going against" the way Python 2.X does things...Python 2.X is broken
here, in my opinion.

It is true this approach may introduce regressions for people who do not
have their environment locale properly configured such that os.
getfilesystemencoding() returns a value that can be used to decode their
file system paths. But these systems are already broken and it's just
accident that they have not run afoul of the problem yet. We have been
noting the need for proper system configuration for years:

https://code.djangoproject.com/ticket/11030#comment:5
https://code.djangoproject.com/ticket/9696#comment:10
https://code.djangoproject.com/ticket/13550#comment:3

The current documentation on this need, however, is still buried (though at
least is is no longer in the mod_python doc only) and makes it sound like
this is only necessary in some rare cases:

https://docs.djangoproject.com/en/dev/howto/deployment/wsgi/modwsgi/#if-you-get-a-unicodeencodeerror

That note should be moved to somewhere more prominent (
https://docs.djangoproject.com/en/dev/howto/deployment/ ?) and re-written
since the problem will now be more widespread and not only crop up during
file uploads of files with non-ascii characters in their names. Likely the
release notes should have an item noting this, assuming we go with option 2
to fix it.

Karen

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to