My preference is for option 2: convert file system paths to unicode and use unicode internally as much as possible. This is consistent with what we have been doing/recommending for years, even if it is at odds with Python's default for 2.X. See for example:
https://code.djangoproject.com/ticket/9579 fixed by: https://github.com/django/django/commit/dfa90aec1b which converts the file system paths returned by Python to unicode for future use. #9579 was essentially the same problem we are again facing in #19357, only the unicode literals change has made it way more widespread. I think we should approach #19357 in a consistent fashion with recommendations we have made ever since adding unicode support to Django: convert to/from unicode at the edges, use unicode internally. True, this is somewhat at odds with Python's 2.X behavior/default, but we found Python's behavior in 2.X to be unworkable for practical unicode support so I'm not particularly concerned with "going against" the way Python 2.X does things...Python 2.X is broken here, in my opinion. It is true this approach may introduce regressions for people who do not have their environment locale properly configured such that os. getfilesystemencoding() returns a value that can be used to decode their file system paths. But these systems are already broken and it's just accident that they have not run afoul of the problem yet. We have been noting the need for proper system configuration for years: https://code.djangoproject.com/ticket/11030#comment:5 https://code.djangoproject.com/ticket/9696#comment:10 https://code.djangoproject.com/ticket/13550#comment:3 The current documentation on this need, however, is still buried (though at least is is no longer in the mod_python doc only) and makes it sound like this is only necessary in some rare cases: https://docs.djangoproject.com/en/dev/howto/deployment/wsgi/modwsgi/#if-you-get-a-unicodeencodeerror That note should be moved to somewhere more prominent ( https://docs.djangoproject.com/en/dev/howto/deployment/ ?) and re-written since the problem will now be more widespread and not only crop up during file uploads of files with non-ascii characters in their names. Likely the release notes should have an item noting this, assuming we go with option 2 to fix it. Karen -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.