New submission from STINNER Victor <victor.stin...@haypocalc.com>: While working on #9425 (support non-ascii characters in python directory name with ascii locale), I wrote a patch for distutils.file_util(): set encoding to utf-8 and errors to surrogateescape. See the patch with comments at: http://codereview.appspot.com/1874048/patch/1/9
(the patch is not enough, it should also patch *all* functions reading files) I discussed with takek who told me that it is documented that distutils files have to be utf-8. I didn't found the documentation. I checked read_manifest() in sdist command: in Python2 and Python3, it uses open(name) syntax. It means that Python2 uses the binary API (bytes), whereas Python3 uses the text API (unicode characters) and Python3 relies on open() (TextIOWrapper) heuristic to *guess* the file encoding. I think that it will be better to specify the encoding in Python3, and maybe use the text API in Python2. Anyway, before going futher (work on patches), I would like the approval of distutils maintainer(s). ---------- assignee: tarek components: Distutils, Distutils2, Unicode messages: 113552 nosy: haypo, merwok, tarek priority: normal severity: normal status: open title: distutils: set encoding to utf-8 for input and output files versions: Python 3.2 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue9561> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com