New submission from STINNER Victor <victor.stin...@haypocalc.com>:

While working on #9425 (support non-ascii characters in python directory name 
with ascii locale), I wrote a patch for distutils.file_util(): set encoding to 
utf-8 and errors to surrogateescape. See the patch with comments at:
http://codereview.appspot.com/1874048/patch/1/9

(the patch is not enough, it should also patch *all* functions reading files)

I discussed with takek who told me that it is documented that distutils files 
have to be utf-8. I didn't found the documentation. I checked read_manifest() 
in sdist command: in Python2 and Python3, it uses open(name) syntax. It means 
that Python2 uses the binary API (bytes), whereas Python3 uses the text API 
(unicode characters) and Python3 relies on open() (TextIOWrapper) heuristic to 
*guess* the file encoding.

I think that it will be better to specify the encoding in Python3, and maybe 
use the text API in Python2.

Anyway, before going futher (work on patches), I would like the approval of 
distutils maintainer(s).

----------
assignee: tarek
components: Distutils, Distutils2, Unicode
messages: 113552
nosy: haypo, merwok, tarek
priority: normal
severity: normal
status: open
title: distutils: set encoding to utf-8 for input and output files
versions: Python 3.2

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue9561>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to