[issue9598] untabify.py fails on files that contain non-ascii characters

Alexander Belopolsky Fri, 03 Sep 2010 15:50:38 -0700

Alexander Belopolsky <belopol...@users.sourceforge.net> added the comment:


> If untabify fails because a file has an incorrect encoding, is it really
> a problem in untabify? This is a developer’s tool, so getting a
> traceback here seems okay to me.

I disagree.  I think we should use this opportunity to clarify preferred 
encoding for C language source files in python and make untabify produce 
meaningful diagnostic in case of encoding errors.

As a matter of policy, I see two possibilities:

1. Restrict C sources to 7-bit ASCII.  (A pedantic reading of ANSI C standard 
would probably suggest even more restricted character set, but practically, I 
don't think 7-bit ASCII in C comments is likely to cause problems for any tools.

2. Require UTF-8 encoding for non-ASCII characters.  Given that this is the 
default for python source code, it is likely that tools that are used for 
python development can handle UTF-8.

My vote is for #1.  Display of non-ascii characters is still not universally 
supported and they are likely to be clobbered when diffs are copied in e-mails 
etc.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue9598>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue9598] untabify.py fails on files that contain non-ascii characters

Reply via email to