Ray.Allen <ysj....@gmail.com> added the comment:

This is the problem with module tabnanny, it always tries to read the py source 
file as a platform-dependent encoded text module, that is, open the file with 
builtin function "open()", and with no encoding parameters. It doesn't parse 
the encoding cookie at the beginning of the fource file! So if a python source 
file contains some character not encoded in that platform-dependent encoding, 
the tabnanny module will fail on checking that source file. Not only heapq.py, 
but also several other stander modules.

That platform-dependent encoding is judged as following orders:
1. os.device_encoding(fd)
2. locale.preferredencoding()
3. ascii.

I wonder why tabnanny works in this way. Is this the intended behaviour?  On my 
flatform, if I use tabnanny to check a source file which contains some chinese 
characters and encoded in 'gbk', the UnicodeDecodedError will raise.

If this is not the intended behaviour, I guess if we want to fix this problem, 
we have to change the way tabnanny read the source file. Just like the way 
python compiler works. First, open the file in "rb" module, then try to detect 
the encoding use tokenize.detect_encoding() method, then use the dected 
encoding to open the source file again in text module.

----------
nosy: +ysj.ray

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue8774>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to