Martin v. Löwis <mar...@v.loewis.de> added the comment: >> 7-zip encodes "à" (U+00e0) as 0x85 (1 byte), and "é" (U+00e9) as 0x82 (1 >> byte). I don't know this encoding. > > That's an old DOS code paged used in Europe: CP850
There is a good chance that they use it because it is the OEM code page on the system. In any case, I think that both cp850 and cp1252 are inherently incorrect for tarfiles (despite these tools using them). tar is a POSIX thing, and these encodings have nothing to do with POSIX. So using UTF-8 is a reasonable choice, IMO. The other reasonable choice would be ASCII. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue8784> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com