Martin v. Löwis <mar...@v.loewis.de> added the comment:

>> 7-zip encodes "à" (U+00e0) as 0x85 (1 byte), and "é" (U+00e9) as 0x82 (1 
>> byte). I don't know this encoding.
>
> That's an old DOS code paged used in Europe: CP850

There is a good chance that they use it because it is the OEM code page 
on the system.

In any case, I think that both cp850 and cp1252 are inherently incorrect 
for tarfiles (despite these tools using them). tar is a POSIX thing, and 
these encodings have nothing to do with POSIX.

So using UTF-8 is a reasonable choice, IMO. The other reasonable choice 
would be ASCII.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue8784>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to