Steven Bethard wrote:
On Wed, Jan 28, 2009 at 10:29 AM, "Martin v. Löwis" <mar...@v.loewis.de> wrote:
Notice that the determination of the specific encoding used is fairly
elaborate:
- if IO is to a terminal, Python tries to determine the encoding of
the terminal. This is mostly relevant for Windows (which uses,
by default, the "OEM code page" in the terminal).
- if IO is to a file, Python tries to guess the "common" encoding
for the system. On Unix, it queries the locale, and falls back
to "ascii" if no locale is set. On Windows, it uses the "ANSI
code page". On OSX, it uses the "system encoding".
- if IO is binary, (clearly) no encoding is used. Network IO is
always binary.
- for file names, yet different algorithms apply. On Windows, it
uses the Unicode API, so no need for an encoding. On Unix, it
(again) uses the locale encoding. On OSX, it uses UTF-8
(just to be clear: this applies to the first argument of open(),
not to the resulting file object)
This a very helpful explanation. Is it in the docs somewhere, or if it
isn't, could it be?
Here is the current entry on encodings in the Lib ref, built-in types,
file objects.
file.encoding
The encoding that this file uses. When strings are written to a file,
they will be converted to byte strings using this encoding. In addition,
when the file is connected to a terminal, the attribute gives the
encoding that the terminal is likely to use (that information might be
incorrect if the user has misconfigured the terminal). The attribute is
read-only and may not be present on all file-like objects. It may also
be None, in which case the file uses the system default encoding for
converting strings.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com