On 04/11/2016 04:43 PM, Victor Stinner wrote:
Le 11 avr. 2016 11:11 PM, "Ethan Furman" a écrit :

So my concern in such a case is what happens if we pass this SE
string somewhere else: a UTF-8 file, or over a socket, or into a
database? Does this have issues that we wouldn't face if we just used bytes?

"SE string" are returned by os.listdir(str), os.walk(str),
os.getenv(str), sys.argv[int], ... since Python 3.3. Nothing new under
the sun.

So when we pass a bytes object in, Python (on posix) converts that to a string using surrogateescape, gets back strings from the os, and encodes them back to bytes, again using surrogateescape?


Trying to encode a surrogate to ascii, latin1 or utf8 raise an encoding
error.

latin1? I thought latin1 had a code point for 0-255, so how could using it raise an encoding error?

--
~Ethan~
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to