Thanks for the clarity, Steve, a couple questions/thoughts: The choices are: > > * don't represent them at all (remove bytes API) >
Would the bytes API be removed on *nix also? > * convert and drop characters not in the (legacy) active code page > * convert and fail on characters not in the (legacy) active code page > "Failure is not an option" -- These two seem like a plain old bad idea. * convert and fail on invalid surrogate pairs > where would an invalid surrogate pair come from? never from a file system API call, yes? * represent them as UTF-16-LE in bytes (with embedded '\0' everywhere) > would this be doing anything -- or just keeping whatever the Windows API takes/returns? i.e. exactly what is done on *nix? > The fifth option is the best for round-tripping within Windows APIs. > How is it better? only performance (i.e. no encoding/decoding required) -- or would it be more reliable as well? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/