2009/6/29 Antoine Pitrou <solip...@pitrou.net>: > As for a bytes version of sys.argv and os.environ, you're welcome to propose a > patch (this would be a separate issue on the aforementioned issue tracker).
But please be aware that such a proposal would have to consider: 1. That on Windows, the native form is the character version, and the bytes version would have to address all the same sorts of encoding issues that the OP is complaining about in the character versions. [1] 2. That the proposal address the question of how to write portable, robust, code (given that choosing argv vs argv_bytes based on sys.platform is unlikely to count as a good option...) 3. Why defining your own argv_bytes as argv_bytes = [a.encode("iso-8859-1", "surrogateescape") for a in sys.argv] is insufficient (excluding issues with bugs, which will be fixed regardless) for the occasional cases where it's needed. Before writing the proposal, the OP should probably review the extensive discussions which can be found in the python-dev archives. It would be wrong for people reading this thread to think that the implemented approach is in any sense a "quick fix" - it's certainly a compromise (and no-one likes all aspects of any compromise!) but it's one made after a lot of input from people with widely differing requirements. Paul. [1] And my understanding, from the PEP, is that even on POSIX, the argv and environ data is intended to be character data, even though the native C APIs expose a byte-oriented interface. So conceptually, character format is "correct" on POSIX as well... (But I don't write code for POSIX systems, so I'll leave it to the POSIX users to debate this point further). -- http://mail.python.org/mailman/listinfo/python-list