New submission from Bill Fenner <fen...@gmail.com>: In python 2.5, shlex handled unicode input fine:
Python 2.5.1 (r251:54863, Jun 15 2008, 18:24:51) [GCC 4.3.0 20080428 (Red Hat 4.3.0-8)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import shlex >>> shlex.split( u'Hello, World!' ) ['Hello,', 'World!'] In python 2.6, shlex turns unicode input into UCS-4 output, thus utterly confusing execl: Python 2.6 (r26:66714, Jun 8 2009, 16:07:29) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import shlex >>> shlex.split( u'Hello, World' ) ['H\x00\x00\x00e\x00\x00\x00l\x00\x00\x00l\x00\x00\x00o\x00\x00\x00,\x00\x00\x00', '\x00\x00\x00W\x00\x00\x00o\x00\x00\x00r\x00\x00\x00l\x00\x00\x00d\x00\x00\x00'] Even weirder, the two return strings have different byte order (see 'H\x00\x00\x00' vs. '\x00\x00\x00W'!) ---------- components: Library (Lib) messages: 93074 nosy: fenner severity: normal status: open title: shlex.split() converts unicode input to UCS-4 output with varying byte order versions: Python 2.6 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue6988> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com