New submission from Bill Fenner <fen...@gmail.com>:

In python 2.5, shlex handled unicode input fine:

Python 2.5.1 (r251:54863, Jun 15 2008, 18:24:51) 
[GCC 4.3.0 20080428 (Red Hat 4.3.0-8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import shlex
>>> shlex.split( u'Hello, World!' )
['Hello,', 'World!']

In python 2.6, shlex turns unicode input into UCS-4 output, thus utterly
confusing execl:

Python 2.6 (r26:66714, Jun  8 2009, 16:07:29)
[GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import shlex
>>> shlex.split( u'Hello, World' )
['H\x00\x00\x00e\x00\x00\x00l\x00\x00\x00l\x00\x00\x00o\x00\x00\x00,\x00\x00\x00',
'\x00\x00\x00W\x00\x00\x00o\x00\x00\x00r\x00\x00\x00l\x00\x00\x00d\x00\x00\x00']

Even weirder, the two return strings have different byte order (see
'H\x00\x00\x00' vs. '\x00\x00\x00W'!)

----------
components: Library (Lib)
messages: 93074
nosy: fenner
severity: normal
status: open
title: shlex.split() converts unicode input to UCS-4 output with varying byte 
order
versions: Python 2.6

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue6988>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to