Ezio Melotti <ezio.melo...@gmail.com> added the comment:

Mixing byte and unicode strings should always be avoided, because the implicit 
coercion to unicode works only if the byte strings contains only ASCII, and 
fails otherwise.
Several modules -- including shutil, glob, and os.path -- have API that work 
with both byte and unicode strings, but fail when you mix the two:
>>> os.path.join('א', 'א')  # both byte strings -- works
'\xd7\x90/\xd7\x90'
>>> os.path.join(u'א', u'א')  # both unicode -- works
u'\u05d0/\u05d0'
>>> os.path.join('a', u'א')  # mixed, ASCII-only byte string -- works
u'a/\u05d0'

>>> os.path.join(u'א', 'א')  # mixed, non-ASCII -- fails
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/posixpath.py", line 70, in join
    path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 1: ordinal 
not in range(128)
>>> os.path.join('א', u'א')  # mixed, non-ASCII -- fails
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/posixpath.py", line 70, in join
    path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 0: ordinal 
not in range(128)
>>>

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue11741>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to