Ezio Melotti <ezio.melo...@gmail.com> added the comment: Mixing byte and unicode strings should always be avoided, because the implicit coercion to unicode works only if the byte strings contains only ASCII, and fails otherwise. Several modules -- including shutil, glob, and os.path -- have API that work with both byte and unicode strings, but fail when you mix the two: >>> os.path.join('א', 'א') # both byte strings -- works '\xd7\x90/\xd7\x90' >>> os.path.join(u'א', u'א') # both unicode -- works u'\u05d0/\u05d0' >>> os.path.join('a', u'א') # mixed, ASCII-only byte string -- works u'a/\u05d0'
>>> os.path.join(u'א', 'א') # mixed, non-ASCII -- fails Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.6/posixpath.py", line 70, in join path += '/' + b UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 1: ordinal not in range(128) >>> os.path.join('א', u'א') # mixed, non-ASCII -- fails Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.6/posixpath.py", line 70, in join path += '/' + b UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 0: ordinal not in range(128) >>> ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue11741> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com