New submission from Vinay Sajip: The attached file failing.tar.gz contains a path with UTF-8-encoded Unicode. This causes extractall() to fail, but only when the destination path is Unicode. That's because it leads to a implicit str->unicode conversion using ASCII.
Test script: import shutil, tarfile, tempfile tf = tarfile.open('failing.tar.gz', 'r:gz') workdir = tempfile.mkdtemp() try: # N.B. ensure dest path is Unicode to trigger the failure tf.extractall(unicode(workdir)) finally: shutil.rmtree(workdir) Result: $ python untar.py Traceback (most recent call last): File "untar.py", line 8, in <module> tf.extractall(unicode(workdir)) File "/usr/lib/python2.7/tarfile.py", line 2046, in extractall self.extract(tarinfo, path) File "/usr/lib/python2.7/tarfile.py", line 2083, in extract self._extract_member(tarinfo, os.path.join(path, tarinfo.name)) File "/usr/lib/python2.7/posixpath.py", line 71, in join path += '/' + b UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 44: ordinal not in range(128) ---------- components: Library (Lib), Unicode messages: 181631 nosy: ezio.melotti, vinay.sajip priority: normal severity: normal status: open title: tarfile extract fails when Unicode in pathname type: behavior versions: Python 2.7 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue17153> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com