Daniel Hillier <daniel.hill...@gmail.com> added the comment: zipfile decodes filenames using cp437 or unicode and encodes using ascii or unicode. It seems like zipfile has a preference for writing filenames in unicode rather than cp437. Is zipfile's preference for writing filenames in unicode rather than cp437 intentional?
Is the bug you're seeing related to using zipfile to open and rewrite old zips and not being able to open the rewritten files in an old program that doesn't support the unicode flag? We could address this two ways: - Change ZipInfo._encodeFilenameFlags() to always encode to cp437 if possible - Add a flag to write filenames in cp437 or unicode, otherwise the current situation of ascii or unicode I guess the choice will depend on if preferring unicode rather than cp437 is intentional and if writing filenames in cp437 will break anything (it shouldn't break anything according to Appendix D of https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT) Here's a test for your current patch (I'd probably put it alongside OtherTests.test_read_after_write_unicode_filenames as this test was adapted from that one) class OtherTests(unittest.TestCase): ... def test_read_after_write_cp437_filenames(self): fname = 'test_cp437_é' with zipfile.ZipFile(TESTFN2, 'w') as zipfp: zipfp.writestr(fname, b'sample') with zipfile.ZipFile(TESTFN2) as zipfp: zinfo = zipfp.infolist()[0] # Ensure general purpose bit 11 (Language encoding flag # (EFS)) is unset to indicate the filename is not unicode self.assertFalse(zinfo.flag_bits & 0x800) self.assertEqual(zipfp.read(fname), b'sample') ---------- nosy: +dhillier _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue40172> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com