[Bug 580961] Re: unzip fails to deal correctly with filename encodings
@Launchpad Janitor in #81, the proposed fix really BREAKS the fix by @frol 's PPA. $wget http://ru.archive.ubuntu.com/ubuntu/pool/main/u/unzip/unzip_6.0-4ubuntu1_amd64.deb $sudo dpkg -i unzip_6.0-4ubuntu1_amd64.deb $unzip -l russian.zip 008 - Russian/?? ?? ?? ??, ?? ??.txt Ooops, it's broken ! $wget http://ppa.launchpad.net/frol/zip-i18n/ubuntu/pool/main/u/unzip/unzip_6.0-4ppa3_amd64.deb $sudo dpkg -i unzip_6.0-4ppa3_amd64.deb $unzip -l russian.zip 008 - Russian/Съешь ещё этих мягких французских булок, да выпей же чаю.txt Fixed, we got it back ! The archive is attached, made by GUI 7z from Windows via Wine. P.S. 'libnatspec' was installed previously and never removed. P.S.2 In File Roller, it's broken both ways ** Attachment added: Archive sample with Russian pangram as filename. https://bugs.launchpad.net/debian/+source/unzip/+bug/580961/+attachment/1814823/+files/russian.zip -- You received this bug notification because you are a member of Ubuntu Desktop Bugs, which is a subscriber of a duplicate bug (34667). https://bugs.launchpad.net/bugs/580961 Title: unzip fails to deal correctly with filename encodings -- desktop-bugs mailing list desktop-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/desktop-bugs
[Bug 580961] Re: unzip fails to deal correctly with filename encodings
I contacted the author of last patches (compiled into PPA by @frol) about reported flaw with Hebrew and possibly other languages than Russian. He said he just solved problem for himself, published the solution on his favorite local Linux/FOSS site so it won't be lost, and doesn't want to work on it any further. Also he said he wrote an email to the original author of 'libnatspec' and zip/unzip patches using it, with an offer to include his contribution. The email was never answered. So, this path of solution looks like a blind alley - at least if we want flexible worldwide solution. From my perspective, the problem is in clinging to the old piece of legacy code what tries to be extremely portable at the cost of architectural layering, use of libraries and system-specific facilities. I see the solution in abandoning InfoZIP altogether and writing a gcc/*nix only zip/unzip in either of two ways: 1) Well designed to the modern coding standards library for zipfile format (say 'libzipfile'), with all reasonable hooks and callbacks for stream and piece-wise processing, NOT doing any data transformations ex for ZIP record structures and calling zlib for (de-)compression itself. And atop of this library, also modern well-designed command line tools, relying on Linux/*NIX specific libraries for things like charset conversion, command line options etc, written for clarity and flexibility. 2) Stopgap scripts in Python, wrappers around it's zipfile and iconv modules, minimal to the requirements of GUI's like FileRoller and it's KDE counterpart. -- You received this bug notification because you are a member of Ubuntu Desktop Bugs, which is a subscriber of a duplicate bug (34667). https://bugs.launchpad.net/bugs/580961 Title: unzip fails to deal correctly with filename encodings -- desktop-bugs mailing list desktop-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/desktop-bugs
[Bug 580961] Re: unzip fails to deal correctly with filename encodings
@Shimi Chen, thank you for testing, but this way of reporting is counter-constructive, since it's not reproducible. Please supply minimal live examples of: * good and bad Hebrew archives * descriptions of how did you construct them. Scripts for automated testing are especially welcome. * files attempted to compress, under surely good archiver (I see 7z is good) * files extracted after bad compression, also under surely good archiver Here is the test data I prepared for many languages along with test automation script, if something is incomplete for Hebrew (additional letter elements like Latin diacritics ?), please add. With reproducible tests at hand, I will try to contact the patch authors, or if this fail, we should give this problem to some highly international team (at a university ?). ** Attachment added: Test data suite and automation script. https://bugs.launchpad.net/debian/+source/unzip/+bug/580961/+attachment/1803433/+files/non-English_file_names.7z -- You received this bug notification because you are a member of Ubuntu Desktop Bugs, which is a subscriber of a duplicate bug (34667). https://bugs.launchpad.net/bugs/580961 Title: unzip fails to deal correctly with filename encodings -- desktop-bugs mailing list desktop-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/desktop-bugs
[Bug 580961] Re: unzip fails to deal correctly with filename encodings
@Brian: unzip is only a half of solution (not tested personally), zip will remain broken, writing archives not readable on other systems. @Aron: Done, but I have heard what Debian maintainer for ZIP doesn't want to accept solution what won't be ultimately upstreamed, and InfoZIP team does not accept Linux-specific patch since their purpose is ultimate portability. -- You received this bug notification because you are a member of Ubuntu Desktop Bugs, which is a subscriber of a duplicate bug (34667). https://bugs.launchpad.net/bugs/580961 Title: unzip fails to deal correctly with filename encodings -- desktop-bugs mailing list desktop-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/desktop-bugs
[Bug 580961] Re: unzip fails to deal correctly with filename encodings
Here is, indeed, the ready working solution right at this site: 1) Add two PPA's, https://launchpad.net/~frol/+archive/zip-i18n/ (the patches I mentioned in #61, with the same publication reference) and https://launchpad.net/~r0lf/+archive/ppa (libnatspec, it's dependency) 2) apt-get update; apt-get install zip unzip SOLVED ! I just tested this - works two-way, read and write. So after fixing the bug technically, the only step remaining is to fix it socially: come to agreement about inclusion of libnatspec and patched InfoZIP into Ubuntu/Debian, or competent voted decision to implement codepage conversion inside InfoZIP in accordance with it's developers, so patches can be ultimately upstreamed. P.S. Sorry for overlooking the ready solution in the first place, I discovered it doing last-minute duplicate check before creating my own PPA. -- You received this bug notification because you are a member of Ubuntu Desktop Bugs, which is a subscriber of a duplicate bug (34667). https://bugs.launchpad.net/bugs/580961 Title: unzip fails to deal correctly with filename encodings -- desktop-bugs mailing list desktop-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/desktop-bugs
[Bug 580961] Re: unzip fails to deal correctly with filename encodings
The bug is FIXED by Russian AltLinux community and some humble developer who improved on their patch to apply to then-newest (2010-11) InfoZIP sources, but published only in country-local website in Russian language. Here's the Google-translated publication with all links and instructions: http://goo.gl/paR5i I tested the patches, they work for me on Ubuntu 10.11 amd64. I wrote to topic starter, waiting for reply, if none - will try to contact ZIP maintainer in Debian. Shame to the Russian-speaking persons who were too lazy to search the Web, instead barking at actual developers - it took me a few hours to find the ready solution and test multiple ways of applying it, 'alien' (failed, worked only for reading) and source build. -- You received this bug notification because you are a member of Ubuntu Desktop Bugs, which is a subscriber of a duplicate bug (34667). https://bugs.launchpad.net/bugs/580961 Title: unzip fails to deal correctly with filename encodings -- desktop-bugs mailing list desktop-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/desktop-bugs