> On Sun, Mar 02, 2014 at 12:55:34 +0100, Bastien ROUCARIES wrote: >> Hi, >> >> newer tar (1.27.1)has a serious regression compared to 1.27. > > > On Sat, Mar 22, 2014 at 10:33:56 +0100, Bastien ROUCARIES wrote: >> Yes they should be indeed treated literally. I use shell completion >> and it is pretty natural to get the name as is. > > Can you give us more information about how tar is actually being called > by lintian? (I found Debian bug #740199, but it doesn't give any > details about the actual tar command being used by the test or in what > manner it's failing.) >
Hi, (Please ensure I am CC'ed on replies as I am not subscribed to this list) For the given tests, lintian does not build the tarball causing this issue. So it would "only" call tar in the following ways: dpkg-deb --fsys-tarfile <deb-file> | tar tvf - | \ sort -k 6 | gzip --best -c > index.gz dpkg-deb --fsys-tarfile <deb-file> | tar --numeric-owner -tvf - | \ sort -k 6 | gzip --best -c > index-ids.gz dpkg-deb --fsys-tarfile <deb-file> | tar xf - -C unpacked The tool building the relevant tar file (known as the "data.tar") would be dpkg-deb. I am not entirely sure, but I think [1] reflects the tar command-line it uses, which would be: execlp(TAR, "tar", "-cf", "-", "--format=gnu", "--null", "-T", "-", "--no-recursion", NULL); I guess that is where the (missing) "-T" appears that caused us headaches. > Are you sure there was a change in behavior of the test between tar > 1.27 and 1.27.1? Yes. We experienced a behaviour change in 1.27 and we adapted the code and tests accordingly (e.g. [2] and [3]). Now we are seeing it change again with 1.27.1. For reference, tar 1.27 landed in Debian unstable 2013-10-15. > As mentioned earlier in this thread, the unquoting > behavior of file names mentioned in a -T file did change between those > two versions, but the behavior for filenames specified on the command > line shouldn't have changed... (And even for -T, the behavior in > 1.27.1 should match the behavior of earlier versions other than 1.27 > itself.) > For reference, the test in [2] does have 3 files created via (Makefile syntax): echo foo > debian/tmp/usr/share/doc/filenames/bokm<E5>l echo foo > debian/tmp/usr/share/doc/filenames/bokm\\<E5>l echo foo > debian/tmp/usr/share/doc/filenames/bokm\\\\<E5>l Where <E5> is less's way of showing an unrepresentable character, lintian uses "?" instead - the charecter is an å in (I think) ISO-8859-1. Accordingly, when I saw the test output change and compared it what the test was doing, I thought the test had been wrong all this time and tar finally fixed it. Namely, I would expect the above to create the following 3 files on the file-system O1 bokmål (6 chars long) O2 bokm\ål (7 chars long) O3 bokm\\ål (8 chars long) If I understand the situation correctly, then these 3 files are passed to tar (with -T --null) via dpkg, causing them to be "unqouted". This reduces the number of unique names to (per [4]): T1 bokmål (6 chars long, T1 is O1 in the tarball) T2 bokmål (6 chars long, should have been O2, but in tar it is a hardlink of T1) T3 bokm\ål (7 chars long, T3 is O3 in the tarball) T1, T2 and T3 is (as I understand you and other people on this mailing list) the correct, expected behaviour when the tarball is built as: tar -cf - --format=gnu -null -T - --no-recursion and if I want the paths named O1, O2 and O3 to appear in the tarball (rather than T1, T2 and T3), I need to either pass --no-unquote (not an option here) or add one more level of quoting? ~Niels [1] http://anonscm.debian.org/gitweb/?p=dpkg/dpkg.git;a=blob;f=dpkg-deb/build.c;h=2871234a09ed31bcc83865cdf4f0f71a504f0d62;hb=db9051cc21519459b7552f5d04d2465386d0b772#l602 [2] http://anonscm.debian.org/gitweb/?p=lintian/lintian.git;a=commitdiff;h=80f4060ec58262f2cd261fe9677f8de71fb411dd;hp=135f2c1f4ec6556a6e84a5eb533ee0429c7bfc5b [3] http://anonscm.debian.org/gitweb/?p=lintian/lintian.git;a=commitdiff;h=952d64558a25df127f19d08875b4f8996cb4e6e3 [4] http://lists.gnu.org/archive/html/bug-tar/2014-03/msg00018.html