Louis-Philippe Véronneau wrote...

> Lintian runs file (more precisely, `file --no-pad --print0 --print0 --`)
> to get the "file_type" value of files [1].

Very exhausting debugging¹ reveals lintian has that file_type set to the
empty string. Obviously this causes ...

> Then, for all the files in "/usr/share/man/", it verifies .gz files are indeed
> gz-compressed with this perl regex match [2]:
> 
> if ($item->file_type !~ m/gzip compressed data/)

... this check to fail, and things go downhill from there.

> I built the test files Lintian uses for the autopkgtest and when
> I run file 1:5.43-3 on it, I do get an output that should match that regex:
> 
> ---------------------------------------------
> foo@bar:/tmp/foo/usr/share/man/man1# file -v
> > file-5.43
> > magic file from /etc/magic:/usr/share/misc/magic
> 
> foo@bar:/tmp/foo/usr/share/man/man1# file "鳥の詩.1.gz"
> > \351\263\245\343\201\256\350\251\251.1.gz: gzip compressed data, max 
> > compression, from Unix, original size modulo 2^32 145
> ---------------------------------------------

It took a few hours to realize locale setting ruin your day. I could
reproduce that only with LC_ALL=C, and then bisecting led to:

    commit f448f3e5c37de8c285ac14b032b2bdcea82fc08b
    Author: Christos Zoulas <chris...@zoulas.com>
    Date:   Sat May 28 01:04:57 2022 +0000

        PR/351: CathyKMeow: octalify unprintable characters in filenames unless 
raw.

> My perl-foo is pretty bad, but I guess we should be trying to espace or 
> sanitize that value?

Problem is, since that change file(1) no longer returns the file as it
was given on the command line, and therefore add_file_types assigns the
file type to the wrong file, while the real one gets nothing at all.
.oO (from __rant__ import "use strict; use warning;")

Next I'll try

--- a/lib/Lintian/Index/FileTypes.pm
+++ b/lib/Lintian/Index/FileTypes.pm
@@ -98,6 +98,7 @@ sub add_file_types {
             }
 
             while (defined(my $path = shift @lines)) {
+                $path =~ s/(\\[0-7]{3})/chr oct ($1)/eg;
 
                 my $type = shift @lines;

but testing will take a while¹. I wasn't surprised if this
fails from some utf-8 vs. binary string mismatches.

Beside from that, the issue became more pressing now that file 1:5.44-1
is in unstable, blocked by this one from transitioning to testing. Which
is why I spent several hours¹ on this.

    Christoph

¹ <rant>As I am in a way responsible for this situation, I agree I am
  also obliged to help resolving it. However, can you please provide a
  faster way to reproduce the failing test? Currently I am rebuilding
  and running autopkgtest, with a loop taking 50 minutes.</rant>

Attachment: signature.asc
Description: PGP signature

Reply via email to