On Wed, Jan 12, 2022 at 06:06:45PM -0500, Aaron M. Ucko wrote:
> Long story short, update-dpkg-list's batched bidirectional pipe usage wound
> up deadlocking on a system with 20,000+ provided virtual packages, mostly
> from installed librust-*-dev packages.

what do you mean by "deadlocking"? some kind of lockup, i presume. what
exactly happens? roughly how long is the script running before it happens?


BTW, update-dpkg-list runs fine on my 32GB Phenom II 1090T with over 38,000
packages known (over 8700 installed) - most actual packages from debian's own
repos, not an enormous set of virtual pkgs.  It has been doing so since I
built it ~ 10 years ago.

# wc -l /var/lib/dlocate/dpkg-list
38886 /var/lib/dlocate/dpkg-list

# grep '^.i' /var/lib/dlocate/dpkg-list | wc -l
8784

> Could you please take a look?

more info would be useful. at what stage does it lock up? is any output
produced in /var/lib/dlocate/dpkg-list?  Do you get an email from cron?


The only thing I can think of right now, is maybe try editing
/usr/share/dlocate/update-dpkg-list and, on line 61, change:

    xargs -0r apt-cache show

to limit xargs to running apt-cache with only 1000 or 5000 package names at a
time, e.g.:

    xargs -0r -n 1000 apt-cache show


I don't really think it will make any difference, apart from slowing it down
a bit, but it's worth trying...just in case apt-cache can't handle so many
package names on one command line, or maybe it doesn't like so many virtual
packages.

Also try adding some print statements before and after each "section" of code
in the script (i.e. before each open() and after each close(), and before
and after each loop) to pinpoint which part of the script is failing on your
system.  Add '$|=1;' (without quotes) near the top of the script, before
any print statements, to ensure that STDOUT is unbuffered.

I'd do that myself, but there's not much point since it doesn't happen on any
of my systems.  If you're not comfortable making changes to perl code, I've
attached a version of update-dpkg-list with all these changes.


You'll have to do the next suggestion yourself:

Also worth trying is generating a list of your 20K+ package names and feeding
that into 'xargs apt-cache show'  from bash to test whether it's apt-cache or
perl's IPC::Open2.  update-dpkg-list script doesn't do anything unusual, it 
just runs
'apt-cache show' and collects its output into an array.

BTW, the package list doesn't really need to be NUL separated. Using newline
(option "-d '\n'" with xargs instead of `-0`) or even spaces (xargs default)
as the separator for package names is fine, as package names don't and can't
contain whitespace.

craig
#!/usr/bin/perl

use strict;
use warnings;
use IPC::Open2;
use File::Basename;

$|=1;

my $program = basename($0);

my %packages = ();

my $myarch = qx(dpkg --print-architecture);
chomp $myarch;

# pre-declare subroutines (see below for implementation)
use subs qw(parse_pkg);

print "Getting details from dpkg...\n";
# get details for all packages known by dpkg
open(DPKG,'-|','dpkg -l "*"');
while(<DPKG>) {
  next unless (m/^[uihrp][ncHUFWti]/);
  chomp;

  my ($status,$pkg,$version,$arch,$desc) = split /\s+/,$_,5;
  $pkg =~ s/:.*//;

  $packages{"$pkg"}->{$arch}->{'status'} = sprintf('%-3s',$status);
  $packages{"$pkg"}->{$arch}->{'version'} = $version;
  $packages{$pkg}->{$arch}->{'desc'} = $desc;
}
close(DPKG);
print "done.\n";

# now get missing details for uninstalled packages
$/='';
my $pkgfiles='/var/lib/dpkg/status /var/lib/dpkg/available';
my $fields= join(',',qw(Package Description Architecture Version));

print "Getting missing details from grep-dctrl...\n";
open(DCTRL,'-|',"grep-dctrl -e . -s $fields $pkgfiles");
while(<DCTRL>) {
  parse_pkg('DCTRL',$_);
};
close(DCTRL);
print "done.\n";


# as a last ressort, try to get missing details from apt-cache show
# if arch or version is '<none>'
my @unknown = ();
foreach my $pkg (keys %packages) {
  foreach my $arch (keys %{ $packages{$pkg} } ) {

    push @unknown, $pkg if (
            $arch eq '<none>'
         || $packages{$pkg}->{$arch}->{'version'} eq '<none>'
        );
  };
};

print "Running xargs apt-cache show...\n";
if (@unknown) {
  # apt-cache doesn't read stdin, so we have to use xargs to make sure we
  # never exceed the bash command line limit.
  #my $pid = open2(\*ACS, \*XARGS, 'xargs -0r apt-cache show');
  my $pid = open2(\*ACS, \*XARGS, 'xargs -0r -n 1000 apt-cache show');

  print XARGS join("\0",@unknown);
  close(XARGS);

  while (<ACS>) {
    parse_pkg('ACS',$_);
  };
  close(ACS);
};
print "done.\n";

my $dlist = '/var/lib/dlocate/dpkg-list';

print "Writing to $dlist...\n";
open(DPKGLIST,'>', "$dlist.new") or die "$program: couldn't open $dlist.new for 
write: $!\n";
foreach (sort keys %packages) {
  foreach my $arch (sort keys %{ $packages{$_} } ) {
    next if ($arch eq '<none>');
    my $pkg = ($arch =~ m/^($myarch|all)$/io) ? $_ : "$_:$arch";

    printf DPKGLIST "%s\t%s\t%s:%s\t%s\n",
    #printf DPKGLIST "%s\t%s\t%s\t%s\t%s\n",
      $packages{$_}->{$arch}->{'status'},
      $pkg,
      $packages{$_}->{$arch}->{'version'}, $arch,
      $packages{$_}->{$arch}->{'desc'};

  };
};
close(DPKGLIST);
rename("$dlist.new", $dlist);
print "done.\n";


###
### subroutines
###

sub parse_pkg {
  my $calltype = shift;
  my ($pkg,$desc,$status,$version,$arch) = ('','(no description available)','un 
','','');

  # split package details by newline
  foreach (split /\n/,$_) {
    next unless (m/^(Package|Description(?:-..)?|Architecture|Version):/o);

    my ($field, $val) = split /: /,$_,2;
    if ($field eq 'Package') {
      $pkg     = $val ;
    } elsif ($field =~ m/Description(?:-..)?/io) {
      $desc    = $val;
    } elsif ($field eq 'Version') {
      $version = $val;
    } elsif ($field eq 'Architecture') {
      $arch    = $val;
    };
  };

  #$desc = "$calltype $desc";

  return unless ($pkg && $arch);
  return if ($arch ne $myarch && !defined($packages{$pkg}->{$arch}));

  $packages{$pkg}->{$arch}->{'desc'} = $desc;

  if (!defined($packages{$pkg}->{$arch}->{'status'})) {
    $packages{$pkg}->{$arch}->{'status'}  =  'un ';
  };

  if (    ! defined($packages{$pkg}->{$arch}->{'version'})
       || $packages{$pkg}->{$arch}->{'version'} eq '<none>'
     ) {
    $packages{$pkg}->{$arch}->{'version'} =  $version;
  };

  if (!defined($packages{$pkg}->{$arch})) {
    $packages{$pkg}->{$arch}->{'arch'}    =  $arch;
  };
};

Reply via email to