Package: libdpkg-perl Version: 1.16.9 Severity: minor Hi,
Iterating over the fields in a Dpkg::Control object (e.g. keys %{$dctrl}) has an O(n^2) performance. This is apparent in the code, where FIRSTKEY and NEXTKEY loops over the "in-order" list (hinting performance of at least O(n*m) code): """ sub FIRSTKEY { my $self = shift; my $parent = $self->[1]; foreach (@{$parent->{'in_order'}}) { return $_ if exists $self->[0]->{lc($_)}; } } sub NEXTKEY { my ($self, $last) = @_; my $parent = $self->[1]; my $found = 0; foreach (@{$parent->{'in_order'}}) { if ($found) { return $_ if exists $self->[0]->{lc($_)}; } else { $found = 1 if $_ eq $last; } } return undef; } """ The attached script can be used to reproduce the behaviour. In my case, I had to run it with: $ perl perf-iter.pl 400 50 $ perl perf-iter.pl 800 50 $ perl perf-iter.pl 1600 50 to see it and obtain relable results (ymmv). The O(n^2) behaviour is visible in that whenever the number of fields double the rate will be reduced by a factor 4. I tried comparing it to the Perl built-in hashref, but even for 10000 fields I did not get reliable results. My guess that the Perl hash is faster than O(n^2) - likely O(n), but honestly I don't know. The bug is of minor severity, because it takes an "obscence" number of fields before the O(n^2) behaviour degrades the performance visibly in the benchmark[1]. ~Niels [1] Over 100 fields AFAICT.
#!/usr/bin/perl # # $ perl perf-iter.pl 400 50 # Benchmark: timing 50 iterations of dpkg-control... # dpkg-control: 1 wallclock secs ( 0.50 usr + 0.00 sys = 0.50 CPU) @ 100.00/s (n=50) # Rate dpkg-control # dpkg-control 100.0/s -- # $ perl perf-iter.pl 800 50 # Benchmark: timing 50 iterations of dpkg-control... # dpkg-control: 2 wallclock secs ( 1.94 usr + 0.00 sys = 1.94 CPU) @ 25.77/s (n=50) # Rate dpkg-control # dpkg-control 25.8/s -- # $ perl perf-iter.pl 1600 50 # Benchmark: timing 50 iterations of dpkg-control... # dpkg-control: 8 wallclock secs ( 7.70 usr + 0.00 sys = 7.70 CPU) @ 6.49/s (n=50) # Rate dpkg-control # dpkg-control 6.49/s -- # use strict; use warnings; use Benchmark qw(cmpthese timethese); use Dpkg::Control; my ($fields, $count) = @ARGV; $fields //= 2000; $count//= 10; my $dctrl = Dpkg::Control->new; my $hashref = {}; for (my $i = 0; $i < $fields ; $i++) { my $fname = "Field-$i"; $dctrl->{$fname} = "Value $i"; $hashref->{$fname} = "Value $i"; } my $bench = timethese ($count, { # 'hashref' => sub { # foreach my $f (keys %{$hashref}) { # } # }, 'dpkg-control' => sub { foreach my $f (keys %{$dctrl}) { } }, }); cmpthese $bench; exit 0;