Package: libdpkg-perl
Version: 1.16.2
Severity: normal
Tags: patch

When I run a Perl script that repeatedly creates unreferenced Dpkg::Control objects, the perl process (as shown in "top") consumes memory without limit.

A one-line sample:

  perl -MDpkg::Control -e 'Dpkg::Control->new while 1'

I would expect a script like this to have a constant memory usage, as the Dpkg::Control objects are garbage-collected soon after being created. What I find, though is that after running for thirty seconds, perl has consumed over 100 MB of memory.

By contrast, the same test using Dpkg::Index consumes a constant 6 MB.

This problem effectively means that a process can't operate on a large number of Dpkg::Control objects sequentially. I discovered this when I wrote a program that iterated over all the packages in every current Ubuntu release and my system ran out of memory.

The cause of the problem appears to the a circular reference between a Dpkg::Control::Hash and its contained tied hash. I've attached a patch that explicitly breaks this loop when a Dpkg::Control::Hash is destroyed, following the advice in perlobj(1). This appears to solve the memory leak and to pass "debian/rules check".

--
Ben Harris, University of Cambridge Computing Service.

-- System Information:
Debian Release: wheezy/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: i386 (i686)

Kernel: Linux 3.2.0-2-686-pae (SMP w/2 CPU cores)
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
Shell: /bin/sh linked to /bin/bash

Versions of packages libdpkg-perl depends on:
ii  dpkg              1.16.2
ii  libtimedate-perl  1.2000-1
ii  perl              5.14.2-9

Versions of packages libdpkg-perl recommends:
ii  bzip2     1.0.6-1
ii  xz-utils  5.1.1alpha+20110809-3

Versions of packages libdpkg-perl suggests:
ii  binutils        2.22-6
ii  debian-keyring  2012.02.22
ii  gnupg           1.4.12-4
ii  gpgv            1.4.12-4
ii  patch           2.6.1-3

-- no debconf information
--- /usr/share/perl5/Dpkg/Control/Hash.pm	2012-03-19 08:16:14.000000000 +0000
+++ /home/bjh21/dmerge/Dpkg/Control/Hash.pm	2012-04-16 15:07:50.733028819 +0100
@@ -119,6 +119,16 @@
     return $self;
 }
 
+# There is naturally a circular reference between the tied hash and its
+# containing object.  Happily, the extra layer of scalar reference can
+# be used to detect the destruction of the object and break the loop so
+# that everything gets garbage-collected.
+
+sub DESTROY {
+    my ($self) = @_;
+    delete $$self->{'fields'};
+}
+
 =item $c->set_options($option, %opts)
 
 Changes the value of one or more options.
@@ -392,9 +402,10 @@
 }
 
 # $self->[0] is the real hash
-# $self->[1] is an array containing the ordered list of keys
-# $self->[2] is an hash describing the relative importance of each field
-# (used to sort the output).
+# $self->[1] is a reference to the hash contained by the parent object.
+# This reference bypasses the top-level scalar reference of a
+# Dpkg::Control::Hash, hence ensuring that that reference gets DESTROYed
+# properly.
 
 # Dpkg::Control::Hash->new($parent)
 #
@@ -412,7 +423,7 @@
     my ($class, $parent) = @_;
     die "Parent object must be Dpkg::Control::Hash"
         if not $parent->isa("Dpkg::Control::Hash");
-    return bless [ {}, $parent ], $class;
+    return bless [ {}, $$parent ], $class;
 }
 
 sub FETCH {
@@ -427,7 +438,7 @@
     my $parent = $self->[1];
     $key = lc($key);
     if (not exists $self->[0]->{$key}) {
-	push @{$$parent->{'in_order'}}, field_capitalize($key);
+	push @{$parent->{'in_order'}}, field_capitalize($key);
     }
     $self->[0]->{$key} = $value;
 }
@@ -441,7 +452,7 @@
 sub DELETE {
     my ($self, $key) = @_;
     my $parent = $self->[1];
-    my $in_order = $$parent->{'in_order'};
+    my $in_order = $parent->{'in_order'};
     $key = lc($key);
     if (exists $self->[0]->{$key}) {
 	delete $self->[0]->{$key};
@@ -455,7 +466,7 @@
 sub FIRSTKEY {
     my $self = shift;
     my $parent = $self->[1];
-    foreach (@{$$parent->{'in_order'}}) {
+    foreach (@{$parent->{'in_order'}}) {
 	return $_ if exists $self->[0]->{lc($_)};
     }
 }
@@ -464,7 +475,7 @@
     my ($self, $last) = @_;
     my $parent = $self->[1];
     my $found = 0;
-    foreach (@{$$parent->{'in_order'}}) {
+    foreach (@{$parent->{'in_order'}}) {
 	if ($found) {
 	    return $_ if exists $self->[0]->{lc($_)};
 	} else {

Reply via email to