Package: libgdbm6
Version: 1.18.1-4
Severity: normal
Tags: upstream patch

Dear Maintainer,

We are using gdbm through python, and discovered that on debian buster our 
project started to leak
file descriptors. After investigating the issue we came to the conclusion that 
it is caused by a
bug in the libgdbm6 library.

For reproduction on debian:buster docker image run the following commands:

apt install -y python3 python3-gdbm lsof
cat >/tmp/repro.py <<EOL
#!/usr/bin/python3
import dbm.gnu  # IMPORTANT
import os
import time
import subprocess


def main():
    db_file = "/tmp/ti-db-hanging.db"

    # call reorganize() multiple times
    for _ in range(5):
        db = dbm.open(os.path.abspath(db_file), "cf")
        db.reorganize()
        time.sleep(1)
        db.close()

    time.sleep(5)

    # check if we still hold any file handles for the db_file even after 
closing it
    pid = os.getpid()
    x = subprocess.run([f"lsof -p {pid} | grep {db_file}"], shell=True,
                       stdout=subprocess.PIPE)
    assert x.returncode == 0
    res = x.stdout.decode()
    if not res:
        print("check open files: ok")
    else:
        print(res)


if __name__ == "__main__":
    main()
EOL
python3 /tmp/repro.py

In the end of the script lsof should not return any open file handles but it 
does show 5 of them.

Short summary of the issue:
When gdbm reorganize is called from python it will end up calling the 
gdbm_recover in recover.c . This
will create a new temporary file where the contents of the current file will be 
copied (after doing
some processing on it). After this is done new file will renamed to the old 
one. When mmap is used the
mapping for the temporary file is only removed in case of error situations. 
This results in file
handles hold on DELETED files when the process runs reorganize more than once 
in its lifetime.

To fix the issue I prepared the following patch on gdbm:

In case mmap is used, the memory mapping in the recover function is not 
removed. This can cause open file descriptors to deleted files if the recover 
function is called multiple times.
--- a/src/recover.c
+++ b/src/recover.c
@@ -168,15 +168,20 @@
    dbf->bucket_changed    = new_dbf->bucket_changed;
    dbf->second_changed    = new_dbf->second_changed;
 
-   free (new_dbf->name);
-   free (new_dbf);
-   
  #if HAVE_MMAP
    /* Re-initialize mapping if required */
    if (dbf->memory_mapping)
      _gdbm_mapped_init (dbf);
+
+   /* remove the old memory mapping to the temporary file name */
+   if (new_dbf->mapped_region){
+     _gdbm_mapped_unmap(new_dbf);
+  }
  #endif
 
+   free (new_dbf->name);
+   free (new_dbf);
+
    /* Make sure the new database is all on disk. */
    gdbm_file_sync (dbf);
 



-- System Information:
Debian Release: 10.7
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 5.4.0-66-generic (SMP w/4 CPU cores)
Locale: LANG=C, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE=C (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: unable to detect

Versions of packages libgdbm6 depends on:
ii  libc6  2.28-10

libgdbm6 recommends no packages.

Versions of packages libgdbm6 suggests:
pn  gdbm-l10n  <none>

-- no debconf information

Reply via email to