Package: libgdbm6 Version: 1.18.1-4 Severity: normal Tags: upstream patch Dear Maintainer,
We are using gdbm through python, and discovered that on debian buster our project started to leak file descriptors. After investigating the issue we came to the conclusion that it is caused by a bug in the libgdbm6 library. For reproduction on debian:buster docker image run the following commands: apt install -y python3 python3-gdbm lsof cat >/tmp/repro.py <<EOL #!/usr/bin/python3 import dbm.gnu # IMPORTANT import os import time import subprocess def main(): db_file = "/tmp/ti-db-hanging.db" # call reorganize() multiple times for _ in range(5): db = dbm.open(os.path.abspath(db_file), "cf") db.reorganize() time.sleep(1) db.close() time.sleep(5) # check if we still hold any file handles for the db_file even after closing it pid = os.getpid() x = subprocess.run([f"lsof -p {pid} | grep {db_file}"], shell=True, stdout=subprocess.PIPE) assert x.returncode == 0 res = x.stdout.decode() if not res: print("check open files: ok") else: print(res) if __name__ == "__main__": main() EOL python3 /tmp/repro.py In the end of the script lsof should not return any open file handles but it does show 5 of them. Short summary of the issue: When gdbm reorganize is called from python it will end up calling the gdbm_recover in recover.c . This will create a new temporary file where the contents of the current file will be copied (after doing some processing on it). After this is done new file will renamed to the old one. When mmap is used the mapping for the temporary file is only removed in case of error situations. This results in file handles hold on DELETED files when the process runs reorganize more than once in its lifetime. To fix the issue I prepared the following patch on gdbm: In case mmap is used, the memory mapping in the recover function is not removed. This can cause open file descriptors to deleted files if the recover function is called multiple times. --- a/src/recover.c +++ b/src/recover.c @@ -168,15 +168,20 @@ dbf->bucket_changed = new_dbf->bucket_changed; dbf->second_changed = new_dbf->second_changed; - free (new_dbf->name); - free (new_dbf); - #if HAVE_MMAP /* Re-initialize mapping if required */ if (dbf->memory_mapping) _gdbm_mapped_init (dbf); + + /* remove the old memory mapping to the temporary file name */ + if (new_dbf->mapped_region){ + _gdbm_mapped_unmap(new_dbf); + } #endif + free (new_dbf->name); + free (new_dbf); + /* Make sure the new database is all on disk. */ gdbm_file_sync (dbf); -- System Information: Debian Release: 10.7 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 5.4.0-66-generic (SMP w/4 CPU cores) Locale: LANG=C, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE=C (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: unable to detect Versions of packages libgdbm6 depends on: ii libc6 2.28-10 libgdbm6 recommends no packages. Versions of packages libgdbm6 suggests: pn gdbm-l10n <none> -- no debconf information