I think I can confirm the bug is due to the flag I mentioned in the report.

Using the `berkeleydb` package I wrote the following migrate.py python3 script:


#!/usr/bin/env python3

import os
import sys
from berkeleydb import db as bdb


def get_paths(db_dir):
    db_dir = os.path.realpath(db_dir, strict=True)
    old_db_file = os.path.join(db_dir, 'references.db')
    new_db_file = old_db_file+'.new'
    old_db_backup_file = old_db_file+'.old'
    return (db_dir, old_db_file, new_db_file, old_db_backup_file)


def create_new_db(filename):
    new_db = bdb.DB()
    new_db.set_flags(bdb.DB_DUPSORT)
    new_db.open(filename, 'references', bdb.DB_BTREE, bdb.DB_CREATE)
    return new_db


def open_old_db(filename):
    old_db = bdb.DB()
    old_db.open(filename, 'references')
    return old_db


def copy_old_to_new(old_db, new_db):
    for (key, value) in old_db.items():
        new_db.put(key, value)


def print_usage(file, exit_code):
    print('Usage: migrate.py db_dir', file=file)
    sys.exit(exit_code)


if __name__ == '__main__':
    if len(sys.argv) != 2:
        print_usage(sys.stderr, 1)
    elif sys.argv[1] in ['-h', '--help']:
        print_usage(sys.stdout, 0)

    (db_dir, old_db_file, new_db_file,
     old_db_backup_file) = get_paths(sys.argv[1])

    if os.path.exists(old_db_backup_file):
        print(
            f'Error: cannot start as a references.db.old file already exists in {db_dir}')

    new_db = create_new_db(new_db_file)
    old_db = open_old_db(old_db_file)
    copy_old_to_new(old_db, new_db)

    # store the old references.db file as references.db.old as a backup
    os.rename(old_db_file, old_db_backup_file)
    os.rename(new_db_file, old_db_file)


What this does is it replaces the references.db file with a copy of the same
file with the DUPSORT flag instead of DUP. If I pass the old file and the new
one to `db_dump -p`, they are shown to be the same except for the new
`dupsort=1` flag. Also their stats (`db_stat`) appear to be the same.

Turning the dupsort flag on using the above script in a references.db file
created with reprepro v5.4.2, and then including two source packages which
share the same upstream package does indeed yield the DB_KEYEXIST error I
reported. Thus that flag appears to be actually causing this issue.

Note that the script - as written above - spoils a v5.4.2 references.db, it
does not fix one created with v5.3.1! To fix the latter one needs to replace


def create_new_db(filename):
    new_db = bdb.DB()
    new_db.set_flags(bdb.DB_DUPSORT) # FLAG HERE
    new_db.open(filename, 'references', bdb.DB_BTREE, bdb.DB_CREATE)
    return new_db


with


def create_new_db(filename):
    new_db = bdb.DB()
    new_db.set_flags(bdb.DB_DUP) # FLAG HERE
    new_db.open(filename, 'references', bdb.DB_BTREE, bdb.DB_CREATE)
    return new_db


so that the database file is created with the v5.4.2 DUP flag instead of
DUPSORT. I checked that, after changing the flag, the old database is able to
store source packages with the same *.orig.tar.gz/.xz file without showing
undesidered behavior (as far as I can tell).

Thus it looks like rewriting the references.db file with the DUP flag instead
of DUPSORT upon first opening of the database, after an upgrade of reprepro,
may be sufficient to fix this bug.

Reply via email to