Hi everybody, Following Ludwig's and Mark's suggestions on how to perform a database dump in LDIF format from dbscan, I have come up with a strategy. I'm talking about ticket #47567: https://pagure.io/389-ds-base/issue/47567
I'd like to discuss this strategy and get some feedback. The general idea is: - We are cursing though id2entry.db printing entries in order. - Parents should be printed before children. - Hence if for some entry parenitd > entryid, we have to print its parent first (out of order) and track that we did so. - In order to do the above, we don't need to move the db cursor. We can just go fetch something from a random place in the db and the cursor will remain in its place, so we can continue from where we left off after we're done with printing a parent. - We also need to construct and print the DN of each entry using its RDN and the DN of its father. - Let's use something like a hash map to pair entryids with dns (only for entries that has children), e.g. map[1] = "dc=example,dc=com", map[4] = "ou=People,dc=example,dc=com", etc. I'll present the algorithm that I came up with in python-like pseudo-code. First, the following function constructs the entry's DN and updates the hash map if needed. We can know whether an entry is a parent or not, by the presence of the numSubordinates attribute. # assumes that parentid already exists in the map function display_entry(e, map): if not e.parentid: e.dn = e.rdn else: e.dn = e.rdn + map[e.parentid] if isparent(e): map[e.entryid] = e.dn print_to_ldif_format(e) Then, the main loop: map = new(hashmap) printed_in_advance = [] for e in entries: if e.entryid in printed_in_advance: continue # skip this, we have already displayed it if e.parentid < e.entryid: display_entry(e, map) else: # we need to display parents before children list_of_parents = [] p = e while p.parentid > p.entryid: p = get_from_db(key = p.parentid) list_of_parents.append(p) # see note below (*) for p in reversed(list_of_parents): display_entry(p, map) printed_in_advance.append(p.entryid) * here we store the whole entry in the list (aka its data) and not just its id, in order to avoid fetching it from the database again As a map, we can use a B+ tree implementation from libsds. I would like to know how the above idea sounds to you before going ahead and implementing it. Thank you! Ilias
_______________________________________________ 389-devel mailing list -- 389-devel@lists.fedoraproject.org To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org