On 08/04/2017 02:08 PM, Ilias Stamatis wrote:
Okay, now that I have read and understood dbscan's code, I have a few
more questions.
2017-08-03 10:10 GMT+03:00 Ludwig Krispenz <lkris...@redhat.com
<mailto:lkris...@redhat.com>>:
Hi, now that I know the context here are some more comments.
If the purpose is to create a useful ldif file, which could
eventually be used for import then formatting an entry correctly
is not enough. Order of entries matters: parents need to come
before children. We already handle this in db2ldif or replication
total update.
That said, whenever you write an entry you always have seen the
parent and could stack the dn with the parentid and createt the dn
without using the entryrdn index.
You even need not to keep track of all the entry rdsn/dns - only
the ones with children will be needed later, the presence of
"numsubordinates"
identifies a parent.
Is it guaranteed that parents are going to appear before children in
id2entry.db?
no. that's what I said before, it is possible that parentid > entryid.
It happens if an entry is moved by modrdn to aother subtree
If so, here's what could probably work:
- Start reading entries from id2entry sequentially.
- For each entry, if it has a numSubordinates attribute it means it is
a parent for other entries. So we can store it's ID - DN pair in a
hash map.
- For entries that they have a parentid and so we need to figure out
their parent's DN, we just look for hashmap[parentid].
To make it even more efficient (if really needed though, because it
will make things more complicated) we can store the value of
numSubordinates with each parent as well somehow in the map. Every
time a parentid is looked in the map we can decrease the value of
numSubordinates by 1. When it becomes 0, it means there are no more
children of this ID so we can safely remove it from the map.
However, I don't know if we would really need this last thing. In a
100 million entry db how many parents would we expect to have
approximately?
Also, do we have a hash map implemented somewhere?
If parents are not guaranteed to appear before children in
id2entry.db, then we would have to alter the above strategy.
Thanks!
_______________________________________________
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric
Shander
_______________________________________________
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org