Ok, thanks for the update.  
 
 
 

 
 

 
 
>  
> On Aug 4, 2017 at 08:08,  <Ilias Stamatis (mailto:stamatis.ili...@gmail.com)> 
>  wrote:
>  
>  
>  
> Okay, now that I have read and understood dbscan's code, I have a few more 
> questions.
>  
>  
>
>  
> 2017-08-03 10:10 GMT+03:00 Ludwig Krispenz  <lkris...@redhat.com 
> (mailto:lkris...@redhat.com)>:
>      
> >  Hi, now that I know the context here are some more comments.
> >  
> > If the purpose is to create a useful ldif file, which could eventually be 
> > used for import then formatting an entry correctly is not enough. Order of 
> > entries matters: parents need to come before children. We already handle 
> > this in db2ldif or replication total update.
> >  That said, whenever you write an entry you always have seen the parent and 
> > could stack the dn with the parentid and createt the dn without using the 
> > entryrdn index.
> >  You even need not to keep track of all the entry rdsn/dns - only the ones 
> > with children will be needed later, the presence of "numsubordinates"
> >  identifies a parent.
> >    
>
>  
> Is it guaranteed that parents are going to appear before children in 
> id2entry.db?
>  
>  
> If so, here's what could probably work:
>  
>  
> - Start reading entries from id2entry sequentially.
>  
> - For each entry, if it has a numSubordinates attribute it means it is a 
> parent for other entries. So we can store it's ID - DN pair in a hash map.
>  
> - For entries that they have a parentid and so we need to figure out their 
> parent's DN, we just look for hashmap[parentid].
>  
>  
> To make it even more efficient (if really needed though, because it will make 
> things more complicated) we can store the value of numSubordinates with each 
> parent as well somehow in the map. Every time a parentid is looked in the map 
> we can decrease the value of numSubordinates by 1. When it becomes 0, it 
> means there are no more children of this ID so we can safely remove it from 
> the map.
>  
>  
> However, I don't know if we would really need this last thing. In a 100 
> million entry db how many parents would we expect to have approximately?
>  
>  
> Also, do we have a hash map implemented somewhere?
>  
>  
> If parents are not guaranteed to appear before children in id2entry.db, then 
> we would have to alter the above strategy.
>    
>
>  
> Thanks!
>  
>
>        _______________________________________________ 389-devel mailing list 
> -- 389-devel@lists.fedoraproject.org To unsubscribe send an email to 
> 389-devel-le...@lists.fedoraproject.org        
_______________________________________________
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org

Reply via email to