[Index] SubLevel removal

Emmanuel Lécharny Thu, 12 Apr 2012 06:02:20 -0700

Hi guys,

I'm currently working on the SubLevelIndex removal. This is slightlymore complex than the OneLevel index, and there are some choice to bemade in this area.

Basically, the idea is that we will first search the entry which is theparent, then start fetching all the descendant from this point.


Considering we have such a tree :

cn=A
  cn=B
  cn=C
    cn=D
      cn=G
    cn=E
  cn=F
    cn=H

then a search for entries using a SUBTREE scope and a base DN 'cn=C,cn=A' will return the following set of entries (the filter isobjectClass=*, to keep it simple) :

{
  cn=C,cn=A
  cn=D,cn=C,cn=A
  cn=G,cn=D,cn=C,cn=A
  cn=E,cn=C,cn=A
}

Note that the same search with a ONE scope would return cn=D,cn=C,cn=Aand cn=E,cn=C,cn=A, excluding the cn=C,cn=A entry, which is the base forthis search).


Now, we have two ways to get those entries :

- depth-first traversal. We fetch every entry, and if it has somechildren, then we go down one level, and so on recursively, until wehave exhausted all the descendant. The consequence is that the entrieswill be returned in this order :


  cn=C,cn=A
  cn=D,cn=C,cn=A
  cn=G,cn=D,cn=C,cn=A
  cn=E,cn=C,cn=A

- Breadth-first traversal. This time, we exhaust all the children forthe current level, and then we go down one level, read all the entries,etc. We never go down one level if all the entries at the same levelhave not been processed. The entries will be returned in this order :


  cn=C,cn=A
  cn=D,cn=C,cn=A
  cn=E,cn=C,cn=A
  cn=G,cn=D,cn=C,cn=A

The problem with the breadth-first approach is that it puts way morepressure on the memory, as we have to keep in memory all the entriesthat have children. With the depth-first approach, we just proceed witha new cursor when we have at least one children, so we will have as manycursors in memory as the tree's depth (so if we have a 10 levels tree,we will keep 10 cursors max). OTOH, the order might be a bit strange(even if there is no guarantee whatsoever that the server will returnthe entry in any given order).

IMO, the depth-first approach is the one which offers the best balancebetween performance and memory consumption. Also the return order is notthat critical assuming that we have implemented the Sort control (whichis not yet part of our implementation).


Any better idea, or any comments ?

Thanks !

--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

[Index] SubLevel removal

Reply via email to