Re: Update about subtree problems

Howard Chu Tue, 20 Jul 2010 00:30:00 -0700

Emmanuel Lecharny wrote:

Basically, sol.1 and sol.2 are both painful, but there is no perfect
solution anyway.


I thought about what is the best implementation all day long, and if
yesterday I was convinced that the solution 2 was better. But the
evaluation cost done on each entry every time a client does a search is
quite high (let's say up to 20% of the search cost, for simple SS). That
will not slow down the server that much, as most of the time is
consummed in the network layer right now, but this won't last.

Now I think that the current solution is may be a not so bad compromise,
assuming we don't do a lot of move operation cross APs on normal entries.

What bugs me here is that last year, for the opposite reason, the DN was
removed from the entries stored in the master table, just t be able to
do O(1) move operations. This is quite paradoxal ! I still think that
move operations are very rare, and that storing the DN into the entries
is a net gain for most of the operations, except for a move...

From another perspective, at least in the OpenLDAP case, O(1) renameoperations were only one of the benefits of the latter approach. The other wasthe huge improvement in scalability of the DN index, which was quite bloatedotherwise. When you see the opportunity to get more than one benefit from aparticular approach, then it becomes a more compelling choice...

Some side note :
after having done some perf tests on the evaluator, and applied some
improvement, I can tell that depending on the number of subentries an
entry is depending on, the cost of this evaluation can goes up to 50% of
the search itself cost - not counting the network layer -. For instance,
evaluating a subtreeSpecification with a min and a max, no chop, will be
done up to 1 000 000 times per second on a 3 level DN (this is all
dependent on the DN size)

IMO, the considerations here are the same as for the O(1) rename. I.e., whenyou remove the entryDN from the entry in the DB, you have to calculate the DNon the fly, and it certainly is a frequently referenced datum. You make thischeap by caching the entryDN in memory, and it's very clear when a cached DNmust be invalidated - most of the time the cached value will not change. Youhave potentially increased the cost of a read operation, but in practice thecost is zero, while the savings on the write side is significant.

Last, not least : the current implementation is really incomplete. The
Move, Rename and MoveAndRename operations are not correctly handled,
with many entries not being updated. I'm going to fix them. I have also
created a branch to play with subtree without breaking the trunk. I'm
not sure I will continue to work on this branch if a decision is  made
to keep the first solution.


--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: Update about subtree problems

Reply via email to