On 2/24/11 3:01 AM, Alex Karasulu wrote:
On Wed, Feb 23, 2011 at 12:14 PM, Emmanuel Lecharny<elecha...@gmail.com>  wrote:
Hi guys,

those last 2 days I did some cleanup in the Rdn and Ava classes, in th
espirit of Dn cleanup. Here is a summary of what has been done, roughly :

- injection of the SchemaManager in both classes' constructors
- removing of the compareTo method
- made some methods private
- made the classes final
- removed the Comparable interface from the equation

One important modification is the last one : it makes no sense to have a Rdn
be comparable. First, how do we compare a cn and a jpegPhoto ? Second, how
do we compare two RDN which attributeType does not have an equality matching
rule ?

Of course, this has an impact in the way the backend works, as the Rdn index
needs to be able to do ordered comparison between Rdn, as this index is a
BTree. What I did is that I replaced the Rdn.compareTo(Rdn) method by a
direct String comparison in Jdbm between the Rdn's normalized name. That
does work.

It made me think that maybe using a hashed index for Rdn is probably a
better idea, because then we won't need this comparison to be done (the
equals method would be enough) and also because it would be faster (finding
an element in a Hash table is an O(1) operation - at least, theorically -
when looking in a BTree is an O(log2(N)))

thoughts on this last point?
I just replied to Stefan's post first but I thought about this after
my reply. Here's what came to mind:

(1) Let's presume we back the one and sublevel indices with the RDN index.
(2) Do we need to sort either of these three indices when they are
coupled this way or all independent?

We don't need to really sort them, we just need to be able to walk the
index producing candidates for one level and sublevel relationships.
We could probably still do this and get away without using a sorted
B+Tree.

The thing is we cannot apply certain assertions any longer with a hash
like using the>=.<=, and substring assertions.
DN syntax and the ATs that use it does not provide a Substring or Ordering MR. Does not make a lot of sense, IMO (comparing DN for equality is ok, trying to order them is just a vain thing)

But this is not
something performed on these indices anyway. You do these with respect
to an attribute type which that might be indexed and do not use these
operators on one level, or sub level inquiries.

So yeah I think this can be done and may have big performance
benefits. But it's a big shift.
This is an interesting side effect of my question :) I haven't expected such a wide modification proposal when I asked about using Hash instead of BTree, but this is were it's start to become interesting ! Definitively to be investigated, and hopefully Stefan has already started to dig in this direction with his Hashbase implementation.
Fun project too.
Sure is !


--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Reply via email to