On 6/24/11 10:04 AM, Alex Karasulu wrote:
On Fri, Jun 24, 2011 at 11:00 AM, Emmanuel Lécharny
<elecha...@apache.org>  wrote:
On 6/24/11 9:51 AM, Alex Karasulu wrote:
The reverse index has no duplicate keys. The only way to get a
duplicate key in the reverse index is if the same entry (i.e. 37)
contained the same value ('foo') for the same (sn) attribute. And this
we know is not possible. So the lookups against the reverse table will
be faster.
I was thinking about something a bit different : as soon as you have
grabbed
the list of entry's ID from the first index, looking into the other
indexes
will also return a list of Entry's ID. Checking if those IDs are valid
candidate can then be done in one shot : do the intersection of the two
sets
(they are ordered, so it's a O(n) operation) and just get the matching
entries.

Compared to the current processing (ie, accessing the reverse index for
*each* candidate), this will be way faster, IMO.
This is a VERY interesting idea. Maybe we should create a separate
thread for this and drive deeper into it. You got something I think
here.

have a look at
https://cwiki.apache.org/confluence/display/DIRxSRVx11/Index+and+IndexEntry,
where I added some paragraphs explaining this idea. We can comment on this
page.
Nice pictures - what did you use for that? Reading further ...
YeD
Also if you're doing this in a branch, hence we're not yet committed
on the approach, can you please do this on a separate page so you
don't alter the existing documentation?
Right now, I just updated the existing doco with some extended explanation on the existing code. I can move suggestions to another page, sure.

--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Reply via email to