On Fri, Jun 24, 2011 at 11:00 AM, Emmanuel Lécharny
<elecha...@apache.org> wrote:
> On 6/24/11 9:51 AM, Alex Karasulu wrote:
>>
>>>> The reverse index has no duplicate keys. The only way to get a
>>>> duplicate key in the reverse index is if the same entry (i.e. 37)
>>>> contained the same value ('foo') for the same (sn) attribute. And this
>>>> we know is not possible. So the lookups against the reverse table will
>>>> be faster.
>>>
>>> I was thinking about something a bit different : as soon as you have
>>> grabbed
>>> the list of entry's ID from the first index, looking into the other
>>> indexes
>>> will also return a list of Entry's ID. Checking if those IDs are valid
>>> candidate can then be done in one shot : do the intersection of the two
>>> sets
>>> (they are ordered, so it's a O(n) operation) and just get the matching
>>> entries.
>>>
>>> Compared to the current processing (ie, accessing the reverse index for
>>> *each* candidate), this will be way faster, IMO.
>>
>> This is a VERY interesting idea. Maybe we should create a separate
>> thread for this and drive deeper into it. You got something I think
>> here.
>>
> have a look at
> https://cwiki.apache.org/confluence/display/DIRxSRVx11/Index+and+IndexEntry,
> where I added some paragraphs explaining this idea. We can comment on this
> page.

Nice pictures - what did you use for that? Reading further ...

Also if you're doing this in a branch, hence we're not yet committed
on the approach, can you please do this on a separate page so you
don't alter the existing documentation?

Thanks,
Alex

Reply via email to