This might be even better in conjunction with moving away from BitSet
to some sort of interface like DocNrSkipper... that way you would
never have to combine the filters into a single BitSet.


-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

On 7/7/06, robert engels <[EMAIL PROTECTED]> wrote:
I implemented it and it works great. I didn't worry about the
deletions since by the time a filter is used the deleted documents
are already removed by the query. The only problem that arose out of
this was for things like the ConstantScoreQuery (which uses a filter)
- I needed to modify this query to ignore deleted documents.

Now I have incremental cached filters - the query performance is
going through the roof.



On Jul 7, 2006, at 2:47 PM, Chris Hostetter wrote:

>
> I'm no segments/MultiReader expert, but your idea sounds good to
> me ... it
> seems like it would certainly work in the "new segments" situation.
>
> One thing i don't see you mention is dealing with deletions ... i'm
> not
> sure if deleting documents cause the version number of an
> IndexReader to
> change or not (if it does your job is easy) but even if it doesn't I'm
> guessing you could say that if hasDeletions() returns true, you
> have to
> assume you need to invalidate your cached bits (worst case scenerio
> you
> are invalidating the cache as often as it is now)
>
>
> : Date: Fri, 7 Jul 2006 00:32:54 -0500
> : From: robert engels <[EMAIL PROTECTED]>
> : Reply-To: java-dev@lucene.apache.org
> : To: Lucene-Dev <java-dev@lucene.apache.org>
> : Subject: MultiSegmentQueryFilter enhancement for interactive
> indexes?
> :
> : I thought of a possible enhancement - before I go down the road,
> I am
> : looking for some input form the community?
> :
> : Currently, the QueryFilter caches the bits base upon the
> IndexReader.
> :
> : The problem with this is small incremental changes to the index
> : invalidate the cache.
> :
> : What if instead the filter determined that the underlying
> IndexReader
> : was a MultiReader and then maintained a bitset for each reader,
> : combining them in bits() when requested. The filter could check if
> : any of the underlying readers were the different (removed or added)
> : and then just create a new bitset for that reader. With the new non-
> : bit set filter implementations this could be even more memory
> : efficient since the bitsets would not need to be combined into a
> : single bitset.
> :
> : With the previous work on "reopen" so that segments are reused, this
> : would allow filters to be far more useful in a highly interactive
> : environment.
> :
> : What do you think?

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to