This might be even better in conjunction with moving away from BitSet to some sort of interface like DocNrSkipper... that way you would never have to combine the filters into a single BitSet.
-Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server On 7/7/06, robert engels <[EMAIL PROTECTED]> wrote:
I implemented it and it works great. I didn't worry about the deletions since by the time a filter is used the deleted documents are already removed by the query. The only problem that arose out of this was for things like the ConstantScoreQuery (which uses a filter) - I needed to modify this query to ignore deleted documents. Now I have incremental cached filters - the query performance is going through the roof. On Jul 7, 2006, at 2:47 PM, Chris Hostetter wrote: > > I'm no segments/MultiReader expert, but your idea sounds good to > me ... it > seems like it would certainly work in the "new segments" situation. > > One thing i don't see you mention is dealing with deletions ... i'm > not > sure if deleting documents cause the version number of an > IndexReader to > change or not (if it does your job is easy) but even if it doesn't I'm > guessing you could say that if hasDeletions() returns true, you > have to > assume you need to invalidate your cached bits (worst case scenerio > you > are invalidating the cache as often as it is now) > > > : Date: Fri, 7 Jul 2006 00:32:54 -0500 > : From: robert engels <[EMAIL PROTECTED]> > : Reply-To: java-dev@lucene.apache.org > : To: Lucene-Dev <java-dev@lucene.apache.org> > : Subject: MultiSegmentQueryFilter enhancement for interactive > indexes? > : > : I thought of a possible enhancement - before I go down the road, > I am > : looking for some input form the community? > : > : Currently, the QueryFilter caches the bits base upon the > IndexReader. > : > : The problem with this is small incremental changes to the index > : invalidate the cache. > : > : What if instead the filter determined that the underlying > IndexReader > : was a MultiReader and then maintained a bitset for each reader, > : combining them in bits() when requested. The filter could check if > : any of the underlying readers were the different (removed or added) > : and then just create a new bitset for that reader. With the new non- > : bit set filter implementations this could be even more memory > : efficient since the bitsets would not need to be combined into a > : single bitset. > : > : With the previous work on "reopen" so that segments are reused, this > : would allow filters to be far more useful in a highly interactive > : environment. > : > : What do you think?
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]