Thanks for the detailed reponse Hoss. That's the sort of in depth golden nugget I'd like to see in a copy of LIA 2 when it becomes available...

I've wanted to use Filter to cache certain of my Term Queries, as it looked faster for straight Term Query searches, but Solr's DocSet interface abstraction is more useful. HashDocSet will probably satisfy 90% of my cache.

Index DBs will typically be in the 1-3 million documents range, but for mail which is spread over 1-6K user, so caching lots of BitSets for that number of users in not practical!

I ended up creating a DocSetFilter and creating DocSets (a la Solr) from BitSet which is then cached. I then convert it back during Filter.bits(). Not the best solution, but the typical hit size is small, so the iteration is fast.

Thanks eks dev for the info about Lucene-584 - that looks like an interesting set of patches.

Antony

Chris Hostetter wrote:
it's kind of an Apples/Oranges comparison .. in the examples you gave
below, one is executing an arbitrary query (which oculd be anything) the
other is doing a simple TermEnumeration.

Asuming that Query is a TermQuery, the Filter is theoreticaly going to be
faster becuase it does't have to compute any Scores ... generally speaking
a a Filter will alwyas be a little faster then a functionally equivilent
Query for the purposes of building up a simple BitSet of matching
documents because teh Query involves the score calcuations ... but the
Query is generally more usable.

The Query can also be more efficient in other ways, because the
HitCollector doesn't *have* to build a BitSet, it can deal with the
results in whatever way it wants (where as a Filter allways generates a
BitSet).

Solr goes the HitCollector route for a few reasons:
  1) allows us to use hte DocSet abstraction which allows other
     performance benefits over straight BitSets
  2) allows us to have simpler code that builds DocSets and DocLists
     (DocLists know about scores, sorting, and pagination) in a single
     pass when scores or sorting are requested.



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to