Thanks for the detailed reponse Hoss. That's the sort of in depth golden nugget
I'd like to see in a copy of LIA 2 when it becomes available...
I've wanted to use Filter to cache certain of my Term Queries, as it looked
faster for straight Term Query searches, but Solr's DocSet interface abstraction
is more useful. HashDocSet will probably satisfy 90% of my cache.
Index DBs will typically be in the 1-3 million documents range, but for mail
which is spread over 1-6K user, so caching lots of BitSets for that number of
users in not practical!
I ended up creating a DocSetFilter and creating DocSets (a la Solr) from BitSet
which is then cached. I then convert it back during Filter.bits(). Not the
best solution, but the typical hit size is small, so the iteration is fast.
Thanks eks dev for the info about Lucene-584 - that looks like an interesting
set of patches.
Antony
Chris Hostetter wrote:
it's kind of an Apples/Oranges comparison .. in the examples you gave
below, one is executing an arbitrary query (which oculd be anything) the
other is doing a simple TermEnumeration.
Asuming that Query is a TermQuery, the Filter is theoreticaly going to be
faster becuase it does't have to compute any Scores ... generally speaking
a a Filter will alwyas be a little faster then a functionally equivilent
Query for the purposes of building up a simple BitSet of matching
documents because teh Query involves the score calcuations ... but the
Query is generally more usable.
The Query can also be more efficient in other ways, because the
HitCollector doesn't *have* to build a BitSet, it can deal with the
results in whatever way it wants (where as a Filter allways generates a
BitSet).
Solr goes the HitCollector route for a few reasons:
1) allows us to use hte DocSet abstraction which allows other
performance benefits over straight BitSets
2) allows us to have simpler code that builds DocSets and DocLists
(DocLists know about scores, sorting, and pagination) in a single
pass when scores or sorting are requested.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]