I remember the benefits from Terms.intersect being pretty huge. Rather
than simple ping-pong, the whole monster gets handed off directly to
the codec's term dictionary implementation. For the default terms
dictionary using blocktree, this saves time seeking to terms you don't
care about (because
Thanks for the feedback Robert. This approach sounds like a better path to
follow. I'll explore it. I agree that we should provide default behavior
that is overall best for our users, and not for one specific use-case such
as Amazon search :).
Mike- TermInSetQuery used to use seekExact, and now
Besides not being able to use the bloom filter, seekCeil is also just more
costly than seekExact since it is essentially both .seekExact and .next in
a single operation.
Are either of the two approaches using the intersect method of TermsEnum?
It might be faster if the number of terms is over
Thanks Patrick. I tend to agree with you for the default behavior. Bloom
filter usage seems like a bit of a less-common case on the surface at least
(e.g., it's expected behavior for query terms to not be present in a given
segment with enough frequency to justify the additional codec layer). A
The better solution is to use Terms.intersect. Then the postings
format can do the right thing. But this query doesn't use
Terms.intersect today, instead doing ping-ponging itself.
That's the problem.
We must *not* tune our algorithms for amazon's search but instead what
is the best for users
Hi Greg
IMO I still think the seekCeil is a better solution for the default posting
format, as it could potentially save time on traversing the FST by doing
the ping-pong skipping.
I can see that in the case of using bloom filter the seekExact might be
better but I'm not sure whether there is a
Hi folks-
Back in GH#12156 (https://github.com/apache/lucene/pull/12156), we rewrote
TermInSetQuery to extend MultiTermQuery. With this change, TermInSetQuery
can now leverage the various "rewrite methods" available to MultiTermQuery,
allowing users to customize the query evaluation strategy