Sorry for the double post, but I think I can clarify the problem a little
more.
We want to execute:
query: A | B | C | D
filter: null
However, C and D cause TooManyClauses, so instead we execute:
query: A | B
filter: C | D
My understanding is that Lucene will apply the Filter (C | D) first,
limiting the result set, then apply the Query (A | B). Is this correct?
If so, the end result is essentially the query: (A | B) & (C | D)
Is there any way I can achieve (A | B | C | D) without putting the entire
query into a filter (which is too slow)?
Shaun
On Thu, Oct 15, 2009 at 5:14 PM, Shaun Senecal <[email protected]>wrote:
> I know this has been discussed to great length, but I still have not found
> a satisfactory solution and I am hoping someone on the list has some
> ideas...
>
> We have a large index (4M+ Documents) with a handful of Fields. We need to
> perform PrefixQueries on multiple fields. The problem is that when the
> Query gets rewritten, certain fields expand to too many terms and we end up
> with TooManyClauses (I know, I know, read the FAQ). The solution so far has
> been to extract the bits of the query which cause TooManyClauses to be
> thrown and make them filters:
>
> for every field to be searched {
> try {
> PrefixQuery(term).rewrite();
>
> if (resulting BooleanQuery contains at least 1 clause) //
> important, otherwise 0 results can be returned when >0 should be returned
> add the rewritten query to a BooleanQuery (using SHOULD)
> catch (TMC) {
> PrefixFilter(term)
> add the filter to a BooleanFilter(using SHOULD)
> }
> }
>
>
> Up to Lucene 2.4, this has been working out for us. However, in Lucene 2.9
> this breaks since rewrite() now returns a ConstantScoreQuery. I changed the
> code to automatically make the entire query a filter if TooManyClauses is
> ever caught, but this had massive performance implications. It seems to
> have doubled our average query execution time!
>
> Is there a solution to this? Is there a way I can know that a
> ConstantScoreQuery will match at least 1 term (if not, I dont want to add it
> to the BooleanQuery)? Does 2.9 support new features that would aid in this
> area?
>