Hi Uwe, I see, makes sense, thanks very much for the info. Sorry about giving you wrong info Carsten.
-sujit On Apr 15, 2013, at 1:06 PM, Uwe Schindler wrote: > Hi, > > ----Original Message----- >> From: Sujit Pal [mailto:sujitatgt...@gmail.com] On Behalf Of SUJIT PAL >> Sent: Monday, April 15, 2013 9:43 PM >> To: java-user@lucene.apache.org >> Subject: Re: Statically store sub-collections for search (faceted search?) >> >> Hi Uwe, >> >> Thanks for the info, I was under the impression that it didn't... I got this >> info >> (that filters don't have a limit because they are not scoring) from a >> document >> like the one below. Can't say this is the exact doc because its been a while >> since I saw that, though. >> >> http://searchhub.org/2009/06/08/bringing-the-highlighter-back-to-wildcard- >> queries-in-solr-14/ >> >> """ >> As a response to this performance pitfall on very large indices’s (and the >> infamous TooManyClauses exception), new queries were developed that >> relied on a new Query class called ConstantScoreQuery. >> ConstantScoreQuerys accept a filter of matching documents and then score >> with a constant value equal to the boost. Depending on the qualities of your >> index, this method can be faster than the Boolean expansion method, and >> more importantly, does not suffer from TooManyClauses exceptions. Rather >> than matching and scoring n BooleanQuery clauses (potentially thousands of >> clauses), a single filter is enumerated and then traveled for scoring. On the >> other hand, constructing and scoring with a BooleanQuery containing a few >> clauses is likely to be much faster than constructing and traveling a Filter. >> """ > > This is true, but you misunderstood it: This is about MultiTermQueries (which > is the superclass of WildcardQuery, Fuzzy-, and range queries). Those queries > are no native Lucene queries, so they rewrite to basic/native queries. In > earlier Lucene versions, Wildcards were always rewritten to BooleanQueries > with many TermQueries (one for each term that matches the wildcard), leading > to the problem with too many terms. This is still the case, but only in some > limits (this mode is only used if the wildcard expands to few terms). Those > BooleanQueris are then used with ConstantScoreQuery(Query). > The above text talks about another mode (which is used for many terms today): > *No* BooleanQuery is build at all, instead all matching term's documents are > marked in a BitSet and this BitSet is used with a Filter to construct a > different Query type: ConstantScoreQuery(Filter). The BooleanQuery max clause > count does not apply, because no BooleanQuery is involved in the whole > process. If you use ConstantScoreQuery(BooleanQuery), the limit still > applies, but not for ConstantScoreQuery(internalWildcardFilter). > > Uwe > >> On Apr 15, 2013, at 1:04 AM, Uwe Schindler wrote: >> >>> The limit also applies for filters. If you have a list of terms ORed >>> together, >> the fastest way is not to use a BooleanQuery at all, but instead a >> TermsFilter >> (which has no limits). >>> >>> ----- >>> Uwe Schindler >>> H.-H.-Meier-Allee 63, D-28213 Bremen >>> http://www.thetaphi.de >>> eMail: u...@thetaphi.de >>> >>> >>>> -----Original Message----- >>>> From: Carsten Schnober [mailto:schno...@ids-mannheim.de] >>>> Sent: Monday, April 15, 2013 9:53 AM >>>> To: java-user@lucene.apache.org >>>> Subject: Re: Statically store sub-collections for search (faceted >>>> search?) >>>> >>>> Am 12.04.2013 20:08, schrieb SUJIT PAL: >>>>> Hi Carsten, >>>>> >>>>> Why not use your idea of the BooleanQuery but wrap it in a Filter >> instead? >>>> Since you are not doing any scoring (only filtering), the max boolean >>>> clauses limit should not apply to a filter. >>>> >>>> Hi Sujit, >>>> thanks for your suggestion! I wasn't aware that the max clause limit >>>> does not match for a BooleanQuery wrapped in a filter. I suppose the >>>> ideal way would be to use a BooleanFilter but not a QueryWrapperFilter, >> right? >>>> >>>> However, I am also not sure how to apply a filter in my use case >>>> because I perform a SpanQuery. Although SpanQuery#getSpans() does >>>> take a Bits object as an argument (acceptDocs), I haven't been able >>>> to figure out how to generate this Bits object correctly from a Filter >> object. >>>> >>>> Best, >>>> Carsten >>>> >>>> -- >>>> Institut für Deutsche Sprache | http://www.ids-mannheim.de >>>> Projekt KorAP | http://korap.ids-mannheim.de >>>> Tel. +49-(0)621-43740789 | schno...@ids-mannheim.de >>>> Korpusanalyseplattform der nächsten Generation Next Generation >> Corpus >>>> Analysis Platform >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org