Hi Uwe,

I see, makes sense, thanks very much for the info. Sorry about giving you wrong 
info Carsten.

-sujit

On Apr 15, 2013, at 1:06 PM, Uwe Schindler wrote:

> Hi,
> 
> ----Original Message-----
>> From: Sujit Pal [mailto:sujitatgt...@gmail.com] On Behalf Of SUJIT PAL
>> Sent: Monday, April 15, 2013 9:43 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: Statically store sub-collections for search (faceted search?)
>> 
>> Hi Uwe,
>> 
>> Thanks for the info, I was under the impression that it didn't... I got this 
>> info
>> (that filters don't have a limit because they are not scoring) from a 
>> document
>> like the one below. Can't say this is the exact doc because its been a while
>> since I saw that, though.
>> 
>> http://searchhub.org/2009/06/08/bringing-the-highlighter-back-to-wildcard-
>> queries-in-solr-14/
>> 
>> """
>> As a response to this performance pitfall on very large indices’s (and the
>> infamous TooManyClauses exception), new queries were developed that
>> relied on a new Query class called ConstantScoreQuery.
>> ConstantScoreQuerys accept a filter of matching documents and then score
>> with a constant value equal to the boost. Depending on the qualities of your
>> index, this method can be faster than the Boolean expansion method, and
>> more importantly, does not suffer from TooManyClauses exceptions. Rather
>> than matching and scoring n BooleanQuery clauses (potentially thousands of
>> clauses), a single filter is enumerated and then traveled for scoring. On the
>> other hand, constructing and scoring with a BooleanQuery containing a few
>> clauses is likely to be much faster than constructing and traveling a Filter.
>> """
> 
> This is true, but you misunderstood it: This is about MultiTermQueries (which 
> is the superclass of WildcardQuery, Fuzzy-, and range queries). Those queries 
> are no native Lucene queries, so they rewrite to basic/native queries. In 
> earlier Lucene versions, Wildcards were always rewritten to BooleanQueries 
> with many TermQueries (one for each term that matches the wildcard), leading 
> to the problem with too many terms. This is still the case, but only in some 
> limits (this mode is only used if the wildcard expands to few terms). Those 
> BooleanQueris are then used with ConstantScoreQuery(Query).
> The above text talks about another mode (which is used for many terms today): 
> *No* BooleanQuery is build at all, instead all matching term's documents are 
> marked in a BitSet and this BitSet is used with a Filter to construct a 
> different Query type: ConstantScoreQuery(Filter). The BooleanQuery max clause 
> count does not apply, because no BooleanQuery is involved in the whole 
> process. If you use ConstantScoreQuery(BooleanQuery), the limit still 
> applies, but not for ConstantScoreQuery(internalWildcardFilter).
> 
> Uwe
> 
>> On Apr 15, 2013, at 1:04 AM, Uwe Schindler wrote:
>> 
>>> The limit also applies for filters. If you have a list of terms ORed 
>>> together,
>> the fastest way is not to use a BooleanQuery at all, but instead a 
>> TermsFilter
>> (which has no limits).
>>> 
>>> -----
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>> http://www.thetaphi.de
>>> eMail: u...@thetaphi.de
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Carsten Schnober [mailto:schno...@ids-mannheim.de]
>>>> Sent: Monday, April 15, 2013 9:53 AM
>>>> To: java-user@lucene.apache.org
>>>> Subject: Re: Statically store sub-collections for search (faceted
>>>> search?)
>>>> 
>>>> Am 12.04.2013 20:08, schrieb SUJIT PAL:
>>>>> Hi Carsten,
>>>>> 
>>>>> Why not use your idea of the BooleanQuery but wrap it in a Filter
>> instead?
>>>> Since you are not doing any scoring (only filtering), the max boolean
>>>> clauses limit should not apply to a filter.
>>>> 
>>>> Hi Sujit,
>>>> thanks for your suggestion! I wasn't aware that the max clause limit
>>>> does not match for a BooleanQuery wrapped in a filter. I suppose the
>>>> ideal way would be to use a BooleanFilter but not a QueryWrapperFilter,
>> right?
>>>> 
>>>> However, I am also not sure how to apply a filter in my use case
>>>> because I perform a SpanQuery. Although SpanQuery#getSpans() does
>>>> take a Bits object as an argument (acceptDocs), I haven't been able
>>>> to figure out how to generate this Bits object correctly from a Filter
>> object.
>>>> 
>>>> Best,
>>>> Carsten
>>>> 
>>>> --
>>>> Institut für Deutsche Sprache | http://www.ids-mannheim.de
>>>> Projekt KorAP                 | http://korap.ids-mannheim.de
>>>> Tel. +49-(0)621-43740789      | schno...@ids-mannheim.de
>>>> Korpusanalyseplattform der nächsten Generation Next Generation
>> Corpus
>>>> Analysis Platform
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to