Am 11.03.2013 18:22, schrieb Michael McCandless: > On Mon, Mar 11, 2013 at 9:32 AM, Carsten Schnober > <schno...@ids-mannheim.de> wrote: >> Am 11.03.2013 13:38, schrieb Michael McCandless: >>> On Mon, Mar 11, 2013 at 7:08 AM, Uwe Schindler <u...@thetaphi.de> wrote: >>> >>>> Set the rewrite method to e.g. SCORING_BOOLEAN_QUERY_REWRITE, then this >>>> should work (after rewrite your query is a BooleanQuery, which supports >>>> extractTerms()). >>> >>> ... as long as you don't exceed the max number of terms allowed by BQ >>> (1024 by default, but you can raise it). >> >> True, I've noticed this meanwhile. Are there any recommendations for >> this setting where the limit is as large as possible while staying >> within a reasonable performance? Of course, this is highly subjective, >> but what's the magnitude here? Will a limit of 1,024,000 typically >> increase the query time by the factor 1,000 too? >> Carsten > > I think 1024 may already be too high ;) > > But really it depends on your situation: test different limits and see. > > How much slower a larger query is depends on the specifics of the terms ...
For the purpose of initial testing, I've increased the limit by the factor 1,000. As Uwe pointed out, I don't actually execute the query, but only extract the terms. In this regard, there are no performance issues with thousands of terms, although I will have to perform a systematic evaluation yet. Best, Carsten -- Institut für Deutsche Sprache | http://www.ids-mannheim.de Projekt KorAP | http://korap.ids-mannheim.de Tel. +49-(0)621-43740789 | schno...@ids-mannheim.de Korpusanalyseplattform der nächsten Generation Next Generation Corpus Analysis Platform --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org