Am 12.03.2013 10:39, schrieb Uwe Schindler:
> I would suggest to use my example code with the fake query and custom
> rewrite. This does not have the overhead of BooleanQuery and more important:
> You don't need to change the *global* and *static* default in BooleanQuery.
> Otherwise you could introduce a denial of service case into your application,
> if you at some other place execute a wildcard using Boolean rewrite with
> unlimited number of terms.
Hi Uwe,
many thanks for your code sample! I've made tiny adaptations in
GetTermsRewrite to make the overridden methods match their counterparts
in the superclass (ScoringRewrite). I suppose that your version was not
written for Lucene 4.0, right? It looks like this now:
final class GetTermsRewrite extends ScoringRewrite<TermHolderQuery> {
@Override
protected void addClause(TermHolderQuery topLevel, Term term, int
docCount, float boost, TermContext states) {
topLevel.add(term);
}
@Override
protected TermHolderQuery getTopLevelQuery() {
return new TermHolderQuery();
}
@Override
protected void checkMaxClauseCount(int count) throws IOException {
// TODO Auto-generated method stub
}
}
I'm not sure what checkMaxClauseCount() is supposed to do though, but
apart from that, everything works great. Thanks!
The code I use for calling this:
IndexSearcher searcher = ...;
String query = ...;
MultiTermQuery query = new RegexpQuery(new Term("text", query));
query.setRewriteMethod(new GetTermsRewrite());
TermHolderQuery queryRewritten = (TermHolderQuery) searcher.rewrite(query);
Set<Term> terms = queryRewritten.getTerms();
There's another thing that is not entirely clear to me: when calling
query.setRewriteMethod(new GetTermsRewrite()), does this really apply to
the IndexSearcher in the sense that IndexSearcher.rewrite() uses the
given rewrite method? It seems to work fine, but I am not sure why it
does and whether it always will do.
Best,
Carsten
--
Institut für Deutsche Sprache | http://www.ids-mannheim.de
Projekt KorAP | http://korap.ids-mannheim.de
Tel. +49-(0)621-43740789 | [email protected]
Korpusanalyseplattform der nächsten Generation
Next Generation Corpus Analysis Platform
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]