(I'm using Lucene 4.9.0)

I've been doing some perf testing of MemoryIndex, and have found that it is 
much slower when a BooleanQuery contains a non-required clause, compared to 
when it just contains required clauses.

Most of the time is spent in BooleanScorer, which as far as I can tell is an 
optimization for scoring lots of documents, so it would make sense that it's 
not so good when scoring just a single document.

I found that I'm able to greatly increase performance (non-required clause 
speed on par with required clause speed) by changing the 
acceptsDocsOutOfOrder() method in MemoryIndex's collector to return false 
instead of true, which causes BooleanScorer to not be used.

I did try out Lucene 5.0.0 and found that it is much faster, I think partially 
due to BooleanScorer not being used if optional.size() == 0, which happens if 
there are no document hits. This was changed here: 
http://svn.apache.org/viewvc/lucene/dev/tags/lucene_solr_5_0_0/lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java?r1=1651551&r2=1652034

I guess I don't really have a question. Just want to make other people aware of 
what I found. Maybe there are other optimizations that can be made to avoid 
using BooleanScorer.

-Michael

Reply via email to