(I'm using Lucene 4.9.0) I've been doing some perf testing of MemoryIndex, and have found that it is much slower when a BooleanQuery contains a non-required clause, compared to when it just contains required clauses.
Most of the time is spent in BooleanScorer, which as far as I can tell is an optimization for scoring lots of documents, so it would make sense that it's not so good when scoring just a single document. I found that I'm able to greatly increase performance (non-required clause speed on par with required clause speed) by changing the acceptsDocsOutOfOrder() method in MemoryIndex's collector to return false instead of true, which causes BooleanScorer to not be used. I did try out Lucene 5.0.0 and found that it is much faster, I think partially due to BooleanScorer not being used if optional.size() == 0, which happens if there are no document hits. This was changed here: http://svn.apache.org/viewvc/lucene/dev/tags/lucene_solr_5_0_0/lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java?r1=1651551&r2=1652034 I guess I don't really have a question. Just want to make other people aware of what I found. Maybe there are other optimizations that can be made to avoid using BooleanScorer. -Michael