[
https://issues.apache.org/jira/browse/LUCENE-7260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15868916#comment-15868916
]
Trejkaz commented on LUCENE-7260:
---------------------------------
Lucene 5.5 is an additional 50% slower.
{noformat}
dt: 28105
dt: 27394
dt: 27947
dt: 25959
dt: 24957
dt: 27461
dt: 25734
dt: 27295
dt: 26739
dt: 28613
{noformat}
> StandardQueryParser is over 100 times slower in v5 compared to v3
> -----------------------------------------------------------------
>
> Key: LUCENE-7260
> URL: https://issues.apache.org/jira/browse/LUCENE-7260
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/queryparser
> Affects Versions: 5.4.1
> Environment: Java 8u51
> Reporter: Trejkaz
> Labels: performance
>
> The following test code times parsing a large query.
> {code}
> import org.apache.lucene.analysis.KeywordAnalyzer;
> //import org.apache.lucene.analysis.core.KeywordAnalyzer;
> import org.apache.lucene.queryParser.standard.StandardQueryParser;
> //import org.apache.lucene.queryparser.flexible.standard.StandardQueryParser;
> import org.apache.lucene.search.BooleanQuery;
> public class LargeQueryTest {
> public static void main(String[] args) throws Exception {
> BooleanQuery.setMaxClauseCount(50_000);
> StringBuilder builder = new StringBuilder(50_000*10);
> builder.append("id:( ");
> boolean first = true;
> for (int i = 0; i < 50_000; i++) {
> if (first) {
> first = false;
> } else {
> builder.append(" OR ");
> }
> builder.append(String.valueOf(i));
> }
> builder.append(" )");
> String queryString = builder.toString();
> StandardQueryParser parser2 = new StandardQueryParser(new
> KeywordAnalyzer());
> for (int i = 0; i < 10; i++) {
> long t0 = System.currentTimeMillis();
> parser2.parse(queryString, "nope");
> long t1 = System.currentTimeMillis();
> System.out.println(t1-t0);
> }
> }
> }
> {code}
> For Lucene 3.6.2, the timings settle down to 200~300 with the fastest being
> 207.
> For Lucene 5.4.1, the timings settle down to 20000~30000 with the fastest
> being 22444.
> So at some point, some change made the query parser 100 times slower. I would
> suspect that it has something to do with how the list of children is now
> handled. Every time someone gets the children, it copies the list. Every time
> someone sets the children, it walks through to detach parent references and
> then reattaches them all again.
> If it were me, I would probably make these collections immutable so that I
> didn't have to defensively copy them.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]