[ https://issues.apache.org/jira/browse/LUCENE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319972#comment-17319972 ]
Adrien Grand commented on LUCENE-9335: -------------------------------------- bq. I made the following changes, and actually still saw varying benchmark result across runs (randomized queries?) Indeed the benchmark randomly picks queries in the tasks file. bq. Changes in benchUtil.py to not verify counts Actually you should be able to do it without modifying the benchmarking code, by configuring your Competition object to not verify counts like that in your localrun file: {{comp = competition.Competition(verifyCounts=False)}} bq. When I run luceneutil, I see further errors from verifyScores section of code, which may indicate bugs in my changes: Indeed this indicates that the query returns different top hits with your change. If the change was in the order of one ulp, then this could be due to the fact that the sum might depend on the order in which clauses' scores are summed up, but given the significant score difference, there must be a bigger problem. Have you run tests with this change? This could help figure out where the bug is. > Add a bulk scorer for disjunctions that does dynamic pruning > ------------------------------------------------------------ > > Key: LUCENE-9335 > URL: https://issues.apache.org/jira/browse/LUCENE-9335 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Lucene often gets benchmarked against other engines, e.g. against Tantivy and > PISA at [https://tantivy-search.github.io/bench/] or against research > prototypes in Table 1 of > [https://cs.uwaterloo.ca/~jimmylin/publications/Grand_etal_ECIR2020_preprint.pdf]. > Given that top-level disjunctions of term queries are commonly used for > benchmarking, it would be nice to optimize this case a bit more, I suspect > that we could make fewer per-document decisions by implementing a BulkScorer > instead of a Scorer. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org