[jira] [Updated] (LUCENE-9335) Add a bulk scorer for disjunctions that does dynamic pruning
[ https://issues.apache.org/jira/browse/LUCENE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zach Chen updated LUCENE-9335: -- Attachment: wikimedium.10M.nostopwords.tasks > Add a bulk scorer for disjunctions that does dynamic pruning > > > Key: LUCENE-9335 > URL: https://issues.apache.org/jira/browse/LUCENE-9335 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: wikimedium.10M.nostopwords.tasks > > Time Spent: 2.5h > Remaining Estimate: 0h > > Lucene often gets benchmarked against other engines, e.g. against Tantivy and > PISA at [https://tantivy-search.github.io/bench/] or against research > prototypes in Table 1 of > [https://cs.uwaterloo.ca/~jimmylin/publications/Grand_etal_ECIR2020_preprint.pdf]. > Given that top-level disjunctions of term queries are commonly used for > benchmarking, it would be nice to optimize this case a bit more, I suspect > that we could make fewer per-document decisions by implementing a BulkScorer > instead of a Scorer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9335) Add a bulk scorer for disjunctions that does dynamic pruning
[ https://issues.apache.org/jira/browse/LUCENE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zach Chen updated LUCENE-9335: -- Attachment: wikimedium.10M.nostopwords.tasks.5OrMeds > Add a bulk scorer for disjunctions that does dynamic pruning > > > Key: LUCENE-9335 > URL: https://issues.apache.org/jira/browse/LUCENE-9335 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: wikimedium.10M.nostopwords.tasks, > wikimedium.10M.nostopwords.tasks.5OrMeds > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Lucene often gets benchmarked against other engines, e.g. against Tantivy and > PISA at [https://tantivy-search.github.io/bench/] or against research > prototypes in Table 1 of > [https://cs.uwaterloo.ca/~jimmylin/publications/Grand_etal_ECIR2020_preprint.pdf]. > Given that top-level disjunctions of term queries are commonly used for > benchmarking, it would be nice to optimize this case a bit more, I suspect > that we could make fewer per-document decisions by implementing a BulkScorer > instead of a Scorer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9335) Add a bulk scorer for disjunctions that does dynamic pruning
[ https://issues.apache.org/jira/browse/LUCENE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-9335: - Attachment: MSMarcoPassages.java > Add a bulk scorer for disjunctions that does dynamic pruning > > > Key: LUCENE-9335 > URL: https://issues.apache.org/jira/browse/LUCENE-9335 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: MSMarcoPassages.java, wikimedium.10M.nostopwords.tasks, > wikimedium.10M.nostopwords.tasks.5OrMeds > > Time Spent: 6h 50m > Remaining Estimate: 0h > > Lucene often gets benchmarked against other engines, e.g. against Tantivy and > PISA at [https://tantivy-search.github.io/bench/] or against research > prototypes in Table 1 of > [https://cs.uwaterloo.ca/~jimmylin/publications/Grand_etal_ECIR2020_preprint.pdf]. > Given that top-level disjunctions of term queries are commonly used for > benchmarking, it would be nice to optimize this case a bit more, I suspect > that we could make fewer per-document decisions by implementing a BulkScorer > instead of a Scorer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9335) Add a bulk scorer for disjunctions that does dynamic pruning
[ https://issues.apache.org/jira/browse/LUCENE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zach Chen updated LUCENE-9335: -- Attachment: JFR result for BMM scorer with optimizations May 22.png > Add a bulk scorer for disjunctions that does dynamic pruning > > > Key: LUCENE-9335 > URL: https://issues.apache.org/jira/browse/LUCENE-9335 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: JFR result for BMM scorer with optimizations May 22.png, > MSMarcoPassages.java, wikimedium.10M.nostopwords.tasks, > wikimedium.10M.nostopwords.tasks.5OrMeds > > Time Spent: 6h 50m > Remaining Estimate: 0h > > Lucene often gets benchmarked against other engines, e.g. against Tantivy and > PISA at [https://tantivy-search.github.io/bench/] or against research > prototypes in Table 1 of > [https://cs.uwaterloo.ca/~jimmylin/publications/Grand_etal_ECIR2020_preprint.pdf]. > Given that top-level disjunctions of term queries are commonly used for > benchmarking, it would be nice to optimize this case a bit more, I suspect > that we could make fewer per-document decisions by implementing a BulkScorer > instead of a Scorer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org