[ https://issues.apache.org/jira/browse/LUCENE-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049941#comment-14049941 ]
Terry Smith commented on LUCENE-5796: ------------------------------------- Thanks for taking the time to review my patch and comment on the approach. The reason that I advocated changing FilterScorer and BoostedScorer is to allow some of my custom Query implementations to use a regular BooleanQuery for recall and optionally scoring while taking advantage of the actual Scorers used on a per document, per clause basis. This has been working great across quite a few Lucene releases but failed when I upgraded to 4.9 due to the two regressions in behavior for Scorer.getChildren() as described in this ticket. In this scenario, a BooleanQuery containing two TermQueries (one a miss and the other a hit) returns the following from BooleanWeight.scorer(): * BoostedScorer ** TermScorer (hit) Calling getChildren() on this returns an empty list because the BoostedScorer just returns in.getChildren() and thus you are unable to navigate to the actual TermScorer in play. This would impact any classes that extend FilterScorer and don't override getChildren(). In other words, the current wiring does make the BoostedScorer transparent but with the disadvantage of hiding the actual scorer that performs the work. If this is an unsupported workflow, I'm happy to move the discussion over to the user mailing list. > Scorer.getChildren() can throw or hide a subscorer for some boolean queries > --------------------------------------------------------------------------- > > Key: LUCENE-5796 > URL: https://issues.apache.org/jira/browse/LUCENE-5796 > Project: Lucene - Core > Issue Type: Bug > Components: core/search > Affects Versions: 4.9 > Reporter: Terry Smith > Priority: Minor > Attachments: LUCENE-5796.patch > > > I've isolated two example boolean queries that don't behave with release 4.9 > of Lucene. > # A BooleanQuery with three SHOULD clauses and a minimumNumberShouldMatch of > 2 will throw an ArrayIndexOutOfBoundsException. > {noformat} > java.lang.ArrayIndexOutOfBoundsException: 2 > at > __randomizedtesting.SeedInfo.seed([2F79B3DF917D071B:2539E6DBC4DF793C]:0) > at > org.apache.lucene.search.MinShouldMatchSumScorer.getChildren(MinShouldMatchSumScorer.java:119) > at > org.apache.lucene.search.TestBooleanQueryVisitSubscorers$ScorerSummarizingCollector.summarizeScorer(TestBooleanQueryVisitSubscorers.java:261) > at > org.apache.lucene.search.TestBooleanQueryVisitSubscorers$ScorerSummarizingCollector.setScorer(TestBooleanQueryVisitSubscorers.java:238) > at > org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:161) > at > org.apache.lucene.search.AssertingBulkScorer.score(AssertingBulkScorer.java:64) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621) > at > org.apache.lucene.search.AssertingIndexSearcher.search(AssertingIndexSearcher.java:94) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:309) > at > org.apache.lucene.search.TestBooleanQueryVisitSubscorers.testGetChildrenMinShouldMatchSumScorer(TestBooleanQueryVisitSubscorers.java:196) > {noformat} > # A BooleanQuery with two should clauses, one of which is a miss for all > documents in the current segment will accidentally mask the scorer that was a > hit. > Unit tests and patch based on {{branch_4x}} are available and will be > attached as soon as this ticket has a number. > They are immediately available on GitHub on branch > [shebiki/bqgetchildren|https://github.com/shebiki/lucene-solr/commits/bqgetchildren] > as commit > [c64bb6f|https://github.com/shebiki/lucene-solr/commit/c64bb6f2df8f33dd8daafc953d9c27b5cbf29fa3]. > I took the liberty of naming the relationship in BoostingScorer.getChildren() > {{BOOSTING}}. Suspect someone will offer a better name for this. Here is a > summary of the various relationships in play for all Scorer.getChildren() > implementations on {{branch_4x}} to help choose. > || class || > relationships > | org.apache.lucene.search.AssertingScorer | > SHOULD > | org.apache.lucene.search.join.ToParentBlockJoinQuery.BlockJoinScorer | > BLOCK_JOIN > | org.apache.lucene.search.ConjunctionScorer | MUST > | org.apache.lucene.search.ConstantScoreQuery.ConstantScorer | > constant > | org.apache.lucene.queries.function.BoostedQuery.CustomScorer | > CUSTOM > | org.apache.lucene.queries.CustomScoreQuery.CustomScorer | > CUSTOM > | org.apache.lucene.search.DisjunctionScorer | > SHOULD > | org.apache.lucene.facet.DrillSidewaysScorer.FakeScorer | MUST > | org.apache.lucene.search.FilterScorer | > calls in.getChildren() > | org.apache.lucene.search.ScoreCachingWrappingScorer | > CACHED > | org.apache.lucene.search.FilteredQuery.LeapFrogScorer | > FILTERED > | org.apache.lucene.search.MinShouldMatchSumScorer | > SHOULD > | org.apache.lucene.search.FilteredQuery | > FILTERED > | org.apache.lucene.search.ReqExclScorer | MUST > | org.apache.lucene.search.ReqOptSumScorer | > MUST, SHOULD > | org.apache.lucene.search.join.ToChildBlockJoinQuery | > BLOCK_JOIN > I also removed FilterScorer.getChildren() to prevent mistakes and force > subclasses to provide a correct implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org