[ https://issues.apache.org/jira/browse/LUCENE-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834132#comment-15834132 ]
ASF subversion and git services commented on LUCENE-7628: --------------------------------------------------------- Commit 5bdc492c9ca8f866d9827d83a05fbab4b95f5ce9 in lucene-solr's branch refs/heads/master from [~romseygeek] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5bdc492 ] LUCENE-7628: Scorer.getChildren() returns only matching Scorers > Add a getMatchingChildren() method to DisjunctionScorer > ------------------------------------------------------- > > Key: LUCENE-7628 > URL: https://issues.apache.org/jira/browse/LUCENE-7628 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Alan Woodward > Assignee: Alan Woodward > Priority: Minor > Fix For: 6.4 > > Attachments: LUCENE-7628.patch, LUCENE-7628.patch > > > This one is a bit convoluted, so bear with me... > The luwak highlighter works by rewriting queries into their Span-equivalents, > and then running them with a special Collector. At each matching doc, the > highlighter gathers all the Spans objects positioned on the current doc and > collects their positions using the SpanCollection API. > Some queries can't be translated into Spans. For those queries that generate > Scorers with ChildScorers, like BooleanQuery, we can call .getChildren() on > the Scorer and see if any of them are SpanScorers, and for those that aren't > we can call .getChildren() again and recurse down. For each child scorer, we > check that it's positioned on the current document, so non-matching > subscorers can be skipped. > This all works correctly *except* in the case of a DisjunctionScorer where > one of the children is a two-phase iterator that has matched its > approximation, but not its refinement query. A SpanScorer in this situation > will be correctly positioned on the current document, but its Spans will be > in an undefined state, meaning the highlighter will either collect incorrect > hits, or it will throw an Exception and prevent hits being collected from > other subspans. > We've tried various ways around this (including forking SpanNearQuery and > adding a bunch of slow position checks to it that are used only by the > highlighting code), but it turns out that the simplest fix is to add a new > method to DisjunctionScorer that only returns the currently matching child > Scorers. It's a bit of a hack, and it won't be used anywhere else, but it's > a fairly small and contained hack. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org