[ https://issues.apache.org/jira/browse/LUCENE-8477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793928#comment-16793928 ]
Alan Woodward commented on LUCENE-8477: --------------------------------------- Here's a better patch, using term counting rather than prefix matching - the latter won't work if we have stacked tokens, for example, and this makes things much simpler. > Improve handling of inner disjunctions in intervals > --------------------------------------------------- > > Key: LUCENE-8477 > URL: https://issues.apache.org/jira/browse/LUCENE-8477 > Project: Lucene - Core > Issue Type: New Feature > Reporter: Alan Woodward > Priority: Major > Attachments: LUCENE-8477.patch, LUCENE-8477.patch > > > The current implementation of the disjunction interval produced by > {{Intervals.or}} is a direct implementation of the OR operator from the Vigna > paper. This produces minimal intervals, meaning that (a) is preferred over > (a b), and (b) also over (a b). This has advantages when it comes to > counting intervals for scoring, but also has drawbacks when it comes to > matching. For example, a phrase query for ((a OR (a b)) BLOCK (c)) will not > match the document (a b c), because (a) will be preferred over (a b), and (a > c) does not match. > This ticket is to discuss the best way of dealing with disjunctions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org