[jira] [Commented] (LUCENE-8828) Fix Intervals.unordered() without overlaps
[ https://issues.apache.org/jira/browse/LUCENE-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858680#comment-16858680 ] ASF subversion and git services commented on LUCENE-8828: - Commit cd030efa9c91a0da0a0b9d4a4003161bb775ed61 in lucene-solr's branch refs/heads/branch_8x from Alan Woodward [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cd030ef ] LUCENE-8828: Make unorderedNoOverlaps a separate IntervalsSource > Fix Intervals.unordered() without overlaps > -- > > Key: LUCENE-8828 > URL: https://issues.apache.org/jira/browse/LUCENE-8828 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-8828.patch > > > LUCENE-8300 added an option to Intervals.unordered() which would attempt to > find intervals that contained all of a set of subintervals where none of the > subintervals overlapped. Unfortunately, this implementation was buggy, and > could miss documents depending on the order in which the subintervals were > passed to the factory method. > After some digging around, I think that it is not in fact possible to > implement this in anything other than n! time, because of the need to > minimize the resulting intervals. My proposal is to remove the boolean flag, > and instead implement an Intervals.unorderedNoOverlaps() method that takes > only two subsources, and rewrites NO_OVERLAPS(a, b) to OR(ORDERED(a, b), > ORDERED(b, a)). The usual simplifications will apply here, so NO_OVERLAPS(a, > a) will end up as ORDERED(a, a) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8828) Fix Intervals.unordered() without overlaps
[ https://issues.apache.org/jira/browse/LUCENE-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858681#comment-16858681 ] ASF subversion and git services commented on LUCENE-8828: - Commit 67677d995e8fca6214844b0eb814d06138bce0a2 in lucene-solr's branch refs/heads/master from Alan Woodward [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=67677d9 ] LUCENE-8828: Make unorderedNoOverlaps a separate IntervalsSource > Fix Intervals.unordered() without overlaps > -- > > Key: LUCENE-8828 > URL: https://issues.apache.org/jira/browse/LUCENE-8828 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-8828.patch > > > LUCENE-8300 added an option to Intervals.unordered() which would attempt to > find intervals that contained all of a set of subintervals where none of the > subintervals overlapped. Unfortunately, this implementation was buggy, and > could miss documents depending on the order in which the subintervals were > passed to the factory method. > After some digging around, I think that it is not in fact possible to > implement this in anything other than n! time, because of the need to > minimize the resulting intervals. My proposal is to remove the boolean flag, > and instead implement an Intervals.unorderedNoOverlaps() method that takes > only two subsources, and rewrites NO_OVERLAPS(a, b) to OR(ORDERED(a, b), > ORDERED(b, a)). The usual simplifications will apply here, so NO_OVERLAPS(a, > a) will end up as ORDERED(a, a) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8828) Fix Intervals.unordered() without overlaps
[ https://issues.apache.org/jira/browse/LUCENE-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855924#comment-16855924 ] Matt Weber commented on LUCENE-8828: [~romseygeek] Sounds good thanks! > Fix Intervals.unordered() without overlaps > -- > > Key: LUCENE-8828 > URL: https://issues.apache.org/jira/browse/LUCENE-8828 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-8828.patch > > > LUCENE-8300 added an option to Intervals.unordered() which would attempt to > find intervals that contained all of a set of subintervals where none of the > subintervals overlapped. Unfortunately, this implementation was buggy, and > could miss documents depending on the order in which the subintervals were > passed to the factory method. > After some digging around, I think that it is not in fact possible to > implement this in anything other than n! time, because of the need to > minimize the resulting intervals. My proposal is to remove the boolean flag, > and instead implement an Intervals.unorderedNoOverlaps() method that takes > only two subsources, and rewrites NO_OVERLAPS(a, b) to OR(ORDERED(a, b), > ORDERED(b, a)). The usual simplifications will apply here, so NO_OVERLAPS(a, > a) will end up as ORDERED(a, a) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8828) Fix Intervals.unordered() without overlaps
[ https://issues.apache.org/jira/browse/LUCENE-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855907#comment-16855907 ] Alan Woodward commented on LUCENE-8828: --- Right, that would match `w2 w3` or `w2 w2` but not `w2` by itself - I'll add a test to ensure that case is covered. > Fix Intervals.unordered() without overlaps > -- > > Key: LUCENE-8828 > URL: https://issues.apache.org/jira/browse/LUCENE-8828 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-8828.patch > > > LUCENE-8300 added an option to Intervals.unordered() which would attempt to > find intervals that contained all of a set of subintervals where none of the > subintervals overlapped. Unfortunately, this implementation was buggy, and > could miss documents depending on the order in which the subintervals were > passed to the factory method. > After some digging around, I think that it is not in fact possible to > implement this in anything other than n! time, because of the need to > minimize the resulting intervals. My proposal is to remove the boolean flag, > and instead implement an Intervals.unorderedNoOverlaps() method that takes > only two subsources, and rewrites NO_OVERLAPS(a, b) to OR(ORDERED(a, b), > ORDERED(b, a)). The usual simplifications will apply here, so NO_OVERLAPS(a, > a) will end up as ORDERED(a, a) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8828) Fix Intervals.unordered() without overlaps
[ https://issues.apache.org/jira/browse/LUCENE-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855897#comment-16855897 ] Matt Weber commented on LUCENE-8828: [~romseygeek] Yup that would work... would this be able to handle something like {{NO_OVERLAPS(OR(a,b), a)}}? I wouldn't want a single token {{a}} to match this. > Fix Intervals.unordered() without overlaps > -- > > Key: LUCENE-8828 > URL: https://issues.apache.org/jira/browse/LUCENE-8828 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-8828.patch > > > LUCENE-8300 added an option to Intervals.unordered() which would attempt to > find intervals that contained all of a set of subintervals where none of the > subintervals overlapped. Unfortunately, this implementation was buggy, and > could miss documents depending on the order in which the subintervals were > passed to the factory method. > After some digging around, I think that it is not in fact possible to > implement this in anything other than n! time, because of the need to > minimize the resulting intervals. My proposal is to remove the boolean flag, > and instead implement an Intervals.unorderedNoOverlaps() method that takes > only two subsources, and rewrites NO_OVERLAPS(a, b) to OR(ORDERED(a, b), > ORDERED(b, a)). The usual simplifications will apply here, so NO_OVERLAPS(a, > a) will end up as ORDERED(a, a) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8828) Fix Intervals.unordered() without overlaps
[ https://issues.apache.org/jira/browse/LUCENE-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855875#comment-16855875 ] Alan Woodward commented on LUCENE-8828: --- cc [~mattweber], who asked for this in the first place - does the two-subsource implementation work for your purposes? > Fix Intervals.unordered() without overlaps > -- > > Key: LUCENE-8828 > URL: https://issues.apache.org/jira/browse/LUCENE-8828 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > Attachments: LUCENE-8828.patch > > > LUCENE-8300 added an option to Intervals.unordered() which would attempt to > find intervals that contained all of a set of subintervals where none of the > subintervals overlapped. Unfortunately, this implementation was buggy, and > could miss documents depending on the order in which the subintervals were > passed to the factory method. > After some digging around, I think that it is not in fact possible to > implement this in anything other than n! time, because of the need to > minimize the resulting intervals. My proposal is to remove the boolean flag, > and instead implement an Intervals.unorderedNoOverlaps() method that takes > only two subsources, and rewrites NO_OVERLAPS(a, b) to OR(ORDERED(a, b), > ORDERED(b, a)). The usual simplifications will apply here, so NO_OVERLAPS(a, > a) will end up as ORDERED(a, a) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8828) Fix Intervals.unordered() without overlaps
[ https://issues.apache.org/jira/browse/LUCENE-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855872#comment-16855872 ] Alan Woodward commented on LUCENE-8828: --- In particular, the query described in LUCENE-2861 does not work with the current implementation, but does with the two-subsource rewrite implementation detailed above. > Fix Intervals.unordered() without overlaps > -- > > Key: LUCENE-8828 > URL: https://issues.apache.org/jira/browse/LUCENE-8828 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Major > > LUCENE-8300 added an option to Intervals.unordered() which would attempt to > find intervals that contained all of a set of subintervals where none of the > subintervals overlapped. Unfortunately, this implementation was buggy, and > could miss documents depending on the order in which the subintervals were > passed to the factory method. > After some digging around, I think that it is not in fact possible to > implement this in anything other than n! time, because of the need to > minimize the resulting intervals. My proposal is to remove the boolean flag, > and instead implement an Intervals.unorderedNoOverlaps() method that takes > only two subsources, and rewrites NO_OVERLAPS(a, b) to OR(ORDERED(a, b), > ORDERED(b, a)). The usual simplifications will apply here, so NO_OVERLAPS(a, > a) will end up as ORDERED(a, a) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org