[ 
https://issues.apache.org/jira/browse/LUCENE-8196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596348#comment-16596348
 ] 

Alan Woodward commented on LUCENE-8196:
---------------------------------------

Hi [~Martin Hermann]

Thanks for the detailed feedback - this is very helpful!

1) As with Spans, one way to fix the issue with OR intervals is to change the 
precedence rules so that longer intervals sort before their prefixes.  I need 
to go re-read the paper's proof concerning the OR operator, it would be 
interesting to see if this ends up causing problems elsewhere .  Another option 
would be to add a separate IntervalsSource with this behaviour, maybe triggered 
as a parameter on {{Intervals.or()}}

2) Intervals don't really have the notion of 'slop' that Spans do, but we could 
add the idea of an 'internal slop' to ordered and unordered spans.  This would 
be measured as the space within an interval not taken up by the component 
intervals.  Your {{("big bad" OR evil) wolf}} query I think can already be done 
using {{Intervals.phrase()}}?

3) Spans have the notion of a 'gap' Span, which could be usefully added here.  
This could help with avoiding minimization in your CONTAINS query

> Add IntervalQuery and IntervalsSource to expose minimum interval semantics 
> across term fields
> ---------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-8196
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8196
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Alan Woodward
>            Assignee: Alan Woodward
>            Priority: Major
>             Fix For: 7.4
>
>         Attachments: LUCENE-8196-debug.patch, LUCENE-8196.patch, 
> LUCENE-8196.patch, LUCENE-8196.patch, LUCENE-8196.patch, LUCENE-8196.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This ticket proposes an alternative implementation of the SpanQuery family 
> that uses minimum-interval semantics from 
> [http://vigna.di.unimi.it/ftp/papers/EfficientAlgorithmsMinimalIntervalSemantics.pdf]
>  to implement positional queries across term-based fields.  Rather than using 
> TermQueries to construct the interval operators, as in LUCENE-2878 or the 
> current Spans implementation, we instead use a new IntervalsSource object, 
> which will produce IntervalIterators over a particular segment and field.  
> These are constructed using various static helper methods, and can then be 
> passed to a new IntervalQuery which will return documents that contain one or 
> more intervals so defined.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to