[jira] [Updated] (LUCENE-7848) QueryBuilder.analyzeGraphPhrase does not handle gaps correctly

2017-07-14 Thread Michael Gibney (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Gibney updated LUCENE-7848:
---
Attachment: LUCENE-7848-delimOnly-token-offset.patch

I think the remaining problem is that WordDelimiterGraphFilter is swallowing 
delim-only tokens and leaving a gap even when PRESERVE_ORIGINAL is true. 
[^LUCENE-7848-delimOnly-token-offset.patch] fixes this (and addresses the 
problematic gaps).

> QueryBuilder.analyzeGraphPhrase does not handle gaps correctly
> --
>
> Key: LUCENE-7848
> URL: https://issues.apache.org/jira/browse/LUCENE-7848
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.5, 6.6
>Reporter: Jim Ferenczi
> Attachments: capture-3.png, LUCENE-7848-branching-spanOr.patch, 
> LUCENE-7848-delimOnly-token-offset.patch, LUCENE-7848.patch, LUCENE-7848.patch
>
>
> Position increments greater than 1 are ignored when the query builder creates 
> a graph phrase query. 
> Instead it should use SpanNearQuery.addGap for pos incr > 1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7848) QueryBuilder.analyzeGraphPhrase does not handle gaps correctly

2017-07-12 Thread Michael Gibney (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Gibney updated LUCENE-7848:
---
Attachment: (was: LUCENE-7848-branching-spanOr.patch)

> QueryBuilder.analyzeGraphPhrase does not handle gaps correctly
> --
>
> Key: LUCENE-7848
> URL: https://issues.apache.org/jira/browse/LUCENE-7848
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.5, 6.6
>Reporter: Jim Ferenczi
> Attachments: capture-3.png, LUCENE-7848-branching-spanOr.patch, 
> LUCENE-7848.patch, LUCENE-7848.patch
>
>
> Position increments greater than 1 are ignored when the query builder creates 
> a graph phrase query. 
> Instead it should use SpanNearQuery.addGap for pos incr > 1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7848) QueryBuilder.analyzeGraphPhrase does not handle gaps correctly

2017-07-12 Thread Michael Gibney (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Gibney updated LUCENE-7848:
---
Attachment: LUCENE-7848-branching-spanOr.patch

> QueryBuilder.analyzeGraphPhrase does not handle gaps correctly
> --
>
> Key: LUCENE-7848
> URL: https://issues.apache.org/jira/browse/LUCENE-7848
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.5, 6.6
>Reporter: Jim Ferenczi
> Attachments: capture-3.png, LUCENE-7848-branching-spanOr.patch, 
> LUCENE-7848.patch, LUCENE-7848.patch
>
>
> Position increments greater than 1 are ignored when the query builder creates 
> a graph phrase query. 
> Instead it should use SpanNearQuery.addGap for pos incr > 1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7848) QueryBuilder.analyzeGraphPhrase does not handle gaps correctly

2017-07-12 Thread Michael Gibney (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Gibney updated LUCENE-7848:
---
Attachment: (was: LUCENE-7848-branching-spanOr.patch)

> QueryBuilder.analyzeGraphPhrase does not handle gaps correctly
> --
>
> Key: LUCENE-7848
> URL: https://issues.apache.org/jira/browse/LUCENE-7848
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.5, 6.6
>Reporter: Jim Ferenczi
> Attachments: capture-3.png, LUCENE-7848-branching-spanOr.patch, 
> LUCENE-7848.patch, LUCENE-7848.patch
>
>
> Position increments greater than 1 are ignored when the query builder creates 
> a graph phrase query. 
> Instead it should use SpanNearQuery.addGap for pos incr > 1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7848) QueryBuilder.analyzeGraphPhrase does not handle gaps correctly

2017-07-12 Thread Michael Gibney (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Gibney updated LUCENE-7848:
---
Attachment: LUCENE-7848-branching-spanOr.patch

sorry, updated patch

> QueryBuilder.analyzeGraphPhrase does not handle gaps correctly
> --
>
> Key: LUCENE-7848
> URL: https://issues.apache.org/jira/browse/LUCENE-7848
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.5, 6.6
>Reporter: Jim Ferenczi
> Attachments: capture-3.png, LUCENE-7848-branching-spanOr.patch, 
> LUCENE-7848.patch, LUCENE-7848.patch
>
>
> Position increments greater than 1 are ignored when the query builder creates 
> a graph phrase query. 
> Instead it should use SpanNearQuery.addGap for pos incr > 1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7848) QueryBuilder.analyzeGraphPhrase does not handle gaps correctly

2017-07-12 Thread Michael Gibney (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Gibney updated LUCENE-7848:
---
Attachment: LUCENE-7848-branching-spanOr.patch

"Could be a bug somewhere in span queries."^ -- I think the remaining problem 
here is that only one branch (the shortest) of a SpanOrQuery is evaluated, at 
which point the "spanOr" is designated a match (or not) of the 
width/positionEnd of the shortest branch. When the branches of a "spanOr" 
differ in length (as they will as a matter of course for uses of GraphFilters 
such as in the above test), the shorter branch is evaluated, but if a longer 
branch is also a match, it affects the offset of subsequent tokens, and the 
enclosing "spanNear" sees a larger-than-expected slop, and fails to match. 

[^LUCENE-7848-branching-spanOr.patch] adjusts SpanOrQuery to support repeated 
calls to nextStartPosition() which return the same startPosition, but different 
endPositions. The subSpan clauses of the "spanOr" are popped off the 
priorityQueue, retained, and restored upon exhaustion of subSpans (when it's 
time to move on to the next potential match). Some corresponding changes were 
necessary to make NearSpansOrdered aware of the new "spanOr" behavior, and 
conditionally evaluate as many branches of "spanOr" clauses as necessary to 
match (or not) on the full "nearSpan".

There may be other modifications needed in code that can call the modified 
"spanOr" and would need to be aware of its new behavior, but with this patch 
applied, all the tests in the TestWordDelimiterGraphFilter pass (including the 
new testLucene7848()). 

> QueryBuilder.analyzeGraphPhrase does not handle gaps correctly
> --
>
> Key: LUCENE-7848
> URL: https://issues.apache.org/jira/browse/LUCENE-7848
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.5, 6.6
>Reporter: Jim Ferenczi
> Attachments: capture-3.png, LUCENE-7848-branching-spanOr.patch, 
> LUCENE-7848.patch, LUCENE-7848.patch
>
>
> Position increments greater than 1 are ignored when the query builder creates 
> a graph phrase query. 
> Instead it should use SpanNearQuery.addGap for pos incr > 1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7848) QueryBuilder.analyzeGraphPhrase does not handle gaps correctly

2017-06-19 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-7848:

Attachment: capture-3.png

Token graph for the input (indexing and search is the same).

> QueryBuilder.analyzeGraphPhrase does not handle gaps correctly
> --
>
> Key: LUCENE-7848
> URL: https://issues.apache.org/jira/browse/LUCENE-7848
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.5, 6.6
>Reporter: Jim Ferenczi
> Attachments: capture-3.png, LUCENE-7848.patch, LUCENE-7848.patch
>
>
> Position increments greater than 1 are ignored when the query builder creates 
> a graph phrase query. 
> Instead it should use SpanNearQuery.addGap for pos incr > 1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7848) QueryBuilder.analyzeGraphPhrase does not handle gaps correctly

2017-06-19 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-7848:

Attachment: LUCENE-7848.patch

Here's a test (testLucene7848) that reproduces the behavior observed in Solr. 
To me this should work (right)?

I didn't take a look at token streams emitted vs. the query yet -- have to 
switch context now, but it'd be a good start to figure out what's happening.

> QueryBuilder.analyzeGraphPhrase does not handle gaps correctly
> --
>
> Key: LUCENE-7848
> URL: https://issues.apache.org/jira/browse/LUCENE-7848
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.5, 6.6
>Reporter: Jim Ferenczi
> Attachments: LUCENE-7848.patch, LUCENE-7848.patch
>
>
> Position increments greater than 1 are ignored when the query builder creates 
> a graph phrase query. 
> Instead it should use SpanNearQuery.addGap for pos incr > 1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7848) QueryBuilder.analyzeGraphPhrase does not handle gaps correctly

2017-06-15 Thread Jim Ferenczi (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Ferenczi updated LUCENE-7848:
-
Attachment: LUCENE-7848.patch

Here is a simple patch that support gaps in QueryBuilder#createSpanQuery and 
QueryBuilder#analyzeGraphPhrase.
QueryBuilder#createSpanQuery could also handle zero increment but that's 
probably another issue.


> QueryBuilder.analyzeGraphPhrase does not handle gaps correctly
> --
>
> Key: LUCENE-7848
> URL: https://issues.apache.org/jira/browse/LUCENE-7848
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.5, 6.6
>Reporter: Jim Ferenczi
> Attachments: LUCENE-7848.patch
>
>
> Position increments greater than 1 are ignored when the query builder creates 
> a graph phrase query. 
> Instead it should use SpanNearQuery.addGap for pos incr > 1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org