[GitHub] [lucene-solr] dweiss commented on a change in pull request #2127: LUCENE-9633: Improve match highlighter behavior for degenerate intervals

2020-12-08 Thread GitBox


dweiss commented on a change in pull request #2127:
URL: https://github.com/apache/lucene-solr/pull/2127#discussion_r538796480



##
File path: 
lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/TestMatchRegionRetriever.java
##
@@ -361,6 +374,41 @@ public void testIntervalQueries() throws IOException {
 );
   }
 
+  @Test
+  public void testDegenerateIntervalsWithPositions() throws IOException {
+testDegenerateIntervals(FLD_TEXT_POS);
+  }
+
+  @Test @AwaitsFix(bugUrl = 
"https://issues.apache.org/jira/browse/LUCENE-9634: " +

Review comment:
   I may provide a PR with those query parser changes I made if there's 
interest - they're not that difficult and they make it possible to use 
intervals from plain text queries. I'll get to it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #2127: LUCENE-9633: Improve match highlighter behavior for degenerate intervals

2020-12-08 Thread GitBox


dweiss commented on a change in pull request #2127:
URL: https://github.com/apache/lucene-solr/pull/2127#discussion_r538313908



##
File path: 
lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/TestMatchRegionRetriever.java
##
@@ -361,6 +374,41 @@ public void testIntervalQueries() throws IOException {
 );
   }
 
+  @Test
+  public void testDegenerateIntervalsWithPositions() throws IOException {
+testDegenerateIntervals(FLD_TEXT_POS);
+  }
+
+  @Test @AwaitsFix(bugUrl = 
"https://issues.apache.org/jira/browse/LUCENE-9634: " +

Review comment:
   It is extremely useful to capture and drill down in the context of 
another query. Let's say apples nearby oranges. Yes, you can achieve a similar 
thing with other queries but it's pretty useful on its own (because you can 
first inspect the context you're looking at by running the extends query in 
isolation).
   
   I've modified flexible query parser and added those functions as 
prefix-scoped "language". Looks like this:
   https://get.carrotsearch.com/lingo4g/1.12.0-SNAPSHOT/doc/#interval-functions
   
   And combine with the matches highlighter it really shines. It's the best 
when you get multiple overlapping intervals; I don't have an example won this 
computer (I have a day of on home duties) but I can send you one later on - you 
can do some really impressive stuff with intervals!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #2127: LUCENE-9633: Improve match highlighter behavior for degenerate intervals

2020-12-08 Thread GitBox


dweiss commented on a change in pull request #2127:
URL: https://github.com/apache/lucene-solr/pull/2127#discussion_r538162712



##
File path: 
lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/TestMatchRegionRetriever.java
##
@@ -361,6 +374,41 @@ public void testIntervalQueries() throws IOException {
 );
   }
 
+  @Test
+  public void testDegenerateIntervalsWithPositions() throws IOException {
+testDegenerateIntervals(FLD_TEXT_POS);
+  }
+
+  @Test @AwaitsFix(bugUrl = 
"https://issues.apache.org/jira/browse/LUCENE-9634: " +

Review comment:
   > I sort of think that just highlighting the original term is the 
correct behaviour?
   
   Hmm... I don't think I agree. When you have a query parser that allows 
intervals then extend becomes a function just like anything else. The intuitive 
user expectation for a query extend(foo 2 2) is to actually highlight the 
matching interval of positions  (well, users think of "words") pointed to by 
that interval. This is particularly important if you're building more complex 
expressions out of these (left/ right/ extend, etc.) and you wish to see 
partial fragments as you're building more focused expressions.
   
   I'm not saying this has to be fixed (neither do I know how it should) but 
it's real feedback from people who use those queries intensively (and my gut 
feeling agrees).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #2127: LUCENE-9633: Improve match highlighter behavior for degenerate intervals

2020-12-07 Thread GitBox


dweiss commented on a change in pull request #2127:
URL: https://github.com/apache/lucene-solr/pull/2127#discussion_r537669124



##
File path: 
lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/TestMatchRegionRetriever.java
##
@@ -361,6 +374,41 @@ public void testIntervalQueries() throws IOException {
 );
   }
 
+  @Test
+  public void testDegenerateIntervalsWithPositions() throws IOException {
+testDegenerateIntervals(FLD_TEXT_POS);
+  }
+
+  @Test @AwaitsFix(bugUrl = 
"https://issues.apache.org/jira/browse/LUCENE-9634: " +

Review comment:
   @romseygeek Alan - this may be interesting for you. Is this how it's 
supposed to work with fields with offsets? If you enable this test and compare 
how positions only vs. positions+offsets field behaves you'll see what I mean - 
the stored offsets actually cause an incorrect highlight in this case.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org