[GitHub] [lucene-solr] dweiss commented on a change in pull request #2127: LUCENE-9633: Improve match highlighter behavior for degenerate intervals
dweiss commented on a change in pull request #2127: URL: https://github.com/apache/lucene-solr/pull/2127#discussion_r538796480 ## File path: lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/TestMatchRegionRetriever.java ## @@ -361,6 +374,41 @@ public void testIntervalQueries() throws IOException { ); } + @Test + public void testDegenerateIntervalsWithPositions() throws IOException { +testDegenerateIntervals(FLD_TEXT_POS); + } + + @Test @AwaitsFix(bugUrl = "https://issues.apache.org/jira/browse/LUCENE-9634: " + Review comment: I may provide a PR with those query parser changes I made if there's interest - they're not that difficult and they make it possible to use intervals from plain text queries. I'll get to it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2127: LUCENE-9633: Improve match highlighter behavior for degenerate intervals
dweiss commented on a change in pull request #2127: URL: https://github.com/apache/lucene-solr/pull/2127#discussion_r538313908 ## File path: lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/TestMatchRegionRetriever.java ## @@ -361,6 +374,41 @@ public void testIntervalQueries() throws IOException { ); } + @Test + public void testDegenerateIntervalsWithPositions() throws IOException { +testDegenerateIntervals(FLD_TEXT_POS); + } + + @Test @AwaitsFix(bugUrl = "https://issues.apache.org/jira/browse/LUCENE-9634: " + Review comment: It is extremely useful to capture and drill down in the context of another query. Let's say apples nearby oranges. Yes, you can achieve a similar thing with other queries but it's pretty useful on its own (because you can first inspect the context you're looking at by running the extends query in isolation). I've modified flexible query parser and added those functions as prefix-scoped "language". Looks like this: https://get.carrotsearch.com/lingo4g/1.12.0-SNAPSHOT/doc/#interval-functions And combine with the matches highlighter it really shines. It's the best when you get multiple overlapping intervals; I don't have an example won this computer (I have a day of on home duties) but I can send you one later on - you can do some really impressive stuff with intervals! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2127: LUCENE-9633: Improve match highlighter behavior for degenerate intervals
dweiss commented on a change in pull request #2127: URL: https://github.com/apache/lucene-solr/pull/2127#discussion_r538162712 ## File path: lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/TestMatchRegionRetriever.java ## @@ -361,6 +374,41 @@ public void testIntervalQueries() throws IOException { ); } + @Test + public void testDegenerateIntervalsWithPositions() throws IOException { +testDegenerateIntervals(FLD_TEXT_POS); + } + + @Test @AwaitsFix(bugUrl = "https://issues.apache.org/jira/browse/LUCENE-9634: " + Review comment: > I sort of think that just highlighting the original term is the correct behaviour? Hmm... I don't think I agree. When you have a query parser that allows intervals then extend becomes a function just like anything else. The intuitive user expectation for a query extend(foo 2 2) is to actually highlight the matching interval of positions (well, users think of "words") pointed to by that interval. This is particularly important if you're building more complex expressions out of these (left/ right/ extend, etc.) and you wish to see partial fragments as you're building more focused expressions. I'm not saying this has to be fixed (neither do I know how it should) but it's real feedback from people who use those queries intensively (and my gut feeling agrees). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2127: LUCENE-9633: Improve match highlighter behavior for degenerate intervals
dweiss commented on a change in pull request #2127: URL: https://github.com/apache/lucene-solr/pull/2127#discussion_r537669124 ## File path: lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/TestMatchRegionRetriever.java ## @@ -361,6 +374,41 @@ public void testIntervalQueries() throws IOException { ); } + @Test + public void testDegenerateIntervalsWithPositions() throws IOException { +testDegenerateIntervals(FLD_TEXT_POS); + } + + @Test @AwaitsFix(bugUrl = "https://issues.apache.org/jira/browse/LUCENE-9634: " + Review comment: @romseygeek Alan - this may be interesting for you. Is this how it's supposed to work with fields with offsets? If you enable this test and compare how positions only vs. positions+offsets field behaves you'll see what I mean - the stored offsets actually cause an incorrect highlight in this case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org