dsmiley commented on a change in pull request #1123: LUCENE-9093: Unified 
highlighter with word separator never gives context to the left
URL: https://github.com/apache/lucene-solr/pull/1123#discussion_r361827668
 
 

 ##########
 File path: 
lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/FieldHighlighter.java
 ##########
 @@ -159,8 +160,9 @@ public Object highlightFieldForDoc(LeafReader reader, int 
docId, String content)
           break;
         }
         // advance breakIterator
-        passage.setStartOffset(Math.max(this.breakIterator.preceding(start + 
1), 0));
-        passage.setEndOffset(Math.min(this.breakIterator.following(start), 
contentLength));
+        passage.setStartOffset(Math.max(this.breakIterator.preceding(start + 
1), lastPassageEnd));
 
 Review comment:
   Oh wait; something occurred to me.  The breakIterator.preceding impl doesn't 
intrinsically know that FieldHighlighter is going to call `Math.max(..., 
lastPassageEnd)` on it.  And I recall you are adding this change here in 
FieldHighlighter because the updated LengthGoalBreakIterator might want to look 
further back to the left into a zone that might have been part of a previous 
Passage.  Maybe `LengthGoalBreakIterator.preceding` should examine `current()` 
at the start and ensure it doesn't yield a break before that.  Then 
FieldHighlighter wouldn't change.  Without this small proposal, the length of 
this passage will be undersized because LengthGoalBreakIterator doesn't know 
FieldHighlighter is going to chop off some of the beginning thanks to that 
`max()`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to