Traktormaster commented on a change in pull request #1123: LUCENE-9093: Unified 
highlighter with word separator never gives context to the left
URL: https://github.com/apache/lucene-solr/pull/1123#discussion_r361839121
 
 

 ##########
 File path: 
lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/FieldHighlighter.java
 ##########
 @@ -159,8 +160,9 @@ public Object highlightFieldForDoc(LeafReader reader, int 
docId, String content)
           break;
         }
         // advance breakIterator
-        passage.setStartOffset(Math.max(this.breakIterator.preceding(start + 
1), 0));
-        passage.setEndOffset(Math.min(this.breakIterator.following(start), 
contentLength));
+        passage.setStartOffset(Math.max(this.breakIterator.preceding(start + 
1), lastPassageEnd));
 
 Review comment:
   > Without this small proposal, the length of this passage will be undersized
   
   That's incorrect. Such a fragment will be undersized either way. The current 
approach has the `fragsize` split up by `fragAlignRatio` statically. Even if 
there is not fulfilled expansion on the left, that won't be used on the right. 
We would only be moving the point where the `fragsize` on the left is truncated.
   
   BTW in the results these would-be-overlapping fragments get merged into a 
single snippet. So they'll become a bigger one instead of one normal- and one 
undersized. The only sure place we will receive an undersized snippet is when a 
match is at the very beginning or the end of the text.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to