[GitHub] [lucene-solr] Traktormaster commented on issue #1123: LUCENE-9093: Unified highlighter with word separator never gives context to the left

2019-12-28 Thread GitBox
Traktormaster commented on issue #1123: LUCENE-9093: Unified highlighter with 
word separator never gives context to the left
URL: https://github.com/apache/lucene-solr/pull/1123#issuecomment-569408594
 
 
   Basically the whole patch has been reworked, here are the highlights:
   - About half as many calls to `preceding()` and `following()` on the wrapped 
BI.
   - `fragsize` is about the length of the contextual text around the match. 
(it's relation to the snippet size is indirect)
   - `fragalign` has been renamed to `fragAlignRatio`. Changed it's default 
value to `0.5`.
   - Made the hidden `closestTo/targetLen` mode the default instead of 
`minimumLen`. This adds an automatic slop-like behaviour.
   - The new parameter `fragsizeIsMinimum` can be used to switch back to the 
previously used mode. (minimum mode is a little faster)
   - Changed the call of `following(start)` in `FieldHighlighter` to 
`following(end - 1)` for performance and correctness.
   - Tests and docs have been updated.
   
   With these defaults the UH will give more meaingful results as-is. To have 
even more similar results to the older highlighters the `hl.bs.type` can be 
switched to `WORD`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] Traktormaster commented on issue #1123: LUCENE-9093: Unified highlighter with word separator never gives context to the left

2019-12-29 Thread GitBox
Traktormaster commented on issue #1123: LUCENE-9093: Unified highlighter with 
word separator never gives context to the left
URL: https://github.com/apache/lucene-solr/pull/1123#issuecomment-569492072
 
 
   Most of your review points this round are about the correctness of the 
BreakIterator API/class impl. Yet the comment where the overlap handling is 
discussed, you would move that logic inside the LGBI which would conflict with 
the original intention of what the `preceding()` method should implement. (it's 
not supposed to only work after the current index)
   
   So far I've not made an attempt to make the behavior of this class 
"consistent" as it is very unclear to what I would want it to be consistent to. 
The exclusively one place it is ever used would not benefit from more 
consistency. It is explicitly stated in the class docstring: `Important: This 
is not a general purpose {@link BreakIterator}; it's only designed to work in a 
way compatible with the {@link UnifiedHighlighter}.  Some assumptions are 
checked with Java assertions.` Some methods are straight up unimplemented with 
an `assert false`.
   
   Basically I can make it behave more consistently, but there's no benefit 
right now. That's why I haven't done it so already. I guess you're thinking 
that in the future when someone will work on this again, it would be better to 
have a little more well rounded implementation. I can respect that.
   
   However in light of what we're going for now, I do not agree to move the 
`Math.max(..., lastPassageEnd)` functionality into the LGBI. (It would not have 
a benefit either way IMO)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] Traktormaster commented on issue #1123: LUCENE-9093: Unified highlighter with word separator never gives context to the left

2019-12-29 Thread GitBox
Traktormaster commented on issue #1123: LUCENE-9093: Unified highlighter with 
word separator never gives context to the left
URL: https://github.com/apache/lucene-solr/pull/1123#issuecomment-569509670
 
 
   I took your advice and made the LGBI "cache" the current index to avoid 
having to "fix" it on the underlying BI. I've also reevaluated the handling of 
DONE results.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org