I hacked up the test a bit so it would compile against 9.0 and confirmed
the problem existed there as well.
So going back a little farther with some manual bisection (to account for
the transition from ant to gradle) lead me to the following...
# first bad commit: [2719cf6630eb2bd7cb37d0e8462dc912d8fafd83]
LUCENE-9431: UnifiedHighlighter WEIGHT_MATCHES is now true by default
(#362)
...my impression here is that this probably must have existed for a
while somwhere in a 'WEIGHT_MATCHES' code path, and this commit just
exposed the probably "by default".
That impression seemed to be confirmed by tweaking my test patch (against
2719cf6630eb2bd7cb37d0e8462dc912d8fafd83) to use...
UnifiedHighlighter highlighter = new UnifiedHighlighter(searcher,
indexAnalyzer) {
@Override
protected Set<HighlightFlag> getFlags(String field) {
final Set<HighlightFlag> x = new
java.util.HashSet<>(super.getFlags(field));
x.remove(HighlightFlag.WEIGHT_MATCHES);
return x;
}
};
...and the tests started to pass.
Again, i don't really understand this code, but: Knowing that the probably
happens when TermVectorOffsetStrategy means that usages of WEIGHT_MATCHES
in getOffsetStrategy's ANALYSIS codepath probably aren't relevant -- which
leands me to assume the source of the problem is
probably FieldOffsetStrategy.createOffsetsEnumsWeightMatcher ?
But this brings me back to not really understanding what code is "at
fault" here ? ... The existence of WEIGHT_MATCHES and the design of
FieldOffsetStrategy.createOffsetsEnumsWeightMatcher to return an
OffsetsEnum ordered by the "weighted" matches implies that it's
expected/allowed for the offsets in Passages to be out of (ordinal) order
... so does that mean DefaultPassageFormatter is broken for not
expecting this?
-Hoss
http://www.lucidworks.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]