scampi opened a new pull request, #13315: URL: https://github.com/apache/lucene/pull/13315
- **test: add unit tests to reproduce the IndexOutOfBoundsException in DefaultPassageFormatter** - **doc: clarify javadoc of MatchesIterator** - **fix: sort passages by offset if positions are missing** Fix #12431 ### Description `DefaultPassageFormatter` may cause an `IndexOutOfBoundsException` in case matches of a passage are out of order. This happens when term vectors are stored but not the positions. The tentative solution consists in ordering matches according to offsets in `DisjunctionMatchesIterator` in case positions don't exist. However, I wonder if it is correct to create such an iterator when positions are not available. Matches with no terms are explicitly removed, and the javadoc of `MATCH_WITH_NO_TERMS` mentions that it indicates a match with _no term positions_. Therefore, from that doc it shouldn't be possible to highlight text without positions. Is that javadoc correct ? Should it be updated ? https://github.com/apache/lucene/blob/bc678ac67e32c55a27a4e8950c25144cc89cef66/lucene/core/src/java/org/apache/lucene/search/MatchesUtils.java#L67 https://github.com/apache/lucene/blob/bc678ac67e32c55a27a4e8950c25144cc89cef66/lucene/core/src/java/org/apache/lucene/search/MatchesUtils.java#L40-L44 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org