[ https://issues.apache.org/jira/browse/SOLR-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Seunghan Jung resolved SOLR-17474. ---------------------------------- Resolution: Duplicate This should have been posted to the “Lucene” project, but it was mistakenly posted to the “Solr” project. We are working on it in the following PR: https://github.com/apache/lucene/pull/13832. > The snippet formatting does not work as intended when PassageSort is not > startOffset. > ------------------------------------------------------------------------------------- > > Key: SOLR-17474 > URL: https://issues.apache.org/jira/browse/SOLR-17474 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: highlighter > Affects Versions: main (10.0) > Reporter: Seunghan Jung > Priority: Critical > Attachments: image-2024-10-03-04-14-50-717.png > > > DefaultPassageFormatter.format method에, 다음과 같이 (startOffset이 아닌 score순으로) > 정렬되어 있는 passages가 주어졌다고 합시다. > !image-2024-10-03-04-14-50-717.png! > 이때 content는 "When indexing data in Solr, each document is composed of various > fields. A document essentially represents a single record, and each document > typically contains a unique ID field." 이므로 각 Passage는 다음과 같습니다. > * Passages[0] -> "A document essentially represents a single record, and > each document typically contains a unique ID field." > * Passages[1] -> "When indexing data in Solr, each document is composed of > various fields. " > > 의도한 formatting 결과는 다음과 같습니다. > "A <b>document</b> essentially represents a single record, and each > <b>document</b> typically contains a unique ID field.\{{ellipsis}}When > indexing data in Solr, each <b>document</b> is composed of various fields." > > 하지만 두 passage가 이어져 있는지 판단하는 조건문이 passages가 startOffset으로 정렬을 전제로 작성되어 있어, 두 > passage가 ellipsis로 구분되어지지 않고 연결되어 하나의 snippet이 되어 버립니다. > > ""A <b>document</b> essentially represents a single record, and each > <b>document</b> typically contains a unique ID field.When indexing data in > Solr, each <b>document</b> is composed of various fields." > > 이에 해당 조건문을 수정합니다. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org