[ https://issues.apache.org/jira/browse/OAK-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15286204#comment-15286204 ]
Tommaso Teofili commented on OAK-4368: -------------------------------------- the fallback would apply to: - previous version of the index that don't have offsets - full text Lucene indexes whose properties are not analyzed and therefore the Lucene fields are of type {{TextField}} > Excerpt extraction from the Lucene index should be more selective > ----------------------------------------------------------------- > > Key: OAK-4368 > URL: https://issues.apache.org/jira/browse/OAK-4368 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene > Affects Versions: 1.0.30, 1.2.14, 1.4.2, 1.5.2 > Reporter: Tommaso Teofili > Assignee: Tommaso Teofili > Fix For: 1.5.3 > > Attachments: OAK-4368.0.patch > > > Lucene index can be used in order to extract _rep:excerpt_ using > {{Highlighter}}. > The current implementation may suffer performance issues when the result set > of the original query contains a lot of results, each of them possibly > containing lots of (stored) properties that get passed to the highlighter in > order to try to extract the excerpt; such a process doesn't stop as soon as > the first excerpt is found so that excerpt is composed using text from all > stored properties in all results (if there's a match on the query). > While we can accept some cost of extracting excerpt at query time (whereas it > was generated at excerpt retrieval time before OAK-3580, e.g. via > _row.getValue("rep:excerpt")_) , that should be bounded and mitigated as much > as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)