[ 
https://issues.apache.org/jira/browse/OAK-7071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dirk Rudolph updated OAK-7071:
------------------------------
    Description: 
*PostingsHighligher* returns for example 
{quote} 
[my text with any <b>highlighting</b> followed by more text]
{quote}
because the PostingsHighligher itself returns for each field a {{String[]}} of 
phrases limited by the beforehand given max phrases. This String[] is the 
transformed to String using {{Arrays.toString()}} at 
[LucenePropertyIndex.java#L688|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LucenePropertyIndex.java#L688]
 causing the value to be wrapped in square brackets.

*Highlighter* returns 
{quote}
my text with any <strong>highlighting</strong> followed by more text 
{quote}

*SimpleExcerptProvider* returns
{quote}
<div><span>my text with any <strong>highlighting</strong> followed by more 
text</span></div>
{quote}

As the PostingsHighligher cannot get any custom prefix or suffix, I would 
suggest set <b></b> as default for the others as well to prevent any further 
text transformation post extracting the excerpts.


  was:
*PostingsHighligher* returns for example 
{quote} 
[my text with any <b>highlighting</b> followed by more text]
{quote}
because the PostingsHighligher itself returns for each field a {{String[]}} of 
phrases limited by the beforehand given max phrases. This String[] is the 
transformed to String using {{Arrays.toString()}} at 
[LucenePropertyIndex.java#L688|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LucenePropertyIndex.java#L688]
 causing the value to be wrapped in square brackets.

*Highlighter* returns 
{quote}
my text with any <strong>highlighting</strong> followed by more text 
{quote}

*SimpleExcerptProvider* returns
{quote}
my text with any <div><span>highlighting</span></div> followed by more text 
{quote}

As the PostingsHighligher cannot get any custom prefix or suffix, I would 
suggest set <b></b> as default for the others as well to prevent any further 
text transformation post extracting the excerpts.



> PostingsHighlighter, Highlighter and SimpleExcerptProvider return all 
> different formats for excerpts
> ----------------------------------------------------------------------------------------------------
>
>                 Key: OAK-7071
>                 URL: https://issues.apache.org/jira/browse/OAK-7071
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: lucene
>    Affects Versions: 1.6.7, 1.8
>            Reporter: Dirk Rudolph
>              Labels: excerpt
>
> *PostingsHighligher* returns for example 
> {quote} 
> [my text with any <b>highlighting</b> followed by more text]
> {quote}
> because the PostingsHighligher itself returns for each field a {{String[]}} 
> of phrases limited by the beforehand given max phrases. This String[] is the 
> transformed to String using {{Arrays.toString()}} at 
> [LucenePropertyIndex.java#L688|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LucenePropertyIndex.java#L688]
>  causing the value to be wrapped in square brackets.
> *Highlighter* returns 
> {quote}
> my text with any <strong>highlighting</strong> followed by more text 
> {quote}
> *SimpleExcerptProvider* returns
> {quote}
> <div><span>my text with any <strong>highlighting</strong> followed by more 
> text</span></div>
> {quote}
> As the PostingsHighligher cannot get any custom prefix or suffix, I would 
> suggest set <b></b> as default for the others as well to prevent any further 
> text transformation post extracting the excerpts.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to