[
https://issues.apache.org/jira/browse/SOLR-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628775#comment-13628775
]
Holger Floerke commented on SOLR-4686:
--------------------------------------
"""
Have you seen the XmlCharFilter on SOLR-2597 ?
"""
No, but this is a two year old bug report never reached a release...
You are right for phrase highlighting. I didn't think about that, this is a
point where HTMLStripCharFilter (or XMLCharFilter) does not have any chance.
Regarding to the high volume of unresolved bugs for solr, I would suggest to
close this bug as "won't change". I will reopen it, if I have a good idea on
this issue.
> HTMLStripCharFilter and Highlighter generates invalid HTML
> ----------------------------------------------------------
>
> Key: SOLR-4686
> URL: https://issues.apache.org/jira/browse/SOLR-4686
> Project: Solr
> Issue Type: Bug
> Components: highlighter
> Affects Versions: 4.1
> Reporter: Holger Floerke
> Labels: HTML, highlighter
>
> Using the HTMLStripCharFilter may yield to an invalid HTML highlight.
> The HTMLStripCharFilter has a special treatment of inline-elements (eg. "a",
> "b", ...). For theese elements the CharFilter ignores the tag and does not
> insert any split-character.
> If you index
> """
> <a>xxx</a>
> """
> you get the word "xxx" starting at position 3 ending on position 10(!)
> If you highlight a search on "xxx", you will get
> """
> <a><em>xxx</a></em>
> """
> which is invalid HTML.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]