[ https://issues.apache.org/jira/browse/SOLR-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dawid Weiss updated SOLR-882: ----------------------------- Attachment: patch - Hex. entity handling improved (more issues with proper padding when entities were not terminated with a ';') - Added recognition of all-uppercase entities (exceptions). > HTMLStripReader improvement - padding corrected for hexadecimal entities, > option not to emit padding at all added > ----------------------------------------------------------------------------------------------------------------- > > Key: SOLR-882 > URL: https://issues.apache.org/jira/browse/SOLR-882 > Project: Solr > Issue Type: Improvement > Reporter: Dawid Weiss > Priority: Trivial > Attachments: patch > > > Improvements to HTMLStripHighlighter: > - fix padding of hexadecimal entities (currently off by 1) > - add an option not to emit padding at all. In certain applications padding > emitted after entities such as ó may split words that are in fact > single terms. > - add entities that are recognized when written all in uppercase and > recognized by browsers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.