[
https://issues.apache.org/jira/browse/TIKA-190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jukka Zitting resolved TIKA-190.
--------------------------------
Resolution: Fixed
Fix Version/s: 0.3
Assignee: Jukka Zitting
Patch applied in revision 737578. Thanks!
> wrong handling of ignorableWhitespace/characters in SafeContentHandler and
> WriteoutContentHandler
> -------------------------------------------------------------------------------------------------
>
> Key: TIKA-190
> URL: https://issues.apache.org/jira/browse/TIKA-190
> Project: Tika
> Issue Type: Bug
> Affects Versions: 0.3
> Reporter: Uwe Schindler
> Assignee: Jukka Zitting
> Fix For: 0.3
>
> Attachments: TIKA-190.patch
>
>
> During investigation of TIKA-189, I found out the following:
> The patch TIKA-188 does everything correct (if looking at the output), but
> the internal handling is incorrect. XHTMLContentHandler inserts
> ignorableWhitespace with the tabs and newlines, but the superclass
> SafeContentHandler has a bug that forwards ignorableWhitespace() to the
> decorators characters() event (copy'n'paste-error). Fixing this, the tests
> fail, because WriteoutContentHandler has no ignorableWhitespace() and removes
> all whitespace.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.