[
https://issues.apache.org/jira/browse/TIKA-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jukka Zitting resolved TIKA-319.
--------------------------------
Resolution: Fixed
Fix Version/s: 0.5
Assignee: Jukka Zitting
Good point! Fixed as suggested in revision 835726.
> HtmlParser - use encoding hint only if charset is supported
> -----------------------------------------------------------
>
> Key: TIKA-319
> URL: https://issues.apache.org/jira/browse/TIKA-319
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 0.4
> Reporter: Piotr B.
> Assignee: Jukka Zitting
> Fix For: 0.5
>
>
> Encoding hint should be considered only if that encoding is supported.
> Diff of my fix:
> --- HtmlParser.java (wersja 835302)
> +++ HtmlParser.java (kopia robocza)
> @@ -46,7 +46,7 @@
> // Prepare the input source using the encoding hint if available
> InputSource source = new InputSource(stream);
> String encoding = metadata.get(Metadata.CONTENT_ENCODING);
> - if (encoding != null) {
> + if (encoding != null && Charset.isSupported(encoding)) {
> source.setEncoding(encoding);
> }
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.