Illegal Charset Name crashes HTMLParser
---------------------------------------
Key: TIKA-454
URL: https://issues.apache.org/jira/browse/TIKA-454
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 0.7
Reporter: Julien Nioche
Fix For: 0.8
As reported by Andrzej [1], the HTMLParser crashes when the charset found in
meta is illegal e.g.
<meta http-equiv="Content-Type" content="text/html; charset=ISO 8859-1"/>
[1]
http://mail-archives.apache.org/mod_mbox/tika-user/201006.mbox/%[email protected]%3e
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.