HtmlParser's content-type handling code needs to be more flexible
-----------------------------------------------------------------
Key: TIKA-350
URL: https://issues.apache.org/jira/browse/TIKA-350
Project: Tika
Issue Type: Improvement
Affects Versions: 0.6
Reporter: Ken Krugler
Priority: Minor
Some servers return a content-type response header that has the charset value
in a non-standard position. For example:
Content-Type: charset=utf-8; text/html
The HtmlParser code that attempts to extract the charset needs to be more
flexible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.