TXT parser does not honour the specified encoding
-------------------------------------------------

                 Key: TIKA-868
                 URL: https://issues.apache.org/jira/browse/TIKA-868
             Project: Tika
          Issue Type: Bug
            Reporter: Daniel Bonniot de Ruisselet
             Fix For: 1.1


With input text "Indanyl", the encoding is recognized as IBM500, even when 
"UTF-8" is specified explicitly.

I would argue that detection should only be used when the declared information 
is incorrect (saving time and avoiding wrong detection), as proposed by Ken 
Krugler in TIKA-539.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to