Hi, On Wed, Aug 29, 2012 at 6:02 PM, chraj007 <[email protected]> wrote: > http://lucene.472066.n3.nabble.com/file/n4004078/test.html test.html
Looks like that file has an incorrect http-equiv declaration:
<META http-equiv="Content-Type" content="text/html; charset=utf-16">
The encoding of the file is not UTF-16.
Can you file a TIKA issue about this? Tika should be able to
automatically detect the correct encoding and use it if the declared
one is obviously incorrect.
BR,
Jukka Zitting
