Ok, here it is:
https://issues.apache.org/jira/browse/TIKA-1793
-Vjeran
On 13.11.2015 13:48, Nick Burch wrote:
On Fri, 13 Nov 2015, Vjeran Marcinko wrote:
On 13.11.2015 11:51, Nick Burch wrote:
On Fri, 13 Nov 2015, Vjeran Marcinko wrote:
I saved 2 .eml files saved by my Thunderbird, and one o
On Fri, 13 Nov 2015, Vjeran Marcinko wrote:
On 13.11.2015 11:51, Nick Burch wrote:
On Fri, 13 Nov 2015, Vjeran Marcinko wrote:
I saved 2 .eml files saved by my Thunderbird, and one of them contained
plain text content, whereas other one rich HTML content.
Did you try with the latest version o
Yep, I'm using v1.11
On 13.11.2015 11:51, Nick Burch wrote:
On Fri, 13 Nov 2015, Vjeran Marcinko wrote:
I saved 2 .eml files saved by my Thunderbird, and one of them
contained plain text content, whereas other one rich HTML content.
Did you try with the latest version of Apache Tika? IIRC we
On Fri, 13 Nov 2015, Vjeran Marcinko wrote:
I saved 2 .eml files saved by my Thunderbird, and one of them contained
plain text content, whereas other one rich HTML content.
Did you try with the latest version of Apache Tika? IIRC we did some fixes
around this moderately recently
Nick
Hello,
I saved 2 .eml files saved by my Thunderbird, and one of them contained
plain text content, whereas other one rich HTML content.
The plain text one got recognized by Tika as "message/rfc822" file, but
the other one incorrectly as "text/html" (and textual content being
incorrectly extr