On Mon, 25 Jul 2016, Vjeran Marcinko wrote:
I fist noticed that my .mbox file doesn't get parsed by MBoxParser,
and later, after debugging Tika source code, I found what the problem
is - default detector doesn't even recognize it as "applciation/mbox"
MIME type, and although file extension is .mbox, it ignores this hint
because its "magic" way of detecting file type based on some amount of
initial bytes detects it is "text/html"

Can you try with a recent Tika nightly build? Only there have been some tweaks done around that sort of thing recently

If a nightly build / build from Git still shows the issue, please open a bug in Jira and attach a problematic file, then we can take a look!

Nick

Reply via email to