RE: Outlook For Mac (OLM) Parser?

2017-08-10 Thread Allison, Timothy B.
Please open a ticket on our JIRA and share an example file. We'll want to update our package detector to handle this format. As for parsing, XML is doable, and I'd be happy to try my hand at it...if we can find enough examples... Please no protobufs, please no protobufs... :) -Original M

Outlook For Mac (OLM) Parser?

2017-08-10 Thread Tucker Barbour
I have recently encountered a case where I need to parse an Outlook For Mac email archive (OLM). I have not found an officially published specification for the file format but after a bit of inspection it appears to be similar to the OOXML format. It's a ZIP file containing emails in an XML for

Re: Tika content detection and crawled "remote" content

2017-08-10 Thread Sebastian Nagel
Hi, a follow up based on Tika 1.16 for the July crawl: # Tika-1.16 HTTP-Content-Type 4580525 text/x-php text/html 842698 text/x-coldfusion text/html 579128 text/asptext/html 510323 text/aspdotn