Yes. Please do open a ticket, and y, I have a need to read anything from decalage…he does some amazing work. 😊
I trust you wouldn’t, but please don’t post an actual malware file for us to use in our unit tests. 😉 From: Jim Idle [mailto:[email protected]] Sent: Thursday, March 1, 2018 12:32 AM To: [email protected] Subject: Malware RTF is not detected as RTF I can open a ticket for this but wanted to just run it by you first. As explained here: http://www.decalage.info/rtf_tricks (no need to read if you don’t care 😉 Malicious RTF files take advantage of the fact that Microsoft do not follow their own RTF spec. Specifically, Word et al only looks for the opening sequence: {rt Thought the spec says it should be: {rtf1 Where 1 is the version number. Tika fails to identify a malware file starting: {\rtf1{\pict\jpegblip\picw24\pich24\bin49922 As an RTF file – it says that it is application/octet-stream Could the Tika detector be modified to just look for {rt as per Office tools? Cheers, Jim
