I can open a ticket for this but wanted to just run it by you first.
As explained here: http://www.decalage.info/rtf_tricks (no need to read if you
donโt care ๐
Malicious RTF files take advantage of the fact that Microsoft do not follow
their own RTF spec. Specifically, Word et al only looks for the opening
sequence:
{rt
Thought the spec says it should be:
{rtf1
Where 1 is the version number.
Tika fails to identify a malware file starting:
{\rtf1{\pict\jpegblip\picw24\pich24\bin49922
As an RTF file โ it says that it is application/octet-stream
Could the Tika detector be modified to just look for {rt as per Office tools?
Cheers,
Jim