On Fri, 29 May 2015 15:38:33 +0200
Benoit Panizzon <benoit.paniz...@imp.ch> wrote:

> => Extract text from PDF and pass it to spamassassin to match
> blacklisted URI's within the PDF.

There is a program called pdftotext, which on Debian systems is part
of the poppler-utils package.  I'm sure it's packaged in most Linux distros.

So I'm thinking you could run the PDF through that, add a text/plain part
to INPUTMSG with MIME::tools and pass that to SpamAssassin.  You wouldn't
actually modify the original message; just temporarily add the text/plain
part.  Something like this:

1) Convert PDFs to text and add them as attachment with MIME::tools
   methods.

2) Rename ./INPUTMSG to ./INPUTMSG.ORIG

3) Write out the modified message to ./INPUTMSG

4) Call SpamAssassin

5) Rename ./INPUTMSG.ORIG to ./INPUTMSG

I haven't tried this, but it seems that it should work.

Regards,

Dianne
_______________________________________________
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang

Reply via email to