At 10:41 16-06-2009, Rosenbaum, Larry M. wrote:
We get a significant number of 419 scam letters where the actual
spam text is in a Word (.doc or .rtf) or PDF attachment. Example:
Don't limit yourself to that. Think of the next step.
It would be really great if there was an SA plugin to extract the
text from the attachment and then feed the text to the regular SA
body rules. Has anybody looked at that possibility?
See http://wiki.apache.org/spamassassin/FuzzyOcrPlugin It is
possible to modify that plugin to call the wv library to extract the
content. If you want to use regular rules, you would have to render
the content before passing the modified message to SpamAssassin.
Regards,
-sm