Re: PDFassassin

2012-11-19 Thread Jason Haar
ExtractText can do any attachment you want. However, it's just a framework. We use it to call a script that extracts images, office documents, pdf files and passes them through a range of tools (antiword, unzip [newer Office are just XML files in a zip file], gocr) and converts them into text - whi

Re: PDFassassin

2012-11-18 Thread Olivier Nicole
Thank you Jari, > > In the same way, I am wondering if something similar exists for all > > the (open|libre|MS)office documents? > ExtractText > > Works with this documents as well as with PDF. I had a look at ExtractText, but it only extract text,not the images. And antiword, the extractor for

Re: PDFassassin

2012-11-15 Thread Jari Fredriksson
on anything newer than 3.2.5 > > Scott > >> -Original Message- >> From: Jari Fredriksson [mailto:ja...@iki.fi] >> Sent: Thursday, November 15, 2012 6:01 AM >> To: users@spamassassin.apache.org >> Subject: Re: PDFassassin >> >> 15.11.2012 13:31,

Re: PDFassassin

2012-11-15 Thread John Hardin
On Thu, 15 Nov 2012, Olivier Nicole wrote: Finally, I am wondering if fuzzyOCR still has any interest? Like above, I'd like to see it push the stings it can identify to the body of the message, for further analysis by SA, rather than having it's own list of spam words. I believe that FuzzrOCR

Re: PDFassassin

2012-11-15 Thread Jari Fredriksson
15.11.2012 13:31, Olivier Nicole kirjoitti: > In the same way, I am wondering if something similar exists for all > the (open|libre|MS)office documents? ExtractText Works with this documents as well as with PDF. -- Fame is a vapor; popularity an accident; the only earthly certainty is oblivion.

PDFassassin

2012-11-15 Thread Olivier Nicole
Hi, While going through old stuff, I noticed I have been using a modified version of PDFassassin. I did not even know that I ported it to my new mail server. What it does, basically, it extracts the text from the PDF attachment and stuff it back to SA for further anlysis. The only difference

RE: PDFAssassin

2007-08-14 Thread Jean-Paul Natola
-Original Message- From: Bob Pierce [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 14, 2007 11:00 AM To: users@spamassassin.apache.org Subject: PDFAssassin Is anybody using the PDFAssassin module from http://blog.atmail.com/?p=61 I didn't think I saw it talked about on the lis

PDFAssassin

2007-08-14 Thread Bob Pierce
Is anybody using the PDFAssassin module from http://blog.atmail.com/?p=61 I didn't think I saw it talked about on the list yet. I'm looking for a good solution for catching PDF spam. Are there any better suggestions for catching PDF? Thanks again, Bob