James MacLean wrote, on 15/07/07 05:05 PM:
Subject:
Re: PDFText Plugin for PDF file scoring - PDFText2.pm for ver 3.2
From:
James MacLean <[EMAIL PROTECTED]>
Date:
Sun, 15 Jul 2007 17:05:38 -0300
To:
users@spamassassin.apache.org
To:
users@spamassassin.apache.org
Theo Van Dinter wrote,
Steve West wrote, on 26/07/07 10:59 AM:
decoder wrote:
Try using the SVN Version (revision 132). This is basically the same
as the latest 3.5.x release but some issues with SA 3.2.x were fixed.
Best regards,
Chris
We are running SA 3.2.1 and just wondering if anyone using the SVN
version
Hi JT,
There is the expectation that if the author requested that a PDF not be
copied, then the PDF is not to be copied. This is done by a password
protecting mechanism when the PDF is saved and exists in the PDF file.
The author of Xpdf makes his position known on subverting this feature:
h
Hi Folks,
Noticed that my bodies were not being parsed any more. Found out that
SPAM was creating PDF's that are copy protected. Xpdf utils from 3.0
will present the text, but at least 3.02 reports the file is copy
protected and does not parse it...
Simple fix here was to compile a _special_
JT DeLys wrote, on 16/07/07 07:02 PM:
Seems to me that, assuming I can get the prereqs for FuzzOCR+pdf built
correctly (working), that FuzzyOCR /for/ OCR plus PDFText2 for text
might be a solid solution ...
Wish I had your confidence :). PDFText2 is still too younge to know if
it holds up u
Michael Parker wrote, on 16/07/07 01:58 PM:
Theo Van Dinter wrote:
IMO, if people find this a useful enough feature of 3.2, it's a relatively
trivial change in the code as I recall, so a bugzilla request to backport
may get somewhere for a future 3.1 release.
I would +1 a backport.
M
JT DeLys wrote, on 16/07/07 06:36 PM:
Hi,
With PDFText2, the found text is added (rendered) to the main
tests that SpamAssassin does.
Do you mean to those tests defined in 80_additional.cf? or others?
It means any test you do on the body of e-mail will test against this.
for example,
JT DeLys wrote, on 16/07/07 02:14 PM:
Hi,
Could someone perhaps succinctly summarize the various & sundry
anti-pdf-image-spam tools that are currently in play?
PDFText
-- works in 3.2, not 3.1
This one is my fault :(. PDFText _does_ work in 3.1 and that is where we
are getting the most
Theo Van Dinter wrote, on 14/07/07 02:13 PM:
On Sat, Jul 14, 2007 at 09:54:36AM -0300, James MacLean wrote:
Where do I find information on hooking into post_message_parse()? Tried
greping in the module area with no luck :(. Certainly agree it would be
better to get the text out and let
Dallas Engelken wrote, on 14/07/07 12:17 AM:
James MacLean wrote:
Hi folks,
Regrets if this is the wrong list.
Wanted to be able to score on text found in PDF files. Did not see
any obvious route, so made a plugin that calls XPDF's pdfinfo and
pdftotext to get the text that is then s
Hi folks,
Regrets if this is the wrong list.
Wanted to be able to score on text found in PDF files. Did not see any
obvious route, so made a plugin that calls XPDF's pdfinfo and pdftotext
to get the text that is then scored.
Sample local.cf could be :
pdftotext_cmd /usr/local/bin/pdftotext
11 matches
Mail list logo