Re: [Mimedefang] LZW, Gifs, and fingerprinting stock spams

2006-11-02 Thread leloup
Am Mittwoch November 1 2006 15:48 schrieb Dave Helton:
 I have had very good success with this plugin for SA.

 http://wiki.apache.org/spamassassin/FuzzyOcrPlugin

 config file allows you to add/remove keywords, and the program
 keeps a hash of known images so that they are not ocr'ed again.

I am using it as well since some days. Works quite amazing, even with the 
provided short but reasonable wordlist.

One fine thing also is, that you can tune it not to scan mails which already 
got a spamassassin score over a certain treshold. So you won't have to do an 
OCR on mails with images spamassassin already marked as spam :-)

Jürgen

___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


Re: [Mimedefang] LZW, Gifs, and fingerprinting stock spams

2006-11-01 Thread Kevin A. McGrail

I'm trying to do some stochastic analysis of stock spams and
figure out if there's a common fingerprint that can be used to
identify them...


Philip:

Have you looked at Dallas' ImageInfo.pm?  See 
http://www.rulesemporium.com/plugins.htm.  It's a great place to start 
building image rules.  However, I think you are barking up the wrong tree. 
The spams have been very effective at being randomized.


I will also say that the stock image spams have been very effective at 
thwarting traditional anti-spam techniques.  It's been an ebb and flow 
battle for weeks (months?) with them.  But I am happy to say that if you use 
MIMEDefang, I've been VERY pleased with the results of the AOL-esque reverse 
DNS test that I wrote a few weeks ago.


I'm continuing to tweak it but I just put the latest version up in 
http://www.peregrinehw.com/downloads/MIMEDefang/mimedefang-filter-KAM.  I 
use this in conjunction with my ruleset which only SCORES the emails.  I do 
NOT use this technique to block email like AOL.  This may change.  The rules 
are in http://www.peregrinehw.com/downloads/SpamAssassin/contrib/KAM.cf


Good Luck!

Regards,
KAM 


___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


Re: [Mimedefang] LZW, Gifs, and fingerprinting stock spams

2006-11-01 Thread David F. Skoll
Kevin A. McGrail wrote:

 I will also say that the stock image spams have been very effective at
 thwarting traditional anti-spam techniques.

Not to blow our own horn too much, but...

We've had pretty good luck with our RPTN system.  It's a shared Bayes
database.  A couple of hundred of our customer sites submit votes with
word-counts to add to a large shared Bayes database, which we update
and redistribute every night.  The non-image junk in most of those
spams usually scores very high in our Bayes implementation.
Our current RPTN database contains words and word-pairs from 451,506
spams and 224,318 hams, for a total of just under 7 million tokens.

Regards,

David.
___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


RE: [Mimedefang] LZW, Gifs, and fingerprinting stock spams

2006-11-01 Thread Jason Bertoch [Electronet]


 -Original Message-
 Not to blow our own horn too much, but...
 
 We've had pretty good luck with our RPTN system.  
 
 Regards,
 
 David.


David,

Where might I find more information on the RPTN system?  The searchable
archives don't seem to be working at the moment.


Jason A. Bertoch
Network Administrator
[EMAIL PROTECTED]
ElectroNet Intermedia Consulting
3411 Capital Medical Blvd.
Tallahassee, FL 32308
(V) 850.222.0229 (F) 850.222.8771

___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


RE: [Mimedefang] LZW, Gifs, and fingerprinting stock spams

2006-11-01 Thread Dave Helton
I have had very good success with this plugin for SA.

http://wiki.apache.org/spamassassin/FuzzyOcrPlugin 

config file allows you to add/remove keywords, and the program
keeps a hash of known images so that they are not ocr'ed again.

this plugin also understands animated gifs, something I've seen
recently.  I do not know how well it handles compressed images.
needs testing.

HTH

-Dave
 Hughes Network Technologies



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Philip
Prindeville
Sent: Tuesday, October 31, 2006 10:26 PM
To: mimedefang@lists.roaringpenguin.com
Subject: [Mimedefang] LZW, Gifs, and fingerprinting stock spams

I'm trying to do some stochastic analysis of stock spams and figure out if
there's a common fingerprint that can be used to identify them...

But first, I'm bumping up against some Perl issues.

Seems that there aren't many modules out there that help deconstruct Gif
formats.  I'm using Image::Info::GIF, but need to decompress the compressed
data portion.  I tried to take the data and pass it to Compress::LZW
directly, but most GIF's (at least for stocks, which don't use many
colors) use 4, 6, or 8 bit codesizes.

Unfortunately, Compress::LZW only handles 12 or 16 bits...  Anyone familiar
enough with either GIF formats or how to decompress the data to offer a leg
up?

Thanks,

-Philip

___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang
mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang

___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


Re: [Mimedefang] LZW, Gifs, and fingerprinting stock spams

2006-11-01 Thread David F. Skoll
Jason Bertoch [Electronet] wrote:

   Where might I find more information on the RPTN system?  The searchable
 archives don't seem to be working at the moment.

We have a white paper (mostly marketing-oriented, alas):

http://www.roaringpenguin.com/files/images/resources_files/White-Paper-RPTN.pdf

(Sorry for the ridiculous URL; it predates our move to Drupal.)

If you want a more technical white paper, please e-mail me off-list.

RPTN is only available to CanIt customers, though we continue to mull
over the possibility of making a SpamAssassin plugin and selling RPTN
subscriptions.  We haven't had enough people say Yes, I'm willing to pay!
to prod us to do it yet, however.

Regards,

David.

___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


[Mimedefang] LZW, Gifs, and fingerprinting stock spams

2006-10-31 Thread Philip Prindeville
I'm trying to do some stochastic analysis of stock spams and
figure out if there's a common fingerprint that can be used to
identify them...

But first, I'm bumping up against some Perl issues.

Seems that there aren't many modules out there that help
deconstruct Gif formats.  I'm using Image::Info::GIF, but
need to decompress the compressed data portion.  I tried
to take the data and pass it to Compress::LZW directly,
but most GIF's (at least for stocks, which don't use many
colors) use 4, 6, or 8 bit codesizes.

Unfortunately, Compress::LZW only handles 12 or 16
bits...  Anyone familiar enough with either GIF formats
or how to decompress the data to offer a leg up?

Thanks,

-Philip

___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang