Re: [Dspam-user] Spam Identification Deteriorates with Time

Jonathan Hall Fri, 20 Mar 2009 12:55:07 -0700

Why do you have SA's Bayes and AWL features turned on? In this setup,it doesn't seem to me that these would really help, as dspam does bothof these already, right? Seems like you could reduce processing time inSA by disabling these features, without affecting accuracy at all.


Maybe you have reasons to have enabled these I haven't thought of?


--
Jonathan


Yan Seiner wrote:

On Fri, March 20, 2009 10:43 am, Chris Ryland wrote:

Interesting--can you elaborate just a bit?


OK, first mail passes through SA.  I have it configured to only add info
in the X- headers.

Then the mail passes through dspam.  dspam uses the info in SA's X headers
as tokens in its decision.  So your email has the following headers:

X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on selene.seiner.lan
X-Spam-Level:
X-Spam-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00,
     DNS_FROM_RFC_BOGUSMX autolearn=no version=3.2.5
X-DSPAM-Check: by www.seiner.com on Fri, 20 Mar 2009 11:08:38 -0700
X-DSPAM-Result: Innocent
X-DSPAM-Processed: Fri Mar 20 11:08:39 2009
X-DSPAM-Confidence: 0.9995
X-DSPAM-Probability: 0.0000
X-DSPAM-Signature: 49c3dba742621804284693
X-DSPAM-Factors: 27,
     Cc*lists.sourceforge.net, 0.00010,
     wrote+>>, 0.00010,
     On+Fri, 0.00010,
     Subject*user], 0.00010,
     as+>, 0.00011,
     >>+>>, 0.00013,
     wrote+>, 0.00015,
     >+On, 0.00017,
     Cc*user, 0.00021,
     the+>, 0.00022,
     References*mail.gmail.com>, 0.00023,
     References*mail.gmail.com>, 0.00023,
     same+>, 0.00024,
     Cc*user+lists.sourceforge.net, 0.00024,
     >+I, 0.00026,
     >+>, 0.00026,
     >+>, 0.00026,
     X-Mailer*Mail+(2.930.3), 0.00048,
     X-Mailer*(2.930.3), 0.00048,
     Mime-Version*v930.3), 0.00049,
     Mime-Version*framework+v930.3), 0.00049,
     References*www.datavault.us>, 0.00052,
     >+Yan, 0.00053,
     >>+Can, 0.00058,
     >+the, 0.00061,
     38+PM, 0.00067,
     From*Chris, 0.00092

Now let's look at a piece of junk:

X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on selene.seiner.lan
X-Spam-Level: ***********
X-Spam-Status: Yes, score=11.2 required=5.0 tests=AWL,BAYES_99,
     HTML_IMAGE_RATIO_04,HTML_MESSAGE,MIME_HTML_ONLY,RCVD_IN_XBL,URIBL_JP_SURBL,
     URIBL_RHS_DOB autolearn=no version=3.2.5
X-Spam-Report:
     * 1.5 URIBL_JP_SURBL Contains an URL listed in the JP SURBL blocklist
     * [URIs: batiaceo.org]
     * 3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
     * [score: 1.0000]
     * 0.2 HTML_IMAGE_RATIO_04 BODY: HTML has a low ratio of text to image
area
     * 0.0 HTML_MESSAGE BODY: HTML included in message
     * 1.5 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
     * 3.0 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL
     * [64.18.137.4 listed in zen.spamhaus.org]
     * 1.1 URIBL_RHS_DOB Contains an URI of a new domain (Day Old Bread)
     * [URIs: batiaceo.org]
     * 0.4 AWL AWL: From: address is in the auto white-list
X-DSPAM-Check: by www.seiner.com on Fri, 20 Mar 2009 11:42:38 -0700
X-DSPAM-Result: Spam
X-DSPAM-Processed: Fri Mar 20 11:42:39 2009
X-DSPAM-Confidence: 0.9997
X-DSPAM-Probability: 1.0000
X-DSPAM-Signature: 49c3e39f74883847820380
X-DSPAM-Factors: 15,
     X-Spam-Report*[URIs, 0.99990,
     X-Spam-Report*URL, 0.99990,
     X-Spam-Report*URL+listed, 0.99990,
     X-Spam-Report*1.5+URIBL_JP_SURBL, 0.99990,
     X-Spam-Report*URIBL_JP_SURBL, 0.99990,
     X-Spam-Report*URI+of, 0.99990,
     X-Spam-Report*an+URI, 0.99990,
     jpg"/>, 0.99990,
     X-Spam-Report*the, 0.99990,
     X-Spam-Report*3.5, 0.99990,
     X-Spam-Report*the+JP, 0.99990,
     X-Spam-Report*URIBL_JP_SURBL+Contains, 0.99990,
     X-Spam-Report*3.5+BAYES_99, 0.99990,
     X-Spam-Report*MIME_HTML_ONLY, 0.99990,
     X-Spam-Report*RCVD_IN_XBL+RBL, 0.99990

you can see that almost all the tokens dspam used came from the X-Spam
headers.

--Yan

On Mar 20, 2009, at 1:38 PM, Yan Seiner wrote:

On Fri, March 20, 2009 9:55 am, Chris Ryland wrote:

Very interesting, thanks.

Can I ask what SpamAssassin adds to the mix?

I use SA as input to dspam.  It allows dspam to be more accurate as
the
header tokens are nearly always the same.

--
Yan Seiner, PE

Support my bid for the 4J School Board
http://www.seiner.com

Cheers!
--Chris Ryland / Em Software, Inc. / www.emsoftware.com


!DSPAM:49c3dba742621804284693!

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com

_______________________________________________
Dspam-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-user

Re: [Dspam-user] Spam Identification Deteriorates with Time

Reply via email to