Re: inconsistent scoring issue?

2008-05-16 Thread Jeff Aitken
On Thu, May 15, 2008 at 08:53:57PM +0200, Karsten Br?ckelmann wrote:
 Yes. Hence my question about mail hitting URIBL_BLACK on the first run,
 unlike that one example.
 
 The point is, whether *no* mail hits URIBL_BLACK, or at least *some*
 mail does. Do you get any URIBL_BLACK hits at all? Is that one example
 you pasted exemplary for all your incoming mail, never hitting
 URIBL_BLACK -- or is this an isolated case not triggering the BL?
 
 The answer to this might hint where to look next...

At, gotcha.  Yes, some messages do hit URIBL_BLACK; all examples that I've
found so far are also (properly) identified as spam.

I'm thinking you're probably right that this is a timing issue.  I just
checked another message that had different scoring results.  The initial
message was received on 5/15 at 1156UTC and did not hit URIBL_BLACK.  I
fed it to SA manually at 1203UTC and it DID hit URIBL_BLACK.  I looked up
the URI in question and it was listed on 5/15 at 1153UTC.


--Jeff



Re: inconsistent scoring issue?

2008-05-16 Thread John Hardin

On Fri, 16 May 2008, Jeff Aitken wrote:

I'm thinking you're probably right that this is a timing issue.  I just 
checked another message that had different scoring results.  The initial 
message was received on 5/15 at 1156UTC and did not hit URIBL_BLACK.  I 
fed it to SA manually at 1203UTC and it DID hit URIBL_BLACK.  I looked 
up the URI in question and it was listed on 5/15 at 1153UTC.


One argument for implementing greylisting?

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  You do not examine legislation in the light of the benefits it
  will convey if properly administered, but in the light of the
  wrongs it would do and the harms it would cause if improperly
  administered.  -- Lyndon B. Johnson
---
 5 days until the 4th anniversary of SpaceshipOne winning the X-prize


inconsistent scoring issue?

2008-05-15 Thread Jeff Aitken
Hello,

Apologies if this is a FAQ or old news, but I did a bit of searching
yesterday and didn't find an answer to this one.

I'm using SA (3.2.4) site-wide on a FreeBSD-6.3 box in conjunction with
postfix, using procmail as the LDA.  I'm using spamd/spamc, so the individual
spamc processes are run as the recipient's userid (since they're spawned
by procmail).  I know this has implications for which bayes db gets
consulted (versus a true sitewide with shared bayes db) but I don't
think that's the issue I'm seeing here.  Anyway...

It seems like a lot more spam has been getting through in the last couple
of weeks.  This prompted me to enable Pyzor, which I had not done in my
initial install.  While that seems to work, I noticed that I'm getting
inconsistent scoring results on messages that should be tagged as spam but
which are not.

For example, a message that was just delivered to my inbox contained the
following report from SA:

X-Spam-Status: No, score=4.4 required=5.0 
tests=BAYES_99,DATE_IN_FUTURE_03_06,
RAZOR2_CHECK,RDNS_DYNAMIC autolearn=no version=3.2.4
X-Spam-Report:
*  3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
*  [score: 1.]
*  0.3 DATE_IN_FUTURE_03_06 Date: is 3 to 6 hours after Received: 
date
*  0.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
*  0.1 RDNS_DYNAMIC Delivered to trusted network by host with
*  dynamic-looking rDNS

If I save the original message and run SA manually (spamassassin -t  msg)
I get the following:

X-Spam-Status: Yes, score=7.3 required=5.0 tests=AWL,BAYES_99, 

DATE_IN_FUTURE_03_06,RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E8_51_100,

RAZOR2_CHECK,RCVD_IN_DSBL,RCVD_IN_SORBS_DUL,RDNS_DYNAMIC,URIBL_BLACK 
autolearn=no version=3.2.4
X-Spam-Report:
*  0.9 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP 
address
*  [88.73.238.103 listed in dnsbl.sorbs.net]
*  1.0 RCVD_IN_DSBL RBL: Received via a relay in list.dsbl.org
*  [http://dsbl.org/listing?88.73.238.103]
*  3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
*  [score: 1.]
*  0.3 DATE_IN_FUTURE_03_06 Date: is 3 to 6 hours after Received: 
date
*  1.5 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence 
level
*  above 50%
*  [cf: 100]
*  0.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
*  0.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 
50%
*  [cf: 100]
*  2.0 URIBL_BLACK Contains an URL listed in the URIBL blacklist
*  [URIs: win-todayoo.com.cn]
*  0.1 RDNS_DYNAMIC Delivered to trusted network by host with
*  dynamic-looking rDNS
* -2.9 AWL AWL: From: address is in the auto white-list

I'm going to assume that the score being wrong by 0.1 (should be 7.4, not
7.3) is due to a rounding error or other similar issue.  However, I can't
figure out why the results are so different.  What's even more interesting
is that if I turn on debugging (spamassassin -D -t  msg) then I get a
*third* different result:

X-Spam-Status: Yes, score=8.7 required=5.0 tests=AWL,BAYES_99,

DATE_IN_FUTURE_03_06,RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E8_51_100,
RAZOR2_CHECK,RCVD_IN_DSBL,RCVD_IN_SORBS_DUL,RDNS_DYNAMIC,URIBL_BLACK
autolearn=no version=3.2.4
X-Spam-Report:
*  0.9 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP 
address
*  [88.73.238.103 listed in dnsbl.sorbs.net]
*  1.0 RCVD_IN_DSBL RBL: Received via a relay in list.dsbl.org
*  [http://dsbl.org/listing?88.73.238.103]
*  2.0 URIBL_BLACK Contains an URL listed in the URIBL blacklist
*  [URIs: win-todayoo.com.cn]
*  3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
*  [score: 1.]
*  0.3 DATE_IN_FUTURE_03_06 Date: is 3 to 6 hours after Received: 
date
*  1.5 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence 
level
*  above 50%
*  [cf: 100]
*  0.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
*  0.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 
50%
*  [cf: 100]
*  0.1 RDNS_DYNAMIC Delivered to trusted network by host with
*  dynamic-looking rDNS
* -1.4 AWL AWL: From: address is in the auto white-list

The two commands were run on the same host, by the same user, within
seconds of one another, and yet the scores for the AWL test are 1.5
different.

Any thoughts on what I'm missing or doing wrong?

Thanks!


--Jeff



Re: inconsistent scoring issue?

2008-05-15 Thread Karsten Bräckelmann
On Thu, 2008-05-15 at 14:19 +, Jeff Aitken wrote:

 For example, a message that was just delivered to my inbox contained the
 following report from SA:
 
 X-Spam-Report:
 *  3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
 *  [score: 1.]
 *  0.3 DATE_IN_FUTURE_03_06 Date: is 3 to 6 hours after Received: 
 date
 *  0.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
 *  0.1 RDNS_DYNAMIC Delivered to trusted network by host with
 *  dynamic-looking rDNS
 
 If I save the original message and run SA manually (spamassassin -t  msg)
 I get the following:
 
 X-Spam-Report:
 *  0.9 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic 
 IP address
 *  [88.73.238.103 listed in dnsbl.sorbs.net]
 *  1.0 RCVD_IN_DSBL RBL: Received via a relay in list.dsbl.org
 *  [http://dsbl.org/listing?88.73.238.103]
 *  3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
 *  [score: 1.]
 *  0.3 DATE_IN_FUTURE_03_06 Date: is 3 to 6 hours after Received: 
 date
 *  1.5 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence 
 level
 *  above 50%
 *  [cf: 100]
 *  0.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
 *  0.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 
 50%
 *  [cf: 100]
 *  2.0 URIBL_BLACK Contains an URL listed in the URIBL blacklist
 *  [URIs: win-todayoo.com.cn]
 *  0.1 RDNS_DYNAMIC Delivered to trusted network by host with
 *  dynamic-looking rDNS
 * -2.9 AWL AWL: From: address is in the auto white-list

No DNSBLs in the original result... This *may* be due to the BLs
catching up, and the second run being done later. This specifically
seems to be the case for Razor (which hit in both run, just differently)
and likely for URIBL_BLACK, too. Maybe DNS timeout issues.

Do you see hits URIBL_BLACK hits in the incoming stream at all?


 I'm going to assume that the score being wrong by 0.1 (should be 7.4, not
 7.3) is due to a rounding error or other similar issue.  However, I can't
 figure out why the results are so different.  What's even more interesting
 is that if I turn on debugging (spamassassin -D -t  msg) then I get a
 *third* different result:
[...]
 * -1.4 AWL AWL: From: address is in the auto white-list
 
 The two commands were run on the same host, by the same user, within
 seconds of one another, and yet the scores for the AWL test are 1.5
 different.

AWL is a score averager. It *will* change by run, unless the difference
between the current score and the previous average is about 0.

Please see these:
  http://wiki.apache.org/spamassassin/AutoWhitelist
  http://wiki.apache.org/spamassassin/AwlWrongWay

  guenther


-- 
char *t=[EMAIL PROTECTED];
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: inconsistent scoring issue?

2008-05-15 Thread Jeff Aitken
On Thu, May 15, 2008 at 05:35:52PM +0200, Karsten Br?ckelmann wrote:
 No DNSBLs in the original result... This *may* be due to the BLs
 catching up, and the second run being done later. This specifically
 seems to be the case for Razor (which hit in both run, just differently)
 and likely for URIBL_BLACK, too. Maybe DNS timeout issues.

Perhaps... don't see any evidence in the logs, but I might not without
lots of extra debugging enabled.  Whatever is happening, it's a definite
change from as recently as two weeks ago because other users on this 
system have reported a massive increase in spam not being properly
classified as such.

However, wrt to the comment about the BLs catching up, if I'm reading it
right this host has been listed in at least DSBL since last year.  Still
could have been a timeout on my end, of course, but it seems unlikely that
I've been having timeouts for the last two weeks or so.


 Do you see hits URIBL_BLACK hits in the incoming stream at all?

Not sure exactly what you're asking here... but I included the entire
X-Spam-Status and X-Spam-Report headers, without removing any lines.  So
there was no URIBL_BLACK hit in the message as it was delivered to my
inbox, but the same message, when run through SA manually a few minutes
later, did trigger it.


 AWL is a score averager. It *will* change by run, unless the difference
 between the current score and the previous average is about 0.

Ah, right... didn't think about the affect of running SA manually
influencing the AWL on subsequent runs.  The initial email scored 4.4, but
when I ran SA manually it scored 10.3 or so, which means the AWL should
have subtracted about 3 on the second exposure to that sender... and then
down from there, etc.  Sorry, my fault for not thinking that one through.


--Jeff



Re: inconsistent scoring issue?

2008-05-15 Thread Karsten Bräckelmann
On Thu, 2008-05-15 at 16:20 +, Jeff Aitken wrote:
 On Thu, May 15, 2008 at 05:35:52PM +0200, Karsten Br?ckelmann wrote:

  Do you see hits URIBL_BLACK hits in the incoming stream at all?
 
 Not sure exactly what you're asking here... but I included the entire
 X-Spam-Status and X-Spam-Report headers, without removing any lines.  So
 there was no URIBL_BLACK hit in the message as it was delivered to my
 inbox, but the same message, when run through SA manually a few minutes
 later, did trigger it.

Yes. Hence my question about mail hitting URIBL_BLACK on the first run,
unlike that one example.

The point is, whether *no* mail hits URIBL_BLACK, or at least *some*
mail does. Do you get any URIBL_BLACK hits at all? Is that one example
you pasted exemplary for all your incoming mail, never hitting
URIBL_BLACK -- or is this an isolated case not triggering the BL?

The answer to this might hint where to look next...

  guenther


-- 
char *t=[EMAIL PROTECTED];
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}