> -----Original Message-----
> From: Matt Kettler [mailto:[EMAIL PROTECTED] 
> Sent: Friday, February 17, 2006 18:47
> To: Matt Kettler
> Cc: Jeff Chan; users@spamassassin.apache.org
> Subject: Re: Over-scoring of SURBL lists...
> 
> Matt Kettler wrote:
> 
> > I'll even re-quote myself:
> >> I personally would like to see some statistics, but  at 
> this point, 
> >> we  don't have any test data on this so we're arguing your 
> theory vs mine.
> > And your quote that I was counter-pointing:
> >> As you can see the performance of the lists are different, 
> and the way they're created is different too.
> > 
> > I don't see enough of a difference to clearly rule out 
> significant overlap.
> > 
> > I'll define my test of "significant overlap" as:
> >> 10% of total hits redundant across 3 or more lists and >1% nonspam 
> >> hits
> > redundant across 2 or more lists.
> > 
> 
> Messages received today that are double-listed in two or more 
> of SC, JP, AB, OB and WS:
> grep "SURBL_MULTI2" /var/log/maillog |grep "Feb 17" |wc -l
>     292
> 
> All surbl.org hits in same timeframe (includes ph, but no matter):
> 
> grep "_SURBL" /var/log/maillog |grep "Feb 17" |wc -l
>     583
> 
> So we at least have a 50% double-listing rate. That 
> in-and-of-itself isn't much of a problem, but it also doesn't 
> rule out overlap. It's still a whole lot higher than my first 
> criteria of 10% overlap
> 
> However, right now I don't have more than 100 FPs so I can't 
> really comment on the nonspam hit rate of SURBL_MULTI2. 
> That's the important one.
> 
> I also added multi3, multi4 and another rule to detect 
> overlap between uribl.com's black and surbl.org:
> 
> meta URIBL_BLACK_OVERLAP (URIBL_BLACK && (URIBL_AB_SURBL || 
> URIBL_JP_SURBL || URIBL_OB_SURBL || URIBL_WS_SURBL || 
> URIBL_SC_SURBL)) score URIBL_BLACK_OVERLAP -1.0
> 

if anyone is interested, here is an alternative scoring method for
25_uribl.cf -> http://www.uribl.com/tools/25_uribl.cf (make sure you wipe
out the scores for uribl tests in 50_scores.cf if you replace this file).

This should make SBL/URIBL/SURBL hits range in score from 2.0 to 5.5... 

- 2.0 (SBL ONLY) 
- 2.5 (URIBL_ONLY)
- 2.5 (SURBL_ONLY)
- 3.0 (SBL + URIBL)
- 3.0 (SBL + SURBL)
- 3.0 (SURBL_ONLY x2)
- 4.0 (URIBL + SURBL)
- 5.0 (SBL + URIBL + SURBL)
- 5.5 (SBL + URIBL + SURBLx2)

If you want to reduce the possibility of URIBL-only FPs, this is the way to
go.  

D

Reply via email to