Re: spam scoring -2.6

John D. Hardin Wed, 18 Jul 2007 07:08:30 -0700

On Wed, 18 Jul 2007, Jean-Paul Natola wrote:

> I'm getting creamed with these spams but they are getting through
> due to;
>  
> -2.6 BAYES_00               BODY: Bayesian spam probability is 0 to 1%
>       [score: 0.0005]
> 
> Why is this happening?


Well, most likely two causes: (1) your bayes is mistrained, or (2) the 
spams contain sufficient text that looks like your legitimate message 
traffic.

I'll repost the questions I posted here a bit ago in response to a 
similar question:

(0) Does the text in the low-scoring spam, apart from the commercial
pitch, look like legitimate message traffic that you would expect at
your site? (e.g. if you're in the medical profession, it contains a
lot of text about medical topics.)

(1) How often does this happen?

(2) How big is your bayes database? (use "sa-learn --dump magic")

(3) How are you training your bayes database? Autolearn? Or manual?

(4) If manual, do you keep your corpus around after you've trained it?

(5) If manual, do you let nontechnical users train it without review?

Finally, have you installed any of the SARE rules from
www.rulesemporium.com? While not "fixing" a low bayes score, they may
counteract it when bayes gets fooled.

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]    FALaholic #11174     pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Where We Want You To Go Today 07/05/07: Microsoft patents in-OS
  adware architecture incorporating spyware, profiling, competitor
  suppression and delivery confirmation (U.S. Patent #20070157227)
-----------------------------------------------------------------------
 6 days until The 38th anniversary of Apollo 11 landing on the Moon

Re: spam scoring -2.6

Reply via email to