On Mon, 19 Oct 2015, Larry Goldman wrote:

I found that much of the SPAM had a BAYES_00 score of -1.9, which was defeating the contribution of the other tests. A closer inspection of the raw source revealed invisible gibberish text which, I assume, is designed to thwart the default BAYES_00 test — very cleaver. I have since changed the score of that test to 0.

That's the wrong response. Spams that are scoring BAYES_00 (FNs) should be used to train Bayes as spam. That will correct the database. BAYES_00 should only hit on (your) ham, and it should have a slight negative score.

The invisible gibberish text ("Bayes poison") has not proven to be an effective tactic when Bayes is properly trained with FNs and sufficient "real" ham. The gibberish gets learned as a spam sign.

Question: do you have autolearn turned on? You may want to disable that and review your Bayes training corpora (best practice is to keep known-reliable hand-reviewed Bayes training corpora, including the FN and FP corrections, around in case Bayes needs to be retrained from scratch) and retrain from scratch.

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The ["assault weapons"] ban is the moral equivalent of banning red
  cars because they look too fast.  -- Steve Chapman, Chicago Tribune
-----------------------------------------------------------------------

Reply via email to