On Thu, 10 Mar 2011 15:01:34 -0500
dar...@chaosreigns.com wrote:

> On 03/10, Jason Bertoch wrote:
> > Wouldn't spam already scored at 15+ be considered a little redundant
> > to the corpus?  If not, I'm certain I could modify my config to keep
> > a copy for processing in the mass checks.
> 
> No.  If all spams scored 15+ hit similar tests, and none of those
> spams are included in the mass-checks, then those tests might not be
> scored highly enough to catch those spams in the future.
> 
> It's a big "if", but "redundant" is certainly not applicable.

This argument seems a bit far fetched when you take into account that
corpora may retain spam for years, and that there will be other sites
including the higher scoring examples. The scores for high scoring
spams are determined by other mail that scores close to 5.  If the
scores for a particular set of rules systematically reduced over time
they would drop below 15 before they dropped below 5, bringing in fresh
examples


It seems to me that rejecting on blocklists, or over-reliance on
spamtraps is more of a problem than rejection on high scores.   

As far as BAYES is concerned different people train it in different
ways so I don't see the sense in strictly mandating
train-on-everything.  

Reply via email to