On 09/25/2014 05:24 PM, Amir Caspi wrote:
On Sep 25, 2014, at 8:51 AM, John Hardin <jhar...@impsec.org> wrote:

You *did* keep your initial Bayes training corpora, right?

Does it matter if you keep the initial corpora, or just that you train on known corpora, 
even if they are "fluid?"

imo, fresh spam is the best spam.
old spam tokens may not work on fresh spam.
ham age is not as critical.

Nowadays, we tend to reejct most good fodder with all kinds of methods at SMTP level and what's left is often hardly enough to keep a bayes DB well fed.

A separate trap box/vm with a domain (or more) which takes all it gets (no rejetcs, no filtering) make a great source of spam fodder.

With a few tricks you can even auto feed bayes to a shared SQL/Redis backend giving you nice fresh spam tokens with minimal intervention.


Reply via email to