Hello Peter,

Wednesday, July 6, 2005, 7:48:36 AM, you wrote:

PF> Secondly, for a larger corpus, wondering if there is much difference
PF> between perceptron scores for last 6 months, and last 1 month.  If you
PF> already have the ham/spam.log for the last 6 months, complete with the
PF> "time" field, how much do the perceptron scores differ for the last 6
PF> months, and last 1 month?  The thinking behind this is in moving towards
PF> more regular rule score updates (at least locally), based on the current
PF> flavour of spam.  It may be a self defeating exercise though, if spam
PF> and scores are both moving targets.

I can't speak for the perceptron, but I launch mass-checks on the various
SARE rule set files I maintain approximatley monthly (once they're
stable -- new files are mass-checked more frequently). The hits rates,
both ham and spam, vary significantly month to month.

That may be because SARE rules generally test for lower hit rates than
official rules do, therefore our hit rates may be less statistically
stable, but extrapolating to the general case, I believe that "most
recent month" spam is different from "six months of spam."

Bob Menschel



Reply via email to