Raquel Rice <[EMAIL PROTECTED]> wrote: > [...] > That isn't what I asked. I get over a thousand emails per day, > personally. Those are from all the lists I'm on, all the > personal mail, and all the business mail. I assume that > Matt's email is similar. What I'm asking is, how to select > 125 per day out of 1000?
I just go through and delete "borderline" cases from my inbox (mbox) (that is, messages that are OK, but "spammy"), and manually sa-learn that as ham occasionally. I do the same against folder for mailing lists that have low/no spam hits. So I simply PRUNE my inbox before training for any large amount of ham. (more below) > (I've been going through all my messages each day, manually > moving "ham" to a ham directory and moving "spam" to a spam > directory ... a long and tedious job ... then using that to > train bayes) The key for me is keeping spam OUT of my inbox altogether for quick downloads, reading and required daily maintenance. Perhaps set a lower spam threshold initially, then automatically sort messages above threshold into "obvious" and "maybe" spam folders? This would help keep your inbox spam-free (mostly), while not dumping useful but not-as-important stuff. I manually sort the false-positives out of the "maybe spam" folder and just drag to "not spam" and "confirmed spam" folders. I have a cron script automatically run sa-learn on several times a day. Since anything in the "not" or "confirmed" folder has been verified, I'm comfortable with this. This way, I don't have to worry about training daily. I just do it as time allows, yet still enjoy a spam-free inbox. Daily use is virtually spam-free, and I just sort when convenient. Once bayes came up to speed, I started dumping anything over the bayes auto_learn threshold, since I had zero false positives at that level. So even the "maybe spam" folder isn't overwhelming. If it starts to get cumbersome, I might even crank this threshold back a couple of points, as I've yet to have a false positive score much more than 6. I don't get 1,000 messages personally each day, but over 500 come through regularly. I find this quite manageable. - Bob
