Daryl C. W. O'Shea writes: > Michael Parker wrote: > > Daryl C. W. O'Shea wrote: > > >> Are the hit-rates of the lists high enough that the results that aren't > >> cached by the use of --reuse low enough to fall under the block > >> triggering level? Either way, I guess we should get around to figuring > >> out a way of caching the non-hits. I'm thinking of a method that > >> assumes you ran the rules (based on the SA version in the message > >> header) unless you've specifically told it you don't run a particular rule. > > > --reuse should take care of this. Everyone should save their X-Spam-* > > headers in their corpus msgs. Reuse sets the rule score to zero so for > > msgs that it didn't hit, and still have their X-Spam-Status header > > present we shouldn't be doing any sort of lookup. > > Ah, that's right, thanks. For some reason I was thinking that any > message that didn't previously have a hit recorded would have the tests run. > > > Maybe we should add a --force-reuse that would ignore any msgs that > > can't be reused. > > I'm thinking that should be the only option for reuse.
+1 That sounds like a very good idea. Well, at least, let's get an idea of how many mass-check lines we lose, and we can make a more informed decision at that point; but I'm pretty sure we should be doing this. even if we lose 50% of the hits, we'll get a much more accurate picture of the real accuracy of those rules. the SpamAssassin.org spamtraps don't record network rule hits, but I don't rely entirely on those in my mass-checks anyway. Most of my personal spamtrap addresses are run through my normal mail account and are scanned. PS: by the way, I wouldn't worry too much (yet) about doing more frequent --net mass-checks to generate new network-rule scores. That's certainly going to be trickier than the non-net variant, and the latter is more important to start with. ;) PPS: good point about Spamhaus... 100 user limit is tiny! We may indeed have to do something about that, but I agree with Daryl; the Spamhaus rules are very accurate. Could you open a bug, and let's see if we can get some discussion underway... --j.