Hello Sniffer, I think we've identified the cause of some reports of spam leakage over the past few weeks.
I've been testing submitted messages against customer rulebases and I've noted that in almost every case there were rules that matched the messages. One of the customers testing with me pointed out that the results from my test included primarily rules from group 60 and 62. These are the Experimental IP and Experimental Abstract rule groups respectively. After that I reviewed other test results and found a similar thread. We have been announcing on the list that the content of our Experimental rule groups has been changing and that we have made these groups significantly more accurate in recent weeks. One of the changes that we have also made in these weeks is that we have increased the number of rules that are generated automatically from our spamtraps. The auto-rule AI runs every 20 minutes, much more frequently than we can manually review the incoming spam, as a result the system is much more responsive to new spam. All of these rules are placed in the appropriate experimental rule groups. As a result, over the past few weeks a greater number of new rules have been generated in these groups rather than manually into other groups. This trend will continue over time. We have not (and probably will not) implemented a practice of recoding these rules to specific content categories because this would be of little value. It turns out that the vast majority of the rule candidates generated by the AI are of the type that spammers re-use for multiple campaigns. For example, we might see a Snake-oil spam, a porn-spam, and a get-rich spam all within the same week using the same throw-away domain detected by our AI. If you are using a weighting system such as Declude and you have not yet revisited your weights on group 60 and 62, then you are probably seeing more "spam leakage" as a result. I recommend that you review your weights using a combination of your current experiences and the spam test quality analysis found here: <http://www2.spamchk.com/public.html> One formula that you can use to derive your test weights from this analysis is W = (SA^2)*HOLD_WEIGHT. So, in the case of these two groups you might select these weights for your system: SNIFFER-IP(60), estimated accuracy 81%, (.81)*(.81) => .6561, recommended weight: 66% of hold weight. SNIFFER-EXP(62), estimated accuracy 92%, (.92)*(.92) => .8464, recommended weight: 85% of hold weight. We are continuing to refine these processes and improve our accuracy so it is a good idea to review these settings periodically for the best performance. They days of the Gray-Hosting group with a high false positive rate are long gone and will not return ;-) Thanks, _M Pete McNeil (Madscientist) President, MicroNeil Research Corporation Chief SortMonster (www.sortmonster.com) This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html