Thank you Marcus for the very informative analysis. I see that we're not using some of the more accurate tests (because our global.cfg file is a little out of date). A number of these tests are not defined in Declude's example glogal.cfg file. Can you supply a global.cfg (or part of one) with an example test definition for each of these tests?
Thanks, Todd Holt Xidix Technologies, Inc Las Vegas, NV USA 702.319.4349 www.xidix.com -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Markus Gufler Sent: Monday, April 05, 2004 12:43 PM To: [EMAIL PROTECTED] Subject: RE: [IMail Forum] March 2004 Spam Statistics [This email took a suspicious route to arrive here; Suspected SPAM (4)] Beside Scott's monthly stats showing up which test can catch more spam I wondered what each single test can contribute to catch as many spam as possible by having as few false positives as possible. (on a MTA processing legit messages from over 1000 mailboxes) The calculation is based on the assumtion that the weighting system on our server will catch over 97% of spam by having around 0.01% of false positives. (we review all hold spam between 100 and 200% of our hold weight and keep note of every requeued legit message) So by parsing the logfiles I assume that the final weight is "correct" and so I know if a message is spam or legit. Now I look to the individual result of each test and if he has counted in the "right" direction. For example Final weight: 120 points => it's spam BASE64: 10 points => right result SPAMCOP: 10 points => right result NOLEGITCONTENT: -5 points => wrong result ... The result is a table with 4 values for each single test: Dark green: right result for spam message Light green: right result for legit message Dark red: wrong result for spam message Light red: wrong result for legit message Beside the absolute numbers I've created also a diagram with relative values showing also for how much messages the test hasn't returned any result (grey). You can find the results on www.zcom.it/decludeupdater/spam_stats.htm Notes: 1.) Most tests per design can return only positive or only negative results. But there are also tests that can return both positive (voting for spam) and negative (voting for legit) results. So for example a IP4R usualy has (or should have) a positive result for spam and no result for legit messages. So it can't vote right for a legit message or wrong for a spam message. 2.) At the first moment the table maybe is a litle bit confusing. Mouseover the relative bars will show a short explanation. 3.) Briefly: the more green you can see the bether it is. Red is bad. 4.) If you can't see any red bar in the relative values note that this means that there are not enough false positives to show at least 1% in the diagram. Maybe you can see some few false positives in the absolute numbers. Not very much tests are completely free of false positives like John Tolmachoff's AUTOWHITE. (the only FP was caused by a spam-test message containing a lot of tipical spam keywords) 5.) Based on my assumtion that the final weight is right it can happen that one or more tests are voting "right" but the final weight is not correct (spam going trough the filters or legit message hold as false positive) In this case the tests with the right vote will earn a count for the red values. But as I know that we have already a well balanced weighting system this wrong counts should be very rare. Any comments or suggestions are welcome! Hope this helps and you can understand my "english" :-) Markus To Unsubscribe: http://www.ipswitch.com/support/mailing-lists.html List Archive: http://www.mail-archive.com/imail_forum%40list.ipswitch.com/ Knowledge Base/FAQ: http://www.ipswitch.com/support/IMail/ --- [This E-mail scanned for viruses by Declude Virus (http://www.declude.com)] --- [This E-mail scanned for viruses by Declude Virus (http://www.declude.com)] To Unsubscribe: http://www.ipswitch.com/support/mailing-lists.html List Archive: http://www.mail-archive.com/imail_forum%40list.ipswitch.com/ Knowledge Base/FAQ: http://www.ipswitch.com/support/IMail/
