On Tue, 2006-01-31 at 11:20 -0600, Kristopher Austin wrote:
> Hmm, I guess that's a question for Dallas.  This is the version I'm
> using:
> # file: sa-stats.pl
> # date: 2005-08-03
> # version: 1.0
> # author: Dallas Engelken <[EMAIL PROTECTED]>
> # desc: SA 3.1.x log parser
> 
> I don't seem to be the only one showing that strange math.  Dave had the
> same sort of entry in his:
> TOP HAM RULES FIRED
> RANK  RULE NAME               COUNT %OFRULES %OFMAIL %OFSPAM  %OFHAM
>     1 HTML_MESSAGE            63067    21.17   21.46   63.61   56.74
> 
> Dallas, is there a bug or are we interpreting these numbers incorrectly?
> 

Ok, Lets take the following sample data....

Email:     2766 
Spam:       975
Ham:       1791

TOP SPAM RULES FIRED
----------------------------------------------------------------------
RANK    RULE NAME               COUNT  %OFMAIL %OFSPAM  %OFHAM
----------------------------------------------------------------------
   7    HTML_MESSAGE              629    22.74   64.51   34.51
 ----------------------------------------------------------------------

TOP HAM RULES FIRED
----------------------------------------------------------------------
RANK    RULE NAME               COUNT  %OFMAIL %OFSPAM  %OFHAM
----------------------------------------------------------------------
   6    HTML_MESSAGE              618    22.34   64.51   34.51
----------------------------------------------------------------------

we had 2766 total emails.  

for %OFMAIL,
629 spam messages hit HTML_MESSAGE which is 629/2766 = 22.74%.
618 ham messages hit HTML_MESSAGE which is 618/2766 = 22.34%.

for %OFSPAM
629 spam message hit HTML_MESSAGE  which is 629/975 = 64.51%.
618 spam message hit HTML_MESSAGE  which is 618/1791 = 34.51%.

If you want to know what percent the rule HTML_MESSAGE triggered out of
all email, you'd need to add SPAM + HAM / TOTAL
618+629 / 2766 = 45.08%.

The %OFMAIL category is misleading because its comparing the hit count
(on that line) against the total email.   I've went ahead and changed
that is v1.02 and v0.92 respectively.   If you like the old way it
works, dont get the new version :)

SA 3.0.x - http://www.rulesemporium.com/programs/sa-stats.txt
SA 3.1.x - http://www.rulesemporium.com/programs/sa-stats-1.0.txt

Hope this clarifies!

Thanks,

-- 
Dallas Engelken <[EMAIL PROTECTED]>
http://uribl.com 

Reply via email to