That's the output from Dallas Engelken's "sa-stats.pl" log analyzer.
You feed it a segment of your spamd logs and it gives you
those rule hit statistics.

See: http://wiki.apache.org/spamassassin/StatsAndAnalyzers

Looking at that wiki page, I noticed that the copy available is v0.93.
I've got v1.03
Does anybody know what was the newest one last avaialable on the rulesemporium site? Anbody got something newer than v1.03?

I've done a bit of hacking to my copy (such as adding the S/O ratio stats).


On Thu, 10 Mar 2016, Erickarlo Porro wrote:


I would like to know how to get these stats too.

 

From: Robert Chalmers [mailto:rob...@chalmers.com.au]
Sent: Tuesday, March 08, 2016 5:25 AM
To: users@spamassassin.apache.org
Subject: Re: Missed spam, suggestions?

 

Can I ask, how are you getting these stats please?

 

Thanks

      On 8 Mar 2016, at 05:11, David B Funk <dbf...@engineering.uiowa.edu> 
wrote:

 

On Mon, 7 Mar 2016, Charles Sprickman wrote:


      I’ve been running with some daily training for a little over a week and 
I’m seeing less spam in my
      inbox.  I’ve seen a few things slip through because bayes tipped them 
below the default score, these
      were two phishing emails.

      Here’s some rule stats for anyone interested:

      TOP SPAM RULES FIRED

      RANK RULE NAME                        COUNT %OFRULES %OFMAIL %OFSPAM  
%OFHAM

       1         TXREP                       13171   8.47   40.38  91.00  72.91
       2         HTML_MESSAGE                12714   8.18   38.98  87.85  90.80
       3         DCC_CHECK                        10593   6.81   32.48  73.19  
33.78
       4         RDNS_NONE                        10269   6.60   31.48  70.95   
5.63
       5         SPF_HELO_PASS                 10070   6.48   30.87  69.58  
23.41
       6         URIBL_BLACK                    9711    6.25   29.77  67.10   
1.58
       7         BODY_NEWDOMAIN_FMBLA                9550    6.14   29.28   
65.98   1.64
       8         FROM_NEWDOMAIN_FMBLA                9483    6.10   29.07   
65.52   1.36
       9         BAYES_99                             8486    5.46   26.02  
58.63   1.18
      10        BAYES_999                           8141    5.24   24.96  56.25 
  1.06

      TOP HAM RULES FIRED

      RANK RULE NAME                        COUNT %OFRULES %OFMAIL %OFSPAM  
%OFHAM

       1         HTML_MESSAGE                16473   9.13   50.51  87.85  90.80
       2         DKIM_SIGNED                    13776   7.64   42.24  13.81  
75.93
       3         TXREP                       13228   7.33   40.56  91.00  72.91
       4         DKIM_VALID                      12962   7.19   39.74  11.93  
71.44
       5         RCVD_IN_DNSWL_NONE            9941    5.51   30.48   8.08      
      54.79
       6         DKIM_VALID_AU              8711    4.83   26.71   7.99   48.01
       7         BAYES_00                             8390    4.65   25.72   
1.84   46.24
       8         RCVD_IN_JMF_W               7369    4.09   22.59   2.54   40.62
       9         RCVD_IN_MSPIKE_WL                 6713    3.72   20.58   4.39  
          37.00
      10        BAYES_50                             6201    3.44   19.01  
25.56  34.18


Based upon your stats it looks like you need more Bayes training. Your Bayes 
00/99 hits should rank higher in the
rules-fired stats and BAYES_50 shouldn't be in the top-10 at all.
(of course if you've only been training for a week that would explain it).

For example, here's my top-10 hits (for a one month interval).

TOP SPAM RULES FIRED
----------------------------------------------------------------------
RANK    RULE NAME                       COUNT  %OFMAIL %OFSPAM  %OFHAM  S/O
----------------------------------------------------------------------
  1    T__BOTNET_NOTRUST               114907   60.32   86.81   42.66  0.5755
  2    BAYES_99                        109138   32.98   82.45    0.01  0.9998
  3    BAYES_999                       104903   31.70   79.25    0.01  0.9999
  4    HTML_MESSAGE                    90850    79.41   68.63   86.59  0.3456
  5    URIBL_BLACK                     90845    27.61   68.63    0.27  0.9942
  6    T_QUARANTINE_1                  90640    27.40   68.47    0.02  0.9996
  7    URIBL_DBL_SPAM                  79152    24.02   59.79    0.17  0.9956
  8    KAM_VERY_BLACK_DBL              74301    22.45   56.13    0.00  1.0000
  9    L_FROM_SPAMMER1k                73667    22.26   55.65    0.00  1.0000
 10    T__RECEIVED_1                   72413    42.60   54.70   34.54  0.5135

OP HAM RULES FIRED
----------------------------------------------------------------------
RANK    RULE NAME                       COUNT  %OFMAIL %OFSPAM  %OFHAM  S/O
----------------------------------------------------------------------
  1    BAYES_00                        182674   56.03    2.11   91.97  0.0150
  2    HTML_MESSAGE                    171992   79.41   68.63   86.59  0.3456
  3    SPF_PASS                        136623   63.08   54.52   68.78  0.3457
  4    T_RP_MATCHES_RCVD               130879   53.75   35.54   65.89  0.2644
  5    T__RECEIVED_2                   125492   53.76   39.62   63.18  0.2947
  6    DKIM_SIGNED                     114808   38.57    9.72   57.80  0.1008
  7    DKIM_VALID                      105385   34.70    7.16   53.06  0.0825
  8    RCVD_IN_DNSWL_NONE              92951    29.90    4.56   46.80  0.0609
  9    T__BOTNET_NOTRUST               84741    60.32   86.81   42.66  0.5755
 10    KHOP_RCVD_TRUST                 84623    26.44    2.19   42.60  0.0331

Note how highly BAYES 00/99 ranked. What you don't see is that BAYES_50 is way 
down in the mud (below 50 rank).

BTW, this is with a Bayes that is mostly fed via auto-learning. I occasionally
hand feed corner cases that get mis-classified (usually things like phishes, or 
conference announcments that can
look shakey).


--
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

 

Robert Chalmers

rob...@chalmers.com.au  Quantum Radio: http://tinyurl.com/lwwddov

Mac mini 6.2 - 2012, Intel Core i7,2.3 GHz, Memory:16 GB. El-Capitan 10.11.  
XCode 7.2.1

2TB: Drive 0:HGST HTS721010A9E630. Upper bay. Drive 1:ST1000LM024 HN-M101MBB. 
Lower Bay

 

 

 




--
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Reply via email to