Am 22.03.2015 um 17:44 schrieb Alex Regan:
Would it be helpful to have something that graphs the data to monitor the effect of learning changes? Does something already exist?
i am doing something similar recently by one per night iterate through all ham/spam smaples to get a overview how they are classified
i pipe all samples via spamc to a second spmd-instance with anything but bayes diabled and store in a small database the total counts for each classification, that way it needs only two database recocords per analyze and the sript sends an alert with the filename in case of a spam-sample goes below BAYES_80 or a ham-sample above BAYES_40
but that works only if your training is manually and you have all learning messages as eml file - well, i found a few wrong classified that way to move from spam to ham and vice versa (wrong classified by user mistake, user = my self)
hopefully the screenshot makes it through the list manager ___________________________________________________________________here some numbers about bayes and log-parsing of the current month, keep in mind that a ton of MTA rules and RBL scoring is in front of SA and hence 7.88 % milter rejects is not that bad
[root@mail-gw:~]$ bayes-stats.sh 0.000 0 3 0 non-token data: bayes db version 0.000 0 13983 0 non-token data: nspam 0.000 0 13567 0 non-token data: nham 0.000 0 1518122 0 non-token data: ntokens 0.000 0 958431600 0 non-token data: oldest atime 0.000 0 1427041902 0 non-token data: newest atime0.000 0 1427044330 0 non-token data: last journal sync atime
0.000 0 0 0 non-token data: last expiry atime0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count
insgesamt 35M -rw------- 1 sa-milt sa-milt 2,5M 2015-03-22 18:12 bayes_seen -rw------- 1 sa-milt sa-milt 40M 2015-03-22 18:12 bayes_toks -rw------- 1 sa-milt sa-milt 98 2015-02-17 11:37 user_prefs BAYES_00 38283 83.09 % BAYES_05 579 1.25 % BAYES_20 682 1.48 % BAYES_40 608 1.31 % BAYES_50 3255 7.06 % BAYES_60 325 0.70 % BAYES_80 373 0.80 % BAYES_95 245 0.53 % BAYES_99 1723 3.73 % BAYES_999 1461 3.17 % DNSWL 40425 87.74 % SPF 27816 60.37 % SPF WL 1406 3.05 % BLOCKED 3632 7.88 %
signature.asc
Description: OpenPGP digital signature