Luigi,

> I am interested in crm114 as a plugin of SA or as a scanner parallel to SA.
> Are there detailed HOWTOs, experiences, suggestions... on the web?

I'm using it as a SpamAssassin plugin, so that it can contribute its
score points and take advantage of auto-learning, much like another
bayes. I made some changes to the plugin and sent them to its author
Martin Schütte, these are now incorporated into the v0.8.0:

  http://wiki.apache.org/spamassassin/CustomPlugins
    -> crm114

or directly at:
  http://mschuette.name/wp/crm114-spamassassin-plugin/

The plugin's .cf file contains some guidelines to setting up the crm114.
If you have a chance, go for a version 20081111-BlameBarack or later,
although the 20080326-BlameSentansoken works too (but does not support
the --report_only option).

For a medium and larger sites the default size of css files (as built
by cssutil -b -r) may be too small, use option -s or -S to cssutil to
specify more entries. Our current files have 8M entries each (as reported
by 'cssutil -b -r spam.css' and 'cssutil -b -r nonspam.css'), but is
getting nearly full, I'll need to bump it up.

Initially the css files learn tokens at a faster rate,
but then autolearning slows down, as crm114 only learns on its mistakes.

I just let it start from scratch, depending entirely on autolearning
from SpamAssassin. This only works well if your SpamAssassin is producing
quality results (uses good network tests, updated rules, sought.rules.yerp.org 
rules, manual tweaks when necessary, ...). Occasionally when I see a
prominent false positive or false negative I feed it either to the
crm directly, or through spamassassin --report/--revoke when I want
Bayes to learn as well.

Here is my crm114.cf file (comments and empty lines removed):

loadplugin Mail::SpamAssassin::Plugin::CRM114 crm114.pm
full      CRM114_CHECK  eval:check_crm()
priority  CRM114_CHECK  899
crm114_command /usr/local/bin/crm -u /var/amavis/.crm114 mailreaver.crm
add_header all CRM114-Status _CRM114STATUS_ ( _CRM114SCORE_ )
crm114_dynscore 1
crm114_dynscore_factor -0.10
crm114_good_threshold  10
crm114_spam_threshold -10
crm114_learn 1
crm114_autolearn 1
crm114_lookup_cacheid 1
crm114_cache_dir /var/amavis/.crm114/reaver_cache

(initially you may want to start with a lower crm114_dynscore_factor)

If your version of crm114 suports the --report_only option,
then use it, and then you may remove the --stats_only (in crm1114.pm)
to obtain a bit more information from crm114, not just its score.

I choose not to let crm store a copy of every message to its cache
(for later manual learning) by keeping crm114_use_cacheid at 0,
which is a default. It would only store cases that needed to
be learned - into subdirectories known_good and known_spam within
its crm114_cache_dir directory ( /var/amavis/.crm114/reaver_cache/ ).
Keeping these training cases can be useful if you ever decide to
re-start from scratch but use them for initial training.

At log level 3 the amavisd includes SpamAssassin debugging in its
log, so it may be interesting, at least initially, to have crm114
plugin's debugging enabled:

  amavisd -d noall,crm114

Of these log entries, the particularly interesting are the
'CRM and SA disagree' log entries, like:

(42922-12) SA dbg: crm114: CRM and SA disagree, crm says SPAM, sa -3.142
(33786-11) SA dbg: crm114: CRM and SA disagree, crm says SPAM, sa -22.462

It often is worthwhile to check these messages. Some of these events
are likely to be genuine false positives or false negatives.
It may indicate that either SA rules would benefit from some
tweaking, or the crm may need to learn some sample.

Now with 2.6.3, using crm directly by amavisd is possible too.
But if you already have SpamAssassin running and decide to keep
it, I think the plugin choice is better, keeping the best of both
worlds - at the expense of extra processing needs of SpamAssassin.

  Mark

------------------------------------------------------------------------------
Stay on top of everything new and different, both inside and 
around Java (TM) technology - register by April 22, and save
$200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
300 plus technical and hands-on sessions. Register today. 
Use priority code J9JMT32. http://p.sf.net/sfu/p
_______________________________________________
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net 
https://lists.sourceforge.net/lists/listinfo/amavis-user 
 AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3 
 AMaViS-HowTos:http://www.amavis.org/howto/ 

Reply via email to