On Tue, 8 Aug 2006, Wolfgang Jeltsch wrote:
I was kind of shocked when I discovered that there is no SpamAssassin manual
or tutorial.  For me, it's unimaginable that the world's leading open source
spam detection software is missing such an important piece of documentation.

Well, it's not entirely true that there isn't a manual.
The various components do have manuals.  Here are the most
commonly useful ones:

    perldoc Mail::SpamAssassin
    perldoc Mail::SpamAssassin::Conf
    man spamassassin
    man sa-learn
    man sa-update

And some other ones:

    perldoc Mail::SpamAssassin::Plugin
    perldoc Mail::SpamAssassin::Bayes
    perldoc Mail::SpamAssassin::BayesStore
    perldoc Mail::SpamAssassin::Plugin::Hashcash

Not all the modules that should have documentation do have
documentation (for instance, Mail::SpamAssassin::BayesStore::DBM
doesn't have any), but there is at least some information.

You can root around in the Mail/SpamAssassin directory (should
be somewhere inside your site_perl directory) to find more
modules that might have documentation.  There may be a more
elegant way, but this is one of seeing a list of modules which
have documentation:

    cd ...../site_perl/...../Mail/SpamAssassin
    find . -name '*.pm' -print | xargs grep -l '^=head'

The wiki pages are more bits and pieces than a coherent documentation and
often don't explain things in principal but give you finished configuration
files for procmail & Co.  But what if I don't use procmail?

Well, SpamAssassin doesn't deliver mail, so this question,
which is about delivery methods, isn't really relevant.

First, SpamAssassin seems to do autolearning.  What does this mean?  Does it
learn that messages which it already considers spam are spam, and messages
which it already considers ham are ham?  Wouldn't this mean that SpamAssassin
is just doing self-affirmation?

The Bayes database needs to be fed training data in order to
be effective.  It needs to see several (preferably hundreds and
hundreds) of known spam and known ham messages.  sa-learn is the
command that is used to do this manually.  Autolearning means
to do the same thing as sa-learn, but automatically.

Basically, the other rules work well enough that they can identify
obvious spam and ham.  Those messages can be used to train the
Bayes database.

Second, I often have a message of the following form in my mail log:

        courierlocal: [???] Cannot open bayes databases
        /home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists

What's the problem here, and how can I get rid of it?

Without any more information than that, I would say that
something is either still using the Bayes database in your
home directory or it is finished but the lock file hasn't
been removed.  I haven't tried using SpamAssassin with Courier
anything, so I'm not really familiar with how it's normally
invoked.

  - Logan

Reply via email to