On Tue, 8 Aug 2006, Wolfgang Jeltsch wrote:
I was kind of shocked when I discovered that there is no SpamAssassin manual or tutorial. For me, it's unimaginable that the world's leading open source spam detection software is missing such an important piece of documentation.
Well, it's not entirely true that there isn't a manual. The various components do have manuals. Here are the most commonly useful ones: perldoc Mail::SpamAssassin perldoc Mail::SpamAssassin::Conf man spamassassin man sa-learn man sa-update And some other ones: perldoc Mail::SpamAssassin::Plugin perldoc Mail::SpamAssassin::Bayes perldoc Mail::SpamAssassin::BayesStore perldoc Mail::SpamAssassin::Plugin::Hashcash Not all the modules that should have documentation do have documentation (for instance, Mail::SpamAssassin::BayesStore::DBM doesn't have any), but there is at least some information. You can root around in the Mail/SpamAssassin directory (should be somewhere inside your site_perl directory) to find more modules that might have documentation. There may be a more elegant way, but this is one of seeing a list of modules which have documentation: cd ...../site_perl/...../Mail/SpamAssassin find . -name '*.pm' -print | xargs grep -l '^=head'
The wiki pages are more bits and pieces than a coherent documentation and often don't explain things in principal but give you finished configuration files for procmail & Co. But what if I don't use procmail?
Well, SpamAssassin doesn't deliver mail, so this question, which is about delivery methods, isn't really relevant.
First, SpamAssassin seems to do autolearning. What does this mean? Does it learn that messages which it already considers spam are spam, and messages which it already considers ham are ham? Wouldn't this mean that SpamAssassin is just doing self-affirmation?
The Bayes database needs to be fed training data in order to be effective. It needs to see several (preferably hundreds and hundreds) of known spam and known ham messages. sa-learn is the command that is used to do this manually. Autolearning means to do the same thing as sa-learn, but automatically. Basically, the other rules work well enough that they can identify obvious spam and ham. Those messages can be used to train the Bayes database.
Second, I often have a message of the following form in my mail log: courierlocal: [???] Cannot open bayes databases /home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists What's the problem here, and how can I get rid of it?
Without any more information than that, I would say that something is either still using the Bayes database in your home directory or it is finished but the lock file hasn't been removed. I haven't tried using SpamAssassin with Courier anything, so I'm not really familiar with how it's normally invoked. - Logan