Dear Spamassassin users and developers,
Recently, we've been setting up Bayesian learning on our existing Amavis
with Spamassassin setup on Ubuntu 18.04 (Spamassassin
3.4.2-0ubuntu0.18.04.3 and Amavis 1:2.11.0-1ubuntu1). We've decided to
use a global db that was seeded with an aggregation of spam and ham
we've received, then enabling autolearn to further train the set. As
Spamassassin runs inside Amavis, the Bayes database files are owned by
the amavis user. This setup works fine, and results for Bayes are great
and growing in accuracy by autolearning.
What was somewhat confusing is that we noticed our daily cronjob running
sa-update and sa-compile was giving us an error concerning permissions:
May 25 00:31:25.488 [8381] warn: bayes: cannot write to
/var/lib/spamassassin/bayes_db/bayes_journal, bayes db update ignored:
Permission denied
bayes: cannot write to /var/lib/spamassassin/bayes_db/bayes_journal,
bayes db update ignored: Permission denied
While this makes a lot of sense, considering that the files are owned by
the amavis user, we were quite surprised this cronjob would need to
access these files in the first place. Looking further into the issue,
we figured out it was specifically sa-compile, and the specific message
probably originated from
/usr/share/perl5/Mail/SpamAssassin/BayesStore/DBM.pm. While I have some
programming experience, I was sadly unable to understand this Perl file
enough to properly comprehend why this code was accessing bayes_journal
and what it was planning to do there.
My question therefore specifically is: what exactly does sa-compile do
to the bayes database files?
I've asked this same question on IRC but was unable to get an answer.
While a fix for this issue changing permissions and user/group ownership
is rather obvious, we'd first want to understand what sa-compile is up to.
Kind regards,
Bert Van de Poel
ULYSSIS