Re: Why does sa-compile access the bayes db?

2020-05-25 Thread RW
On Mon, 25 May 2020 23:34:27 +0200
Bert Van de Poel wrote:


> My question therefore specifically is: what exactly does sa-compile
> do to the bayes database files?

I don't know for sure, but it's probably just a side-effect of
initializing plugins. Possibly it's trying to perform an opportunistic
sync on the journal file.

sa-compile doesn't need to access Bayes, so you could just treat it as
a cosmetic error. I wouldn't change ownership or permissions just for
this.


Why does sa-compile access the bayes db?

2020-05-25 Thread Bert Van de Poel

Dear Spamassassin users and developers,

Recently, we've been setting up Bayesian learning on our existing Amavis 
with Spamassassin setup on Ubuntu 18.04 (Spamassassin 
3.4.2-0ubuntu0.18.04.3 and Amavis 1:2.11.0-1ubuntu1). We've decided to 
use a global db that was seeded with an aggregation of spam and ham 
we've received, then enabling autolearn to further train the set. As 
Spamassassin runs inside Amavis, the Bayes database files are owned by 
the amavis user. This setup works fine, and results for Bayes are great 
and growing in accuracy by autolearning.


What was somewhat confusing is that we noticed our daily cronjob running 
sa-update and sa-compile was giving us an error concerning permissions:
May 25 00:31:25.488 [8381] warn: bayes: cannot write to 
/var/lib/spamassassin/bayes_db/bayes_journal, bayes db update ignored: 
Permission denied
bayes: cannot write to /var/lib/spamassassin/bayes_db/bayes_journal, 
bayes db update ignored: Permission denied


While this makes a lot of sense, considering that the files are owned by 
the amavis user, we were quite surprised this cronjob would need to 
access these files in the first place. Looking further into the issue, 
we figured out it was specifically sa-compile, and the specific message 
probably originated from 
/usr/share/perl5/Mail/SpamAssassin/BayesStore/DBM.pm. While I have some 
programming experience, I was sadly unable to understand this Perl file 
enough to properly comprehend why this code was accessing bayes_journal 
and what it was planning to do there.


My question therefore specifically is: what exactly does sa-compile do 
to the bayes database files?


I've asked this same question on IRC but was unable to get an answer. 
While a fix for this issue changing permissions and user/group ownership 
is rather obvious, we'd first want to understand what sa-compile is up to.


Kind regards,
Bert Van de Poel
ULYSSIS