Re: No BAYES_XX tags in X-Spam-Report

Bill Cole Tue, 23 Jun 2015 09:49:33 -0700

On 23 Jun 2015, at 0:05, Michael B Allen wrote:

On Mon, Jun 22, 2015 at 10:42 PM, Bill Cole
<sausers-20150...@billmail.scconsult.com> wrote:

On 22 Jun 2015, at 21:45, Michael B Allen wrote:

On Mon, Jun 22, 2015 at 8:01 PM, Reindl Harald<h.rei...@thelounge.net>

wrote:


[root@www .spamassassin]# pwd
/var/log/spamassassin/.spamassassin
[root@www .spamassassin]# ls -la
total 1100
drwx------ 2 spamd spamd    4096 Jun 22 19:42 .
drwx------ 3 spamd spamd    4096 Jun  7 00:41 ..
-rw------- 1 spamd spamd   45056 Jun 22 19:42 bayes_seen
-rw------- 1 spamd spamd 1290240 Jun 22 19:42 bayes_toks
-rw-r--r-- 1 spamd spamd    1869 Jun  7 00:41 user_prefs




i doubt that SA is using the bayes of root
so you just rain the wrong bayes



So with a default install (CentOS 7 in my case and I suspect pretty
much all other systems), bayes will NOT just work by default unless

you explicitly modify /etc/mail/spamassassin/local.cf to tellsa-learn

to use the bayes db owned by spamd

(/var/log/spamassassin/.spamassassin/bayes in my case) and NOT theone

owned by root?

However, I have done this:

bayes_path /var/log/spamassassin/.spamassassin/bayes
bayes_file_mode 0777

Don't do that, ever, on any regular file, on any system that hasprocessesrunning as more than just root. I know it's in the SA Wiki, but it'san

irresponsible recommendation.


Yeah, I was going to ask about this because it seems to me if the db
is owned by spamd and spamassassin is running as user spamd and
sa-learn is running as root then 0600 should be fine (although it's
not obvious to me why SA needs a "file mode" in the first place).

A diversity of rigs. SA isn't the spamd daemon or the spamc client or$PERLLIB/Mail/SpamAssassin.pm or the configured ruleset, it is the wholetree of Perl modules in the Mail::SpamAssassin namespace plus *maybe*spamd/spamc, the rules, and subsidiary utilities using them likesa-learn. Different sites use the Perl framework and tools in differentways, so they need different ownership & permission settings. As I don'tuse SpamAssassin on CentOS (or RHEL) I'm not sure precisely what thedefault SA rig there looks like, and how (if at all) RedHat has hookedit into Postfix(?) so I can't explain much about the specifics of whatyou get from 'yum install spamassassin'.

More specifically: in its simplest form, SA is designed to be used byeach of many unprivileged users with independent Bayes DBs fed & used bylocal mail delivery and pre-delivery filtering processes and sa-learn oran equivalent tool for learning messages post-delivery. Thebayes_file_mode defaults to 0700 and usually need not be changed, but onsome OS's with some mail subsystems it may be necessary to adjust thatto allow a delivery agent or other component (e.g. filtering tools)running as something other than root OR the individual mail recipient toread or maybe even write to the users' individual Bayes DBs. You shouldNOT need it changed on a system that only uses a system-wide Bayes DB.

So then what do you recommend that the bayes_file_mode value beprecisely?

The default is usually fine. That's why it is the default. Note thatthis value is only applied when creating a new file in the Bayes DB(which is composed of multiple files) so it is possible for the effectsof changing it to be delayed. If RedHat's packaging of SpamAssassinincludes a different value, I'd suggest not changing it. Also, movingyour DB into /var/log/spamassassin/ is a quirky choice that might not becompatible with RedHat's integration choices in the package theydistribute (and which CentOS replicates.) It's your system and yourchoices of course...

At any rate, the whole thing seems to be working now incidentally. I
am getting BAYES_XX tags now.

Yes. As documented, you don't get messages scored by the Bayes componentuntil it has built an adequate learned history of both ham and spam todo valid scoring.

As stated in my other followup message,
SA seems to have detected the broken db and fixed it because it
suddenly just stated working and sa-learn --dump magic works and is
showing the right numbers.

Well, I'm not convinced that's exactly how it worked, but I'm glad youseem to have it working.

Note that 'sa-learn' DOES NOT talk to spamd, it uses the SA config thatit finds for the user running it to figure out which rules it should useand where to find the Bayes DB (and AWL or TxRep DBs) for that user. Ifyou have spamd running to use a system-wideconfig/ruleset/Bayes/(AWL|TxRep) you should get in the habit of usingspamc to communicate with the daemon rather than running sa-learn asroot and relying on a quirky config to assure that you are handling DBfiles that are global and owned by the right non-root user. If in doingthat you cause the creation of a file in the DB that is owned by rootand can't be deleted by spamd, your DB will be broken.

So just for posterity, the problem was I just needed "bayes_path
/var/log/spamassassin/.spamassassin/bayes" in local.cf to make
sa-learn use that db instead of /root/.spamassassin/bayes. Looks like
it choked initially but somehow it's working now.

Yeah, that seems like a very wrong solution. Not saying it didn't workfor you, but it would not be my choice. Since you seem set on having aweird place for your DB, I won't argue the issue.

Everything is installed as user / group spamd and postfix is set to
call spamassassin with user=spamd. And I assume I must run sa-learnas
root so that it can access Maildir directories and that bayes_path
tells sa-learn where the db is. So now what's the problem?
Wrong assumption.
The sa-learn program is for anyone to manually work with their ownBayes DB,including for the owner of a system-wide Bayes DB to work with thatBayes
DB. If you have a system-wide Bayes DB, it should be fed by either a
system-wide filtering mechanism operating as part of the deliveryprocessand running as the owner of the global DB or by users running thespamcclient under their own ids to feed a spamd daemon running as theowner ofthe global DB or by a combination of the two. The CentOS 7 packageinstallsspamd and spamc, and if you want to learn already-delivered mail intoa
global BayesDB, those are the tools to use.
Yes, I want a system-wide bayes db. And I am running spamd and spamc
and I assume that is all working (but of course I have no idea if it
really is).

But I want users to be able to put spams that get through into
~/Maildir/.LearnAsSpam and then, every once in a while, I want to run
sa-learn on all of those messages for the system-wide db.

So can that be done without running sa-learn as root?

Of course. As I said in other words that you quoted but apparentlymisunderstood:

***** sa-learn IS NOT THE RIGHT TOOL FOR LEARNING MESSAGES INTO ASYSTEM-WIDE DB ****

Use 'spamc -L (spam|ham)'. Have users run it if they like, or have itrun as the user whose magic maildirs are being learned. It talks to thespamd daemon, running as the spamd user, managing the system-wide BayesDB. If it isn't run as root, it can't do random violence limited only byyour capacity for typos.

Ideally I would think sa-learn should be able to run as root just to
access files but use a spamd child to process them and update the
bayes db. Possible?


That's not how any of this works...

The reason for the 'd' in spamd is that it is a daemon: a long-runningprocess that other processes (or network entities) can talk to via alocal unix socket in the filesystem or a TCP port using a definedprotocol. The sa-learn program is not a client of spamd speaking thatprotocol but rather a direct manipulator of the BayesDB, just as spamdis. You can usually get away with using sa-learn to work with the sameBayesDB that spamd uses, but you are likely to eventually do something alittle wrong and either screw up the BayesDB with a file spamd can'twrite to or accidentally and blindly work with a brand new differentBayesDB because of some environmental change or you've re-installed SAor whatever. I don't think there's a real risk of deadlock or datacorruption or anything like that from using spamd and sa-learn on thesame DB, but you do have 2 tools that are unaware of each otherpotentially trying to write to the same files, so there is at least somepossibility for contention problems. And as you've noticed: to learnmessages in anyone's maildirs itnot the system BayesDB, you have to runsa-learn as root, because it isn't talking to spamd at all but fiddlingwith spamd's file behind spamd's back. Running things as root should beresisted and avoided. Use spamc instead, avoid the risks.

--

Bill Cole Email:b...@scconsult.com18847 Rosetta Ave. USE THE FROM HEADER IF ITDIFFERS!Eastpointe, MI USA 48021 MAIN ADDRESS IS HEAVILYSPAM-FILTERED!

Phone: +1-586-774-4357

Re: No BAYES_XX tags in X-Spam-Report

Reply via email to