On 23 Jun 2015, at 0:05, Michael B Allen wrote:
On Mon, Jun 22, 2015 at 10:42 PM, Bill Cole
<sausers-20150...@billmail.scconsult.com> wrote:
On 22 Jun 2015, at 21:45, Michael B Allen wrote:
On Mon, Jun 22, 2015 at 8:01 PM, Reindl Harald
<h.rei...@thelounge.net>
wrote:
[root@www .spamassassin]# pwd
/var/log/spamassassin/.spamassassin
[root@www .spamassassin]# ls -la
total 1100
drwx------ 2 spamd spamd 4096 Jun 22 19:42 .
drwx------ 3 spamd spamd 4096 Jun 7 00:41 ..
-rw------- 1 spamd spamd 45056 Jun 22 19:42 bayes_seen
-rw------- 1 spamd spamd 1290240 Jun 22 19:42 bayes_toks
-rw-r--r-- 1 spamd spamd 1869 Jun 7 00:41 user_prefs
i doubt that SA is using the bayes of root
so you just rain the wrong bayes
So with a default install (CentOS 7 in my case and I suspect pretty
much all other systems), bayes will NOT just work by default unless
you explicitly modify /etc/mail/spamassassin/local.cf to tell
sa-learn
to use the bayes db owned by spamd
(/var/log/spamassassin/.spamassassin/bayes in my case) and NOT the
one
owned by root?
However, I have done this:
bayes_path /var/log/spamassassin/.spamassassin/bayes
bayes_file_mode 0777
Don't do that, ever, on any regular file, on any system that has
processes
running as more than just root. I know it's in the SA Wiki, but it's
an
irresponsible recommendation.
Yeah, I was going to ask about this because it seems to me if the db
is owned by spamd and spamassassin is running as user spamd and
sa-learn is running as root then 0600 should be fine (although it's
not obvious to me why SA needs a "file mode" in the first place).
A diversity of rigs. SA isn't the spamd daemon or the spamc client or
$PERLLIB/Mail/SpamAssassin.pm or the configured ruleset, it is the whole
tree of Perl modules in the Mail::SpamAssassin namespace plus *maybe*
spamd/spamc, the rules, and subsidiary utilities using them like
sa-learn. Different sites use the Perl framework and tools in different
ways, so they need different ownership & permission settings. As I don't
use SpamAssassin on CentOS (or RHEL) I'm not sure precisely what the
default SA rig there looks like, and how (if at all) RedHat has hooked
it into Postfix(?) so I can't explain much about the specifics of what
you get from 'yum install spamassassin'.
More specifically: in its simplest form, SA is designed to be used by
each of many unprivileged users with independent Bayes DBs fed & used by
local mail delivery and pre-delivery filtering processes and sa-learn or
an equivalent tool for learning messages post-delivery. The
bayes_file_mode defaults to 0700 and usually need not be changed, but on
some OS's with some mail subsystems it may be necessary to adjust that
to allow a delivery agent or other component (e.g. filtering tools)
running as something other than root OR the individual mail recipient to
read or maybe even write to the users' individual Bayes DBs. You should
NOT need it changed on a system that only uses a system-wide Bayes DB.
So then what do you recommend that the bayes_file_mode value be
precisely?
The default is usually fine. That's why it is the default. Note that
this value is only applied when creating a new file in the Bayes DB
(which is composed of multiple files) so it is possible for the effects
of changing it to be delayed. If RedHat's packaging of SpamAssassin
includes a different value, I'd suggest not changing it. Also, moving
your DB into /var/log/spamassassin/ is a quirky choice that might not be
compatible with RedHat's integration choices in the package they
distribute (and which CentOS replicates.) It's your system and your
choices of course...
At any rate, the whole thing seems to be working now incidentally. I
am getting BAYES_XX tags now.
Yes. As documented, you don't get messages scored by the Bayes component
until it has built an adequate learned history of both ham and spam to
do valid scoring.
As stated in my other followup message,
SA seems to have detected the broken db and fixed it because it
suddenly just stated working and sa-learn --dump magic works and is
showing the right numbers.
Well, I'm not convinced that's exactly how it worked, but I'm glad you
seem to have it working.
Note that 'sa-learn' DOES NOT talk to spamd, it uses the SA config that
it finds for the user running it to figure out which rules it should use
and where to find the Bayes DB (and AWL or TxRep DBs) for that user. If
you have spamd running to use a system-wide
config/ruleset/Bayes/(AWL|TxRep) you should get in the habit of using
spamc to communicate with the daemon rather than running sa-learn as
root and relying on a quirky config to assure that you are handling DB
files that are global and owned by the right non-root user. If in doing
that you cause the creation of a file in the DB that is owned by root
and can't be deleted by spamd, your DB will be broken.
So just for posterity, the problem was I just needed "bayes_path
/var/log/spamassassin/.spamassassin/bayes" in local.cf to make
sa-learn use that db instead of /root/.spamassassin/bayes. Looks like
it choked initially but somehow it's working now.
Yeah, that seems like a very wrong solution. Not saying it didn't work
for you, but it would not be my choice. Since you seem set on having a
weird place for your DB, I won't argue the issue.
Everything is installed as user / group spamd and postfix is set to
call spamassassin with user=spamd. And I assume I must run sa-learn
as
root so that it can access Maildir directories and that bayes_path
tells sa-learn where the db is. So now what's the problem?
Wrong assumption.
The sa-learn program is for anyone to manually work with their own
Bayes DB,
including for the owner of a system-wide Bayes DB to work with that
Bayes
DB. If you have a system-wide Bayes DB, it should be fed by either a
system-wide filtering mechanism operating as part of the delivery
process
and running as the owner of the global DB or by users running the
spamc
client under their own ids to feed a spamd daemon running as the
owner of
the global DB or by a combination of the two. The CentOS 7 package
installs
spamd and spamc, and if you want to learn already-delivered mail into
a
global BayesDB, those are the tools to use.
Yes, I want a system-wide bayes db. And I am running spamd and spamc
and I assume that is all working (but of course I have no idea if it
really is).
But I want users to be able to put spams that get through into
~/Maildir/.LearnAsSpam and then, every once in a while, I want to run
sa-learn on all of those messages for the system-wide db.
So can that be done without running sa-learn as root?
Of course. As I said in other words that you quoted but apparently
misunderstood:
***** sa-learn IS NOT THE RIGHT TOOL FOR LEARNING MESSAGES INTO A
SYSTEM-WIDE DB ****
Use 'spamc -L (spam|ham)'. Have users run it if they like, or have it
run as the user whose magic maildirs are being learned. It talks to the
spamd daemon, running as the spamd user, managing the system-wide Bayes
DB. If it isn't run as root, it can't do random violence limited only by
your capacity for typos.
Ideally I would think sa-learn should be able to run as root just to
access files but use a spamd child to process them and update the
bayes db. Possible?
That's not how any of this works...
The reason for the 'd' in spamd is that it is a daemon: a long-running
process that other processes (or network entities) can talk to via a
local unix socket in the filesystem or a TCP port using a defined
protocol. The sa-learn program is not a client of spamd speaking that
protocol but rather a direct manipulator of the BayesDB, just as spamd
is. You can usually get away with using sa-learn to work with the same
BayesDB that spamd uses, but you are likely to eventually do something a
little wrong and either screw up the BayesDB with a file spamd can't
write to or accidentally and blindly work with a brand new different
BayesDB because of some environmental change or you've re-installed SA
or whatever. I don't think there's a real risk of deadlock or data
corruption or anything like that from using spamd and sa-learn on the
same DB, but you do have 2 tools that are unaware of each other
potentially trying to write to the same files, so there is at least some
possibility for contention problems. And as you've noticed: to learn
messages in anyone's maildirs itnot the system BayesDB, you have to run
sa-learn as root, because it isn't talking to spamd at all but fiddling
with spamd's file behind spamd's back. Running things as root should be
resisted and avoided. Use spamc instead, avoid the risks.
--
Bill Cole Email:
b...@scconsult.com
18847 Rosetta Ave. USE THE FROM HEADER IF IT
DIFFERS!
Eastpointe, MI USA 48021 MAIN ADDRESS IS HEAVILY
SPAM-FILTERED!
Phone: +1-586-774-4357