Re: Debugging bayes w/ '--virtual-config-dir'

2012-12-03 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/28/2012 04:17 PM, Tres Seaver wrote:
 Running SA 3.3.2 on Ubunto 12.04.
 
 Here is how spamd is running:
 
 $ pgrep -lf spamd 26110 /usr/bin/perl -T -w /usr/sbin/spamd
 --create-prefs --max-children 5 --helper-home-dir --username=vmail
 --nouser-config --virtual-config-dir=/home/vmail/spamassassin/%d/%l 
 --syslog=/var/log/spamd.log --debug=all,bayes,check,config -d
 --pidfile=/var/run/spamd.pid 26112 spamd child 26113 spamd child
 
 And the tokens for my account:
 
 # sa-learn --dump=magic\ 
 --dbpath=/home/vmail/spamassassin/example.com/localname 0.000
 0  3  0  non-token data: bayes db version 0.000
 0   3109  0  non-token data: nspam 0.000  0
 24458  0  non-token data: nham 0.000  0 177188
 0  non-token data: ntokens 0.000  0 1351290514  0
 non-token data: oldest atime 0.000  0 1354054449  0
 non-token data: newest atime 0.000  0  0  0
 non-token data: last journal sync atime 0.000  0 1354062194
 0  non-token data: last expiry atime 0.000  02764800
 0  non-token data: last expire atime delta 0.000  0   7488
 0  non-token data: last expire reduction count
 
 But I see nothing in the log for 'bayes' during normal processing;  I 
 only see entries immediately after restart (e.g., the nightly restart 
 after updating rulesets):
 
 # grep : bayes /var/log/spamd.log ... [25810] dbg: logger: adding
 facilities: all, bayes, check, config ... [26110] dbg: config: fixed
 relative path: 
 /var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf 
 ... [26110] dbg: config: using 
 /var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf 
 for included file ... [26110] dbg: bayes: learner_new
 self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x2db7fe0), 
 bayes_store_module=Mail::SpamAssassin::BayesStore::DBM ... [26110]
 dbg: bayes: learner_new: got 
 store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x26392f8) ... [26110]
 dbg: bayes: no dbs present, cannot tie DB R/O: 
 /tmp/spamd-26110-init/.spamassassin/bayes_toks ... [26110] dbg: bayes:
 no dbs present, cannot tie DB R/O: 
 /tmp/spamd-26110-init/.spamassassin/bayes_toks
 
 Which smells to me as though the bayesian stuff is not enabled.  But:
 
 # grep bayes /etc/spamassassin/local.cf use_bayes 1 bayes_auto_learn
 1 bayes_ignore_header X-Bogosity bayes_ignore_header X-Spam-Flag 
 bayes_ignore_header X-Spam-Status #   and a well-trained bayes DB can
 save running rules, too
 
 Any suggestions where I should be looking?

I think I have identified the issue:  using amavisd's built-in SA support
means that per-recipient spam checks aren't feadible (amavisd wants to
pass each message through SA exactly once, rather than once per
recipient).  I think that I'm not seeing any log activity in the spamd
log because amavisd isn't delegating to spamd, but rather using the SA
perl modules directly.  Can anyone confirm that hypothesis?

I believe the workaround is going to be to disable SA support inside
amavisd and instead do the SA procesing during the delivery phase, where
I can run 'spamc -u user' to play nicely with spamd's --virtual-config-dir.



Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with undefined - http://www.enigmail.net/

iEYEARECAAYFAlC8xNkACgkQ+gerLs4ltQ78KACcCh2sijgh6uh7KHLSnQqXHqVh
uloAoI+BxyxTJ8yF0+Q9Gzt6FRGB1KIW
=yv5s
-END PGP SIGNATURE-


Re: Debugging bayes w/ '--virtual-config-dir'

2012-12-03 Thread Bowie Bailey

On 12/3/2012 10:27 AM, Tres Seaver wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/28/2012 04:17 PM, Tres Seaver wrote:

Running SA 3.3.2 on Ubunto 12.04.

Here is how spamd is running:

$ pgrep -lf spamd 26110 /usr/bin/perl -T -w /usr/sbin/spamd
--create-prefs --max-children 5 --helper-home-dir --username=vmail
--nouser-config --virtual-config-dir=/home/vmail/spamassassin/%d/%l
--syslog=/var/log/spamd.log --debug=all,bayes,check,config -d
--pidfile=/var/run/spamd.pid 26112 spamd child 26113 spamd child

And the tokens for my account:

# sa-learn --dump=magic\
--dbpath=/home/vmail/spamassassin/example.com/localname 0.000
0  3  0  non-token data: bayes db version 0.000
0   3109  0  non-token data: nspam 0.000  0
24458  0  non-token data: nham 0.000  0 177188
0  non-token data: ntokens 0.000  0 1351290514  0
non-token data: oldest atime 0.000  0 1354054449  0
non-token data: newest atime 0.000  0  0  0
non-token data: last journal sync atime 0.000  0 1354062194
0  non-token data: last expiry atime 0.000  02764800
0  non-token data: last expire atime delta 0.000  0   7488
0  non-token data: last expire reduction count

But I see nothing in the log for 'bayes' during normal processing;  I
only see entries immediately after restart (e.g., the nightly restart
after updating rulesets):

# grep : bayes /var/log/spamd.log ... [25810] dbg: logger: adding
facilities: all, bayes, check, config ... [26110] dbg: config: fixed
relative path:
/var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf
... [26110] dbg: config: using
/var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf
for included file ... [26110] dbg: bayes: learner_new
self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x2db7fe0),
bayes_store_module=Mail::SpamAssassin::BayesStore::DBM ... [26110]
dbg: bayes: learner_new: got
store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x26392f8) ... [26110]
dbg: bayes: no dbs present, cannot tie DB R/O:
/tmp/spamd-26110-init/.spamassassin/bayes_toks ... [26110] dbg: bayes:
no dbs present, cannot tie DB R/O:
/tmp/spamd-26110-init/.spamassassin/bayes_toks

Which smells to me as though the bayesian stuff is not enabled.  But:

# grep bayes /etc/spamassassin/local.cf use_bayes 1 bayes_auto_learn
1 bayes_ignore_header X-Bogosity bayes_ignore_header X-Spam-Flag
bayes_ignore_header X-Spam-Status #   and a well-trained bayes DB can
save running rules, too

Any suggestions where I should be looking?

I think I have identified the issue:  using amavisd's built-in SA support
means that per-recipient spam checks aren't feadible (amavisd wants to
pass each message through SA exactly once, rather than once per
recipient).  I think that I'm not seeing any log activity in the spamd
log because amavisd isn't delegating to spamd, but rather using the SA
perl modules directly.  Can anyone confirm that hypothesis?


Confirmed.  Amavisd does not do per-user SA and uses the SA libraries 
internally rather than talking to spamd.



I believe the workaround is going to be to disable SA support inside
amavisd and instead do the SA procesing during the delivery phase, where
I can run 'spamc -u user' to play nicely with spamd's --virtual-config-dir.


That is what I do.  Pass in the user with 'spamc -u email-address' and 
use the --virtual-config-dir to tell spamd where to find the directories 
for each user.


--
Bowie


Debugging bayes w/ '--virtual-config-dir'

2012-11-28 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Running SA 3.3.2 on Ubunto 12.04.

Here is how spamd is running:

 $ pgrep -lf spamd
 26110 /usr/bin/perl -T -w /usr/sbin/spamd --create-prefs --max-children 5
  --helper-home-dir --username=vmail --nouser-config
  --virtual-config-dir=/home/vmail/spamassassin/%d/%l
  --syslog=/var/log/spamd.log --debug=all,bayes,check,config
  -d --pidfile=/var/run/spamd.pid
 26112 spamd child
 26113 spamd child

And the tokens for my account:

 # sa-learn --dump=magic\
--dbpath=/home/vmail/spamassassin/example.com/localname
 0.000  0  3  0  non-token data: bayes db version
 0.000  0   3109  0  non-token data: nspam
 0.000  0  24458  0  non-token data: nham
 0.000  0 177188  0  non-token data: ntokens
 0.000  0 1351290514  0  non-token data: oldest atime
 0.000  0 1354054449  0  non-token data: newest atime
 0.000  0  0  0  non-token data: last journal sync
   atime
 0.000  0 1354062194  0  non-token data: last expiry atime
 0.000  02764800  0  non-token data: last expire atime
   delta
 0.000  0   7488  0  non-token data: last expire
   reduction count

But I see nothing in the log for 'bayes' during normal processing;  I
only see entries immediately after restart (e.g., the nightly restart
after updating rulesets):

 # grep : bayes /var/log/spamd.log
 ... [25810] dbg: logger: adding facilities: all, bayes, check, config
 ... [26110] dbg: config: fixed relative path:
 /var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf
 ... [26110] dbg: config: using
 /var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf
 for included file
 ... [26110] dbg: bayes:
 learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x2db7fe0),
 bayes_store_module=Mail::SpamAssassin::BayesStore::DBM
 ... [26110] dbg: bayes:
 learner_new: got
 store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x26392f8)
 ... [26110] dbg: bayes: no dbs present, cannot tie DB R/O:
 /tmp/spamd-26110-init/.spamassassin/bayes_toks
 ... [26110] dbg: bayes: no dbs present, cannot tie DB R/O:
 /tmp/spamd-26110-init/.spamassassin/bayes_toks

Which smells to me as though the bayesian stuff is not enabled.  But:

 # grep bayes /etc/spamassassin/local.cf
 use_bayes 1
 bayes_auto_learn 1
 bayes_ignore_header X-Bogosity
 bayes_ignore_header X-Spam-Flag
 bayes_ignore_header X-Spam-Status
 #   and a well-trained bayes DB can save running rules, too

Any suggestions where I should be looking?


Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with undefined - http://www.enigmail.net/

iEYEARECAAYFAlC2f2EACgkQ+gerLs4ltQ6dTgCguD19mXz5hpTGoZcUVZtKg0+h
r3QAn2jZ8ASQ2K9gewoqVPxUOguJmuib
=vJVz
-END PGP SIGNATURE-