RE: Question about user specific bayes

2022-01-18 Thread Dino Edwards


> Note that SA will try to create an empty DB if none exists. I'm not sure that 
> I can think up a circumstance (other than a disappearing user) where fallback 
> > to global Bayes would happen. SA will not fall back to a global Bayes DB 
> just because an otherwise perfectly good per-user DB isn't properly seeded.

It doesn't seem to be creating an empty database at all. Not sure why

> -Original Message-
> From: Bill Cole 
> Sent: Tuesday, January 18, 2022 12:23 PM
> To: users@spamassassin.apache.org
> Subject: Re: Question about user specific bayes
>
> On 2022-01-18 at 11:12:01 UTC-0500 (Tue, 18 Jan 2022 16:12:01 +) 
> Dino Edwards  is rumored to have said:
>
>> Hi,
>>
>> Trying to implement user specific bayes. My current setup is setup as 
>> follows in regards to global bayes. I'm also using amavis:
>>
>> bayes_path /opt/sa-bayes/bayes
>> bayes_file_mode 0777
>
> Don't do that anywhere. It's not safe.
>
>> use_bayes 1
>> use_bayes_rules 1
>> bayes_auto_learn 0
>> bayes_auto_learn_threshold_spam 15
>> bayes_auto_learn_threshold_nonspam -5
> [...]
>>
>> and it did seem to create  bayes_toks and bayes_seen files under the 
>> /opt/sa-bayes-users/b...@domain.tld<mailto:/opt/sa-bayes-users/bob@dom
>> a
>> in.tld>
>> directory as expected.
>
> So, it is working.
>
>> Is this all that's required to get this working?
>
> Yes
>
>> What happens to the global bayes file  in local.cf? Is that no longer 
>> used?
>
> I believe that it would be used if for some reason SA couldn't figure 
> out which user to pick for a scan at runtime. Maybe if spamd was 
> launched as a user that was later deleted?
>
> But generally, working per-user Bayes setup makes the global file 
> pointless and unused.
>
>>
>> How do the following settings from the local.cf figure in the user 
>> specific bayes files?
>>
>> use_bayes 1
>> use_bayes_rules 1
>> bayes_auto_learn 0
>> bayes_auto_learn_threshold_spam 15
>> bayes_auto_learn_threshold_nonspam -5
>
> The local.cf file is loaded before user_prefs, which is the last 
> config file loaded, so anything that can be changed in user_prefs 
> (i.e. all of those, I believe) which is set in user_prefs will 'stick'
>
> Note that in this case you're choosing to disable auto-learn, so the 
> threshold values are never used.
>
>> Do the user specific bayes have the same requirements to train them 
>> with at least 200 messages?
>
> Yes. Each Bayes DB must be seeded before it can be used. You should 
> also plan a way to regularly feed known spam and ham to those 
> databases, since you aren't auto-learning.
>
>> before they start working?
>
> Before SA will determine a Bayes score on incoming messages, yes.
>
>
>
>
> --
> Bill Cole
> b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many 
> *@billmail.scconsult.com addresses) Not Currently Available For Hire


--
Bill Cole
b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many 
*@billmail.scconsult.com addresses) Not Currently Available For Hire


RE: Question about user specific bayes

2022-01-18 Thread Dino Edwards
Hi, thanks for the quick reply. So when amavis calls on SA for an incoming 
message, it will pass the recipient (e-mail address) in the %u variable and 
then SA will take that variable and look in the /opt/sa-bayes-users/%u 
directory for the existence of bayes database and if it finds one, it will use 
it provided it's properly seeded. If not, it will fall back to the global 
bayes. Is that correct?

Thanks



-Original Message-
From: Bill Cole  
Sent: Tuesday, January 18, 2022 12:23 PM
To: users@spamassassin.apache.org
Subject: Re: Question about user specific bayes

On 2022-01-18 at 11:12:01 UTC-0500 (Tue, 18 Jan 2022 16:12:01 +) Dino 
Edwards  is rumored to have said:

> Hi,
>
> Trying to implement user specific bayes. My current setup is setup as 
> follows in regards to global bayes. I'm also using amavis:
>
> bayes_path /opt/sa-bayes/bayes
> bayes_file_mode 0777

Don't do that anywhere. It's not safe.

> use_bayes 1
> use_bayes_rules 1
> bayes_auto_learn 0
> bayes_auto_learn_threshold_spam 15
> bayes_auto_learn_threshold_nonspam -5
[...]
>
> and it did seem to create  bayes_toks and bayes_seen files under the 
> /opt/sa-bayes-users/b...@domain.tld<mailto:/opt/sa-bayes-users/bob@doma
> in.tld>
> directory as expected.

So, it is working.

> Is this all that's required to get this working?

Yes

> What happens to the global bayes file  in local.cf? Is that no longer 
> used?

I believe that it would be used if for some reason SA couldn't figure out which 
user to pick for a scan at runtime. Maybe if spamd was launched as a user that 
was later deleted?

But generally, working per-user Bayes setup makes the global file pointless and 
unused.

>
> How do the following settings from the local.cf figure in the user 
> specific bayes files?
>
> use_bayes 1
> use_bayes_rules 1
> bayes_auto_learn 0
> bayes_auto_learn_threshold_spam 15
> bayes_auto_learn_threshold_nonspam -5

The local.cf file is loaded before user_prefs, which is the last config file 
loaded, so anything that can be changed in user_prefs (i.e. all of those, I 
believe) which is set in user_prefs will 'stick'

Note that in this case you're choosing to disable auto-learn, so the threshold 
values are never used.

> Do the user specific bayes have the same requirements to train them 
> with at least 200 messages?

Yes. Each Bayes DB must be seeded before it can be used. You should also plan a 
way to regularly feed known spam and ham to those databases, since you aren't 
auto-learning.

> before they start working?

Before SA will determine a Bayes score on incoming messages, yes.




--
Bill Cole
b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many 
*@billmail.scconsult.com addresses) Not Currently Available For Hire


Question about user specific bayes

2022-01-18 Thread Dino Edwards
Hi,

Trying to implement user specific bayes. My current setup is setup as follows 
in regards to global bayes. I'm also using amavis:

bayes_path /opt/sa-bayes/bayes
bayes_file_mode 0777
use_bayes 1
use_bayes_rules 1
bayes_auto_learn 0
bayes_auto_learn_threshold_spam 15
bayes_auto_learn_threshold_nonspam -5



According to various things I've read online, I've setup the following in 
/etc/default/spamassassin in an attempt to setup user specific bayes:


OPTIONS="--create-prefs --max-children 5 
--helper-home-dir=/opt/sa-bayes-users/%u -x -u amavis"

I've also created a bunch of subdirectories with usernames under 
/opt/sa-bayes-users. Example:

/opt/sa-bayes-users/b...@domain.tld
/opt/sa-bayes-users/la...@domain.tld

Etc...

I've setup the owner in /opt/sa-bayes-users/ to amavis and I've also setup the 
permissions to 700.

I've run a test sa-learn as follows where /mnt/data/amavis/clean/n/nTutbwTMVWzK 
is the actual e-mail file I use to train SA:

sa-learn --spam --dbpath /opt/sa-bayes-users/b...@domain.tld 
/mnt/data/amavis/clean/n/nTutbwTMVWzK

and it did seem to create  bayes_toks and bayes_seen files under the 
/opt/sa-bayes-users/b...@domain.tld 
directory as expected.

Is this all that's required to get this working?

What happens to the global bayes file  in local.cf? Is that no longer used?

How do the following settings from the local.cf figure in the user specific 
bayes files?

use_bayes 1
use_bayes_rules 1
bayes_auto_learn 0
bayes_auto_learn_threshold_spam 15
bayes_auto_learn_threshold_nonspam -5


Do the user specific bayes have the same requirements to train them with at 
least 200 messages? before they start working?

Thanks in advance