Re: training bayes database

2018-05-16 Thread Alex Woick

David B Funk schrieb am 10.05.2018 um 20:23:

On Thu, 10 May 2018, John Hardin wrote:


On Thu, 10 May 2018, Matthew Broadhead wrote:


On 09/05/18 20:43, David Jones wrote:

On 05/09/2018 01:29 PM, Matthew Broadhead wrote:

On 09/05/18 16:37, Reindl Harald wrote:


quoting URIBL_BLOCKED is a joke - setup a *recursion* 
*non-forwarding*

nameserver, no dnsmasq or such crap

http://uribl.com/refused.shtml

with your setup you excedd *obviously* rate-limits and have most
DNSBL/URIBL not working and so you can't expect useful results at 
all


X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2
 tests=[AM.WBL=-3, BAYES_00=-1.9, 
HEADER_FROM_DIFFERENT_DOMAINS=0.25,

 MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001,
 URIBL_BLOCKED=0.001, USER_IN_DEF_SPF_WL=-7.5]
 autolearn=ham autolearn_force=no


i followed the guidance at that url and it gave me
[root@ns1 ~]# host -tTXT 2.0.0.127.multi.uribl.com
2.0.0.127.multi.uribl.com descriptive text "127.0.0.1 -> Query 
Refused. See http://uribl.com/refused.shtml for more information 
[Your DNS IP: 213.171.193.134]"


i guess my dns is set to use my isp's dns server.  do i need to 
set up dns relay on my machine so it comes from my ip?


there is no way we send more than 500k emails from our domain so i 
should qualify for the free lookup?


Yes.  Setup BIND, unbound, or pdns_recursor on your SA server that 
is not forwarding to another DNS server then set your 
/etc/resolv.conf or SA dns_server to 127.0.0.1.  This will make 
your DNS queries isolated from your IP to stay under their daily 
limit.


Keep in mind that if your SA box is behind NAT that is not 
dedicated to your server then other DNS queries could get combined 
with your shared public IP.  This is not likely since others are 
not going to query RBL/URIBL servers but it's possible.  If your SA 
server is directly on the Internet as an edge mail gateway then 
this won't be a problem.


i already had bind handling my dns.  i just had to add to 
/etc/named.conf


allow-query-cache {localhost; any;};
recursion yes;


Don't forget to *turn off forwarding*.


and to /etc/resolv.conf

nameserver 127.0.0.1


That is the most important point in this whole discussion.

It doesn't matter (much) what DNS server/software you use so long as 
it supports recursive NON-FORWARDED queries.
Caching is desirable but is only a secondary consideration VS the 
first point.


Security point; when you run a recursive server it is a potential DDOS 
risk, so protect it from being used/abused by untrusted clients. (best 
if it only listens on the loopback address, 127.* or has strong 
ACL/access control support that is properly configured).


I saw in the above quotes that Matthew opened his server to answer any 
recursive query - this is what it makes a security risk if that server 
is directly facing the internet. If you have a server like I do that 
hosts the primary (or secondary) dns zone for your domain and running 
the mail server, you want to allow dns recursion only on local queries 
and disable recursion for everyone else, while still allowing 
non-recursive queries for your zone. You achieve this with bind:


    allow-query { any; };
    allow-query-cache   { localhost; };
    allow-recursion { localhost; };

Important is localhost only for allow-query-cache and allow-recursion. 
Matthew has an "any;" included and global recursion yes - remove this! 
If you need this on a public facing server, you're doing something 
wrong. Don't put a recursive DNS server online these days.


Alex



Re: training bayes database

2018-05-10 Thread David B Funk

On Thu, 10 May 2018, John Hardin wrote:


On Thu, 10 May 2018, Matthew Broadhead wrote:


On 09/05/18 20:43, David Jones wrote:

On 05/09/2018 01:29 PM, Matthew Broadhead wrote:

On 09/05/18 16:37, Reindl Harald wrote:


quoting URIBL_BLOCKED is a joke - setup a *recursion* *non-forwarding*
nameserver, no dnsmasq or such crap

http://uribl.com/refused.shtml

with your setup you excedd *obviously* rate-limits and have most
DNSBL/URIBL not working and so you can't expect useful results at all

X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2
 tests=[AM.WBL=-3, BAYES_00=-1.9, 
HEADER_FROM_DIFFERENT_DOMAINS=0.25,

 MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001,
 URIBL_BLOCKED=0.001, USER_IN_DEF_SPF_WL=-7.5]
 autolearn=ham autolearn_force=no


i followed the guidance at that url and it gave me
[root@ns1 ~]# host -tTXT 2.0.0.127.multi.uribl.com
2.0.0.127.multi.uribl.com descriptive text "127.0.0.1 -> Query Refused. 
See http://uribl.com/refused.shtml for more information [Your DNS IP: 
213.171.193.134]"


i guess my dns is set to use my isp's dns server.  do i need to set up 
dns relay on my machine so it comes from my ip?


there is no way we send more than 500k emails from our domain so i should 
qualify for the free lookup?


Yes.  Setup BIND, unbound, or pdns_recursor on your SA server that is not 
forwarding to another DNS server then set your /etc/resolv.conf or SA 
dns_server to 127.0.0.1.  This will make your DNS queries isolated from 
your IP to stay under their daily limit.


Keep in mind that if your SA box is behind NAT that is not dedicated to 
your server then other DNS queries could get combined with your shared 
public IP.  This is not likely since others are not going to query 
RBL/URIBL servers but it's possible.  If your SA server is directly on the 
Internet as an edge mail gateway then this won't be a problem.



i already had bind handling my dns.  i just had to add to /etc/named.conf

allow-query-cache {localhost; any;};
recursion yes;


Don't forget to *turn off forwarding*.


and to /etc/resolv.conf

nameserver 127.0.0.1


That is the most important point in this whole discussion.

It doesn't matter (much) what DNS server/software you use so long as it supports 
recursive NON-FORWARDED queries.

Caching is desirable but is only a secondary consideration VS the first point.

Security point; when you run a recursive server it is a potential DDOS risk, so 
protect it from being used/abused by untrusted clients. (best if it only listens 
on the loopback address, 127.* or has strong ACL/access control support that is 
properly configured).


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

Re: training bayes database

2018-05-10 Thread John Hardin

On Thu, 10 May 2018, Matthew Broadhead wrote:


On 09/05/18 20:43, David Jones wrote:

On 05/09/2018 01:29 PM, Matthew Broadhead wrote:

On 09/05/18 16:37, Reindl Harald wrote:


Am 09.05.2018 um 16:28 schrieb Matthew Broadhead:

it looks like it is working.  so maybe it is just not flagging or moving
the spam?

in a differnt post you showed this status header which *clearly* shows
bayes is working - bayes alone don't flag, the total socre does, moving
don't happen at all on this layer - other software like sieve is
responsible for acting on the headers of a message

quoting URIBL_BLOCKED is a joke - setup a *recursion* *non-forwarding*
nameserver, no dnsmasq or such crap

http://uribl.com/refused.shtml

with your setup you excedd *obviously* rate-limits and have most
DNSBL/URIBL not working and so you can't expect useful results at all

X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2
 tests=[AM.WBL=-3, BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25,
 MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001,
 URIBL_BLOCKED=0.001, USER_IN_DEF_SPF_WL=-7.5]
 autolearn=ham autolearn_force=no


i followed the guidance at that url and it gave me
[root@ns1 ~]# host -tTXT 2.0.0.127.multi.uribl.com
2.0.0.127.multi.uribl.com descriptive text "127.0.0.1 -> Query Refused. 
See http://uribl.com/refused.shtml for more information [Your DNS IP: 
213.171.193.134]"


i guess my dns is set to use my isp's dns server.  do i need to set up dns 
relay on my machine so it comes from my ip?


there is no way we send more than 500k emails from our domain so i should 
qualify for the free lookup?


Yes.  Setup BIND, unbound, or pdns_recursor on your SA server that is not 
forwarding to another DNS server then set your /etc/resolv.conf or SA 
dns_server to 127.0.0.1.  This will make your DNS queries isolated from 
your IP to stay under their daily limit.


Keep in mind that if your SA box is behind NAT that is not dedicated to 
your server then other DNS queries could get combined with your shared 
public IP.  This is not likely since others are not going to query 
RBL/URIBL servers but it's possible.  If your SA server is directly on the 
Internet as an edge mail gateway then this won't be a problem.



i already had bind handling my dns.  i just had to add to /etc/named.conf

allow-query-cache {localhost; any;};
recursion yes;


Don't forget to *turn off forwarding*.


and to /etc/resolv.conf

nameserver 127.0.0.1

i cannot believe that is not the default.  i always assumed my dns was 
working correctly.




--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  The Constitution is a written instrument. As such its meaning does
  not alter. That which it meant when adopted, it means now.
-- U.S. Supreme Court
   SOUTH CAROLINA v. US, 199 U.S. 437, 448 (1905)
---
 406 days since the first commercial re-flight of an orbital booster (SpaceX)

Re: training bayes database

2018-05-10 Thread Reio Remma

On 10.05.18 15:23, David Jones wrote:

On 05/10/2018 07:12 AM, Reio Remma wrote:

On 10.05.18 15:08, David Jones wrote:

On 05/10/2018 07:02 AM, Reio Remma wrote:
On a slightly related note. We're running a PFSense firewall with 
DNS Forwarder (dnsmasq) in front of our mail server. From what I've 
gleaned from the net is that it caches as well. Should I still 
install a local (BIND) on the mail server?


Thanks!
Reio


YES!  As I was corrected on this mailing list last year, dnsmasq is 
only a forwarding DNS server so it will cause your queries to be 
lumped into whatever it's forwarding to.  Setup a real recursive DNS 
server local on your mail server since it should have it's own 
dedicated NAT or real public IP on your pfSense firewall so your DNS 
queries will be completely isolated. 


There's also the option of DNS Resolver (unbound) on the firewall - 
would that be better?


Reio


No.  Your DNS traffic for your general network served by your firewall 
is much different from your mail server DNS lookup.  You will probably 
want to forward your firewall DNS server to OpenDNS, Google, or even 
do DNS over TLS someday.


https://wiki.apache.org/spamassassin/CachingNameserver

My favorite is PowerDNS Recursor but Unbound is very popular. 


That seems to have worked - installed unbound and set dns_server 
127.0.0.1 in local.cf


Thanks,
Reio


Re: training bayes database

2018-05-10 Thread David Jones

On 05/10/2018 07:12 AM, Reio Remma wrote:

On 10.05.18 15:08, David Jones wrote:

On 05/10/2018 07:02 AM, Reio Remma wrote:

On 10.05.18 14:58, Matus UHLAR - fantomas wrote:

Am 09.05.2018 um 16:28 schrieb Matthew Broadhead:
i guess my dns is set to use my isp's dns server.  do i need to 
set up dns relay on my machine so it comes from my ip?


there is no way we send more than 500k emails from our domain so 
i should qualify for the free lookup?



On 09/05/18 20:43, David Jones wrote:
Yes.  Setup BIND, unbound, or pdns_recursor on your SA server that 
is not forwarding to another DNS server then set your 
/etc/resolv.conf or SA dns_server to 127.0.0.1.  This will make 
your DNS queries isolated from your IP to stay under their daily 
limit.


Keep in mind that if your SA box is behind NAT that is not 
dedicated to your server then other DNS queries could get combined 
with your shared public IP.  This is not likely since others are 
not going to query RBL/URIBL servers but it's possible.  If your 
SA server is directly on the Internet as an edge mail gateway then 
this won't be a problem.




On 10.05.18 12:15, Matthew Broadhead wrote:
i already had bind handling my dns.  i just had to add to 
/etc/named.conf


allow-query-cache {localhost; any;};


NO!
this way everyone is allowed to use your server as recursive DNS.

only allow "localhost;" it defined all ipv4 and ipv6 address on your 
system.


It's also better to define allow-recursion instead.
While it means something different, they both have same defaults, but
allow-recursion has more clear meaning.


recursion yes;


not needed by default.


and to /etc/resolv.conf

nameserver 127.0.0.1

i cannot believe that is not the default.  i always assumed my dns 
was working correctly.


It's not default to have DNS server on your system. And it's not 
default to
have localhost in resolv.conf - it may be authoritative-only. 


On a slightly related note. We're running a PFSense firewall with DNS 
Forwarder (dnsmasq) in front of our mail server. From what I've 
gleaned from the net is that it caches as well. Should I still 
install a local (BIND) on the mail server?


Thanks!
Reio


YES!  As I was corrected on this mailing list last year, dnsmasq is 
only a forwarding DNS server so it will cause your queries to be 
lumped into whatever it's forwarding to.  Setup a real recursive DNS 
server local on your mail server since it should have it's own 
dedicated NAT or real public IP on your pfSense firewall so your DNS 
queries will be completely isolated. 


There's also the option of DNS Resolver (unbound) on the firewall - 
would that be better?


Reio


No.  Your DNS traffic for your general network served by your firewall 
is much different from your mail server DNS lookup.  You will probably 
want to forward your firewall DNS server to OpenDNS, Google, or even do 
DNS over TLS someday.


https://wiki.apache.org/spamassassin/CachingNameserver

My favorite is PowerDNS Recursor but Unbound is very popular.

--
David Jones


Re: training bayes database

2018-05-10 Thread Matus UHLAR - fantomas

Am 09.05.2018 um 16:28 schrieb Matthew Broadhead:
i guess my dns is set to use my isp's dns server.  do i need 
to set up dns relay on my machine so it comes from my ip?


there is no way we send more than 500k emails from our domain 
so i should qualify for the free lookup?



On 09/05/18 20:43, David Jones wrote:
Yes.  Setup BIND, unbound, or pdns_recursor on your SA server 
that is not forwarding to another DNS server then set your 
/etc/resolv.conf or SA dns_server to 127.0.0.1.  This will make 
your DNS queries isolated from your IP to stay under their 
daily limit.


Keep in mind that if your SA box is behind NAT that is not 
dedicated to your server then other DNS queries could get 
combined with your shared public IP.  This is not likely since 
others are not going to query RBL/URIBL servers but it's 
possible.  If your SA server is directly on the Internet as an 
edge mail gateway then this won't be a problem.


On 10.05.18 15:02, Reio Remma wrote:
On a slightly related note. We're running a PFSense firewall with DNS 
Forwarder (dnsmasq) in front of our mail server. From what I've 
gleaned from the net is that it caches as well. Should I still 
install a local (BIND) on the mail server?


The requirement is not for caching server - it's for recursing server

dnsmasq is forwarding server, get rid of if when possible. It's even
documented:

https://wiki.apache.org/spamassassin/CachingNameserver

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
If Barbie is so popular, why do you have to buy her friends? 


Re: training bayes database

2018-05-10 Thread Reio Remma

On 10.05.18 15:08, David Jones wrote:

On 05/10/2018 07:02 AM, Reio Remma wrote:

On 10.05.18 14:58, Matus UHLAR - fantomas wrote:

Am 09.05.2018 um 16:28 schrieb Matthew Broadhead:
i guess my dns is set to use my isp's dns server.  do i need to 
set up dns relay on my machine so it comes from my ip?


there is no way we send more than 500k emails from our domain so 
i should qualify for the free lookup?



On 09/05/18 20:43, David Jones wrote:
Yes.  Setup BIND, unbound, or pdns_recursor on your SA server that 
is not forwarding to another DNS server then set your 
/etc/resolv.conf or SA dns_server to 127.0.0.1.  This will make 
your DNS queries isolated from your IP to stay under their daily 
limit.


Keep in mind that if your SA box is behind NAT that is not 
dedicated to your server then other DNS queries could get combined 
with your shared public IP.  This is not likely since others are 
not going to query RBL/URIBL servers but it's possible.  If your 
SA server is directly on the Internet as an edge mail gateway then 
this won't be a problem.




On 10.05.18 12:15, Matthew Broadhead wrote:
i already had bind handling my dns.  i just had to add to 
/etc/named.conf


allow-query-cache {localhost; any;};


NO!
this way everyone is allowed to use your server as recursive DNS.

only allow "localhost;" it defined all ipv4 and ipv6 address on your 
system.


It's also better to define allow-recursion instead.
While it means something different, they both have same defaults, but
allow-recursion has more clear meaning.


recursion yes;


not needed by default.


and to /etc/resolv.conf

nameserver 127.0.0.1

i cannot believe that is not the default.  i always assumed my dns 
was working correctly.


It's not default to have DNS server on your system. And it's not 
default to
have localhost in resolv.conf - it may be authoritative-only. 


On a slightly related note. We're running a PFSense firewall with DNS 
Forwarder (dnsmasq) in front of our mail server. From what I've 
gleaned from the net is that it caches as well. Should I still 
install a local (BIND) on the mail server?


Thanks!
Reio


YES!  As I was corrected on this mailing list last year, dnsmasq is 
only a forwarding DNS server so it will cause your queries to be 
lumped into whatever it's forwarding to.  Setup a real recursive DNS 
server local on your mail server since it should have it's own 
dedicated NAT or real public IP on your pfSense firewall so your DNS 
queries will be completely isolated. 


There's also the option of DNS Resolver (unbound) on the firewall - 
would that be better?


Reio


Re: training bayes database

2018-05-10 Thread David Jones

On 05/10/2018 07:02 AM, Reio Remma wrote:

On 10.05.18 14:58, Matus UHLAR - fantomas wrote:

Am 09.05.2018 um 16:28 schrieb Matthew Broadhead:
i guess my dns is set to use my isp's dns server.  do i need to set 
up dns relay on my machine so it comes from my ip?


there is no way we send more than 500k emails from our domain so i 
should qualify for the free lookup?



On 09/05/18 20:43, David Jones wrote:
Yes.  Setup BIND, unbound, or pdns_recursor on your SA server that 
is not forwarding to another DNS server then set your 
/etc/resolv.conf or SA dns_server to 127.0.0.1.  This will make your 
DNS queries isolated from your IP to stay under their daily limit.


Keep in mind that if your SA box is behind NAT that is not dedicated 
to your server then other DNS queries could get combined with your 
shared public IP.  This is not likely since others are not going to 
query RBL/URIBL servers but it's possible.  If your SA server is 
directly on the Internet as an edge mail gateway then this won't be 
a problem.




On 10.05.18 12:15, Matthew Broadhead wrote:
i already had bind handling my dns.  i just had to add to 
/etc/named.conf


allow-query-cache {localhost; any;};


NO!
this way everyone is allowed to use your server as recursive DNS.

only allow "localhost;" it defined all ipv4 and ipv6 address on your 
system.


It's also better to define allow-recursion instead.
While it means something different, they both have same defaults, but
allow-recursion has more clear meaning.


recursion yes;


not needed by default.


and to /etc/resolv.conf

nameserver 127.0.0.1

i cannot believe that is not the default.  i always assumed my dns 
was working correctly.


It's not default to have DNS server on your system. And it's not 
default to
have localhost in resolv.conf - it may be authoritative-only. 


On a slightly related note. We're running a PFSense firewall with DNS 
Forwarder (dnsmasq) in front of our mail server. From what I've gleaned 
from the net is that it caches as well. Should I still install a local 
(BIND) on the mail server?


Thanks!
Reio


YES!  As I was corrected on this mailing list last year, dnsmasq is only 
a forwarding DNS server so it will cause your queries to be lumped into 
whatever it's forwarding to.  Setup a real recursive DNS server local on 
your mail server since it should have it's own dedicated NAT or real 
public IP on your pfSense firewall so your DNS queries will be 
completely isolated.


--
David Jones


Re: training bayes database

2018-05-10 Thread Reio Remma

On 10.05.18 14:58, Matus UHLAR - fantomas wrote:

Am 09.05.2018 um 16:28 schrieb Matthew Broadhead:
i guess my dns is set to use my isp's dns server.  do i need to set 
up dns relay on my machine so it comes from my ip?


there is no way we send more than 500k emails from our domain so i 
should qualify for the free lookup?



On 09/05/18 20:43, David Jones wrote:
Yes.  Setup BIND, unbound, or pdns_recursor on your SA server that 
is not forwarding to another DNS server then set your 
/etc/resolv.conf or SA dns_server to 127.0.0.1.  This will make your 
DNS queries isolated from your IP to stay under their daily limit.


Keep in mind that if your SA box is behind NAT that is not dedicated 
to your server then other DNS queries could get combined with your 
shared public IP.  This is not likely since others are not going to 
query RBL/URIBL servers but it's possible.  If your SA server is 
directly on the Internet as an edge mail gateway then this won't be 
a problem.




On 10.05.18 12:15, Matthew Broadhead wrote:
i already had bind handling my dns.  i just had to add to 
/etc/named.conf


allow-query-cache {localhost; any;};


NO!
this way everyone is allowed to use your server as recursive DNS.

only allow "localhost;" it defined all ipv4 and ipv6 address on your 
system.


It's also better to define allow-recursion instead.
While it means something different, they both have same defaults, but
allow-recursion has more clear meaning.


recursion yes;


not needed by default.


and to /etc/resolv.conf

nameserver 127.0.0.1

i cannot believe that is not the default.  i always assumed my dns 
was working correctly.


It's not default to have DNS server on your system. And it's not 
default to
have localhost in resolv.conf - it may be authoritative-only. 


On a slightly related note. We're running a PFSense firewall with DNS 
Forwarder (dnsmasq) in front of our mail server. From what I've gleaned 
from the net is that it caches as well. Should I still install a local 
(BIND) on the mail server?


Thanks!
Reio


Re: training bayes database

2018-05-10 Thread Matus UHLAR - fantomas

Am 09.05.2018 um 16:28 schrieb Matthew Broadhead:
i guess my dns is set to use my isp's dns server.  do i need to 
set up dns relay on my machine so it comes from my ip?


there is no way we send more than 500k emails from our domain so 
i should qualify for the free lookup?



On 09/05/18 20:43, David Jones wrote:
Yes.  Setup BIND, unbound, or pdns_recursor on your SA server that 
is not forwarding to another DNS server then set your 
/etc/resolv.conf or SA dns_server to 127.0.0.1.  This will make 
your DNS queries isolated from your IP to stay under their daily 
limit.


Keep in mind that if your SA box is behind NAT that is not 
dedicated to your server then other DNS queries could get combined 
with your shared public IP.  This is not likely since others are 
not going to query RBL/URIBL servers but it's possible.  If your SA 
server is directly on the Internet as an edge mail gateway then 
this won't be a problem.




On 10.05.18 12:15, Matthew Broadhead wrote:

i already had bind handling my dns.  i just had to add to /etc/named.conf

allow-query-cache {localhost; any;};


NO!
this way everyone is allowed to use your server as recursive DNS.

only allow "localhost;" it defined all ipv4 and ipv6 address on your system.

It's also better to define allow-recursion instead.
While it means something different, they both have same defaults, but
allow-recursion has more clear meaning.


recursion yes;


not needed by default.


and to /etc/resolv.conf

nameserver 127.0.0.1

i cannot believe that is not the default.  i always assumed my dns 
was working correctly.


It's not default to have DNS server on your system. And it's not default to
have localhost in resolv.conf - it may be authoritative-only.



--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Eagles may soar, but weasels don't get sucked into jet engines. 


Re: training bayes database

2018-05-10 Thread Matthew Broadhead

On 09/05/18 20:43, David Jones wrote:

On 05/09/2018 01:29 PM, Matthew Broadhead wrote:

On 09/05/18 16:37, Reindl Harald wrote:


Am 09.05.2018 um 16:28 schrieb Matthew Broadhead:
it looks like it is working.  so maybe it is just not flagging or 
moving

the spam?

in a differnt post you showed this status header which *clearly* shows
bayes is working - bayes alone don't flag, the total socre does, moving
don't happen at all on this layer - other software like sieve is
responsible for acting on the headers of a message

quoting URIBL_BLOCKED is a joke - setup a *recursion* *non-forwarding*
nameserver, no dnsmasq or such crap

http://uribl.com/refused.shtml

with your setup you excedd *obviously* rate-limits and have most
DNSBL/URIBL not working and so you can't expect useful results at all

X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2
 tests=[AM.WBL=-3, BAYES_00=-1.9, 
HEADER_FROM_DIFFERENT_DOMAINS=0.25,

 MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001,
 URIBL_BLOCKED=0.001, USER_IN_DEF_SPF_WL=-7.5]
 autolearn=ham autolearn_force=no


i followed the guidance at that url and it gave me
[root@ns1 ~]# host -tTXT 2.0.0.127.multi.uribl.com
2.0.0.127.multi.uribl.com descriptive text "127.0.0.1 -> Query 
Refused. See http://uribl.com/refused.shtml for more information 
[Your DNS IP: 213.171.193.134]"


i guess my dns is set to use my isp's dns server.  do i need to set 
up dns relay on my machine so it comes from my ip?


there is no way we send more than 500k emails from our domain so i 
should qualify for the free lookup?


Yes.  Setup BIND, unbound, or pdns_recursor on your SA server that is 
not forwarding to another DNS server then set your /etc/resolv.conf or 
SA dns_server to 127.0.0.1.  This will make your DNS queries isolated 
from your IP to stay under their daily limit.


Keep in mind that if your SA box is behind NAT that is not dedicated 
to your server then other DNS queries could get combined with your 
shared public IP.  This is not likely since others are not going to 
query RBL/URIBL servers but it's possible.  If your SA server is 
directly on the Internet as an edge mail gateway then this won't be a 
problem.



i already had bind handling my dns.  i just had to add to /etc/named.conf

allow-query-cache {localhost; any;};
recursion yes;

and to /etc/resolv.conf

nameserver 127.0.0.1

i cannot believe that is not the default.  i always assumed my dns was 
working correctly.


Re: training bayes database

2018-05-09 Thread David Jones

On 05/09/2018 01:29 PM, Matthew Broadhead wrote:

On 09/05/18 16:37, Reindl Harald wrote:


Am 09.05.2018 um 16:28 schrieb Matthew Broadhead:

it looks like it is working.  so maybe it is just not flagging or moving
the spam?

in a differnt post you showed this status header which *clearly* shows
bayes is working - bayes alone don't flag, the total socre does, moving
don't happen at all on this layer - other software like sieve is
responsible for acting on the headers of a message

quoting URIBL_BLOCKED is a joke - setup a *recursion* *non-forwarding*
nameserver, no dnsmasq or such crap

http://uribl.com/refused.shtml

with your setup you excedd *obviously* rate-limits and have most
DNSBL/URIBL not working and so you can't expect useful results at all

X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2
 tests=[AM.WBL=-3, BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25,
 MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001,
 URIBL_BLOCKED=0.001, USER_IN_DEF_SPF_WL=-7.5]
 autolearn=ham autolearn_force=no


i followed the guidance at that url and it gave me
[root@ns1 ~]# host -tTXT 2.0.0.127.multi.uribl.com
2.0.0.127.multi.uribl.com descriptive text "127.0.0.1 -> Query Refused. 
See http://uribl.com/refused.shtml for more information [Your DNS IP: 
213.171.193.134]"


i guess my dns is set to use my isp's dns server.  do i need to set up 
dns relay on my machine so it comes from my ip?


there is no way we send more than 500k emails from our domain so i 
should qualify for the free lookup?


Yes.  Setup BIND, unbound, or pdns_recursor on your SA server that is 
not forwarding to another DNS server then set your /etc/resolv.conf or 
SA dns_server to 127.0.0.1.  This will make your DNS queries isolated 
from your IP to stay under their daily limit.


Keep in mind that if your SA box is behind NAT that is not dedicated to 
your server then other DNS queries could get combined with your shared 
public IP.  This is not likely since others are not going to query 
RBL/URIBL servers but it's possible.  If your SA server is directly on 
the Internet as an edge mail gateway then this won't be a problem.


--
David Jones


Re: training bayes database

2018-05-09 Thread Matthew Broadhead

On 09/05/18 16:37, Reindl Harald wrote:


Am 09.05.2018 um 16:28 schrieb Matthew Broadhead:

it looks like it is working.  so maybe it is just not flagging or moving
the spam?

in a differnt post you showed this status header which *clearly* shows
bayes is working - bayes alone don't flag, the total socre does, moving
don't happen at all on this layer - other software like sieve is
responsible for acting on the headers of a message

quoting URIBL_BLOCKED is a joke - setup a *recursion* *non-forwarding*
nameserver, no dnsmasq or such crap

http://uribl.com/refused.shtml

with your setup you excedd *obviously* rate-limits and have most
DNSBL/URIBL not working and so you can't expect useful results at all

X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2
 tests=[AM.WBL=-3, BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25,
 MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001,
 URIBL_BLOCKED=0.001, USER_IN_DEF_SPF_WL=-7.5]
 autolearn=ham autolearn_force=no


i followed the guidance at that url and it gave me
[root@ns1 ~]# host -tTXT 2.0.0.127.multi.uribl.com
2.0.0.127.multi.uribl.com descriptive text "127.0.0.1 -> Query Refused. 
See http://uribl.com/refused.shtml for more information [Your DNS IP: 
213.171.193.134]"


i guess my dns is set to use my isp's dns server.  do i need to set up 
dns relay on my machine so it comes from my ip?


there is no way we send more than 500k emails from our domain so i 
should qualify for the free lookup?


Re: training bayes database

2018-05-09 Thread Matthew Broadhead

On 09/05/18 16:37, Reindl Harald wrote:


Am 09.05.2018 um 16:28 schrieb Matthew Broadhead:

it looks like it is working.  so maybe it is just not flagging or moving
the spam?

in a differnt post you showed this status header which *clearly* shows
bayes is working - bayes alone don't flag, the total socre does, moving
don't happen at all on this layer - other software like sieve is
responsible for acting on the headers of a message

quoting URIBL_BLOCKED is a joke - setup a *recursion* *non-forwarding*
nameserver, no dnsmasq or such crap

http://uribl.com/refused.shtml

with your setup you excedd *obviously* rate-limits and have most
DNSBL/URIBL not working and so you can't expect useful results at all

X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2
 tests=[AM.WBL=-3, BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25,
 MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001,
 URIBL_BLOCKED=0.001, USER_IN_DEF_SPF_WL=-7.5]
 autolearn=ham autolearn_force=no

ah i got it...https://wiki.apache.org/spamassassin/CachingNameserver


Re: training bayes database

2018-05-09 Thread John Hardin

On Wed, 9 May 2018, Reio Remma wrote:


On 9 May 2018, at 18:33, John Hardin  wrote:

Also:


On Wed, 9 May 2018, Matthew Broadhead wrote:

your message has

X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2


Setting the threshold higher will result in more spam getting through. The 
scores calculated by the masscheck processes are based on the assumption that 
the threshold is set to 5.0

Is there some specific reason you set the threshold higher than 5.0?


IIRC 6.2 is the default in amavisd in CentOS 7.


Ah. Ok.

That's odd. I would presume Amavis has some mechanism to compensate for 
that.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Your mouse has moved. Your Windows Operating System must be
  relicensed due to this hardware change. Please contact Microsoft
  to obtain a new activation key. If this hardware change results in
  added functionality you may be subject to additional license fees.
  Your system will now shut down. Thank you for choosing Microsoft.
---
 405 days since the first commercial re-flight of an orbital booster (SpaceX)


Re: training bayes database

2018-05-09 Thread Reio Remma


> On 9 May 2018, at 18:33, John Hardin  wrote:
> 
> Also:
> 
>> On Wed, 9 May 2018, Matthew Broadhead wrote:
>> 
>> your message has
>> 
>> X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2
> 
> Setting the threshold higher will result in more spam getting through. The 
> scores calculated by the masscheck processes are based on the assumption that 
> the threshold is set to 5.0
> 
> Is there some specific reason you set the threshold higher than 5.0?

IIRC 6.2 is the default in amavisd in CentOS 7.

Reio


Re: training bayes database

2018-05-09 Thread John Hardin

Also:

On Wed, 9 May 2018, Matthew Broadhead wrote:


your message has

X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2


Setting the threshold higher will result in more spam getting through. The 
scores calculated by the masscheck processes are based on the assumption 
that the threshold is set to 5.0


Is there some specific reason you set the threshold higher than 5.0?

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Your mouse has moved. Your Windows Operating System must be
  relicensed due to this hardware change. Please contact Microsoft
  to obtain a new activation key. If this hardware change results in
  added functionality you may be subject to additional license fees.
  Your system will now shut down. Thank you for choosing Microsoft.
---
 405 days since the first commercial re-flight of an orbital booster (SpaceX)


Re: training bayes database

2018-05-09 Thread John Hardin

On Wed, 9 May 2018, Matthew Broadhead wrote:


[root@ns1 ~]# sudo -H -u amavis bash -c '/usr/bin/sa-learn --dump magic'
0.000  0  3  0  non-token data: bayes db version
0.000  0  32225  0  non-token data: nspam
0.000  0 440420  0  non-token data: nham


So you have a bunch of stuff trained, biased towards ham.


(3)

your message has

X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2
    tests=[AM.WBL=-3, BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25,


BAYES_00 - bayes *is* working and *is* seeing the trained data.

Can you provide the X-Spam-Status from an obvious spam that got through, 
for comparison?



    MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001,
    URIBL_BLOCKED=0.001, USER_IN_DEF_SPF_WL=-7.5]


Side note: you will want to set up a local recursive (NON-FORWARDING!!!) 
DNS server for the MTA's use so you avoid the URIBL_BLOCKED issue. That 
will help quite a lot.



    autolearn=ham autolearn_force=no

(4)

around 50 users.  they are all working in same industry


OK, that's small enough that manual training should not be an issue.

Speculation:

Autotrain has stongly biased your database towards ham.

I *assume* you didn't collect a manual initial training corpus, that you 
just turned on autotrain and let it run from scratch, and that you have no 
manual corpus available to evaluate and verify the ham/spam 
classification.


Recommendation:

(1) Turn off autotrain and autoexpire
(2) Collect and manually review several hundred ham and spam messages and 
do initial retraining from scratch using them

(3) Review Bayes performance
(4) Going forward, train using misses (e.g. a spam with BAYES < 50, or a 
ham with BAYES > 50) - add them to your retained training corpus


You may be able to recruit some clueful, responsible users to help with 
the training, but make sure you review what they submit unless you 
*really* trust their judgement.





On 08/05/18 21:08, John Hardin wrote:

On Tue, 8 May 2018, Matthew Broadhead wrote:

system setup centos-release-7-4.1708.el7.centos.x86_64, 
spamassassin-3.4.0-2.el7.x86_64, amavisd-new-2.11.0-3.el7.noarch


/etc/mail/spamassassin/local.cf:
required_hits 5
report_safe 0
rewrite_header Subject [SPAM]

use_bayes  1
bayes_auto_learn   1
bayes_auto_expire  1

# Store bayesian data in MySQL
bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn       DBI:mysql:sa_bayes:localhost:3306

it is storing the info to the database ok.  but it doesn't seem to be 
filtering any mail.


(1) What is the output of: /usr/bin/sa-learn --dump magic

(2) What user are you running sa-learn as for training, and what user is 
spamd running as?


(3) Are you seeing any BAYES_nn rule hits on messages at all, on either ham 
or spam?


(4) How large is your environment (rough # and diversity of users)?

I'm not familiar with SQL Bayes, others may have other 
questions/recommendations.


Some general comments:

I don't recommend using auto-learn for initial bayes training at least, 
particularly in smaller environments. Manual initial training with careful 
review, followed by manual training of misclassifications after review, is 
more reliable. Others may offer different advice, particularly for large 
installs with a diverse user community (which I don't manage).


Always keep your training corpora so that you can review and fix training 
errors, and wipe and retrain from scratch if Bayes goes completely off the 
rails for some reason.


If you're not auto-learning, auto-expire is not needed. If you *are*, it's 
recommended to expire from a scheduled job rather than take the hit from 
spamd.






--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Your mouse has moved. Your Windows Operating System must be
  relicensed due to this hardware change. Please contact Microsoft
  to obtain a new activation key. If this hardware change results in
  added functionality you may be subject to additional license fees.
  Your system will now shut down. Thank you for choosing Microsoft.
---
 405 days since the first commercial re-flight of an orbital booster (SpaceX)

Re: training bayes database

2018-05-09 Thread Matthew Broadhead

On 09/05/18 16:03, Reio Remma wrote:

On 09.05.18 16:59, Matthew Broadhead wrote:
setting log_level and sa_debug in /etc/amavisd/amavisd.conf didn't 
seem to make any difference. should i be doing it in 
/etc/mail/spamassassin/local.cf?


See if $sa_debug=1 works (for full debug)? (and restart amavisd).

Reio
ok now i am getting a lot of output.  what am i looking for in 
particular?  is it safe to post those logs on here?


I would grep through it looking for error, fail, warn and bayes. :)

Reio



ok i am getting output for bayes using
tail -f /var/log/maillog | grep bayes

May  9 15:25:07 ns1 amavis[15270]: (15270-01) SA dbg: bayes: database 
connection established
May  9 15:25:07 ns1 amavis[15270]: (15270-01) SA dbg: bayes: found bayes 
db version 3

May  9 15:25:07 ns1 amavis[15270]: (15270-01) SA dbg: bayes: Using userid: 1
May  9 15:25:07 ns1 amavis[15270]: (15270-01) SA dbg: bayes: corpus 
size: nspam = 32226, nham = 440969


then i get loads of
 SA dbg: bayes: header tokens for
and
SA dbg: bayes: token

it looks like it is working.  so maybe it is just not flagging or moving 
the spam?




Re: training bayes database

2018-05-09 Thread Reio Remma

On 09.05.18 16:59, Matthew Broadhead wrote:
setting log_level and sa_debug in /etc/amavisd/amavisd.conf didn't 
seem to make any difference. should i be doing it in 
/etc/mail/spamassassin/local.cf?


See if $sa_debug=1 works (for full debug)? (and restart amavisd).

Reio
ok now i am getting a lot of output.  what am i looking for in 
particular?  is it safe to post those logs on here?


I would grep through it looking for error, fail, warn and bayes. :)

Reio



Re: training bayes database

2018-05-09 Thread Matthew Broadhead

On 09/05/18 15:48, Reio Remma wrote:

On 09.05.18 16:33, Matthew Broadhead wrote:

On 08/05/18 21:53, Reio Remma wrote:

On 08.05.2018 22:08, John Hardin wrote:

On Tue, 8 May 2018, Matthew Broadhead wrote:

system setup centos-release-7-4.1708.el7.centos.x86_64, 
spamassassin-3.4.0-2.el7.x86_64, amavisd-new-2.11.0-3.el7.noarch


/etc/mail/spamassassin/local.cf:
required_hits 5
report_safe 0
rewrite_header Subject [SPAM]

use_bayes  1
bayes_auto_learn   1
bayes_auto_expire  1

# Store bayesian data in MySQL
bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn   DBI:mysql:sa_bayes:localhost:3306

it is storing the info to the database ok.  but it doesn't seem to 
be filtering any mail.


(1) What is the output of: /usr/bin/sa-learn --dump magic

(2) What user are you running sa-learn as for training, and what 
user is spamd running as?


(3) Are you seeing any BAYES_nn rule hits on messages at all, on 
either ham or spam?


You'll probably need to look at your amavisd-new config.

To debug SpamAssassin via amavisd, you need to set the following in 
amavisd.conf and then look at what's happening in /var/log/maillog


$log_level = 5;
$sa_debug = '1,bayes';

By not filtering do you mean bayes specifically isn't working, 
SpamAssassin in general isn't working via amavisd-new or ...?


Good luck,
Reio


setting log_level and sa_debug in /etc/amavisd/amavisd.conf didn't 
seem to make any difference.  should i be doing it in 
/etc/mail/spamassassin/local.cf?


See if $sa_debug=1 works (for full debug)? (and restart amavisd).

Reio
ok now i am getting a lot of output.  what am i looking for in 
particular?  is it safe to post those logs on here?


Re: training bayes database

2018-05-09 Thread Reio Remma

On 09.05.18 16:33, Matthew Broadhead wrote:

On 08/05/18 21:53, Reio Remma wrote:

On 08.05.2018 22:08, John Hardin wrote:

On Tue, 8 May 2018, Matthew Broadhead wrote:

system setup centos-release-7-4.1708.el7.centos.x86_64, 
spamassassin-3.4.0-2.el7.x86_64, amavisd-new-2.11.0-3.el7.noarch


/etc/mail/spamassassin/local.cf:
required_hits 5
report_safe 0
rewrite_header Subject [SPAM]

use_bayes  1
bayes_auto_learn   1
bayes_auto_expire  1

# Store bayesian data in MySQL
bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn   DBI:mysql:sa_bayes:localhost:3306

it is storing the info to the database ok.  but it doesn't seem to 
be filtering any mail.


(1) What is the output of: /usr/bin/sa-learn --dump magic

(2) What user are you running sa-learn as for training, and what 
user is spamd running as?


(3) Are you seeing any BAYES_nn rule hits on messages at all, on 
either ham or spam?


You'll probably need to look at your amavisd-new config.

To debug SpamAssassin via amavisd, you need to set the following in 
amavisd.conf and then look at what's happening in /var/log/maillog


$log_level = 5;
$sa_debug = '1,bayes';

By not filtering do you mean bayes specifically isn't working, 
SpamAssassin in general isn't working via amavisd-new or ...?


Good luck,
Reio


setting log_level and sa_debug in /etc/amavisd/amavisd.conf didn't 
seem to make any difference.  should i be doing it in 
/etc/mail/spamassassin/local.cf?


See if $sa_debug=1 works (for full debug)? (and restart amavisd).

Reio


Re: training bayes database

2018-05-09 Thread Matthew Broadhead

On 08/05/18 21:53, Reio Remma wrote:

On 08.05.2018 22:08, John Hardin wrote:

On Tue, 8 May 2018, Matthew Broadhead wrote:

system setup centos-release-7-4.1708.el7.centos.x86_64, 
spamassassin-3.4.0-2.el7.x86_64, amavisd-new-2.11.0-3.el7.noarch


/etc/mail/spamassassin/local.cf:
required_hits 5
report_safe 0
rewrite_header Subject [SPAM]

use_bayes  1
bayes_auto_learn   1
bayes_auto_expire  1

# Store bayesian data in MySQL
bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn       DBI:mysql:sa_bayes:localhost:3306

it is storing the info to the database ok.  but it doesn't seem to 
be filtering any mail.


(1) What is the output of: /usr/bin/sa-learn --dump magic

(2) What user are you running sa-learn as for training, and what user 
is spamd running as?


(3) Are you seeing any BAYES_nn rule hits on messages at all, on 
either ham or spam?


You'll probably need to look at your amavisd-new config.

To debug SpamAssassin via amavisd, you need to set the following in 
amavisd.conf and then look at what's happening in /var/log/maillog


$log_level = 5;
$sa_debug = '1,bayes';

By not filtering do you mean bayes specifically isn't working, 
SpamAssassin in general isn't working via amavisd-new or ...?


Good luck,
Reio


setting log_level and sa_debug in /etc/amavisd/amavisd.conf didn't seem 
to make any difference.  should i be doing it in 
/etc/mail/spamassassin/local.cf?


Re: training bayes database

2018-05-09 Thread Matthew Broadhead


On 09/05/18 09:09, Reio Remma wrote:

On 09.05.18 9:57, Matthew Broadhead wrote:

BAYES_00=-1.9


I've personally set *bayes_sql_override_username = amavis* in my local.cf

If at all possible, run amavisd with SA bayes debug to see if/how it's 
using the database.


Good luck,
Reio



Thanks Reio, I was just trying your debug suggestion now.  my local.cf 
had vmail commented out so i added amavis

#bayes_sql_override_username vmail
bayes_sql_override_username amavis


Re: training bayes database

2018-05-09 Thread Reio Remma

On 09.05.18 9:57, Matthew Broadhead wrote:

BAYES_00=-1.9


I've personally set *bayes_sql_override_username = amavis* in my local.cf

If at all possible, run amavisd with SA bayes debug to see if/how it's 
using the database.


Good luck,
Reio



Re: training bayes database

2018-05-08 Thread Matthew Broadhead

(1)

[root@ns1 ~]# sudo -H -u amavis bash -c '/usr/bin/sa-learn --dump magic'
0.000  0  3  0  non-token data: bayes db version
0.000  0  32225  0  non-token data: nspam
0.000  0 440420  0  non-token data: nham
0.000  0 159483  0  non-token data: ntokens
0.000  0 1525435204  0  non-token data: oldest atime
0.000  0 1525848687  0  non-token data: newest atime
0.000  0  0  0  non-token data: last journal 
sync atime

0.000  0 1525824089  0  non-token data: last expiry atime
0.000  0 443565  0  non-token data: last expire 
atime delta
0.000  0  0  0  non-token data: last expire 
reduction count


(2)

as you say i think it is amavis user

(3)

your message has

X-Spam-Status: No, score=-18.15 tagged_above=-999 required=6.2
    tests=[AM.WBL=-3, BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25,
    MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001,
    URIBL_BLOCKED=0.001, USER_IN_DEF_SPF_WL=-7.5]
    autolearn=ham autolearn_force=no

(4)

around 50 users.  they are all working in same industry


On 08/05/18 21:08, John Hardin wrote:

On Tue, 8 May 2018, Matthew Broadhead wrote:

system setup centos-release-7-4.1708.el7.centos.x86_64, 
spamassassin-3.4.0-2.el7.x86_64, amavisd-new-2.11.0-3.el7.noarch


/etc/mail/spamassassin/local.cf:
required_hits 5
report_safe 0
rewrite_header Subject [SPAM]

use_bayes  1
bayes_auto_learn   1
bayes_auto_expire  1

# Store bayesian data in MySQL
bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn       DBI:mysql:sa_bayes:localhost:3306

it is storing the info to the database ok.  but it doesn't seem to be 
filtering any mail.


(1) What is the output of: /usr/bin/sa-learn --dump magic

(2) What user are you running sa-learn as for training, and what user 
is spamd running as?


(3) Are you seeing any BAYES_nn rule hits on messages at all, on 
either ham or spam?


(4) How large is your environment (rough # and diversity of users)?

I'm not familiar with SQL Bayes, others may have other 
questions/recommendations.


Some general comments:

I don't recommend using auto-learn for initial bayes training at 
least, particularly in smaller environments. Manual initial training 
with careful review, followed by manual training of misclassifications 
after review, is more reliable. Others may offer different advice, 
particularly for large installs with a diverse user community (which I 
don't manage).


Always keep your training corpora so that you can review and fix 
training errors, and wipe and retrain from scratch if Bayes goes 
completely off the rails for some reason.


If you're not auto-learning, auto-expire is not needed. If you *are*, 
it's recommended to expire from a scheduled job rather than take the 
hit from spamd.






Re: training bayes database

2018-05-08 Thread John Hardin

On Tue, 8 May 2018, Reio Remma wrote:


On 08.05.2018 22:08, John Hardin wrote:

On Tue, 8 May 2018, Matthew Broadhead wrote:

system setup centos-release-7-4.1708.el7.centos.x86_64, 
spamassassin-3.4.0-2.el7.x86_64, amavisd-new-2.11.0-3.el7.noarch


/etc/mail/spamassassin/local.cf:
required_hits 5
report_safe 0
rewrite_header Subject [SPAM]

use_bayes  1
bayes_auto_learn   1
bayes_auto_expire  1

# Store bayesian data in MySQL
bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn       DBI:mysql:sa_bayes:localhost:3306

it is storing the info to the database ok.  but it doesn't seem to be 
filtering any mail.


(1) What is the output of: /usr/bin/sa-learn --dump magic

(2) What user are you running sa-learn as for training, and what user is 
spamd running as?


(3) Are you seeing any BAYES_nn rule hits on messages at all, on either ham 
or spam?


You'll probably need to look at your amavisd-new config.


Ugh, I missed that detail, sorry. My question 2 is just "what user are you 
running sa-learn as?" amavis is obviously running as amavisd... :)


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Maxim XXXVII: There is no 'overkill.' There is only 'open fire' and
'time to reload.'
---
 Today: the 73rd anniversary of VE day

Re: training bayes database

2018-05-08 Thread Reio Remma

On 08.05.2018 22:08, John Hardin wrote:

On Tue, 8 May 2018, Matthew Broadhead wrote:

system setup centos-release-7-4.1708.el7.centos.x86_64, 
spamassassin-3.4.0-2.el7.x86_64, amavisd-new-2.11.0-3.el7.noarch


/etc/mail/spamassassin/local.cf:
required_hits 5
report_safe 0
rewrite_header Subject [SPAM]

use_bayes  1
bayes_auto_learn   1
bayes_auto_expire  1

# Store bayesian data in MySQL
bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn       DBI:mysql:sa_bayes:localhost:3306

it is storing the info to the database ok.  but it doesn't seem to be 
filtering any mail.


(1) What is the output of: /usr/bin/sa-learn --dump magic

(2) What user are you running sa-learn as for training, and what user 
is spamd running as?


(3) Are you seeing any BAYES_nn rule hits on messages at all, on 
either ham or spam?


You'll probably need to look at your amavisd-new config.

To debug SpamAssassin via amavisd, you need to set the following in 
amavisd.conf and then look at what's happening in /var/log/maillog


$log_level = 5;
$sa_debug = '1,bayes';

By not filtering do you mean bayes specifically isn't working, 
SpamAssassin in general isn't working via amavisd-new or ...?


Good luck,
Reio


Re: training bayes database

2018-05-08 Thread John Hardin

On Tue, 8 May 2018, Matthew Broadhead wrote:

system setup centos-release-7-4.1708.el7.centos.x86_64, 
spamassassin-3.4.0-2.el7.x86_64, amavisd-new-2.11.0-3.el7.noarch


/etc/mail/spamassassin/local.cf:
required_hits 5
report_safe 0
rewrite_header Subject [SPAM]

use_bayes  1
bayes_auto_learn   1
bayes_auto_expire  1

# Store bayesian data in MySQL
bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn       DBI:mysql:sa_bayes:localhost:3306

it is storing the info to the database ok.  but it doesn't seem to be 
filtering any mail.


(1) What is the output of: /usr/bin/sa-learn --dump magic

(2) What user are you running sa-learn as for training, and what user is 
spamd running as?


(3) Are you seeing any BAYES_nn rule hits on messages at all, on either 
ham or spam?


(4) How large is your environment (rough # and diversity of users)?

I'm not familiar with SQL Bayes, others may have other 
questions/recommendations.


Some general comments:

I don't recommend using auto-learn for initial bayes training at least, 
particularly in smaller environments. Manual initial training with careful 
review, followed by manual training of misclassifications after review, is 
more reliable. Others may offer different advice, particularly for large 
installs with a diverse user community (which I don't manage).


Always keep your training corpora so that you can review and fix training 
errors, and wipe and retrain from scratch if Bayes goes completely off the 
rails for some reason.


If you're not auto-learning, auto-expire is not needed. If you *are*, it's 
recommended to expire from a scheduled job rather than take the hit from 
spamd.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Gun Control enables genocide while doing little to reduce crime.
---
 Today: the 73rd anniversary of VE day

training bayes database

2018-05-08 Thread Matthew Broadhead
system setup centos-release-7-4.1708.el7.centos.x86_64, 
spamassassin-3.4.0-2.el7.x86_64, amavisd-new-2.11.0-3.el7.noarch


/etc/mail/spamassassin/local.cf:
required_hits 5
report_safe 0
rewrite_header Subject [SPAM]

use_bayes  1
bayes_auto_learn   1
bayes_auto_expire  1

# Store bayesian data in MySQL
bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn       DBI:mysql:sa_bayes:localhost:3306

it is storing the info to the database ok.  but it doesn't seem to be 
filtering any mail.