RE: spamassassin working very poorly

2014-10-09 Thread Nick
Wanted to send an update. After successfully getting Bayes to read through 
hand-sorted SPAM and HAM, as well as getting URIBL working, it appears 
spamassassin is working MUCH, MUCH better. For example, compare the original: 
(http://i.imgur.com/CRzX9Mu.jpg) to the SPAM I've received today: 
(http://i.imgur.com/wjUvfLj.jpg). It's catching pretty much 100% of SPAM, and 
so far, no false-positives. Thanks to the helpful members of this list!

 -Nick 

-Original Message-
From: Nick [mailto:n...@aryfi.com] 
Sent: Friday, October 03, 2014 3:48 PM
To: users@spamassassin.apache.org
Subject: RE: spamassassin working very poorly

Thanks guys, I just trained in 2089 legitimate ham messages, so hopefully that 
will do the trick. And also thanks to you John, as I didn't even see that 
URIBL_BLOCKED. I've setup a local recursion DNS server, which seems to have 
taken care of it.  Crossing my fingers that this has a positive impact on 
things. I'll update after some time has gone by.

 - Nick

-Original Message-
From: Reindl Harald [mailto:h.rei...@thelounge.net] 
Sent: Friday, October 03, 2014 3:17 PM
To: users@spamassassin.apache.org
Subject: Re: spamassassin working very poorly


Am 03.10.2014 um 21:07 schrieb Nick:
> Over the last few months, spamassassin has begun barely working for me

spammers also learn

> SPAM is so bad that I've actually started training it - which is 
> something I've never had to do in the past. So I've collected 370+ 
> e-mails over the last few days, and had sa-learn regularly read in 
> these messages Training it doesn't seem to have made any impact.

if you only train spam samples nothing will happen

you need *at least* 200 ham samples to start bayes get used and you really 
really don't want it any other way because it would kill all your legit mail - 
the filter needs to know differences and not every single word appeared in the 
spam-only samples to give a spam score

you need to careful floow this:
https://wiki.apache.org/spamassassin/BayesInSpamAssassin

> X-Spam-Status: No, score=1.1 required=5.0 
> tests=HTML_FONT_LOW_CONTRAST, 
> HTML_MESSAGE,MIME_HTML_ONLY,SPF_PASS,T_REMOTE_IMAGE,T_RP_MATCHES_RCVD,
> URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0

there is no BAYES tag and so it is not used



Re: spamassassin working very poorly

2014-10-08 Thread Bowie Bailey


On 10/8/2014 2:58 PM, Nick wrote:

Thanks Bowie, sure enough the actual spamd process was running as a different 
user. The file to configure the user it runs under is 
/etc/syscofnig/spamassassin


But it still should have used the bayes_path option from local.cf 
regardless of the user.  Unless it was running as a user that couldn't 
read the bayes_path directory...not sure what would happen in that case.


Out of curiosity, what options are being fed to spamd?  The simplest way 
to get them is probably to grab the line from the "ps -ef" list.


--
Bowie


RE: spamassassin working very poorly

2014-10-08 Thread Nick
Thanks Bowie, sure enough the actual spamd process was running as a different 
user. The file to configure the user it runs under is 
/etc/syscofnig/spamassassin

So I'm now seeing Bayes show up in the mail headers!

Many thanks,
Nick

-Original Message-
From: Bowie Bailey [mailto:bowie_bai...@buc.com] 
Sent: Wednesday, October 08, 2014 2:31 PM
To: users@spamassassin.apache.org
Subject: Re: spamassassin working very poorly

On 10/8/2014 2:13 PM, Nick wrote:
> In postfix, I'm calling spamassassin with the 2 lines:
> smtp  inet  n   -   n   -   -   smtpd -o 
> content_filter=spamassassin
> spamassassin unix - n   n   -   -   pipe flags=R 
> user=spamd argv=/usr/bin/spamc -e /usr/sbin/sendmail -oi -f ${sender} 
> ${recipient}

This shows spamc being called.

> In /etc/cron.d/sa-learn I have:
> 51 * * * * spamd sa-learn --spam /var/log/spamassassin/SPAM/ 
> >/dev/null 2>&1
> 52 * * * * spamd sa-learn --ham /var/log/spamassassin/HAM/ >/dev/null 
> 2>&1 (/var/log/spamassassin is spamd's home directory, and it's where 
> the SPAM/HAM is getting copied for learning)

This shows sa-learn being called.

> My /etc/mail/spamassassin/local.cf file is:
> required_hits 5
> report_safe 0
> rewrite_header Subject [SPAM]
> required_score 5.0
> use_bayes 1
> use_bayes_rules 1
> bayes_auto_learn 0
> bayes_path /var/log/spamassassin/.spamassassin/bayes

And this shows a site-wide bayes db, which should be used by both spamd and 
sa-learn regardless of user.

But I still don't see how you start spamd.  For CentOS, it should be started by 
/etc/init.d/spamd (or something similar).  There may also be options defined in 
/etc/sysconfig/spamd (or similar).

--
Bowie


Re: spamassassin working very poorly

2014-10-08 Thread Bowie Bailey

On 10/8/2014 2:13 PM, Nick wrote:

In postfix, I'm calling spamassassin with the 2 lines:
smtp  inet  n   -   n   -   -   smtpd -o 
content_filter=spamassassin
spamassassin unix - n   n   -   -   pipe flags=R user=spamd 
argv=/usr/bin/spamc -e /usr/sbin/sendmail -oi -f ${sender} ${recipient}


This shows spamc being called.


In /etc/cron.d/sa-learn I have:
51 * * * * spamd sa-learn --spam /var/log/spamassassin/SPAM/ >/dev/null 2>&1
52 * * * * spamd sa-learn --ham /var/log/spamassassin/HAM/ >/dev/null 2>&1
(/var/log/spamassassin is spamd's home directory, and it's where the SPAM/HAM 
is getting copied for learning)


This shows sa-learn being called.


My /etc/mail/spamassassin/local.cf file is:
required_hits 5
report_safe 0
rewrite_header Subject [SPAM]
required_score 5.0
use_bayes 1
use_bayes_rules 1
bayes_auto_learn 0
bayes_path /var/log/spamassassin/.spamassassin/bayes


And this shows a site-wide bayes db, which should be used by both spamd 
and sa-learn regardless of user.


But I still don't see how you start spamd.  For CentOS, it should be 
started by /etc/init.d/spamd (or something similar).  There may also be 
options defined in /etc/sysconfig/spamd (or similar).


--
Bowie


RE: spamassassin working very poorly

2014-10-08 Thread Nick
In postfix, I'm calling spamassassin with the 2 lines:
smtp  inet  n   -   n   -   -   smtpd -o 
content_filter=spamassassin
spamassassin unix - n   n   -   -   pipe flags=R user=spamd 
argv=/usr/bin/spamc -e /usr/sbin/sendmail -oi -f ${sender} ${recipient}

In /etc/cron.d/sa-learn I have:
51 * * * * spamd sa-learn --spam /var/log/spamassassin/SPAM/ >/dev/null 2>&1
52 * * * * spamd sa-learn --ham /var/log/spamassassin/HAM/ >/dev/null 2>&1
(/var/log/spamassassin is spamd's home directory, and it's where the SPAM/HAM 
is getting copied for learning)

My /etc/mail/spamassassin/local.cf file is:
required_hits 5
report_safe 0
rewrite_header Subject [SPAM]
required_score 5.0
use_bayes 1
use_bayes_rules 1
bayes_auto_learn 0
bayes_path /var/log/spamassassin/.spamassassin/bayes

Would the above config make spamassassin run as the spamd user? (It's CentOS 
6.5) I've verified the Bayes database is good and populated for user spamd.

Thanks,
Nick



-Original Message-
From: Bowie Bailey [mailto:bowie_bai...@buc.com] 
Sent: Wednesday, October 08, 2014 11:35 AM
To: users@spamassassin.apache.org
Subject: Re: spamassassin working very poorly

On 10/8/2014 11:15 AM, Nick wrote:
> I seem to be catching a lot more SPAM, but no matter what I try, it seems 
> Bayes isn't getting utilized. I have ~700 SPAMS and 2400 HAMS. When I run 
> "spamassassin -D --lint" (as the same user Postfix is running spamc as), it 
> comes back with a report that seems to utilize Bayes, but when normal e-mail 
> flows through, I don't see any indication of Bayes in the headers. Also, when 
> I run "sa-learn --dump magic" (as user spamd), I can see that nspam and nham 
> are correct. I've also tried setting bayes_path, but still no Bayes in the 
> headers. Any idea what could be wrong? Here is a most recent header:
>
> http://pastebin.com/J6TbrVG8 (had to use pastebin as the mailing list 
> was rejecting me!)

So...

1) The Bayes DB for spamd has enough ham and spam to be used
2) Incoming email does not get a Bayes result

This means that the SA process scanning your mail is NOT using the same 
database you are querying.  The most common cause of this is running sa-learn 
as the wrong user.  Are you 100% sure spamd is running as the spamd user?  It 
may also be possible for config options or run-time flags to affect which 
database is being used. Double-check your config and the options you are 
passing to spamd.

--
Bowie


Re: spamassassin working very poorly

2014-10-08 Thread Bowie Bailey

On 10/8/2014 11:15 AM, Nick wrote:

I seem to be catching a lot more SPAM, but no matter what I try, it seems Bayes isn't getting 
utilized. I have ~700 SPAMS and 2400 HAMS. When I run "spamassassin -D --lint" (as the 
same user Postfix is running spamc as), it comes back with a report that seems to utilize Bayes, 
but when normal e-mail flows through, I don't see any indication of Bayes in the headers. Also, 
when I run "sa-learn --dump magic" (as user spamd), I can see that nspam and nham are 
correct. I've also tried setting bayes_path, but still no Bayes in the headers. Any idea what could 
be wrong? Here is a most recent header:

http://pastebin.com/J6TbrVG8 (had to use pastebin as the mailing list was 
rejecting me!)


So...

1) The Bayes DB for spamd has enough ham and spam to be used
2) Incoming email does not get a Bayes result

This means that the SA process scanning your mail is NOT using the same 
database you are querying.  The most common cause of this is running 
sa-learn as the wrong user.  Are you 100% sure spamd is running as the 
spamd user?  It may also be possible for config options or run-time 
flags to affect which database is being used. Double-check your config 
and the options you are passing to spamd.


--
Bowie


Re: spamassassin working very poorly

2014-10-08 Thread Axb

On 10/08/2014 05:15 PM, Nick wrote:

I seem to be catching a lot more SPAM, but no matter what I try, it
seems Bayes isn't getting utilized. I have ~700 SPAMS and 2400 HAMS.
When I run "spamassassin -D --lint" (as the same user Postfix is
running spamc as), it comes back with a report that seems to utilize
Bayes, but when normal e-mail flows through, I don't see any
indication of Bayes in the headers. Also, when I run "sa-learn --dump
magic" (as user spamd), I can see that nspam and nham are correct.
I've also tried setting bayes_path, but still no Bayes in the
headers. Any idea what could be wrong? Here is a most recent header:

http://pastebin.com/J6TbrVG8 (had to use pastebin as the mailing list
was rejecting me!)



What options are you using in your spamd init script ?




RE: spamassassin working very poorly

2014-10-08 Thread Nick
I seem to be catching a lot more SPAM, but no matter what I try, it seems Bayes 
isn't getting utilized. I have ~700 SPAMS and 2400 HAMS. When I run 
"spamassassin -D --lint" (as the same user Postfix is running spamc as), it 
comes back with a report that seems to utilize Bayes, but when normal e-mail 
flows through, I don't see any indication of Bayes in the headers. Also, when I 
run "sa-learn --dump magic" (as user spamd), I can see that nspam and nham are 
correct. I've also tried setting bayes_path, but still no Bayes in the headers. 
Any idea what could be wrong? Here is a most recent header:

http://pastebin.com/J6TbrVG8 (had to use pastebin as the mailing list was 
rejecting me!)

Thanks,
Nick

-Original Message-
From: Reindl Harald [mailto:h.rei...@thelounge.net] 
Sent: Saturday, October 04, 2014 12:47 PM
To: users@spamassassin.apache.org
Subject: Re: spamassassin working very poorly


Am 04.10.2014 um 18:36 schrieb andybalholm:
> On Oct 4, 2014, at 4:39 AM, Benny Pedersen-2 wrote:
>
>  > So anti spammer would now stop reading here ? :)
>
> No, but I sometimes wonder if it’s wise to post my anti-spam ideas here,
> since that makes it easier for spammers to work around them

a valid point

on the other if you post your ideas as well as get the ideas from others 
and people implement the combination of all the ideas

well at the end it makes spammers life harder and i still did not give 
up the idea that sooner or later spam dies because it may become no 
longer a business case

frankly i *every* MX out there would implement Postscreen or something 
else let any new IP wait 10 seconds before answer with REJECT for 
whatever reason and even if the cient is on the 7-days-whitelist for 
this test wait 2 seconds before try to receive data i doubt that it 
would be a business case

simple mathematics how much mail you in theory can deliver in a 
timeframe while completly ignore filters at that calculation

that combind with every ISP close outgoing port 25 for endusers and 
force them to use 587 with smtp-out as well as start every endusers PTR 
with "dynamic-" until one said "i run a mailserver here and need 25 
opened as well as PTR xyz" and spam would be dead from one day to the 
next leaving only hacked real accounts which can be fixed with abuse 
mails and blacklist straight away everybody bouncing on postmaster/abuse

there are enough weapons to let spam die completly if every mailadmin 
and every tech people on ISP sides takes 30 minutes for brainstorming 
how to solve the problem and starts to act



Re: spamassassin working very poorly

2014-10-04 Thread Reindl Harald


Am 04.10.2014 um 18:36 schrieb andybalholm:

On Oct 4, 2014, at 4:39 AM, Benny Pedersen-2 wrote:

 > So anti spammer would now stop reading here ? :)

No, but I sometimes wonder if it’s wise to post my anti-spam ideas here,
since that makes it easier for spammers to work around them


a valid point

on the other if you post your ideas as well as get the ideas from others 
and people implement the combination of all the ideas


well at the end it makes spammers life harder and i still did not give 
up the idea that sooner or later spam dies because it may become no 
longer a business case


frankly i *every* MX out there would implement Postscreen or something 
else let any new IP wait 10 seconds before answer with REJECT for 
whatever reason and even if the cient is on the 7-days-whitelist for 
this test wait 2 seconds before try to receive data i doubt that it 
would be a business case


simple mathematics how much mail you in theory can deliver in a 
timeframe while completly ignore filters at that calculation


that combind with every ISP close outgoing port 25 for endusers and 
force them to use 587 with smtp-out as well as start every endusers PTR 
with "dynamic-" until one said "i run a mailserver here and need 25 
opened as well as PTR xyz" and spam would be dead from one day to the 
next leaving only hacked real accounts which can be fixed with abuse 
mails and blacklist straight away everybody bouncing on postmaster/abuse


there are enough weapons to let spam die completly if every mailadmin 
and every tech people on ISP sides takes 30 minutes for brainstorming 
how to solve the problem and starts to act




signature.asc
Description: OpenPGP digital signature


Re: spamassassin working very poorly

2014-10-04 Thread andybalholm

On Oct 4, 2014, at 4:39 AM, Benny Pedersen-2 wrote:

> So anti spammer would now stop reading here ? :) 

No, but I sometimes wonder if it’s wise to post my anti-spam ideas here, since 
that makes it easier for spammers to work around them…



--
View this message in context: 
http://spamassassin.1065346.n5.nabble.com/spamassassin-working-very-poorly-tp112068p112109.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Re: spamassassin working very poorly

2014-10-04 Thread Benny Pedersen

On October 4, 2014 12:42:15 AM andybalholm  wrote:


> Spammers also learn.

I'm pretty sure some of them read this list. (I sure would if I were a
spammer.)


So anti spammer would now stop reading here ? :)


Re: spamassassin working very poorly

2014-10-03 Thread andybalholm
> Spammers also learn.

I'm pretty sure some of them read this list. (I sure would if I were a
spammer.)



--
View this message in context: 
http://spamassassin.1065346.n5.nabble.com/spamassassin-working-very-poorly-tp112068p112080.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


RE: spamassassin working very poorly

2014-10-03 Thread Nick
Thanks guys, I just trained in 2089 legitimate ham messages, so hopefully that 
will do the trick. And also thanks to you John, as I didn't even see that 
URIBL_BLOCKED. I've setup a local recursion DNS server, which seems to have 
taken care of it.  Crossing my fingers that this has a positive impact on 
things. I'll update after some time has gone by.

 - Nick

-Original Message-
From: Reindl Harald [mailto:h.rei...@thelounge.net] 
Sent: Friday, October 03, 2014 3:17 PM
To: users@spamassassin.apache.org
Subject: Re: spamassassin working very poorly


Am 03.10.2014 um 21:07 schrieb Nick:
> Over the last few months, spamassassin has begun barely working for me

spammers also learn

> SPAM is so bad that I've actually started training it - which is 
> something I've never had to do in the past. So I've collected 370+ 
> e-mails over the last few days, and had sa-learn regularly read in 
> these messages Training it doesn't seem to have made any impact.

if you only train spam samples nothing will happen

you need *at least* 200 ham samples to start bayes get used and you really 
really don't want it any other way because it would kill all your legit mail - 
the filter needs to know differences and not every single word appeared in the 
spam-only samples to give a spam score

you need to careful floow this:
https://wiki.apache.org/spamassassin/BayesInSpamAssassin

> X-Spam-Status: No, score=1.1 required=5.0 
> tests=HTML_FONT_LOW_CONTRAST, 
> HTML_MESSAGE,MIME_HTML_ONLY,SPF_PASS,T_REMOTE_IMAGE,T_RP_MATCHES_RCVD,
> URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0

there is no BAYES tag and so it is not used



Re: spamassassin working very poorly

2014-10-03 Thread John Hardin

On Fri, 3 Oct 2014, Nick wrote:


X-Spam-Status: No, score=1.1 required=5.0 tests=HTML_FONT_LOW_CONTRAST,
HTML_MESSAGE,MIME_HTML_ONLY,SPF_PASS,T_REMOTE_IMAGE,T_RP_MATCHES_RCVD,
URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0


URIBL_BLOCKED = set up a local recursing (NOT forwarding!) name server for 
your mail subsystem (MTA + SA).


You're currently using a forwarding nameserver that is forwarding to an 
upstream nameserver that is aggregating your URIBL query traffic with 
others' to the degree that the free usage limit is exceeded.


And, as already noted, train ham as well. No BAYES_* hits at all means 
bayes is either disabled, or not sufficiently trained.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
   "A well educated Electorate, being necessary to the liberty of a
free State, the Right of the People to Keep and Read Books,
shall not be infringed."
  ...means only registered voters can read books, and only those books
  obtained with State permission from State-controlled bookstores?
---
 Tomorrow: the 10th anniversary of SpaceshipOne winning the X-prize


Re: spamassassin working very poorly

2014-10-03 Thread Bowie Bailey

On 10/3/2014 3:07 PM, Nick wrote:

Over the last few months, spamassassin has begun barely working for me. SPAM is 
so bad that I've actually started training it - which is something I've never 
had to do in the past. So I've collected 370+ e-mails over the last few days, 
and had sa-learn regularly read in these messages. Training it doesn't seem to 
have made any impact.

It's adding the header information. Here is the header from a spam that just 
got through:


X-Spam-Status: No, score=1.1 required=5.0 tests=HTML_FONT_LOW_CONTRAST,
HTML_MESSAGE,MIME_HTML_ONLY,SPF_PASS,T_REMOTE_IMAGE,T_RP_MATCHES_RCVD,
URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0


No Bayes rule matched.  This means one of three things:

1) You have disabled Bayes, in which case learning will do nothing.

2) You are only training on spam and have not yet trained the minimum 
200 ham for Bayes to start scoring.  You have to train regularly on both 
ham and spam for best results.


3) You are training the wrong database.  Make sure you are running 
sa-learn as the same user SpamAssassin is running as.


--
Bowie


Re: spamassassin working very poorly

2014-10-03 Thread Reindl Harald

Am 03.10.2014 um 21:07 schrieb Nick:
> Over the last few months, spamassassin has begun barely working for me

spammers also learn

> SPAM is so bad that I've actually started training it - which is something 
> I've never had to do in the past. So I've collected 370+ e-mails over the 
> last few days, and had sa-learn regularly read in these messages
> Training it doesn't seem to have made any impact.

if you only train spam samples nothing will happen

you need *at least* 200 ham samples to start bayes get
used and you really really don't want it any other way
because it would kill all your legit mail - the filter
needs to know differences and not every single word
appeared in the spam-only samples to give a spam score

you need to careful floow this:
https://wiki.apache.org/spamassassin/BayesInSpamAssassin

> X-Spam-Status: No, score=1.1 required=5.0 tests=HTML_FONT_LOW_CONTRAST,
> HTML_MESSAGE,MIME_HTML_ONLY,SPF_PASS,T_REMOTE_IMAGE,T_RP_MATCHES_RCVD,
> URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0

there is no BAYES tag and so it is not used



signature.asc
Description: OpenPGP digital signature