spamd 'exceeded time limit' inquiry

2019-11-29 Thread Tom H
Hey everyone,

I have a few questions about something I'm encountering with spamd.

I've noticed cases where a bounce message to the server results in spamd
'exceeded time limit'. It reaches the limit of 300+ seconds. In this
particular case, the message size is 564kB and contains an attachment.
Normally, such a message size would not cause spamd to 'hang' for 5 minutes.

I've pulled the bounce message in question to troubleshoot, and noticed that
with spamc it does not recognize the attachment (no attachment rules hit).
When scanning the original message (non bounce), spamc does recognize the
attachment. I also noticed a significantly longer wait time using spamc with
the bounce message compared to the original message.

I also used the 'HitFreqsRuleTiming' plugin to see the performance of rule
scan time between the bounce message and original message. I noticed that
the bounce message had rules taking 4+ seconds (upstream rules such as
__FILL_THIS_FORM_SHORT2 and __FILL_THIS_FORM_LONG2) , while this was not the
case in the original message


I have two questions:

1) By default, does SpamAssassin *not* decode/scan the base64 of the
attachment?

2) Is the longer scan time of the 'bounce' message due to SpamAssassin
scanning the attachment text lines in a way that it normally would not if it
had recognized that it is an attachment?


Thanks in advance.






--
Sent from: http://spamassassin.1065346.n5.nabble.com/SpamAssassin-Users-f3.html


Re: Yet another simple question - how to reprocess an email

2019-11-29 Thread RW
On Fri, 29 Nov 2019 08:21:39 -0500
Joe Acquisto-j4 wrote:

> >>>  
> > On Thu, 2019-11-28 at 22:12 -0500, Joe Acquisto-j4 wrote:  


> > So, I switched to getmail and these problems
> > went away. Getmail worked just fine using the MDA script I wrote for
> > fetchmail and its configuration file is similar to the fetchmail
> > one. 
> >>
> >> /usr/bin/spamc -s 75 < test.txt | /usr/sbin/sendmail -t -i "$@"
> >>   
> > Good. I'm pleased that works for you.
> > 
> > Martin  
> 
> I basically just copied that line from master.cf and altered it to
> eliminate some things it complained about.  Not perfect, as the
> "From" in the resultant message is enclosed in "<>", and the log
> complains about unknown user (running as), but I can look into that.
> 
> I noticed that fetchmail behavior as well, in earlier versions.
> Since I am now  a few revisions behind, not only with fetchmail, I
> may give getmail a look.

FWIW the main reason getmail was created was to avoid having retrieved
mail fed into an mta. getmail can pass mail through multiple filters
(e.g. spamc) itself and then deliver to a maildir or an mda. If you use
something like dovecot-lda with sieve filtering you don't need need
anything else in in-between.


Re: Bayes

2019-11-29 Thread Jerry Malcolm
Can I bump this one to the top again?  I had great bayes reports in 
every email for 30 minutes.  Then nothing for the last three days, even 
after restarting SA.  Is it possible that my bayes db got corrupted?  
The sa-learn --dump magic looks ok as far as I can tell.


Thanks for any suggestions.

Jerry

On 11/26/2019 11:15 PM, Jerry Malcolm wrote:


This is getting stranger by the minute... After playing around and 
verifying permissions and everything, I actually started getting a 
bayes score item for each email.  So I celebrated and went to dinner.  
Came back a few hours later and checked the logs.  Bayes consistently 
added a score line to every email for exactly 30 minutes of run time.  
Then for the last ~5 hours nothing.  It's like the bayes component 
crashed, but SA kept running otherwise.  Mail is still being processed 
via SA and being scored as usual.  But bayes hasn't participated in 
any mail for ~5 hours.  I'm running spamd with -D option.  There was a 
lot of bayes activity in the log while it was participating. Now 
periodically I get "tie-ing to DB file R/O 
/home/spamd/bayes/bayes_toks" and "found bayes db version 3" in the 
log.  That's still occurring.  But nothing else related to bayes in 
the log after the first 30 minutes.


Anybody have any idea about why it would work for 30 minutes, then 
just bypass from then on?


Jerry

On 11/26/2019 6:29 PM, Jerry Malcolm wrote:

On 11/25/2019 3:02 PM, Mikael Syska wrote:

Try and run:
sa-learn --dump magic

Should give you some information like:
0.000  0  3  0  non-token data: bayes db version
0.000  0 493422  0  non-token data: nspam
0.000  0    3867414  0  non-token data: nham
0.000  0 937781  0  non-token data: ntokens
0.000  0 1573870989  0  non-token data: oldest atime
0.000  0 1574715499  0  non-token data: newest atime
0.000  0  0  0  non-token data: last journal 
sync atime
0.000  0 1574712064  0  non-token data: last expiry 
atime
0.000  0 812640  0  non-token data: last expire 
atime delta
0.000  0 204342  0  non-token data: last expire 
reduction count
Thank you so much for the suggestion.  I tried running it as user 
spamd: sa-learn -u spamd --dump magic


I got this response:  ERROR: Bayes dump returned an error, please 
re-run with -D for more information.   I ran it with -D and got the 
following relevant lines:


Nov 27 00:13:23.325 [17354] dbg: bayes: no dbs present, cannot tie DB 
R/O: /home/spamd/bayes/bayes_toks
Nov 27 00:13:23.326 [17354] dbg: bayes: no dbs present, cannot tie DB 
R/O: /home/spamd/bayes/bayes_toks


Just for fun, I tried sudo su and reran it, and got the output 
below.   So for whatever reason, either I'm not specifying to run 
sa-learn as spamd user correctly or user spamd appears to not have 
permissions to its own home dir.  I did check that the bayes folder 
and all bayes_* files inside it were owned by spamd.  Any ideas?  BTW 
the /var/log/maillog shows that spamd itself is indeed finding and 
using the db in that directory.


Thx

Here's the sudo su output:

[root@ip-172-31-47-84 ec2-user]# sa-learn --dump magic
0.000  0  3  0  non-token data: bayes db version
0.000  0  51391  0  non-token data: nspam
0.000  0   6813  0  non-token data: nham
0.000  0 134303  0  non-token data: ntokens
0.000  0 1566344099  0  non-token data: oldest atime
0.000  0 1568249462  0  non-token data: newest atime
0.000  0  0  0  non-token data: last journal 
sync atime

0.000  0  0  0  non-token data: last expiry atime
0.000  0  0  0  non-token data: last expire 
atime delta
0.000  0  0  0  non-token data: last expire 
reduction count


Do note that every user has it's own DB ... can be overriden with
bayes_sql_override_username bayes

So if either "spamd" user or your own runs it  ... you should get 
the same result ...


( I'm actually running it in PgSQL, can't remember what the file 
based is called )


mvh
Mikael Syska


 On Sun, 24 Nov 2019 19:35:16 +0100 *Jerry Malcolm 
* wrote 


Thanks again to everyone who helped me get everything up and
running
over the past couple of days.

Now that I have SA finding my bayes database, I'm curious about
bayes
reporting.  My bayes db is not new.  I migrated my previous
bayes db
from my old installation.  And I've been training it with ~100+
new spam
messages in the past day.  But I'm concerned that I'm not seeing
anything related to bayes in any scoring report. Nothing on any
spam
message or good message over the past hundreds of message that have
arrived since getting bayes up and running in SA.  I would
  

Re: Yet another simple question - how to reprocess an email

2019-11-29 Thread Joe Acquisto-j4
>>>
> On Thu, 2019-11-28 at 22:12 -0500, Joe Acquisto-j4 wrote:
>> I use fetchmail on a different box to pull mail from several
>> accounts at an ISP and send those messages to the SA/postfix box.
>>
> OK, more similar to my setup, then, than I'd guessed.
> 
> FWIW I used to use fetchmail, but found bugs, such as periodically
> having to delete old messages from the ISP mailbox which fetchmail had
> failed to delete. So, I switched to getmail and these problems went
> away. Getmail worked just fine using the MDA script I wrote for
> fetchmail and its configuration file is similar to the fetchmail one.
>   
>>
>> /usr/bin/spamc -s 75 < test.txt | /usr/sbin/sendmail -t -i "$@"
>> 
> Good. I'm pleased that works for you.
> 
> Martin

I basically just copied that line from master.cf and altered it to eliminate
some things it complained about.  Not perfect, as the "From" in the 
resultant message is enclosed in "<>", and the log complains about 
unknown user (running as), but I can look into that.

I noticed that fetchmail behavior as well, in earlier versions.
Since I am now  a few revisions behind, not only with fetchmail, I may give
getmail a look.



-- 
+++
 joea@@j4computers.com
  https://www.j4computers.com
   845-687-3734
+++


Re: Yet another simple question - how to reprocess an email

2019-11-29 Thread Martin Gregorie
On Thu, 2019-11-28 at 22:12 -0500, Joe Acquisto-j4 wrote:
> I use fetchmail on a different box to pull mail from several
> accounts at an ISP and send those messages to the SA/postfix box.
>
OK, more similar to my setup, then, than I'd guessed.

FWIW I used to use fetchmail, but found bugs, such as periodically
having to delete old messages from the ISP mailbox which fetchmail had
failed to delete. So, I switched to getmail and these problems went
away. Getmail worked just fine using the MDA script I wrote for
fetchmail and its configuration file is similar to the fetchmail one.
  
>
> /usr/bin/spamc -s 75 < test.txt | /usr/sbin/sendmail -t -i "$@"
> 
Good. I'm pleased that works for you.

Martin