Patrick,

I'm using Amavisd-new for years with success on MX servers @work, but few days 
ago I've noticed few timeouts during email reception. I'm using Postfix 2.10.0 
and amavisd-new-2.8.0_2 on FreeBSD, in before queue content filtering.
The timeout occurs only for two different sender-recipient couple. I've 
activated amavisd-new debuging for both sender, to get some data, but I can't 
find the culprit. Any help would be greatly appreciated.

(I'm subscribed to digest, feel free to Cc me)

Postfix side of the error:

Jul 15 13:20:26 host postfix/smtpd[60906]: connect from 
sender.domain.tld[xxx.yyy.130.2]
Jul 15 13:20:26 host postfix/smtpd[60906]: NOQUEUE: 
client=sender.domain.tld[xxx.yyy.130.2]
Jul 15 13:22:57 host postfix/smtpd[60906]: warning: timeout talking to proxy 
127.0.0.1:10024
Jul 15 13:22:57 host postfix/smtpd[60906]: proxy-reject: END-OF-MESSAGE: 451 4.3.0 Error: queue 
file write error; from=<sen...@domain.tld> to=<use...@at-work.tld> proto=ESMTP 
helo=<sender.domain.tld>
Jul 15 13:24:33 host postfix/smtpd[60906]: disconnect from 
sender.domain.tld[xxx.yyy.130.2]

Postfix has a timeout on a proxy content filter set to 2 minutes.

Amavisd-new debug logs:

Jul 15 13:20:26 host amavis[58256]: (58256-16)
  get_deadline switch_to_my_time(rx SMTP RCPT) - deadline in 480.0 s, set to 
336.000 s

Your amavisd timeout is set to 8 minutes.

Set the $child_timeout to a value less than the postfix timeout
(time in seconds),
e.g.  $child_timeout = 90;

or increase the postfix timeout (although the 120 seconds may
already be on a high side for some impatient SMTP clients,
so I wouldn't increase it).

Jul 15 13:20:27 host amavis[58256]: (58256-16) CALLING SA check (0)
Jul 15 13:22:57 host postfix/smtpd[60906]: warning: timeout talking to proxy 
127.0.0.1:10024
Jul 15 13:23:20 host amavis[58256]: (58256-16) DONE SA check (0)

Here you can see that postfix gave up after 2 minutes
and SpamAssassin took 3 minutes to complete.


Jul 15 13:23:20 host amavis[58256]: (58256-16) TIMING-SA total 173319 ms- 
parse: 27 (0.0%), extract_message_metadata: 578 (0.3%), get_uri_detail_list: 
834 (0.5%), tests_pri_-1000: 18 (0.0%), tests_pri_-950: 3 (0.0%), 
tests_pri_-900: 3 (0.0%), tests_pri_-400: 9418 (5.4%), check_bayes: 9393 
(5.4%), tests_pri_0: 162515 (93.8%), check_dkim_adsp: 5 (0.0%), check_razor2: 
517 (0.3%), check_pyzor: 187 (0.1%), tests_pri_500: 13 (0.0%), tests_pri_899: 
260 (0.2%), check_crm114: 256 (0.1%), learn: 456 (0.3%), get_report: 1.33 (0.0%)

Bayes took 9 seconds which seems high.
Most of the time was taken by rules at priority 0.
It could indicate a busy host, or a message hitting
some worst case situation in a regexp of some rule.
Capturing such message and running it from am
command line to spamassassin -D  may tell more.

  Mark

Reply via email to