Hi Alex, You could try running amavisd in foreground with strace (records system calls) until the error occurs. Maybe then you can extract from the end of the logs of what is going wrong.
For example: "strace amavisd foreground >strace-amavisd.log 2>&1" But be carful, the log file size will increase very fast. Btw. I had a similar problem where amavisd would not reload as I forget to set the group id of the /etc/amavisd.conf to the user under which amavisd is running in our environment. Regards, Tilman -----Ursprüngliche Nachricht----- Von: Alex [mailto:[email protected]] Gesendet: Mittwoch, 25. April 2012 18:31 An: [email protected]; Mark Martinec Betreff: Re: Amavis just stops and exits Hi, >>> > I have a fedora15 server with amavisd-new-2.6.6-1 and >>> > spamassassin-3.3.2-7 and occasionally amavisd just exits >>> > and has to be restarted. >> >>> > Apr 20 10:38:33 mail amavis[8442]: >>> > logging initialized, log level 2, syslog: amavis.mail >>> > Apr 20 10:38:33 mail amavis[8442]: >>> > Valid PID file (younger than sys uptime 14 6:55:00) >> >>> The second event looks loks like a normal restart: the process >>> 17612 is shutting down, while the new 8442 is starting. >>> Something is sending a restart signal to amavisd, perhaps >>> some cron job or a log file rotator. >> >> P.S.: >> >> if the new starting process (like the 8442 from your log) >> does not come up but fails during a start, a possible reason >> is that some files (like a config file or DKIM keys file) is >> too strongly protected, so a start as root succeeds, but a >> start with reduced privileges fails. > > After your suggestions, I'm really thinking it's due to an unreliable > "service amavisd reload" on fedora15, and instead amavisd existing and > not being restarted. It happened again today, and it looks like there are a few stuck amavisd processes: # ps ax|grep amavisd 944 ? Ssl 483:00 clamd.amavisd -c /etc/clamd.d/amavisd.conf --pid /var/run/clamd.amavisd/clamd.pid 10877 ? S 0:00 smtp -n smtp-amavis -t unix -u -o smtp_data_done_timeout=1200 -o smtp_send_xforward_command=yes -o disable_dns_lookups=yes -o max_use=20 11086 pts/0 S+ 0:00 grep --color=auto amavisd 25025 ? S 0:39 amavisd (ch20-finish) 26842 ? S 0:36 amavisd (ch20-finish) 28176 ? S 1:58 amavisd (ch20-finish) 28255 ? S 1:30 amavisd (ch15-finish) The remaining processes wouldn't exit with just kill, and required "killall -9 amavisd" to kill them. There were then also these entries in the logs: Apr 25 11:58:49 portal amavis[10966]: (10966-01-3) (!)TempDir removal: tempdir is to be PRESERVED: /var/spool/amavisd/tmp/amavis-20120425T115816-10966 Apr 25 11:58:49 portal amavis[10017]: (10017-01-23) (!)TempDir removal: tempdir is to be PRESERVED: /var/spool/amavisd/tmp/amavis-20120425T115402-10017 What could be the cause of amavisd just going catatonic? Thanks, Alex
