We are seeing an intermittent issue in our Postfix logs where we see all
outbound threads (smtp) stop delivering email or logging anything while
the active queue continues to grow. This indicates to me that all
active smtp threads are hanging, since nothing from the smtp threads are
recorded in the logs at all. During this time, inbound email is coming
in fine and smtpd continues to log activity, while the smtp threads
slowly die one by one, over the course of several minutes. Once all smtp
threads finally die, the number of smtp threads instantly jumps to the
max of 110 and outbound email delivery (and logging) continues.
We are going to try to catch it when it's actually happening so that we
can run an strace on one of these hung smtp processes... but I'm curious
if these symptoms are something others have seen before and could share
some insights as to the possible cause. What I'm most curious about is
why would Postfix wait for all existing smtp threads to die before
spawning new threads to handle a rapidly growing active queue?
We are running Postfix 2.9.1.
Thanks,
Curtis