Johannes Erdfelt writes:
On Thu, Dec 06, 2001, Gordon Messmer [EMAIL PROTECTED] wrote:
On Thu, 6 Dec 2001, Johannes Erdfelt wrote:
The mail server is busy much of the time, but I don't think it's busy
enough to naturally hit the respawnhi timeout. It looks like somehow
courier missed that a child finished and that's why it hit the respawnhi
timeout.
I was wrong about that. The child processes are still legitimately
running. As fate would have it just as I started this email, I was pulled
in to some mail server issues and noticed that the respawnhi thing had
happened again. All of the couriersmtp processes were stuck in a read()
system call on fd 5. I have the control file from a couple, and there are
lots of DNS failures recorded.
It's much too late to do any debugging right now, but I'll be over this
tomorrow. In any case, it's not that courierd isn't harvesting children,
it's that the children are blocking on an unprotected read(). (I thought
they all had alarms in place... /me shrugs)
I checked for any running processes, but I couldn't find any. I do have
lots of courier related process running (authdaemon, pop and imap) so I
may have missed one.
Either way, my system sat for 6 hours or so doing nothing. If you're
right that there was a process still running, something is missing a
timeout.
I wonder what the longest timeout is. I guess presumably the respawnhi
could happen at a time right after a legitimate process is spawned which
then needs to timeout to a client, there will always be the chance that
courier just stops delivering email for a while.
respawnhi seems to need some sort of timeout, even if it's extremely
long.
The server is designed to restart itself only when no mail is pending.
The problem is that the client should not be stuck like that. There's a
select() before every read from the socket, so if anything, it should be
stuck in a select().
Get the date of the stuck message, and review your logs to see if there are
any errors in syslog around that time, or a little bit later.
--
Sam
___
courier-users mailing list
[EMAIL PROTECTED]
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users