On Tue, Nov 15, 2005, Bill Campbell wrote:

> I had a customer e-mail me today saying that their mail system had stopped
> processing mail last night, finding that amavisd had stopped.  Larry sent
> me an excerpt from the amavisd log that indicated that the TCP process
> couldn't bind because the port was already in use.
>
> This sounded like a problem that might occur when a process is restarted
> using rc run control, and looking at the rc.amavisd script, sure enough it
> does a restart during log processing.
>
> There is a two second sleep in the %restart section of the rc.amavisd run
> control script.  I think it might be a good idea to bump this
> significantly, say to 20 seconds or so.

Hmmm... I tried it multiple times on rm0.openpkg.net and amavis seems
to stop fine and easily within 2 seconds. Ok, it's no problem to
increase the time a little bit, but 20 seconds are rather long. Ok, for
restarting it could be ok to have longer delays, but in general one has
to be very carefully to not stretch the delays in rc scripts too much,
because I know of servers with over a dozend OpenPKG instances and lots
of daemons in each instance and where cleanly shutting down the machine
is not possible without explicitly increasing the shutdown timeout in
init(8) ;-) I would recommend that you retry with 4 seconds or perhaps 8
seconds and see what happens. I guess this should be enough.

> This would explain several occassions where our main mailing list server
> here stopped working with a dead amavisd process.  That machine isn't
> particularly fast, and the load average can get pretty high when delivering
> large Mailman lists.

Ok, a slow machine and a high load certainly can cause the 2 second
delay to be too less. Ok, I've now comitted a 4 second delay and if your
tests show that we really need even more we can easily bump it up again,
of course. I just want to avoid increasing the delays too much without a
definitive need.

> I think this also is applicable to apache as I've seen many instances where
> it takes longer than the 2 second sleep time in the apache %stop section
> before all the apache processes are complete.

Yes, that's correct. I've now added an extra delay of 4 seconds to the
"restart" part of rc.apache. This way there is an effective delay of 6
instead of 2 seconds now on Apache restarts and still just the 2 seconds
delay on plain stops.
                                       Ralf S. Engelschall
                                       [EMAIL PROTECTED]
                                       www.engelschall.com

______________________________________________________________________
The OpenPKG Project                                    www.openpkg.org
Developer Communication List                   openpkg-dev@openpkg.org

Reply via email to