+10000 to this. the serial behavior of supervisor for batched up commands is really frustrating.
what i've done to get around is stupidly hacky supervisorctl restart job1 & supervisorctl restart job2 & etc... wait On Fri, Dec 2, 2011 at 2:13 PM, Timothy Jones <[email protected]> wrote: > Hello supervisor users.... > > In my environment, it is taking a long time for supervisor to shutdown a > large number of child processes in response to clicking the STOP ALL button > in the GUI. Each of our child processes is typically in the middle of batch > processing, so the behavior that we have sought and achieved is for each > process to finish what it is doing when the terminate signal is received, > then exit cleanly. It may be 2 to 5 minutes for this processing to stop, > which is acceptable to us. (We chose SIGUSR2 because the child processes > are Java JVMs, and Java traps SIGINT/SIGTERM for Its own purposes). > > Since supervisor is designed to catch unexpected terminations of any of its > child processes at any time, I would have expected supervisord to send a > SIGUSR2 to all processes at once and wait for each SIGCHLD signal to arrive > from each of the children. > > However, what I think we are seeing is that supervisord enters a loop where > it sends the configured signal to the first process, then waits for the first > processed to terminate, and ONLY THEN move on to terminate the second > process, wait for it to exit, and so on. This is taking a very long time, > because of the long time I must allow for each process to have to finish its > current processing. I found an illuminating comment in rpcinterface.py (at > then end of method make_allfunc) that confirms this symptom, but doesn't > offer a lot of hope for an easy fix. > > What about an alternate approach where the list of programs has the > 'autorestart' property changed to false, at least in memory, then the > termination signal is send to all processes all at once, and then as the > SIGCHLD signals arrive, then the code that normally handles SIGCHLD signals > would not do an autorestart... Wouldn't this achieve the goal of shutting > down all the processes more quickly, and still allow them to terminate in any > order? > > > > tlj > > _______________________________________________ > Supervisor-users mailing list > [email protected] > http://lists.supervisord.org/mailman/listinfo/supervisor-users _______________________________________________ Supervisor-users mailing list [email protected] http://lists.supervisord.org/mailman/listinfo/supervisor-users
