hi, there. I'm using supervisor to control two processes as a group, let's name it A and B. Process A depends on B, so we should start B first and stop it last, and we should set in the configuration file: priority B > priority A, and it is set.
The problem is, when I stop the group, I wish to send A a USR1 to let it do some clean up, this would take about 60 seconds, after that, a KILL is sent to A to really terminate it. This is fine, we set stopwaitsecs in configuration file and it's done. But since I stopped it as a group, B received a TERM as soon as A received a USR1, and B terminated immediately, which in my opinion, is not desirable, for I've set stopwaitsecs in A, and I've set the priority. I think supervisor should send a USR1 to A, and after that 60 seconds wait, kill A with KILL, then send a TERM to B. Here's my configuration file excerpts: [group:somegroup] programs=A,B priority=15 [program:A] command=/home/xiaket/A priority=10 autostart=false autorestart=false startsecs=0 stopsignal=USR1 stopwaitsecs= 60 stopasgroup = true [program:B] command=/home/xiaket/B priority=100 autostart=false autorestart=false startsecs=0 stopasgroup = true Here's some debuginfo --------------------- 2013-04-08 11:11:25,046 BLAT read event caused by <socket._socketobject object at 0x80382c2f0> 2013-04-08 11:11:25,047 BLAT read event caused by <socket._socketobject object at 0x803963360> 2013-04-08 11:11:25,048 TRAC XML-RPC method called: supervisor.getVersion() 2013-04-08 11:11:25,048 TRAC XML-RPC method supervisor.getVersion() returned successfully 2013-04-08 11:11:25,048 TRAC localhost:0 - - [08/Apr/2013:03:11:25 +0800] "POST /RPC2 HTTP/1.1" 200 251 2013-04-08 11:11:25,048 BLAT write event caused by <socket._socketobject object at 0x803963360> 2013-04-08 11:11:25,050 BLAT read event caused by <socket._socketobject object at 0x803963360> 2013-04-08 11:11:25,050 BLAT read event caused by <socket._socketobject object at 0x80382c2f0> 2013-04-08 11:11:25,050 BLAT read event caused by <socket._socketobject object at 0x803963440> 2013-04-08 11:11:25,051 TRAC XML-RPC method called: supervisor.stopProcessGroup() 2013-04-08 11:11:25,051 TRAC XML-RPC method supervisor.stopProcessGroup() returned successfully 2013-04-08 11:11:25,051 DEBG killing A (pid 53926) process group with signal SIGUSR1 2013-04-08 11:11:25,051 BLAT write event caused by <socket._socketobject object at 0x803963440> 2013-04-08 11:11:25,052 DEBG killing B (pid 53927) process group with signal SIGTERM 2013-04-08 11:11:25,052 BLAT read event caused by <POutputDispatcher at 34419983856 for <Subprocess at 34415711368 with name B in state STOPPING> (stdout)> 2013-04-08 11:11:25,052 DEBG fd 11 closed, stopped monitoring <POutputDispatcher at 34419983856 for <Subprocess at 34415711368 with name B in state STOPPING> (stdout)> 2013-04-08 11:11:25,052 BLAT read event caused by <POutputDispatcher at 34419984072 for <Subprocess at 34415711368 with name B in state STOPPING> (stderr)> 2013-04-08 11:11:25,052 DEBG fd 15 closed, stopped monitoring <POutputDispatcher at 34419984072 for <Subprocess at 34415711368 with name B in state STOPPING> (stderr)> 2013-04-08 11:11:25,052 INFO stopped: B (terminated by SIGTERM) 2013-04-08 11:11:25,052 DEBG received SIGCHLD indicating a child quit 2013-04-08 11:11:26,053 BLAT write event caused by <socket._socketobject object at 0x803963440> 2013-04-08 11:11:26,065 BLAT read event caused by <POutputDispatcher at 34419983496 for <Subprocess at 34415712160 with name A in state STOPPING> (stdout)> 2013-04-08 11:11:26,068 DEBG 'A' stdout output: B error:17 arg:0x800617c08 2013-04-08 11:11:26,070 BLAT read event caused by <POutputDispatcher at 34419983496 for <Subprocess at 34415712160 with name A in state STOPPING> (stdout)> 2013-04-08 11:11:26,076 DEBG fd 8 closed, stopped monitoring <POutputDispatcher at 34419983496 for <Subprocess at 34415712160 with name A in state STOPPING> (stdout)> 2013-04-08 11:11:26,076 BLAT read event caused by <POutputDispatcher at 34419983424 for <Subprocess at 34415712160 with name A in state STOPPING> (stderr)> 2013-04-08 11:11:26,076 DEBG fd 10 closed, stopped monitoring <POutputDispatcher at 34419983424 for <Subprocess at 34415712160 with name A in state STOPPING> (stderr)> 2013-04-08 11:11:26,076 INFO stopped: A (exit status 0) 2013-04-08 11:11:26,076 DEBG received SIGCHLD indicating a child quit I think we could just block further process kills if we have a priority list and one of the process have a stopwaitsecs property. Thanks for you patience. 夏 恺(Xia Kai) http://blog.xiaket.org/ _______________________________________________ Supervisor-users mailing list [email protected] https://lists.supervisord.org/mailman/listinfo/supervisor-users
