hi, there.

  I'm using supervisor to control two processes as a group, let's name it A and 
B. Process A depends on B, so we should start B first and stop it last, and we 
should set in the configuration file: priority B > priority A, and it is set.

  The problem is, when I stop the group, I wish to send A a USR1 to let it do 
some clean up, this would take about 60 seconds, after that, a KILL is sent to 
A to really terminate it. This is fine, we set stopwaitsecs in configuration 
file and it's done. But since I stopped it as a group, B received a TERM as 
soon as A received a USR1, and B terminated immediately, which in my opinion, 
is not desirable, for I've set stopwaitsecs in A, and I've set the priority. I 
think supervisor should send a USR1 to A, and after that 60 seconds wait, kill 
A with KILL, then send a TERM to B.


Here's my configuration file excerpts:


[group:somegroup]
programs=A,B
priority=15

[program:A]
command=/home/xiaket/A
priority=10
autostart=false
autorestart=false
startsecs=0
stopsignal=USR1
stopwaitsecs= 60
stopasgroup = true
[program:B]
command=/home/xiaket/B
priority=100
autostart=false
autorestart=false
startsecs=0
stopasgroup = true




Here's some debuginfo
---------------------

2013-04-08 11:11:25,046 BLAT read event caused by <socket._socketobject object 
at 0x80382c2f0>
2013-04-08 11:11:25,047 BLAT read event caused by <socket._socketobject object 
at 0x803963360>
2013-04-08 11:11:25,048 TRAC XML-RPC method called: supervisor.getVersion()
2013-04-08 11:11:25,048 TRAC XML-RPC method supervisor.getVersion() returned 
successfully
2013-04-08 11:11:25,048 TRAC localhost:0 - - [08/Apr/2013:03:11:25 +0800] "POST 
/RPC2 HTTP/1.1" 200 251
2013-04-08 11:11:25,048 BLAT write event caused by <socket._socketobject object 
at 0x803963360>
2013-04-08 11:11:25,050 BLAT read event caused by <socket._socketobject object 
at 0x803963360>
2013-04-08 11:11:25,050 BLAT read event caused by <socket._socketobject object 
at 0x80382c2f0>
2013-04-08 11:11:25,050 BLAT read event caused by <socket._socketobject object 
at 0x803963440>
2013-04-08 11:11:25,051 TRAC XML-RPC method called: 
supervisor.stopProcessGroup()
2013-04-08 11:11:25,051 TRAC XML-RPC method supervisor.stopProcessGroup() 
returned successfully
2013-04-08 11:11:25,051 DEBG killing A (pid 53926) process group with signal 
SIGUSR1
2013-04-08 11:11:25,051 BLAT write event caused by <socket._socketobject object 
at 0x803963440>
2013-04-08 11:11:25,052 DEBG killing B (pid 53927) process group with signal 
SIGTERM
2013-04-08 11:11:25,052 BLAT read event caused by <POutputDispatcher at 
34419983856 for <Subprocess at 34415711368 with name B in state STOPPING> 
(stdout)>
2013-04-08 11:11:25,052 DEBG fd 11 closed, stopped monitoring 
<POutputDispatcher at 34419983856 for <Subprocess at 34415711368 with name B in 
state STOPPING> (stdout)>
2013-04-08 11:11:25,052 BLAT read event caused by <POutputDispatcher at 
34419984072 for <Subprocess at 34415711368 with name B in state STOPPING> 
(stderr)>
2013-04-08 11:11:25,052 DEBG fd 15 closed, stopped monitoring 
<POutputDispatcher at 34419984072 for <Subprocess at 34415711368 with name B in 
state STOPPING> (stderr)>
2013-04-08 11:11:25,052 INFO stopped: B (terminated by SIGTERM)
2013-04-08 11:11:25,052 DEBG received SIGCHLD indicating a child quit
2013-04-08 11:11:26,053 BLAT write event caused by <socket._socketobject object 
at 0x803963440>
2013-04-08 11:11:26,065 BLAT read event caused by <POutputDispatcher at 
34419983496 for <Subprocess at 34415712160 with name A in state STOPPING> 
(stdout)>
2013-04-08 11:11:26,068 DEBG 'A' stdout output:

B error:17 arg:0x800617c08

2013-04-08 11:11:26,070 BLAT read event caused by <POutputDispatcher at 
34419983496 for <Subprocess at 34415712160 with name A in state STOPPING> 
(stdout)>
2013-04-08 11:11:26,076 DEBG fd 8 closed, stopped monitoring <POutputDispatcher 
at 34419983496 for <Subprocess at 34415712160 with name A in state STOPPING> 
(stdout)>
2013-04-08 11:11:26,076 BLAT read event caused by <POutputDispatcher at 
34419983424 for <Subprocess at 34415712160 with name A in state STOPPING> 
(stderr)>
2013-04-08 11:11:26,076 DEBG fd 10 closed, stopped monitoring 
<POutputDispatcher at 34419983424 for <Subprocess at 34415712160 with name A in 
state STOPPING> (stderr)>
2013-04-08 11:11:26,076 INFO stopped: A (exit status 0)
2013-04-08 11:11:26,076 DEBG received SIGCHLD indicating a child quit



I think we could just block further process kills if we have a priority list 
and one of the process have a stopwaitsecs property.

Thanks for you patience.

夏 恺(Xia Kai)
http://blog.xiaket.org/




_______________________________________________
Supervisor-users mailing list
[email protected]
https://lists.supervisord.org/mailman/listinfo/supervisor-users

Reply via email to