Hi Ternece,
Thank you for your response. I'm sure AM had received the request from the TwillController. Here is the log: 2016-08-09T18:25:00,100Z INFO c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - sleep instanceid 0 2016-08-09T18:25:00,080Z INFO c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - sleep instanceid 1 2016-08-09T18:25:00,878Z INFO c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - sleep instanceid 2 2016-08-09T18:25:00,907Z INFO c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] [runnable-command-executor] CppShellRunnable:handleCommand(CppShellRunnable.java:98) - handle command SimpleCommand{command='instances', options={count=2}} instanceid 1 2016-08-09T18:25:00,886Z INFO o.a.t.i.a.ApplicationMasterService [cp01-yarn-test2] [message-callback] ApplicationMasterService:handleSetInstances(ApplicationMasterService.java:735) - Received change instances request for fetcher, from 3 to 2. 2016-08-09T18:25:00,888Z INFO o.a.t.i.a.ApplicationMasterService [cp01-yarn-test2] [instanceChanger] ApplicationMasterService$6:run(ApplicationMasterService.java:756) - Processing change instance request for fetcher, from 3 to 2. 2016-08-09T18:25:00,890Z INFO o.a.t.i.a.ApplicationMasterService [cp01-yarn-test2] [instanceChanger] ApplicationMasterService$6:run(ApplicationMasterService.java:760) - Confirmed 3 containers running for fetcher. 2016-08-09T18:25:00,891Z INFO o.a.t.i.a.RunningContainers [cp01-yarn-test2] [instanceChanger] RunningContainers:removeInstanceById(RunningContainers.java:226) - Stopping service: fetcher fbd0c443-d7b5-4292-a18b-144510c499c4-2 2016-08-09T18:25:00,919Z INFO o.a.t.i.a.ApplicationMasterService [cp01-yarn-test2] [instanceChanger] ApplicationMasterService$6:run(ApplicationMasterService.java:776) - Change instances request completed. From 3 to 2. 2016-08-09T18:25:00,936Z INFO c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] [runnable-command-executor] CppShellRunnable:handleCommand(CppShellRunnable.java:98) - handle command SimpleCommand{command='instances', options={count=2}} instanceid 0 2016-08-09T18:25:01,100Z INFO c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - sleep instanceid 0 2016-08-09T18:25:01,081Z INFO c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - sleep instanceid 1 For fear of my work cannot be terminated, I just make it sleep in the Twill Runnable and log the instance number. I've checked that the TwillLauncher process is still running. I've no idea why this process not be killed. Also, the restart api cannot kill the process either but just launch the new instances. Thanks! Haosu Guo ------------------ ???????? ------------------ ??????: "chtyim";<cht...@apache.org>; ????????: 2016??8??10??(??????) ????1:50 ??????: "...the end"<549198...@qq.com>; "dev"<dev@twill.apache.org>; ????: Re: A question about Twill changing instances number Hi Haosu, Do you have the application master log? It tells whether the AM actually received the request and tries to terminate the container. Also, if you have access to the cluster, please check if the actual container process is actually terminated or not. We've seen cases that the container process is actually still running due to unterminated user thread, hence causing the container never returns back to YARN. Ternece On Tue, Aug 9, 2016 at 2:22 AM, ...the end <549198...@qq.com> wrote: hi Terence?? I'm a user of Apache Twill and now I have a question about changing the instance number. I'm using twill-incubating-0.7.0.0, yarn-2.6.4, zookeeper-3.4.6. When I gonna to increase the number of instances, it runs well. But when I try to decrease the instances, I think there is something wrong. Here is the log: 16:48:53.365 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x155de15272f0105, packet:: clientPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg serverPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg finished:false header:: 21,1 replyHeader:: 21,3320,0 request:: '/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg,#7b2274797065223a2253595354454d222c2273636f7065223a2252554e4e41424c45222c2272756e6e61626c654e616d65223a2266657463686572222c22636f6d6d616e64223a7b22636f6d6d616e64223a22696e7374616e636573222c226f7074696f6e73223a7b22636f756e74223a2233227d7d7d,v{s{31,s{'world,'anyone}}},2 response:: '/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001 16:48:53.369 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x155de15272f0105, packet:: clientPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001 serverPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001 finished:false header:: 22,3 replyHeader:: 22,3320,0 request:: '/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001,T response:: s{3320,3320,1470732533363,1470732533363,0,0,0,0,119,0,3320} 16:48:56.706 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 0x155de15272f0105 after 0ms 16:48:59.354 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got notification sessionid:0x155de15272f0105 16:48:59.354 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got WatchedEvent state:SyncConnected type:NodeDeleted path:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001 for sessionid 0x155de15272f0105 But the containers Running numbers I get from the 'Nodes of the Cluster' page from yarn is not decreased. I use the api like this: Future<Integer> future = twillController.changeInstances(name, num); JsonObject result = new JsonObject(); try { int newCount = future.get(); result.addProperty("status", 0); result.addProperty("new_count", newCount); } catch (InterruptedException | ExecutionException e) { result.addProperty("status", -1); result.addProperty("errMsg", e.getMessage()); LOG.error("set container number error", e.getMessage()); } Do you have any idea about why this not work? Hoping for your response. Thank you! Haosu Guo