Hi Ternece,

Thank you for your response. I'm sure AM had received the request from the 
TwillController. Here is the log:


 2016-08-09T18:25:00,100Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] 
[TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - sleep 
instanceid 0
2016-08-09T18:25:00,080Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] 
[TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - sleep 
instanceid 1
2016-08-09T18:25:00,878Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] 
[TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - sleep 
instanceid 2
2016-08-09T18:25:00,907Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] 
[runnable-command-executor] 
CppShellRunnable:handleCommand(CppShellRunnable.java:98) - handle command 
SimpleCommand{command='instances', options={count=2}} instanceid 1
2016-08-09T18:25:00,886Z INFO  o.a.t.i.a.ApplicationMasterService 
[cp01-yarn-test2] [message-callback] 
ApplicationMasterService:handleSetInstances(ApplicationMasterService.java:735) 
- Received change instances request for fetcher, from 3 to 2.
2016-08-09T18:25:00,888Z INFO  o.a.t.i.a.ApplicationMasterService 
[cp01-yarn-test2] [instanceChanger] 
ApplicationMasterService$6:run(ApplicationMasterService.java:756) - Processing 
change instance request for fetcher, from 3 to 2.
2016-08-09T18:25:00,890Z INFO  o.a.t.i.a.ApplicationMasterService 
[cp01-yarn-test2] [instanceChanger] 
ApplicationMasterService$6:run(ApplicationMasterService.java:760) - Confirmed 3 
containers running for fetcher.
2016-08-09T18:25:00,891Z INFO  o.a.t.i.a.RunningContainers [cp01-yarn-test2] 
[instanceChanger] 
RunningContainers:removeInstanceById(RunningContainers.java:226) - Stopping 
service: fetcher fbd0c443-d7b5-4292-a18b-144510c499c4-2
2016-08-09T18:25:00,919Z INFO  o.a.t.i.a.ApplicationMasterService 
[cp01-yarn-test2] [instanceChanger] 
ApplicationMasterService$6:run(ApplicationMasterService.java:776) - Change 
instances request completed. From 3 to 2.
2016-08-09T18:25:00,936Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] 
[runnable-command-executor] 
CppShellRunnable:handleCommand(CppShellRunnable.java:98) - handle command 
SimpleCommand{command='instances', options={count=2}} instanceid 0
2016-08-09T18:25:01,100Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] 
[TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - sleep 
instanceid 0
2016-08-09T18:25:01,081Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] 
[TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - sleep 
instanceid 1


For fear of my work cannot be terminated, I just make it sleep in the Twill 
Runnable and log the instance number. I've checked that the TwillLauncher 
process is still running. I've no idea why this process not be killed.


Also, the restart api cannot kill the process either but just launch the new 
instances.


Thanks!


Haosu Guo


------------------ ???????? ------------------
??????: "chtyim";<cht...@apache.org>;
????????: 2016??8??10??(??????) ????1:50
??????: "...the end"<549198...@qq.com>; "dev"<dev@twill.apache.org>; 

????: Re: A question about Twill changing instances number



Hi Haosu,

Do you have the application master log? It tells whether the AM actually 
received the request and tries to terminate the container. Also, if you have 
access to the cluster, please check if the actual container process is actually 
terminated or not. We've seen cases that the container process is actually 
still running due to unterminated user thread, hence causing the container 
never returns back to YARN.


Ternece


On Tue, Aug 9, 2016 at 2:22 AM, ...the end <549198...@qq.com> wrote:
hi Terence??


I'm a user of Apache Twill and now I have a question about changing the 
instance number.


I'm using twill-incubating-0.7.0.0, yarn-2.6.4, zookeeper-3.4.6. When I gonna 
to increase the number of instances, it runs well. But when I try to decrease 
the instances, I think there is something wrong.


Here is the log:
16:48:53.365 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x155de15272f0105, 
packet:: 
clientPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg 
serverPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg 
finished:false header:: 21,1  replyHeader:: 21,3320,0  request:: 
'/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg,#7b2274797065223a2253595354454d222c2273636f7065223a2252554e4e41424c45222c2272756e6e61626c654e616d65223a2266657463686572222c22636f6d6d616e64223a7b22636f6d6d616e64223a22696e7374616e636573222c226f7074696f6e73223a7b22636f756e74223a2233227d7d7d,v{s{31,s{'world,'anyone}}},2
  response:: 
'/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001
16:48:53.369 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x155de15272f0105, 
packet:: 
clientPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001
 
serverPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001
 finished:false header:: 22,3  replyHeader:: 22,3320,0  request:: 
'/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001,T 
 response:: s{3320,3320,1470732533363,1470732533363,0,0,0,0,119,0,3320}
16:48:56.706 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
0x155de15272f0105 after 0ms
16:48:59.354 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
org.apache.zookeeper.ClientCnxn - Got notification sessionid:0x155de15272f0105
16:48:59.354 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
org.apache.zookeeper.ClientCnxn - Got WatchedEvent state:SyncConnected 
type:NodeDeleted 
path:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001
 for sessionid 0x155de15272f0105



But the containers Running numbers I get from the 'Nodes of the Cluster' page 
from yarn is not decreased.


I use the api like this:
 
        Future<Integer> future = twillController.changeInstances(name, num);
 
        JsonObject result = new JsonObject();
 
        try {
 
            int newCount = future.get();
 
            result.addProperty("status", 0);
 
            result.addProperty("new_count", newCount);
 
        } catch (InterruptedException | ExecutionException e) {
 
            result.addProperty("status", -1);
 
            result.addProperty("errMsg", e.getMessage());
 
            LOG.error("set container number error", e.getMessage());
 
        }




Do you have any idea about why this not work? Hoping for your response. Thank 
you!


Haosu Guo

Reply via email to