Hi,

I see. In that case, the run thread is being blocked by the sleep call. You'll 
need to implement the stop() as well, which will be called from a different 
thread then the run() thread and you can interrupt the run thread from the stop 
thread.

Terence

Sent from my iPhone

> On Aug 9, 2016, at 7:28 PM, ...the end <549198...@qq.com> wrote:
> 
> Hi Ternece,
> 
> Thank you for your response. I'm sure AM had received the request from the 
> TwillController. Here is the log:
> 
>  2016-08-09T18:25:00,100Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] 
> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - 
> sleep instanceid 0
> 2016-08-09T18:25:00,080Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] 
> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - 
> sleep instanceid 1
> 2016-08-09T18:25:00,878Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] 
> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - 
> sleep instanceid 2
> 2016-08-09T18:25:00,907Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] 
> [runnable-command-executor] 
> CppShellRunnable:handleCommand(CppShellRunnable.java:98) - handle command 
> SimpleCommand{command='instances', options={count=2}} instanceid 1
> 2016-08-09T18:25:00,886Z INFO  o.a.t.i.a.ApplicationMasterService 
> [cp01-yarn-test2] [message-callback] 
> ApplicationMasterService:handleSetInstances(ApplicationMasterService.java:735)
>  - Received change instances request for fetcher, from 3 to 2.
> 2016-08-09T18:25:00,888Z INFO  o.a.t.i.a.ApplicationMasterService 
> [cp01-yarn-test2] [instanceChanger] 
> ApplicationMasterService$6:run(ApplicationMasterService.java:756) - 
> Processing change instance request for fetcher, from 3 to 2.
> 2016-08-09T18:25:00,890Z INFO  o.a.t.i.a.ApplicationMasterService 
> [cp01-yarn-test2] [instanceChanger] 
> ApplicationMasterService$6:run(ApplicationMasterService.java:760) - Confirmed 
> 3 containers running for fetcher.
> 2016-08-09T18:25:00,891Z INFO  o.a.t.i.a.RunningContainers [cp01-yarn-test2] 
> [instanceChanger] 
> RunningContainers:removeInstanceById(RunningContainers.java:226) - Stopping 
> service: fetcher fbd0c443-d7b5-4292-a18b-144510c499c4-2
> 2016-08-09T18:25:00,919Z INFO  o.a.t.i.a.ApplicationMasterService 
> [cp01-yarn-test2] [instanceChanger] 
> ApplicationMasterService$6:run(ApplicationMasterService.java:776) - Change 
> instances request completed. From 3 to 2.
> 2016-08-09T18:25:00,936Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] 
> [runnable-command-executor] 
> CppShellRunnable:handleCommand(CppShellRunnable.java:98) - handle command 
> SimpleCommand{command='instances', options={count=2}} instanceid 0
> 2016-08-09T18:25:01,100Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] 
> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - 
> sleep instanceid 0
> 2016-08-09T18:25:01,081Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] 
> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - 
> sleep instanceid 1
> 
> For fear of my work cannot be terminated, I just make it sleep in the Twill 
> Runnable and log the instance number. I've checked that the TwillLauncher 
> process is still running. I've no idea why this process not be killed.
> 
> Also, the restart api cannot kill the process either but just launch the new 
> instances.
> 
> Thanks!
> 
> Haosu Guo
> 
> ------------------ 原始邮件 ------------------
> 发件人: "chtyim";<cht...@apache.org>;
> 发送时间: 2016年8月10日(星期三) 凌晨1:50
> 收件人: "...the end"<549198...@qq.com>; "dev"<dev@twill.apache.org>;
> 主题: Re: A question about Twill changing instances number
> 
> Hi Haosu,
> 
> Do you have the application master log? It tells whether the AM actually 
> received the request and tries to terminate the container. Also, if you have 
> access to the cluster, please check if the actual container process is 
> actually terminated or not. We've seen cases that the container process is 
> actually still running due to unterminated user thread, hence causing the 
> container never returns back to YARN.
> 
> Ternece
> 
>> On Tue, Aug 9, 2016 at 2:22 AM, ...the end <549198...@qq.com> wrote:
>> hi Terence,
>> 
>> I'm a user of Apache Twill and now I have a question about changing the 
>> instance number.
>> 
>> I'm using twill-incubating-0.7.0.0, yarn-2.6.4, zookeeper-3.4.6. When I 
>> gonna to increase the number of instances, it runs well. But when I try to 
>> decrease the instances, I think there is something wrong.
>> 
>> Here is the log:
>> 16:48:53.365 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
>> org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x155de15272f0105, 
>> packet:: 
>> clientPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg
>>  
>> serverPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg
>>  finished:false header:: 21,1  replyHeader:: 21,3320,0  request:: 
>> '/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg,#7b2274797065223a2253595354454d222c2273636f7065223a2252554e4e41424c45222c2272756e6e61626c654e616d65223a2266657463686572222c22636f6d6d616e64223a7b22636f6d6d616e64223a22696e7374616e636573222c226f7074696f6e73223a7b22636f756e74223a2233227d7d7d,v{s{31,s{'world,'anyone}}},2
>>   response:: 
>> '/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001
>> 16:48:53.369 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
>> org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x155de15272f0105, 
>> packet:: 
>> clientPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001
>>  
>> serverPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001
>>  finished:false header:: 22,3  replyHeader:: 22,3320,0  request:: 
>> '/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001,T
>>   response:: s{3320,3320,1470732533363,1470732533363,0,0,0,0,119,0,3320}
>> 16:48:56.706 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
>> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
>> 0x155de15272f0105 after 0ms
>> 16:48:59.354 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
>> org.apache.zookeeper.ClientCnxn - Got notification 
>> sessionid:0x155de15272f0105
>> 16:48:59.354 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
>> org.apache.zookeeper.ClientCnxn - Got WatchedEvent state:SyncConnected 
>> type:NodeDeleted 
>> path:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001
>>  for sessionid 0x155de15272f0105
>> 
>> But the containers Running numbers I get from the 'Nodes of the Cluster' 
>> page from yarn is not decreased.
>> 
>> I use the api like this:
>>         Future<Integer> future = twillController.changeInstances(name, num);
>> 
>>         JsonObject result = new JsonObject();
>> 
>>         try {
>> 
>>             int newCount = future.get();
>> 
>>             result.addProperty("status", 0);
>> 
>>             result.addProperty("new_count", newCount);
>> 
>>         } catch (InterruptedException | ExecutionException e) {
>> 
>>             result.addProperty("status", -1);
>> 
>>             result.addProperty("errMsg", e.getMessage());
>> 
>>             LOG.error("set container number error", e.getMessage());
>> 
>>         }
>> 
>> 
>> 
>> Do you have any idea about why this not work? Hoping for your response. 
>> Thank you!
>> 
>> Haosu Guo
> 

Reply via email to