You are very welcome. Please feel free to ask questions on the twill mailing 
list dev@twill.apache.org and we are very happy to help.

Terence

Sent from my iPhone

> On Aug 9, 2016, at 10:45 PM, ...the end <549198...@qq.com> wrote:
> 
> Hi, 
> 
> Thank you for advice and I'll try it! You are very helpful! 
> Thank you very much!
> 
> Haosu Guo
> 
> 
> ------------------ 原始邮件 ------------------
> 发件人: "Terence Yim";<cht...@gmail.com>;
> 发送时间: 2016年8月10日(星期三) 中午1:12
> 收件人: "...the end"<549198...@qq.com>;
> 抄送: "chtyim"<cht...@apache.org>; "dev"<dev@twill.apache.org>;
> 主题: Re: 回复: A question about Twill changing instances number
> 
> Hi,
> 
> I see. In that case, the run thread is being blocked by the sleep call. 
> You'll need to implement the stop() as well, which will be called from a 
> different thread then the run() thread and you can interrupt the run thread 
> from the stop thread.
> 
> Terence
> 
> Sent from my iPhone
> 
>> On Aug 9, 2016, at 7:28 PM, ...the end <549198...@qq.com> wrote:
>> 
>> Hi Ternece,
>> 
>> Thank you for your response. I'm sure AM had received the request from the 
>> TwillController. Here is the log:
>> 
>>  2016-08-09T18:25:00,100Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] 
>> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - 
>> sleep instanceid 0
>> 2016-08-09T18:25:00,080Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] 
>> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - 
>> sleep instanceid 1
>> 2016-08-09T18:25:00,878Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] 
>> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - 
>> sleep instanceid 2
>> 2016-08-09T18:25:00,907Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] 
>> [runnable-command-executor] 
>> CppShellRunnable:handleCommand(CppShellRunnable.java:98) - handle command 
>> SimpleCommand{command='instances', options={count=2}} instanceid 1
>> 2016-08-09T18:25:00,886Z INFO  o.a.t.i.a.ApplicationMasterService 
>> [cp01-yarn-test2] [message-callback] 
>> ApplicationMasterService:handleSetInstances(ApplicationMasterService.java:735)
>>  - Received change instances request for fetcher, from 3 to 2.
>> 2016-08-09T18:25:00,888Z INFO  o.a.t.i.a.ApplicationMasterService 
>> [cp01-yarn-test2] [instanceChanger] 
>> ApplicationMasterService$6:run(ApplicationMasterService.java:756) - 
>> Processing change instance request for fetcher, from 3 to 2.
>> 2016-08-09T18:25:00,890Z INFO  o.a.t.i.a.ApplicationMasterService 
>> [cp01-yarn-test2] [instanceChanger] 
>> ApplicationMasterService$6:run(ApplicationMasterService.java:760) - 
>> Confirmed 3 containers running for fetcher.
>> 2016-08-09T18:25:00,891Z INFO  o.a.t.i.a.RunningContainers [cp01-yarn-test2] 
>> [instanceChanger] 
>> RunningContainers:removeInstanceById(RunningContainers.java:226) - Stopping 
>> service: fetcher fbd0c443-d7b5-4292-a18b-144510c499c4-2
>> 2016-08-09T18:25:00,919Z INFO  o.a.t.i.a.ApplicationMasterService 
>> [cp01-yarn-test2] [instanceChanger] 
>> ApplicationMasterService$6:run(ApplicationMasterService.java:776) - Change 
>> instances request completed. From 3 to 2.
>> 2016-08-09T18:25:00,936Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] 
>> [runnable-command-executor] 
>> CppShellRunnable:handleCommand(CppShellRunnable.java:98) - handle command 
>> SimpleCommand{command='instances', options={count=2}} instanceid 0
>> 2016-08-09T18:25:01,100Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] 
>> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - 
>> sleep instanceid 0
>> 2016-08-09T18:25:01,081Z INFO  c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] 
>> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - 
>> sleep instanceid 1
>> 
>> For fear of my work cannot be terminated, I just make it sleep in the Twill 
>> Runnable and log the instance number. I've checked that the TwillLauncher 
>> process is still running. I've no idea why this process not be killed.
>> 
>> Also, the restart api cannot kill the process either but just launch the new 
>> instances.
>> 
>> Thanks!
>> 
>> Haosu Guo
>> 
>> ------------------ 原始邮件 ------------------
>> 发件人: "chtyim";<cht...@apache.org>;
>> 发送时间: 2016年8月10日(星期三) 凌晨1:50
>> 收件人: "...the end"<549198...@qq.com>; "dev"<dev@twill.apache.org>;
>> 主题: Re: A question about Twill changing instances number
>> 
>> Hi Haosu,
>> 
>> Do you have the application master log? It tells whether the AM actually 
>> received the request and tries to terminate the container. Also, if you have 
>> access to the cluster, please check if the actual container process is 
>> actually terminated or not. We've seen cases that the container process is 
>> actually still running due to unterminated user thread, hence causing the 
>> container never returns back to YARN.
>> 
>> Ternece
>> 
>>> On Tue, Aug 9, 2016 at 2:22 AM, ...the end <549198...@qq.com> wrote:
>>> hi Terence,
>>> 
>>> I'm a user of Apache Twill and now I have a question about changing the 
>>> instance number.
>>> 
>>> I'm using twill-incubating-0.7.0.0, yarn-2.6.4, zookeeper-3.4.6. When I 
>>> gonna to increase the number of instances, it runs well. But when I try to 
>>> decrease the instances, I think there is something wrong.
>>> 
>>> Here is the log:
>>> 16:48:53.365 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
>>> org.apache.zookeeper.ClientCnxn - Reading reply 
>>> sessionid:0x155de15272f0105, packet:: 
>>> clientPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg
>>>  
>>> serverPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg
>>>  finished:false header:: 21,1  replyHeader:: 21,3320,0  request:: 
>>> '/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg,#7b2274797065223a2253595354454d222c2273636f7065223a2252554e4e41424c45222c2272756e6e61626c654e616d65223a2266657463686572222c22636f6d6d616e64223a7b22636f6d6d616e64223a22696e7374616e636573222c226f7074696f6e73223a7b22636f756e74223a2233227d7d7d,v{s{31,s{'world,'anyone}}},2
>>>   response:: 
>>> '/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001
>>> 16:48:53.369 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
>>> org.apache.zookeeper.ClientCnxn - Reading reply 
>>> sessionid:0x155de15272f0105, packet:: 
>>> clientPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001
>>>  
>>> serverPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001
>>>  finished:false header:: 22,3  replyHeader:: 22,3320,0  request:: 
>>> '/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001,T
>>>   response:: s{3320,3320,1470732533363,1470732533363,0,0,0,0,119,0,3320}
>>> 16:48:56.706 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
>>> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
>>> 0x155de15272f0105 after 0ms
>>> 16:48:59.354 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
>>> org.apache.zookeeper.ClientCnxn - Got notification 
>>> sessionid:0x155de15272f0105
>>> 16:48:59.354 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG 
>>> org.apache.zookeeper.ClientCnxn - Got WatchedEvent state:SyncConnected 
>>> type:NodeDeleted 
>>> path:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001
>>>  for sessionid 0x155de15272f0105
>>> 
>>> But the containers Running numbers I get from the 'Nodes of the Cluster' 
>>> page from yarn is not decreased.
>>> 
>>> I use the api like this:
>>>         Future<Integer> future = twillController.changeInstances(name, num);
>>> 
>>>         JsonObject result = new JsonObject();
>>> 
>>>         try {
>>> 
>>>             int newCount = future.get();
>>> 
>>>             result.addProperty("status", 0);
>>> 
>>>             result.addProperty("new_count", newCount);
>>> 
>>>         } catch (InterruptedException | ExecutionException e) {
>>> 
>>>             result.addProperty("status", -1);
>>> 
>>>             result.addProperty("errMsg", e.getMessage());
>>> 
>>>             LOG.error("set container number error", e.getMessage());
>>> 
>>>         }
>>> 
>>> 
>>> 
>>> Do you have any idea about why this not work? Hoping for your response. 
>>> Thank you!
>>> 
>>> Haosu Guo
>> 

Reply via email to