You are very welcome. Please feel free to ask questions on the twill mailing list dev@twill.apache.org and we are very happy to help.
Terence Sent from my iPhone > On Aug 9, 2016, at 10:45 PM, ...the end <549198...@qq.com> wrote: > > Hi, > > Thank you for advice and I'll try it! You are very helpful! > Thank you very much! > > Haosu Guo > > > ------------------ 原始邮件 ------------------ > 发件人: "Terence Yim";<cht...@gmail.com>; > 发送时间: 2016年8月10日(星期三) 中午1:12 > 收件人: "...the end"<549198...@qq.com>; > 抄送: "chtyim"<cht...@apache.org>; "dev"<dev@twill.apache.org>; > 主题: Re: 回复: A question about Twill changing instances number > > Hi, > > I see. In that case, the run thread is being blocked by the sleep call. > You'll need to implement the stop() as well, which will be called from a > different thread then the run() thread and you can interrupt the run thread > from the stop thread. > > Terence > > Sent from my iPhone > >> On Aug 9, 2016, at 7:28 PM, ...the end <549198...@qq.com> wrote: >> >> Hi Ternece, >> >> Thank you for your response. I'm sure AM had received the request from the >> TwillController. Here is the log: >> >> 2016-08-09T18:25:00,100Z INFO c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] >> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - >> sleep instanceid 0 >> 2016-08-09T18:25:00,080Z INFO c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] >> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - >> sleep instanceid 1 >> 2016-08-09T18:25:00,878Z INFO c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] >> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - >> sleep instanceid 2 >> 2016-08-09T18:25:00,907Z INFO c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] >> [runnable-command-executor] >> CppShellRunnable:handleCommand(CppShellRunnable.java:98) - handle command >> SimpleCommand{command='instances', options={count=2}} instanceid 1 >> 2016-08-09T18:25:00,886Z INFO o.a.t.i.a.ApplicationMasterService >> [cp01-yarn-test2] [message-callback] >> ApplicationMasterService:handleSetInstances(ApplicationMasterService.java:735) >> - Received change instances request for fetcher, from 3 to 2. >> 2016-08-09T18:25:00,888Z INFO o.a.t.i.a.ApplicationMasterService >> [cp01-yarn-test2] [instanceChanger] >> ApplicationMasterService$6:run(ApplicationMasterService.java:756) - >> Processing change instance request for fetcher, from 3 to 2. >> 2016-08-09T18:25:00,890Z INFO o.a.t.i.a.ApplicationMasterService >> [cp01-yarn-test2] [instanceChanger] >> ApplicationMasterService$6:run(ApplicationMasterService.java:760) - >> Confirmed 3 containers running for fetcher. >> 2016-08-09T18:25:00,891Z INFO o.a.t.i.a.RunningContainers [cp01-yarn-test2] >> [instanceChanger] >> RunningContainers:removeInstanceById(RunningContainers.java:226) - Stopping >> service: fetcher fbd0c443-d7b5-4292-a18b-144510c499c4-2 >> 2016-08-09T18:25:00,919Z INFO o.a.t.i.a.ApplicationMasterService >> [cp01-yarn-test2] [instanceChanger] >> ApplicationMasterService$6:run(ApplicationMasterService.java:776) - Change >> instances request completed. From 3 to 2. >> 2016-08-09T18:25:00,936Z INFO c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] >> [runnable-command-executor] >> CppShellRunnable:handleCommand(CppShellRunnable.java:98) - handle command >> SimpleCommand{command='instances', options={count=2}} instanceid 0 >> 2016-08-09T18:25:01,100Z INFO c.b.b.o.c.CppShellRunnable [cp01-yarn-test2] >> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - >> sleep instanceid 0 >> 2016-08-09T18:25:01,081Z INFO c.b.b.o.c.CppShellRunnable [cp01-yarn-test3] >> [TwillContainerService] CppShellRunnable:run(CppShellRunnable.java:87) - >> sleep instanceid 1 >> >> For fear of my work cannot be terminated, I just make it sleep in the Twill >> Runnable and log the instance number. I've checked that the TwillLauncher >> process is still running. I've no idea why this process not be killed. >> >> Also, the restart api cannot kill the process either but just launch the new >> instances. >> >> Thanks! >> >> Haosu Guo >> >> ------------------ 原始邮件 ------------------ >> 发件人: "chtyim";<cht...@apache.org>; >> 发送时间: 2016年8月10日(星期三) 凌晨1:50 >> 收件人: "...the end"<549198...@qq.com>; "dev"<dev@twill.apache.org>; >> 主题: Re: A question about Twill changing instances number >> >> Hi Haosu, >> >> Do you have the application master log? It tells whether the AM actually >> received the request and tries to terminate the container. Also, if you have >> access to the cluster, please check if the actual container process is >> actually terminated or not. We've seen cases that the container process is >> actually still running due to unterminated user thread, hence causing the >> container never returns back to YARN. >> >> Ternece >> >>> On Tue, Aug 9, 2016 at 2:22 AM, ...the end <549198...@qq.com> wrote: >>> hi Terence, >>> >>> I'm a user of Apache Twill and now I have a question about changing the >>> instance number. >>> >>> I'm using twill-incubating-0.7.0.0, yarn-2.6.4, zookeeper-3.4.6. When I >>> gonna to increase the number of instances, it runs well. But when I try to >>> decrease the instances, I think there is something wrong. >>> >>> Here is the log: >>> 16:48:53.365 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG >>> org.apache.zookeeper.ClientCnxn - Reading reply >>> sessionid:0x155de15272f0105, packet:: >>> clientPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg >>> >>> serverPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg >>> finished:false header:: 21,1 replyHeader:: 21,3320,0 request:: >>> '/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg,#7b2274797065223a2253595354454d222c2273636f7065223a2252554e4e41424c45222c2272756e6e61626c654e616d65223a2266657463686572222c22636f6d6d616e64223a7b22636f6d6d616e64223a22696e7374616e636573222c226f7074696f6e73223a7b22636f756e74223a2233227d7d7d,v{s{31,s{'world,'anyone}}},2 >>> response:: >>> '/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001 >>> 16:48:53.369 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG >>> org.apache.zookeeper.ClientCnxn - Reading reply >>> sessionid:0x155de15272f0105, packet:: >>> clientPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001 >>> >>> serverPath:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001 >>> finished:false header:: 22,3 replyHeader:: 22,3320,0 request:: >>> '/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001,T >>> response:: s{3320,3320,1470732533363,1470732533363,0,0,0,0,119,0,3320} >>> 16:48:56.706 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG >>> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: >>> 0x155de15272f0105 after 0ms >>> 16:48:59.354 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG >>> org.apache.zookeeper.ClientCnxn - Got notification >>> sessionid:0x155de15272f0105 >>> 16:48:59.354 [ STARTING-SendThread(cp01-yarn-test1:2181)] DEBUG >>> org.apache.zookeeper.ClientCnxn - Got WatchedEvent state:SyncConnected >>> type:NodeDeleted >>> path:/Cpp-Application/8047096c-0a25-40ec-8f21-ca8569c40f8c/messages/msg0000000001 >>> for sessionid 0x155de15272f0105 >>> >>> But the containers Running numbers I get from the 'Nodes of the Cluster' >>> page from yarn is not decreased. >>> >>> I use the api like this: >>> Future<Integer> future = twillController.changeInstances(name, num); >>> >>> JsonObject result = new JsonObject(); >>> >>> try { >>> >>> int newCount = future.get(); >>> >>> result.addProperty("status", 0); >>> >>> result.addProperty("new_count", newCount); >>> >>> } catch (InterruptedException | ExecutionException e) { >>> >>> result.addProperty("status", -1); >>> >>> result.addProperty("errMsg", e.getMessage()); >>> >>> LOG.error("set container number error", e.getMessage()); >>> >>> } >>> >>> >>> >>> Do you have any idea about why this not work? Hoping for your response. >>> Thank you! >>> >>> Haosu Guo >>