Hi Terence,
I find something whether I don't is it a bug. First I startup some applications. Then I start the runner, it can find all the live applications this time. Later I kill some applications, within a minute, the runner can synchronize the status of zookeeper. But If I start a new application, the runner cannot find out it, although I think it already watched the state changes. Here is the log: 15:33:59.151 [ STARTING-SendThread(cp01-yarn-test1.epc.baidu.com:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 0x155de15272f01e4 after 0ms 15:34:01.881 [ STARTING-SendThread(cp01-yarn-test1.epc.baidu.com:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got notification sessionid:0x155de15272f01e4 15:34:01.881 [ STARTING-SendThread(cp01-yarn-test1.epc.baidu.com:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/ for sessionid 0x155de15272f01e4 15:34:01.888 [ STARTING-SendThread(cp01-yarn-test1.epc.baidu.com:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x155de15272f01e4, packet:: clientPath:/ serverPath:/ finished:false header:: 17,12 replyHeader:: 17,4628,0 request:: '/,T response:: v{'Cpp-Application,'my-twil-app,'zookeeper,'HelloWordRunnable,'Parserwork-Test},s{0,0,0,0,0,3,0,0,0,5,4627} 15:34:01.891 [ STARTING-SendThread(cp01-yarn-test1.epc.baidu.com:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x155de15272f01e4, packet:: clientPath:/Parserwork-Test/instances serverPath:/Parserwork-Test/instances finished:false header:: 18,12 replyHeader:: 18,4628,-101 request:: '/Parserwork-Test/instances,T response:: v{} 15:34:01.901 [ STARTING-SendThread(cp01-yarn-test1.epc.baidu.com:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x155de15272f01e4, packet:: clientPath:/Parserwork-Test/instances serverPath:/Parserwork-Test/instances finished:false header:: 19,3 replyHeader:: 19,4628,-101 request:: '/Parserwork-Test/instances,T response:: 15:34:03.676 [ STARTING-SendThread(cp01-yarn-test1.epc.baidu.com:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got notification sessionid:0x155de15272f01e4 15:34:03.676 [ STARTING-SendThread(cp01-yarn-test1.epc.baidu.com:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got WatchedEvent state:SyncConnected type:NodeCreated path:/Parserwork-Test/instances for sessionid 0x155de15272f01e4 15:34:03.683 [ STARTING-SendThread(cp01-yarn-test1.epc.baidu.com:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x155de15272f01e4, packet:: clientPath:/Parserwork-Test/instances serverPath:/Parserwork-Test/instances finished:false header:: 20,3 replyHeader:: 20,4657,0 request:: '/Parserwork-Test/instances,T response:: s{4657,4657,1471419243674,1471419243674,0,0,0,0,0,0,4657} 15:34:03.688 [ STARTING-SendThread(cp01-yarn-test1.epc.baidu.com:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x155de15272f01e4, packet:: clientPath:/Parserwork-Test/instances serverPath:/Parserwork-Test/instances finished:false header:: 21,12 replyHeader:: 21,4657,0 request:: '/Parserwork-Test/instances,T response:: v{},s{4657,4657,1471419243674,1471419243674,0,0,0,0,0,0,4657} 15:34:03.689 [ STARTING-SendThread(cp01-yarn-test1.epc.baidu.com:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got notification sessionid:0x155de15272f01e4 15:34:03.689 [ STARTING-SendThread(cp01-yarn-test1.epc.baidu.com:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/Parserwork-Test/instances for sessionid 0x155de15272f01e4 15:34:07.027 [ STARTING-SendThread(cp01-yarn-test1.epc.baidu.com:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 0x155de15272f01e4 after 6ms Based on the log, I think the runner has found the state changes of application 'Parserwork-Test', but when I used the lookup api or lookupLive api, I got nothing. I'm using twill-0.7.0-inclubating and zookeepr 3.4.6. Thanks! Haosu Guo ------------------ Original ------------------ From: "Terence Yim";<cht...@gmail.com>; Date: Wed, Aug 17, 2016 02:30 PM To: "...the end"<549198...@qq.com>; Cc: "dev"<dev@twill.apache.org>; Subject: Re: A question about lookupLive of TwillRunner Hi, From your code, I believe you are using it correctly. You mentioned that you sometimes see the newly launched app, sometimes not. But is the change eventually consistent or is it keeps flipping back and forth? Terence On Aug 15, 2016, at 9:40 PM, ...the end <549198...@qq.com> wrote: hi Terence, Here is my code. My http server extends the AbstractService of Hadoop. When the service init, I start the runner: @Override protected void serviceInit(Configuration conf) throws Exception { super.serviceInit(conf); LOG.info("service init"); injector = Guice.createInjector(new TwillServiceModule()); TwillRunnerService runner = injector.getInstance(TwillRunnerService.class); runner.start(); } When the service stop, I stop the runner: @Override protected void serviceStop() throws Exception { LOG.info("service stop"); TwillRunnerService runner = injector.getInstance(TwillRunnerService.class); runner.stop(); context.clearAttributes(); context.stop(); server.stop(); super.serviceStop(); } And here is how I look up all the live applications: @GET @Path("recover") public String recover() { LOG.info("service discover"); Iterable<LiveInfo> iterable = twillRunner.lookupLive(); Iterator<LiveInfo> iterator = iterable.iterator(); JsonArray jsonArray = new JsonArray(); while (iterator.hasNext()) { LiveInfo info = iterator.next(); for (TwillController controller : info.getControllers()) { LOG.info("application {}, controller runid {}", info.getApplicationName(), controller.getRunId()); jsonArray.add(new JsonPrimitive(String.format("application %s, controller runid %s", info.getApplicationName(), controller.getRunId()))); } } twillRunner.stop(); return jsonArray.toString(); } My zookeeper version is 3.4.6, thanks! Haosu Guo ------------------ Original ------------------ From: "Terence Yim";<cht...@gmail.com>; Date: Tue, Aug 16, 2016 12:02 PM To: "...the end"<549198...@qq.com>; Cc: "dev"<dev@twill.apache.org>; Subject: Re: A question about lookupLive of TwillRunner Hi, The ZK watches reflect the changes from ZK asynchronously, so there can be some delay. Would you mind attach your code to show how you use it? Terence Sent from my iPhone > On Aug 15, 2016, at 8:07 PM, ...the end <549198...@qq.com> wrote: > > hi, all: > > I'm using Twill and it's a very helpful project to write yarn applications. > And now I've run into a problem. > > I want to start a http server to accept user's request and control the user's > application in the server. I start the TwillRunner when the server starts, > the runner can recover the applications started before the runner start. But > when I start some new applications or kill some applications, using the > lookupLive api or lookup api cannot see any change sometimes. > > I've read the source codes of YarnTwillRunnerService.java. I saw that you > used watchers to listen the changes of zookeeper, but I've no idea why it > doesn't work sometimes. Is there something wrong about the way I use the > TwillRunner? > > Hoping to hear from you soon. > Thanks! > > Haosu Guo