I have created: https://issues.apache.org/jira/browse/TWILL-202 to limit time on waiting for resources from YARN to allow process requests in queue.
Thanks, Yuliya On Sat, Dec 24, 2016 at 11:24 PM, Yuliya <yul...@dremio.com> wrote: > Thank you for the replies > > Comments inline > > > On Dec 24, 2016, at 10:38 PM, Terence Yim <cht...@gmail.com> wrote: > > > > Hi, > > > > 1. I see what you mean now. The reason why Twill currently wait for all > the > > requested containers up and running before changing the number of > > containers again is mainly to provide a more deterministic state > transition > > for runnable lifecyle, in case the application logic is sensitive to > number > > of instances. However, I do agree that Twill can provide more flexible > way > > to let the application to decide whether waiting is needed or not. Would > > you mind opening a JIRA for the improvement? > > I will open a JIRA - thank you > > > > 2. The TwillRunner is decided to survive process restart with the ability > > to rediscover all the running twill applications via ZooKeeper. However, > > due to the natural of async operations in ZK, you might need to call > > "lookup" couple times before all the necessary information is synced up > > from ZK after the process restarted. > Interesting - let me try - it does not look like this from the code, but I > may be missing something > > > > Terence > > > >> On Fri, Dec 23, 2016 at 11:18 AM, Yuliya Feldman <yul...@dremio.com> > wrote: > >> > >> Thank you very much for the reply > >> Please see inline > >> > >> > >>> On Fri, Dec 23, 2016 at 11:10 AM, Terence Yim <cht...@gmail.com> > wrote: > >>> > >>> Hi, > >>> > >>> 1. It really depends on how much resources that your application need. > >>> Twill simply act as a bridge between your app and YARN, however, the > YARN > >>> cluster itself needs to have enough resources (memory and vcores) to > run > >>> your application. > >> I definitely agree that YARN should have capacity. What I am trying to > say > >> is that if I want to change my mind and resize 2nd time before 1st > request > >> was satisfied I can not do it. What if I mistyped number of requested > >> containers - put 100 instead of 10 and YARN will never have this > capacity. > >> If I change back to 10 it won't change it unless 100 is satisfied. > >> > >>> > >>> 2. You should be able to do that through the TwillRunner.lookup method. > >> Do > >>> you mean you tried but it doesn't return anything? > >> TwillRunner.lookup works ONLY if application that uses > TwillRunner.lookup > >> (YARN/Twill client another words) NEVER restarted - if it restarted all > the > >> information is lost and I am not sure how to make TwillRunner to obtain > it > >> again from running cluster. > >> > >>> > >>> Terence > >>> > >>>> On Thu, Dec 22, 2016 at 2:20 PM, Yuliya Feldman <yul...@dremio.com> > >>> wrote: > >>> > >>>> Hello there, > >>>> > >>>> I started using Twill recently and I came across couple of issues I > >>> wanted > >>>> to check on: > >>>> > >>>> 1. If I resize YARN cluster to more capacity it can handle I can't > >> resize > >>>> down, as it did not satisfy first request > >>>> > >>>> 2. If my application that spawns up Twill YARN Cluster restarts > >> (meaning > >>> I > >>>> am losing YarnTwillRunnerService) I can not get hold of the > >>> TwillController > >>>> after it even I know runId and what not. > >>>> > >>>> Could anybody advise/confirm/deny on the issues I am seeing? > >>>> > >>>> Thanks in advance > >> >