Trunk build seems stable now after the merge. A couple of yarn trunk builds also ran fine such as: https://builds.apache.org/job/Hadoop-Yarn-trunk/1169/ .
I've merged this into branch-2 so as to minimize YARN dev effort across branches as per the discussion thread of YARN-1197 merge. Thanks, Wangda On Wed, Sep 23, 2015 at 3:15 PM, Wangda Tan <wheele...@gmail.com> wrote: > Thanks Karthik, > > Doing end-to-end test as well as regression test is the top priority of > this feature. Will keep you updated. > > Regards, > Wangda > > On Wed, Sep 23, 2015 at 3:01 PM, Karthik Kambatla <ka...@cloudera.com> > wrote: > >> I did skim through the patch before voting :) >> >> As I mentioned earlier, I understand vast majority of the code is net new, >> and the likelihood of breaking changes is low. Still, I would like for us >> to be more careful and run some end-to-end and regression tests before >> including this in a release. While I won't block the merge to branch-2, >> will be a -1 on cutting a release without due testing. >> >> At the risk of digressing, I also feel branch-2 should be in a releasable >> state more often than not. The only way to bring any cadence and >> predictability to our releases is by maintaining branch-2 in such a state. >> >> On Tue, Sep 22, 2015 at 8:14 PM, Wangda Tan <wheele...@gmail.com> wrote: >> >> > Hi Karthik, >> > >> > Let me elaborate more to make you feel better of this change, don't be >> > scared by the size of the patch :) >> > >> > Common RM/Scheduler part: >> > - AbstractYarnScheduler new logic only. >> > - AppSchedulingInfo new logic only. >> > - RMContainer / RMNode state machine, new logic only >> > - SchedulerApplicationAttempt / Allocation, refactoring to existing >> > reservation logic so increase request reservation can reuse it, and >> > refactored to simply updating container token / pull container part so >> > increase/decrease/new-allocation can reuse same code. >> > >> > FairScheduler: >> > - Small change since we updated how to pull container updated token. I >> > believe it will be a straightforward change for you if you take a closer >> > look at it. >> > >> > CapacityScheduler: >> > - Most changes are separate logic or small refactorings, most complex >> > allocation logic stays within IncreaseContainerAllocator.java. >> > >> > Please let me know where you want to get more details of >> implementations. >> > >> > I strongly suggest you to take a glance at the diff, we have already >> worked >> > on the merge for the past one week, and we've paid a lot of extra time >> to >> > keep YARN-1197 sync with trunk in the past several months. After this >> merge >> > finished, a couple of weeks needed to finish end-to-end test and some >> other >> > extra tests, it won't affect our upcoming branch-2 release. >> > >> > I would not prefer to merge to trunk only, all people working on RM side >> > will be affected, we're very carefully avoid such divergence of RM in >> > trunk/branch-2. Since nobody wants to create two different patches for >> > every RM changes. And also, after this finished, other efforts can >> happen >> > in parallel such as YARN-4091. >> > >> > Let me know if you have any other questions/concerns. >> > >> > Thanks, >> > Wangda >> > >> > On Tue, Sep 22, 2015 at 7:26 PM, Karthik Kambatla <ka...@cloudera.com> >> > wrote: >> > >> > > I am sorry, but merging a potentially disruptive change to branch-2 >> > without >> > > end-to-end tests seems too disruptive to me. >> > > >> > > I do agree with you on the potential inconvenience of having to post >> > > different patches for trunk and branch-2, but I would rather have that >> > > inconvenience than the risk of merging something that hasn't been >> > > thoroughly tested. >> > > >> > > On Tue, Sep 22, 2015 at 6:18 PM, Wangda Tan <wheele...@gmail.com> >> wrote: >> > > >> > > > Hi Karthik, >> > > > >> > > > Thanks for comments! However, I think only merge to trunk may not >> work, >> > > > this patch involves thousands lines of code changes in scheduler >> side, >> > > only >> > > > putting that to trunk could lead to trunk/branch-2 totally >> incompatible >> > > for >> > > > resource manager. I think most of the code changes are new to >> scheduler >> > > > instead of modifying existed logic, they're not very tricky to me. >> And >> > > when >> > > > 2.8 will be released is not planned yet, at least we have a couple >> of >> > > > months to make sure this feature becomes usable and not cause >> existing >> > > > behavior regressions. >> > > > >> > > > Sounds good to you? >> > > > >> > > > Wangda >> > > > >> > > > >> > > > On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla < >> ka...@cloudera.com> >> > > > wrote: >> > > > >> > > > > +1 on merging to trunk. It would be nice to have some amount of >> > testing >> > > > > done before the merge, but I understand how merging to trunk would >> > > likely >> > > > > speed up the testing efforts. >> > > > > >> > > > > Let us not merge into branch-2 until after we have done a fair >> bit of >> > > > > testing, and are comfortable including it in a release. While the >> > code >> > > > > mostly appears to not mess with existing scheduling logic, I am >> > > concerned >> > > > > about regressions to existing scheduling behavior. >> > > > > >> > > > > >> > > > > >> > > > > On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla < >> > ka...@cloudera.com> >> > > > > wrote: >> > > > > >> > > > > > By the way, for the purposes of merge vote, I believe a >> committer's >> > > > vote >> > > > > > is binding. So, Wangda and Zhihai's votes should be binding. :) >> > > > > > >> > > > > > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <z...@cloudera.com> >> > > wrote: >> > > > > > >> > > > > >> +1 (non-binding) >> > > > > >> >> > > > > >> thanks >> > > > > >> Zhihai Xu >> > > > > >> >> > > > > >> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong < >> > xg...@hortonworks.com> >> > > > > >> wrote: >> > > > > >> >> > > > > >> > +1 Binding >> > > > > >> > >> > > > > >> > Thanks >> > > > > >> > >> > > > > >> > Xuan Gong >> > > > > >> > >> > > > > >> > > On Sep 22, 2015, at 12:03 AM, Junping Du < >> j...@hortonworks.com >> > > >> > > > > wrote: >> > > > > >> > > >> > > > > >> > > +1. (Binding). >> > > > > >> > > >> > > > > >> > > Thanks, >> > > > > >> > > >> > > > > >> > > Junping >> > > > > >> > > ________________________________________ >> > > > > >> > > From: Wangda Tan <wheele...@gmail.com> >> > > > > >> > > Sent: Thursday, September 17, 2015 3:19 AM >> > > > > >> > > To: yarn-dev@hadoop.apache.org >> > > > > >> > > Subject: Re: [VOTE] Merge YARN-1197 container resize into >> > trunk >> > > > > >> > > >> > > > > >> > > +1 (non-binding), >> > > > > >> > > >> > > > > >> > > Thanks Jian starting this thread. This can minimize effort >> of >> > > > works >> > > > > >> > across branches. >> > > > > >> > > >> > > > > >> > > To clarify, this feature is end-to-end code completed, we >> > have >> > > > API, >> > > > > >> > rm/nm implementations patches committed, but we haven't >> tested >> > it >> > > > > >> > end-to-end. Filed YARN-4175 to create an example program to >> test >> > > it >> > > > > >> > end-to-end. >> > > > > >> > > >> > > > > >> > > Regards, >> > > > > >> > > Wangda >> > > > > >> > > >> > > > > >> > >> On Sep 16, 2015, at 6:30 PM, Jian He <j...@hortonworks.com >> > >> > > > wrote: >> > > > > >> > >> >> > > > > >> > >> Hi All, >> > > > > >> > >> >> > > > > >> > >> Thanks Meng Ding and Wangda Tan for all the hard work ! >> > > > > >> > >> >> > > > > >> > >> I would like to call a vote to merge YARN-1197 container >> > resize >> > > > > into >> > > > > >> > trunk. >> > > > > >> > >> >> > > > > >> > >> Key idea: >> > > > > >> > >> This feature adds the ability for AM to change container >> > > resource >> > > > > >> size >> > > > > >> > at runtime. >> > > > > >> > >> >> > > > > >> > >> Details: >> > > > > >> > >> - This feature is tracked at >> > > > > >> > https://issues.apache.org/jira/browse/YARN-1197 >> > > > > >> > >> - It’s currently developed at a separate branch: >> > > > > >> > https://github.com/apache/hadoop/commits/YARN-1197 >> > > > > >> > >> - A uber patch( >> > https://issues.apache.org/jira/browse/YARN-4157 >> > > ) >> > > > > >> > generated from YARN-1197 to run against trunk shows all unit >> > > tests >> > > > > have >> > > > > >> > passed. >> > > > > >> > >> - This feature now can work end-to-end. >> > > > > >> > >> - All the unresolved jiras under YARN-1197 will be the >> next >> > > > step. >> > > > > >> > >> >> > > > > >> > >> Thanks, >> > > > > >> > >> Wangda Tan & Meng Ding & Jian He >> > > > > >> > > >> > > > > >> > > >> > > > > >> > >> > > > > >> > >> > > > > >> >> > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > >