Re: [DISCUSS] Merging YARN-5355 (Timeline Service v.2) to trunk

2017-08-21 Thread Sangjin Lee
Thanks Rohith for that update! Sounds good. Let us know when that testing
is complete too.

On Sat, Aug 19, 2017 at 3:34 AM, Rohith Sharma K S <
rohithsharm...@apache.org> wrote:

> Hi Sangjin
>
> Thanks for bringing this point.
> We did similar exercise with current YARN-5355 branch today. Tests are
> validated against default configuration and timeline server v.1.5 as well.
>
> All YARN major features such as RM HA/Restart/work-preserving-restart, NM
> restart, scheduling, sample MR/distributes shell jobs run are validated.
> Everything looks fine.
>
> This testing will also be done once YARN-5355 branch code freezed
> completely as well in couple of days.
>
> Thanks & Regards
> Rohith Sharma K S
>
> On 18 August 2017 at 22:56, Sangjin Lee  wrote:
>
> > Kudos to Vrushali and the team for getting ready for this large and
> > important feature! I know a huge team effort went into this. I look
> forward
> > to seeing this merged.
> >
> > I'd like to ask one piece of due diligence. Could you please inspect
> > rigorously to ensure that when disabled Timeline Service v.2 does not
> > impact other features in any way? We did a similar exercise when we had
> the
> > first drop, and it would be good to repeat that... Thanks!
> >
> > Sangjin
> >
> > On Wed, Aug 16, 2017 at 11:44 AM, Andrew Wang 
> > wrote:
> >
> > > Great, thanks Vrushali! Sounds good to me.
> > >
> > > I have a few procedural release notes comments I'll put on YARN-5355,
> to
> > > make sure we advertise this to our users appropriately.
> > >
> > > On Wed, Aug 16, 2017 at 11:32 AM, Vrushali Channapattan <
> > > vrushal...@gmail.com> wrote:
> > >
> > > > Hi Andrew,
> > > >
> > > > Thanks for your response!
> > > >
> > > > There have been no changes to existing APIs since alpha1.
> > > >
> > > > We at Twitter have tested the feature to demonstrate it works at what
> > we
> > > > consider moderate scale but this did not include the security related
> > > > testing. The security testing is in progress at present by Timeline
> > > Service
> > > > V2 team in the community and we think we will have more details on
> this
> > > > very soon.
> > > >
> > > > About the jiras under YARN-5355: Only 3 of those sub-tasks are what
> we
> > > > think of as "merge-blockers". The issues being targeted for merge are
> > in
> > > > [link1] below. There are about 59 jiras of which 56 are completed.
> > > >
> > > > We plan to make a new umbrella jira after the merge to trunk. We will
> > > then
> > > > create a new branch with the new jira name and move these open jiras
> > > under
> > > > YARN-5355 as subtasks of that new umbrella jira.
> > > >
> > > > thanks
> > > > Vrushali
> > > > [link1] https://issues.apache.org/jira/projects/YARN/versions/
> 12337991
> > > >
> > > >
> > > > On Wed, Aug 16, 2017 at 10:47 AM, Andrew Wang <
> > andrew.w...@cloudera.com>
> > > > wrote:
> > > >
> > > >> Hi Vrushali,
> > > >>
> > > >> Glad to hear this major dev milestone is nearing completion!
> > > >>
> > > >> Repeating my request on other merge [DISCUSS] threads, could you
> > comment
> > > >> on testing and API stability of this merge? Our timeline for beta1
> is
> > > about
> > > >> a month out, so there's not much time to fix things beforehand.
> > > >>
> > > >> Looking at YARN-5355 there are also many unresolved subtasks. Should
> > > most
> > > >> of these be moved out to a new umbrella? I'm wondering what needs to
> > be
> > > >> completed before sending the merge vote.
> > > >>
> > > >> Given that TSv2 is committed for 3.0.0 GA, I'm more willing to flex
> > the
> > > >> beta1 release date for this feature than others. Hopefully that
> won't
> > be
> > > >> necessary though :)
> > > >>
> > > >> Best,
> > > >> Andrew
> > > >>
> > > >> On Wed, Aug 16, 2017 at 10:26 AM, Vrushali Channapattan <
> > > >> vrushalic2...@gmail.com> wrote:
> > > >>
> > > >>> Looks like some of the hyperlinks appear messed up, my apologies,
> > > >>> resending
> > > >>> the same email with hopefully better looking content:
> > > >>>
> > > >>> Hi All,
> > > >>>
> > > >>> I'd like to open a discussion for merging Timeline Service v2
> > > (YARN-5355)
> > > >>> to trunk in a few weeks.
> > > >>>
> > > >>> We have previously completed one merge onto trunk [1] and Timeline
> > > >>> Service
> > > >>> v2 has been part of Hadoop release 3.0.0-alpha1.
> > > >>>
> > > >>> Since then, we have been working on extending the capabilities of
> > > >>> Timeline
> > > >>> Service v2 in a feature branch [2].  There are a few related issues
> > > >>> pending
> > > >>> that are being actively worked upon and tested. As soon as they are
> > > >>> resolved, we plan on starting a merge vote within the next two
> weeks.
> > > The
> > > >>> goal is to get this into hadoop3 beta.
> > > >>>
> > > >>> We have paid close attention to ensure that  once disabled Timeline
> > > >>> Service
> > > >>> v2 does not impact existing functionality when disabled (by
> default).
> > > >>>
> > > 

Re: Branch merges and 3.0.0-beta1 scope

2017-08-21 Thread Wangda Tan
Andrew,

Thanks for your help to pushing this release.

Echoing what Vinod said, all contributors in these branches are putting
months to years of time working on these features, we don't have to decide
excluded features now since we have 25 days till 3.0-beta1 planned release
time.

The best approach to stabilize feature is to let people try that, instead
of waiting for feature becomes perfect. For features which can be turned
off, I think we should consider to bring it in if it is end-to-end ready. I
will try best to help merge efforts of YARN-3926 branch to trunk before Sep
15, and I'm OK with moving to the next release train if we fail to merge
the feature before release date.

Thanks,
Wangda


On Mon, Aug 21, 2017 at 2:22 PM, Vinod Kumar Vavilapalli  wrote:

> Steve,
>
> You can be strict & ruthless about the timelines. Anything that doesn’t
> get in by mid-September, as was originally planned, can move to the next
> release - whether it is feature work on branches or feature work on trunk.
>
> The problem I see here is that code & branches being worked on for a year
> are now (apparently) close to being done and we are telling them to hold
> for 7 more months - this is not a reasonable ask..
>
> If you are advocating for a 3.1 plan, I’m sure one of these branch
> ‘owners’ can volunteer. But this is how you get competing releases and
> split bandwidth.
>
> As for compatibility / testing etc, it seems like there is a belief that
> the current ‘scoped’ features are all tested well in these areas and so
> adding more is going to hurt the release. There is no way this is the
> reality, trunk has so many features that have been landing for years, the
> only way we can collectively attempt towards making this stable is by
> getting as many parties together as possible, each verifying stuff that
> they need. Not by excluding specific features.
>
> +Vinod
>
> > This is one of those curse-of-cadence things: The higher your release
> cadence, the less pressure to get "everything in". With a slower cadence,
> more pressure to get stuff in, more pressure to hold up the release, slows
> the cadence, gets even more stuff in, etc. etc.
> >
> > - Andrew has been working on the release for months, we all need to
> appreciate how much hard work that is and has been, especially for what is
> going to be a major release.
> >
> > - We know that things will be unstable in 3.0; Andrew's concern is about
> making sure that the newest, unstablest (?) features can at least be
> bypassed if there are problems. I we should also call out in the release
> notes what we think are the unstable bits where people need to use caution
> (example: S3Guard in "authoritative" mode)
> >
> > - Anything related to wire compatibility has been problematic in the
> past; I think it's essential that whatever packets get sent around are
> going to be stable, so changes there need to be in, or at least the
> payloads set up ready for the features. Same for new public APIs.
> >
> > - As fpr the rest, I don't know. I think being strict about it and
> ruthless in assessing the feature's stability & consequences of postponing
> the feature until a Hadoop 3.1 release in Jan/Feb, with a plan to ship then
> and follow up with a 3.2 in the summer.
> >
> > Then: start planning that 3.1 release. Maybe I should put my hand up as
> release manager for that one. Then everyone would realise how amenable
> Andrew is being today.
> >
> >
> > One other thing: alongside the big branches, there's the eternal backlog
> of small patches. We should organise spending a few days updating,
> reviewing & merging them in
> >
> > -Steve
> >
> >
> > -
> > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org  hdfs-dev-unsubscr...@hadoop.apache.org>
> > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> 
>


Re: Branch merges and 3.0.0-beta1 scope

2017-08-21 Thread Vinod Kumar Vavilapalli
Steve,

You can be strict & ruthless about the timelines. Anything that doesn’t get in 
by mid-September, as was originally planned, can move to the next release - 
whether it is feature work on branches or feature work on trunk.

The problem I see here is that code & branches being worked on for a year are 
now (apparently) close to being done and we are telling them to hold for 7 more 
months - this is not a reasonable ask..

If you are advocating for a 3.1 plan, I’m sure one of these branch ‘owners’ can 
volunteer. But this is how you get competing releases and split bandwidth.

As for compatibility / testing etc, it seems like there is a belief that the 
current ‘scoped’ features are all tested well in these areas and so adding more 
is going to hurt the release. There is no way this is the reality, trunk has so 
many features that have been landing for years, the only way we can 
collectively attempt towards making this stable is by getting as many parties 
together as possible, each verifying stuff that they need. Not by excluding 
specific features.

+Vinod

> This is one of those curse-of-cadence things: The higher your release 
> cadence, the less pressure to get "everything in". With a slower cadence, 
> more pressure to get stuff in, more pressure to hold up the release, slows 
> the cadence, gets even more stuff in, etc. etc.
> 
> - Andrew has been working on the release for months, we all need to 
> appreciate how much hard work that is and has been, especially for what is 
> going to be a major release.
> 
> - We know that things will be unstable in 3.0; Andrew's concern is about 
> making sure that the newest, unstablest (?) features can at least be bypassed 
> if there are problems. I we should also call out in the release notes what we 
> think are the unstable bits where people need to use caution (example: 
> S3Guard in "authoritative" mode)
> 
> - Anything related to wire compatibility has been problematic in the past; I 
> think it's essential that whatever packets get sent around are going to be 
> stable, so changes there need to be in, or at least the payloads set up ready 
> for the features. Same for new public APIs.
> 
> - As fpr the rest, I don't know. I think being strict about it and ruthless 
> in assessing the feature's stability & consequences of postponing the feature 
> until a Hadoop 3.1 release in Jan/Feb, with a plan to ship then and follow up 
> with a 3.2 in the summer.
> 
> Then: start planning that 3.1 release. Maybe I should put my hand up as 
> release manager for that one. Then everyone would realise how amenable Andrew 
> is being today.
> 
> 
> One other thing: alongside the big branches, there's the eternal backlog of 
> small patches. We should organise spending a few days updating, reviewing & 
> merging them in
> 
> -Steve
> 
> 
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org 
> 
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org 
>