Re: [DISCUSS] Making 2.10 the last minor 2.x release

2020-04-16 Thread Jonathan Hung
Source code has been deleted from branch-2. Thanks Akira for taking this up!

Jonathan Hung


On Thu, Apr 16, 2020 at 11:40 AM Jonathan Hung  wrote:

> Makes sense. I've cherry-picked the commits in branch-2 that were missed
> in branch-2.10.
>
> Jonathan Hung
>
>
> On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka  wrote:
>
>> Hi folks,
>>
>> I am still seeing some changes are being committed to branch-2.
>> I'd like to delete the source code from branch-2 to avoid mistakes.
>> https://issues.apache.org/jira/browse/HADOOP-16988
>>
>> -Akira
>>
>> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena  wrote:
>>
>>> Hi Jim,
>>> Thanx for catching, I have configured the build to run on branch-2.10.
>>>
>>> -Ayush
>>>
>>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <
>>> james.bren...@verizonmedia.com> wrote:
>>>
>>>> It looks like QBT tests are still being run on branch-2 (
>>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>>> and they are not very helpful at this point.
>>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>>
>>>> Jim
>>>>
>>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka 
>>>> wrote:
>>>>
>>>>> Thank you, Ayush.
>>>>>
>>>>> I understand we should keep branch-2 as is, as well as master.
>>>>>
>>>>> -Akira
>>>>>
>>>>>
>>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena 
>>>>> wrote:
>>>>>
>>>>> > Hi Akira
>>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>>> protected,
>>>>> > we can’t delete it directly.
>>>>> >
>>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>>> >
>>>>> > -Ayush
>>>>> >
>>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka 
>>>>> wrote:
>>>>> >
>>>>> > Thank you for your work, Jonathan.
>>>>> >
>>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>>> remove
>>>>> > it?
>>>>> > I think the branch should be protected if possible.
>>>>> >
>>>>> > -Akira
>>>>> >
>>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung 
>>>>> > wrote:
>>>>> >
>>>>> > It's done. The new commit chain is: trunk -> branch-3.2 ->
>>>>> branch-3.1 ->
>>>>> >
>>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>>> please
>>>>> >
>>>>> > don't try to commit to it)
>>>>> >
>>>>> >
>>>>> > Completed procedure:
>>>>> >
>>>>> >
>>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>>> >
>>>>> >   - Delete old branch-2.10
>>>>> >
>>>>> >   - Rename branch-2 to (new) branch-2.10
>>>>> >
>>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>>> >
>>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>>> >
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung 
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>>> >
>>>>> > shv.had...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> &g

Re: [DISCUSS] Making 2.10 the last minor 2.x release

2020-04-16 Thread Jonathan Hung
Makes sense. I've cherry-picked the commits in branch-2 that were missed in
branch-2.10.

Jonathan Hung


On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka  wrote:

> Hi folks,
>
> I am still seeing some changes are being committed to branch-2.
> I'd like to delete the source code from branch-2 to avoid mistakes.
> https://issues.apache.org/jira/browse/HADOOP-16988
>
> -Akira
>
> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena  wrote:
>
>> Hi Jim,
>> Thanx for catching, I have configured the build to run on branch-2.10.
>>
>> -Ayush
>>
>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan 
>> wrote:
>>
>>> It looks like QBT tests are still being run on branch-2 (
>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>> and they are not very helpful at this point.
>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>
>>> Jim
>>>
>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka 
>>> wrote:
>>>
>>>> Thank you, Ayush.
>>>>
>>>> I understand we should keep branch-2 as is, as well as master.
>>>>
>>>> -Akira
>>>>
>>>>
>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena 
>>>> wrote:
>>>>
>>>> > Hi Akira
>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>> protected,
>>>> > we can’t delete it directly.
>>>> >
>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>> >
>>>> > -Ayush
>>>> >
>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka 
>>>> wrote:
>>>> >
>>>> > Thank you for your work, Jonathan.
>>>> >
>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>> remove
>>>> > it?
>>>> > I think the branch should be protected if possible.
>>>> >
>>>> > -Akira
>>>> >
>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung 
>>>> > wrote:
>>>> >
>>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>>> ->
>>>> >
>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>> please
>>>> >
>>>> > don't try to commit to it)
>>>> >
>>>> >
>>>> > Completed procedure:
>>>> >
>>>> >
>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>> >
>>>> >   - Delete old branch-2.10
>>>> >
>>>> >   - Rename branch-2 to (new) branch-2.10
>>>> >
>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>> >
>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>> >
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung 
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>> >
>>>> > shv.had...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hey guys,
>>>> >
>>>> >
>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>> >
>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>> from
>>>> >
>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> >
>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>> minor
>>>> >
>>>> > 2.x release" confused people.
>>>> >
>>>> > It is in fact a wider matter that can be di

Re: [DISCUSS] Making 2.10 the last minor 2.x release

2019-12-09 Thread Jonathan Hung
It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
don't try to commit to it)

Completed procedure:

   - Verified everything in old branch-2.10 was in old branch-2
   - Delete old branch-2.10
   - Rename branch-2 to (new) branch-2.10
   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
   - Renamed fix versions from 2.11.0 to 2.10.1
   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE


Jonathan Hung


On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung  wrote:

> FYI, starting the rename process, beginning with INFRA-19521.
>
> Jonathan Hung
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko 
> wrote:
>
>> Hey guys,
>>
>> I think we diverged a bit from the initial topic of this discussion,
>> which is removing branch-2.10, and changing the version of branch-2 from
>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> Sounds like the subject line for this thread "Making 2.10 the last minor
>> 2.x release" confused people.
>> It is in fact a wider matter that can be discussed when somebody actually
>> proposes to release 2.11, which I understand nobody does at the moment.
>>
>> So if anybody objects removing branch-2.10 please make an argument.
>> Otherwise we should go ahead and just do it next week.
>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>
>> Thanks,
>> --Konstantin
>>
>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung 
>> wrote:
>>
>>> Thanks for the detailed thoughts, everyone.
>>>
>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> releases. As for putting features into minor/patch releases, if we keep the
>>> convention of putting new features only into minor releases, my assumption
>>> is still that it's unlikely people will want to get them into branch-2
>>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>>> even really removed support for java 7 in branch-2 (much less java 8), so I
>>> feel moving to java 11 would go along with a move to branch 3. And as you
>>> mentioned, if people really want to use java 11 on branch-2, we can always
>>> revive branch-2. But for now I think the convenience of not needing to port
>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> potentially needing to revive branch-2.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang  wrote:
>>>
>>>> +1 for 2.10.x as last release for 2.x version.
>>>>
>>>> Software would become more compatible when more companies stress test
>>>> the same software and making improvements in trunk.  Some may be extra
>>>> caution on moving up the version because obligation internally to keep
>>>> things running.  Company obligation should not be the driving force to
>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>> community when every name brand company maintains its own Hadoop 2.x
>>>> version.  I think it would be more healthy for the community to reduce the
>>>> branch forking and spend energy on trunk to harden the software.  This will
>>>> give more confidence to move up the version than trying to fix n
>>>> permutations breakage like Flash fixing the timeline.
>>>>
>>>> Apache license stated, there is no warranty of any kind for code
>>>> contributions.  Fewer community release process should improve software
>>>> quality when eyes are on trunk, and help steering toward the same end 
>>>> goals.
>>>>
>>>> regards,
>>>> Eric
>>>>
>>>>
>>>>
>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>  wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> Is it written anywhere what the difference is between a minor release
>>>>> and a
>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> have
>>>>> looked around and I can't find anything other than some compatibility
>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> think
>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>> alive.
>>>>> My current understanding is that we can't really break compatibility i

Re: [DISCUSS] Making 2.10 the last minor 2.x release

2019-12-04 Thread Jonathan Hung
FYI, starting the rename process, beginning with INFRA-19521.

Jonathan Hung


On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko 
wrote:

> Hey guys,
>
> I think we diverged a bit from the initial topic of this discussion, which
> is removing branch-2.10, and changing the version of branch-2 from
> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> Sounds like the subject line for this thread "Making 2.10 the last minor
> 2.x release" confused people.
> It is in fact a wider matter that can be discussed when somebody actually
> proposes to release 2.11, which I understand nobody does at the moment.
>
> So if anybody objects removing branch-2.10 please make an argument.
> Otherwise we should go ahead and just do it next week.
> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>
> Thanks,
> --Konstantin
>
> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung 
> wrote:
>
>> Thanks for the detailed thoughts, everyone.
>>
>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>> releases. As for putting features into minor/patch releases, if we keep the
>> convention of putting new features only into minor releases, my assumption
>> is still that it's unlikely people will want to get them into branch-2
>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>> even really removed support for java 7 in branch-2 (much less java 8), so I
>> feel moving to java 11 would go along with a move to branch 3. And as you
>> mentioned, if people really want to use java 11 on branch-2, we can always
>> revive branch-2. But for now I think the convenience of not needing to port
>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>> potentially needing to revive branch-2.
>>
>> Jonathan Hung
>>
>>
>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang  wrote:
>>
>>> +1 for 2.10.x as last release for 2.x version.
>>>
>>> Software would become more compatible when more companies stress test
>>> the same software and making improvements in trunk.  Some may be extra
>>> caution on moving up the version because obligation internally to keep
>>> things running.  Company obligation should not be the driving force to
>>> maintain Hadoop branches.  There is no proper collaboration in the
>>> community when every name brand company maintains its own Hadoop 2.x
>>> version.  I think it would be more healthy for the community to reduce the
>>> branch forking and spend energy on trunk to harden the software.  This will
>>> give more confidence to move up the version than trying to fix n
>>> permutations breakage like Flash fixing the timeline.
>>>
>>> Apache license stated, there is no warranty of any kind for code
>>> contributions.  Fewer community release process should improve software
>>> quality when eyes are on trunk, and help steering toward the same end goals.
>>>
>>> regards,
>>> Eric
>>>
>>>
>>>
>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>  wrote:
>>>
>>>> Hello all,
>>>>
>>>> Is it written anywhere what the difference is between a minor release
>>>> and a
>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> have
>>>> looked around and I can't find anything other than some compatibility
>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>>>> this would help shape my opinion on whether or not to keep branch-2
>>>> alive.
>>>> My current understanding is that we can't really break compatibility in
>>>> either a minor or point release. But the only mention of the difference
>>>> between minor and point releases is how to deal with Stable, Evolving,
>>>> and
>>>> Unstable tags, and how to deal with changing default configuration
>>>> values.
>>>> So it seems like there really isn't a big official difference between
>>>> the
>>>> two. In my mind, the functional difference between the two is that the
>>>> minor releases may have added features and rewrites, while the point
>>>> releases only have bug fixes. This might be an incorrect understanding,
>>>> but
>>>> that's what I have gathered from watching the releases over the last few
>>>> years. Whether or not this is a correct understanding, I think that this
>>>> needs to be documented somewhere, even if it is just a convent

Re: [DISCUSS] Making 2.10 the last minor 2.x release

2019-11-21 Thread Jonathan Hung
Thanks for the detailed thoughts, everyone.

Eric (Badger), my understanding is the same as yours re. minor vs patch
releases. As for putting features into minor/patch releases, if we keep the
convention of putting new features only into minor releases, my assumption
is still that it's unlikely people will want to get them into branch-2
(based on the 2.10.0 release process). For the java 11 issue, we haven't
even really removed support for java 7 in branch-2 (much less java 8), so I
feel moving to java 11 would go along with a move to branch 3. And as you
mentioned, if people really want to use java 11 on branch-2, we can always
revive branch-2. But for now I think the convenience of not needing to port
to both branch-2 and branch-2.10 (and below) outweighs the cost of
potentially needing to revive branch-2.

Jonathan Hung


On Wed, Nov 20, 2019 at 10:50 AM Eric Yang  wrote:

> +1 for 2.10.x as last release for 2.x version.
>
> Software would become more compatible when more companies stress test the
> same software and making improvements in trunk.  Some may be extra caution
> on moving up the version because obligation internally to keep things
> running.  Company obligation should not be the driving force to maintain
> Hadoop branches.  There is no proper collaboration in the community when
> every name brand company maintains its own Hadoop 2.x version.  I think it
> would be more healthy for the community to reduce the branch forking and
> spend energy on trunk to harden the software.  This will give more
> confidence to move up the version than trying to fix n permutations
> breakage like Flash fixing the timeline.
>
> Apache license stated, there is no warranty of any kind for code
> contributions.  Fewer community release process should improve software
> quality when eyes are on trunk, and help steering toward the same end goals.
>
> regards,
> Eric
>
>
>
> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>  wrote:
>
>> Hello all,
>>
>> Is it written anywhere what the difference is between a minor release and
>> a
>> point/dot/maintenance (I'll use "point" from here on out) release? I have
>> looked around and I can't find anything other than some compatibility
>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>> this would help shape my opinion on whether or not to keep branch-2 alive.
>> My current understanding is that we can't really break compatibility in
>> either a minor or point release. But the only mention of the difference
>> between minor and point releases is how to deal with Stable, Evolving, and
>> Unstable tags, and how to deal with changing default configuration values.
>> So it seems like there really isn't a big official difference between the
>> two. In my mind, the functional difference between the two is that the
>> minor releases may have added features and rewrites, while the point
>> releases only have bug fixes. This might be an incorrect understanding,
>> but
>> that's what I have gathered from watching the releases over the last few
>> years. Whether or not this is a correct understanding, I think that this
>> needs to be documented somewhere, even if it is just a convention.
>>
>> Given my assumed understanding of minor vs point releases, here are the
>> pros/cons that I can think of for having a branch-2. Please add on or
>> correct me for anything you feel is missing or inadequate.
>> Pros:
>> - Features/rewrites/higher-risk patches are less likely to be put into
>> 2.10.x
>> - It is less necessary to move to 3.x
>>
>> Cons:
>> - Bug fixes are less likely to be put into 2.10.x
>> - An extra branch to maintain
>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>> patches to if they should go all the way back to 2.10.x
>> - It is less necessary to move to 3.x
>>
>> So on the one hand you get added stability in fewer features being
>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>> committed. In a perfect world, we wouldn't have to make this tradeoff. But
>> we don't live in a perfect world and committers will make mistakes either
>> because of lack of knowledge or simply because they made a mistake. If we
>> have a branch-2, committers will forget, not know to, or choose not to
>> (for
>> whatever reason) commit valid bug fixes back all the way to branch-2.10.
>> If
>> we don't have a branch-2, committers who want their borderline risky
>> feature in the 2.x line will err on the side of putting it into
>> branch-2.10
>> instead of proposing the creation of a branch-2. Cle

Re: [DISCUSS] Making 2.10 the last minor 2.x release

2019-11-18 Thread Jonathan Hung
Thanks Eric for the comments - regarding your concerns, I feel the pros
outweigh the cons. To me, the chances of patch releases on 2.10.x are much
higher than a new 2.11 minor release. (There didn't seem to be many people
outside of our company who expressed interest in getting new features to
branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
release, there's 29 patches that have gone into branch-2 and 9 in
branch-2.10, so it's already diverged quite a bit.

In any case, we can always reverse this decision if we really need to, by
recreating branch-2. But this proposal would reduce a lot of confusion IMO.

Jonathan Hung


On Fri, Nov 15, 2019 at 11:41 AM epa...@apache.org 
wrote:

> Thanks Jonathan for opening the discussion.
>
> I am not in favor of this proposal. 2.10 was very recently released, and
> moving to 2.10 will take some time for the community. It seems premature to
> make a decision at this point that there will never be a need for a 2.11
> release.
>
> -Eric
>
>
>  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> jyhung2...@gmail.com> wrote:
>
> Hi folks,
>
> Given the release of 2.10.0, and the fact that it's intended to be a bridge
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> release line in branch-2. Currently, the main issue is that there's many
> fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> likely never see the light of day unless they are backported to
> branch-2.10.
>
> To do this, I propose we:
>
>   - Delete branch-2.10
>   - Rename branch-2 to branch-2.10
>   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
> This way we get all the current branch-2 fixes into the 2.10.x release
> line. Then the commit chain will look like: trunk -> branch-3.2 ->
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
> Thoughts?
>
> Jonathan Hung
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>


Re: [DISCUSS] Making 2.10 the last minor 2.x release

2019-11-14 Thread Jonathan Hung
Some other additional items we would need:

   - Mark all fix-versions in YARN/HDFS/MAPREDUCE/HADOOP from 2.11.0 to
   2.10.1
   - Remove 2.11.0 as a version in these projects


Jonathan Hung


On Thu, Nov 14, 2019 at 6:51 PM Jonathan Hung  wrote:

> Hi folks,
>
> Given the release of 2.10.0, and the fact that it's intended to be a
> bridge release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
> minor release line in branch-2. Currently, the main issue is that there's
> many fixes going into branch-2 (the theoretical 2.11.0) that's not going
> into branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> likely never see the light of day unless they are backported to branch-2.10.
>
> To do this, I propose we:
>
>- Delete branch-2.10
>- Rename branch-2 to branch-2.10
>- Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
> This way we get all the current branch-2 fixes into the 2.10.x release
> line. Then the commit chain will look like: trunk -> branch-3.2 ->
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
> Thoughts?
>
> Jonathan Hung
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>


[DISCUSS] Making 2.10 the last minor 2.x release

2019-11-14 Thread Jonathan Hung
Hi folks,

Given the release of 2.10.0, and the fact that it's intended to be a bridge
release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
release line in branch-2. Currently, the main issue is that there's many
fixes going into branch-2 (the theoretical 2.11.0) that's not going into
branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
likely never see the light of day unless they are backported to branch-2.10.

To do this, I propose we:

   - Delete branch-2.10
   - Rename branch-2 to branch-2.10
   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT

This way we get all the current branch-2 fixes into the 2.10.x release
line. Then the commit chain will look like: trunk -> branch-3.2 ->
branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8

Thoughts?

Jonathan Hung

[1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html


[ANNOUNCE] Apache Hadoop 2.10.0 release

2019-10-31 Thread Jonathan Hung
Hi all,

I am happy to announce that the Apache Hadoop 2.10.0 has been released.

Apache Hadoop 2.10.0 is the first release in the Apache Hadoop 2.10 line.
The release details, including links to downloads, list of major features,
release notes, and changelog, are on the 2.10.0 announcement page [1]. You
can also download the release from the Downloads page [2].

- Major features: https://hadoop.apache.org/docs/r2.10.0/index.html
- Release notes:
http://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/release/2.10.0/RELEASENOTES.2.10.0.html
- Changelog:
http://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/release/2.10.0/CHANGES.2.10.0.html

Thanks!

[1] https://hadoop.apache.org/release/2.10.0.html
[2] https://hadoop.apache.org/releases.html

Jonathan


Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)

2019-10-29 Thread Jonathan Hung
+1 from me too. The vote passed, so I'll continue with the rest of the
release.

Thanks everyone!

Jonathan Hung


On Tue, Oct 29, 2019 at 1:40 PM Giovanni Matteo Fumarola <
giovanni.fumar...@gmail.com> wrote:

> +1 (non-binding).
>
> - Built from source on Ubuntu with OpenJDK 11.0.3
> - Verified signatures
> - Verified documentation
> - Setup up a single node cluster and ran basic yarn commands
> - Ran UTs for Yarn Router, Yarn Common, Yarn API, YARN NM and YARN RM.
>
> Thanks for putting this together, Jonathan.
>
> On Tue, Oct 29, 2019 at 8:47 AM Dinesh Chitlangia
>  wrote:
>
>> +1 (non-binding)
>>
>> - Verified signatures
>> - Verified documentation
>> - Built from sources on CentOS 7
>> - Tested with basic hdfs commands on a single node setup.
>>
>> Thank for organizing the release, Jonathan.
>>
>> -Dinesh
>>
>>
>>
>> On Tue, Oct 29, 2019 at 9:45 AM epa...@apache.org 
>> wrote:
>>
>> > Compatibility testing has gone well for me.
>> >
>> >  - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.8.5
>> and
>> > 2.10.0
>> > - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.10.0
>> and
>> > trunk
>> > - With one 4-node cluster running 2.10.0 and one 4-node cluster running
>> > trunk, I ran a word count job in each cluster whose inputs and outputs
>> were
>> > from and to the opposite cluster.
>> > - I verified that HDFS replication works as expected in a trunk cluster
>> > that has one 2.10.0 datanode.
>> >
>> >  Thanks,
>> > -Eric
>> >
>> >
>> > > On Tuesday, October 22, 2019, 4:55:29 PM CDT, Jonathan Hung <
>> > jyhung2...@gmail.com> wrote:
>> > > Hi folks,
>> > >
>> > >This is the second release candidate for the first release of Apache
>> > Hadoop
>> > >2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It
>> includes
>> > >features such as:
>> > >
>> > > - User-defined resource types
>> > > - Native GPU support as a schedulable resource type
>> > > - Consistent reads from standby node
>> > > - Namenode port based selective encryption
>> > > - Improvements related to rolling upgrade support from 2.x to 3.x
>> > > - Cost based fair call queue
>> > >
>> > > The RC1 artifacts are at:
>> > http://home.apache.org/~jhung/hadoop-2.10.0-RC1/
>> > >
>> > > RC tag is release-2.10.0-RC1.
>> > >
>> > > The maven artifacts are hosted here:
>> > >
>> https://repository.apache.org/content/repositories/orgapachehadoop-1243/
>> > >
>> > > My public key is available here:
>> > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>> > >
>> > > The vote will run for 5 weekdays, until Tuesday, October 29 at 3:00 pm
>> > PDT.
>> > >
>> > > Thanks,
>> > > Jonathan Hung
>> >
>> >
>> > -
>> > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>> >
>> >
>>
>


Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)

2019-10-28 Thread Jonathan Hung
Thanks Eric! I sent out an RC1 earlier last week, not sure if you saw that.
The only diff between RC1 and RC0 is HDFS-14667. If RC1 looks good to you
then it'd be great to get your testing results on that thread.

Jonathan Hung


On Mon, Oct 28, 2019 at 1:06 PM epa...@apache.org  wrote:

> Compatibility testing has gone well for me.
>
> - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.8.5 and
> 2.10.0
> - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.10.0 and
> trunk
> - With one 4-node cluster running 2.10.0 and one 4-node cluster running
> trunk, I ran a word count job in each cluster whose inputs and outputs were
> from and to the opposite cluster.
> - I verified that HDFS replication works as expected in a trunk cluster
> that has one 2.10.0 datanode.
>
> Thanks,
> -Eric
>
> On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung <
> jyhung2...@gmail.com> wrote:
>
>
>
>
>
> Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and
> 2.10.0 clients and datanodes. Everything worked as expected.
>
> Jonathan Hung
>
>
> On Tue, Oct 22, 2019 at 3:04 PM Eric Badger 
> wrote:
>
> > Hi Jonathan,
> >
> > Thanks for putting this RC together. You stated that there are
> > improvements related to rolling upgrades from 2.x to 3.x and I know I
> have
> > seen multiple JIRAs getting committed to that effect. Could you describe
> > any tests that you have done to verify rolling upgrade compatibility
> > for 3.x servers talking to 2.x clients and vice versa?
> >
> > Thanks,
> >
> > Eric
> >
> > On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung 
> > wrote:
> >
> >> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar
> >> (HDFS-14667). Since this is the first of a minor release, we would like
> to
> >> get it into 2.10.0.
> >>
> >> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1
> >> shortly.
> >>
> >> Jonathan Hung
> >>
> >>
> >> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang  wrote:
> >>
> >> > Thanks for the effort, Jonathan!
> >> >
> >> > +1 (non-binding) on RC0.
> >> >  - Set up a single node cluster with the binary tarball
> >> >  - Run Spark Pi and pySpark job
> >> >
> >> > BR,
> >> > Zhankun
> >> >
> >> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko <
> shv.had...@gmail.com
> >> >
> >> > wrote:
> >> >
> >> >> +1 on RC0.
> >> >> - Verified signatures
> >> >> - Built from sources
> >> >> - Ran unit tests for new features
> >> >> - Checked artifacts on Nexus, made sure the sources are present.
> >> >>
> >> >> Thanks
> >> >> --Konstantin
> >> >>
> >> >>
> >> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung 
> >> >> wrote:
> >> >>
> >> >> > Hi folks,
> >> >> >
> >> >> > This is the first release candidate for the first release of Apache
> >> >> Hadoop
> >> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It
> >> includes
> >> >> > features such as:
> >> >> >
> >> >> > - User-defined resource types
> >> >> > - Native GPU support as a schedulable resource type
> >> >> > - Consistent reads from standby node
> >> >> > - Namenode port based selective encryption
> >> >> > - Improvements related to rolling upgrade support from 2.x to 3.x
> >> >> >
> >> >> > The RC0 artifacts are at:
> >> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/
> >> >> >
> >> >> > RC tag is release-2.10.0-RC0.
> >> >> >
> >> >> > The maven artifacts are hosted here:
> >> >> >
> >> >>
> >>
> https://repository.apache.org/content/repositories/orgapachehadoop-1241/
> >> >> >
> >> >> > My public key is available here:
> >> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >> >> >
> >> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at
> >> 6:00 pm
> >> >> > PDT.
> >> >> >
> >> >> > Thanks,
> >> >> > Jonathan Hung
> >> >> >
> >> >> > [1]
> >> >> >
> >> >> >
> >> >>
> >>
> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)
> >> >> >
> >> >>
> >> >
> >>
> >
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)

2019-10-26 Thread Jonathan Hung
Hi Eric, I took a quick look, are you using
mapreduce.application.framework.path to run your MR jobs? If not, this
seems like expected behavior if AM and tasks get launched on different NMs
with different locally installed hadoop versions?

Jonathan Hung


On Sat, Oct 26, 2019 at 8:55 AM epa...@apache.org  wrote:

> I ran a few compatibility tests between 2.10.0 and 3.3.0 (trunk)
>
> Unfortunately, I ran into the following problem:
>
> Running with 2.10 RM and 3.3.0 (trunk) NM fails attempts with the
> following error:
>
> 2019-10-26 15:44:06,885 WARN [main] org.apache.hadoop.mapred.YarnChild:
> Exception running child :
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RPC$VersionMismatch):
> Protocol org.apache.hadoop.mapred.TaskUmbilicalProtocol version mismatch.
> (client = 19, server = 21)
>
> The AM happened to launch on the 3.3.0 node.
>
> Is this a protobuf issue? I thought we addressed that?
>
> -Eric Payne
>
>
>
> On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung <
> jyhung2...@gmail.com> wrote:
>
>
>
>
>
> Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and
> 2.10.0 clients and datanodes. Everything worked as expected.
>
> Jonathan Hung
>
>
> On Tue, Oct 22, 2019 at 3:04 PM Eric Badger 
> wrote:
>
> > Hi Jonathan,
> >
> > Thanks for putting this RC together. You stated that there are
> > improvements related to rolling upgrades from 2.x to 3.x and I know I
> have
> > seen multiple JIRAs getting committed to that effect. Could you describe
> > any tests that you have done to verify rolling upgrade compatibility
> > for 3.x servers talking to 2.x clients and vice versa?
> >
> > Thanks,
> >
> > Eric
> >
> > On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung 
> > wrote:
> >
> >> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar
> >> (HDFS-14667). Since this is the first of a minor release, we would like
> to
> >> get it into 2.10.0.
> >>
> >> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1
> >> shortly.
> >>
> >> Jonathan Hung
> >>
> >>
> >> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang  wrote:
> >>
> >> > Thanks for the effort, Jonathan!
> >> >
> >> > +1 (non-binding) on RC0.
> >> >  - Set up a single node cluster with the binary tarball
> >> >  - Run Spark Pi and pySpark job
> >> >
> >> > BR,
> >> > Zhankun
> >> >
> >> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko <
> shv.had...@gmail.com
> >> >
> >> > wrote:
> >> >
> >> >> +1 on RC0.
> >> >> - Verified signatures
> >> >> - Built from sources
> >> >> - Ran unit tests for new features
> >> >> - Checked artifacts on Nexus, made sure the sources are present.
> >> >>
> >> >> Thanks
> >> >> --Konstantin
> >> >>
> >> >>
> >> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung 
> >> >> wrote:
> >> >>
> >> >> > Hi folks,
> >> >> >
> >> >> > This is the first release candidate for the first release of Apache
> >> >> Hadoop
> >> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It
> >> includes
> >> >> > features such as:
> >> >> >
> >> >> > - User-defined resource types
> >> >> > - Native GPU support as a schedulable resource type
> >> >> > - Consistent reads from standby node
> >> >> > - Namenode port based selective encryption
> >> >> > - Improvements related to rolling upgrade support from 2.x to 3.x
> >> >> >
> >> >> > The RC0 artifacts are at:
> >> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/
> >> >> >
> >> >> > RC tag is release-2.10.0-RC0.
> >> >> >
> >> >> > The maven artifacts are hosted here:
> >> >> >
> >> >>
> >>
> https://repository.apache.org/content/repositories/orgapachehadoop-1241/
> >> >> >
> >> >> > My public key is available here:
> >> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >> >> >
> >> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at
> >> 6:00 pm
> >> >> > PDT.
> >> >> >
> >> >> > Thanks,
> >> >> > Jonathan Hung
> >> >> >
> >> >> > [1]
> >> >> >
> >> >> >
> >> >>
> >>
> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)
> >> >> >
> >> >>
> >> >
> >>
> >
>


Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)

2019-10-25 Thread Jonathan Hung
Some more thoughts: for the javadoc issue, I think we can just support
building on java 7.

For the release notes issue, I can work with the authors of the major
features to come up with release notes and update them before pushing it to
site. The release notes in the published artifacts won't be up to date, but
I think that's fine.

I'll go ahead with this plan if no objections.

Jonathan Hung


On Fri, Oct 25, 2019 at 12:19 PM Jonathan Hung  wrote:

> Thanks for looking Erik.
>
> For the release notes, yeah I think it's because there's no release notes
> for the corresponding JIRAs. I've added details for these features to the
> index.md.vm file which should show up on the homepage for 2.10.0 (e.g.
> https://hadoop.apache.org/docs/r2.9.0/index.html). We could add release
> notes for these JIRAs, but that would require recreating the tar.gzs since
> the release notes are bundled in there.
>
> For the javadoc issue, I was able to repro this issue, seems it's because
> the org.apache.hadoop.yarn.client.ClientRMProxy import was removed in
> FederationProxyProviderUtil in YARN-7900 in branch-2 (but not in other
> branches). But it's referenced in javadocs in this file so it throws this
> error. Re-adding this import and building with java 8 allows it to succeed.
>
> I checked javadoc html for FederationProxyProviderUtil in the produced
> artifacts and it appears to be correct.
>
> I think we could easily overwrite the current RC1 artifacts with ones
> containing proper release notes. Not sure what to do about the javadoc
> issue though, that would require overwriting the release-2.10.0-RC1 tag
> which I don't want to do. What do others think?
>
> Jonathan Hung
>
>
> On Fri, Oct 25, 2019 at 9:21 AM Erik Krogen  wrote:
>
>> Thanks for putting this together, Jonathan!
>>
>> I noticed that the RELEASENOTES.md makes no mention of any of the major
>> features you mentioned in your email about the RC. Is this expected? I
>> guess it is caused by the lack of a release note on the JIRAs for those
>> features.
>>
>> I also noticed that building a distribution package (mvn -DskipTests
>> package -Pdist) fails on Java 8 (1.8.0_172) with a bunch of Javadoc errors.
>> It works fine on Java 7. Is this expected?
>>
>> Other verifications I performed:
>>
>>- Verified all signatures in RC1
>>- Verified all checksums in RC1
>>- Visually inspected contents of src tarball
>>- Built from source on Mac OSX 10.14.6 and RHEL7 (Java 8)
>>    - mvn -DskipTests package
>>- Visually inspected contents of binary tarball
>>
>> Thanks,
>> Erik
>>
>> --
>> *From:* Konstantin Shvachko 
>> *Sent:* Wednesday, October 23, 2019 6:10 PM
>> *To:* Jonathan Hung 
>> *Cc:* Hdfs-dev ; mapreduce-dev <
>> mapreduce-dev@hadoop.apache.org>; yarn-dev ;
>> Hadoop Common 
>> *Subject:* Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)
>>
>> +1 on RC1
>>
>> - Verified signatures
>> - Verified maven artifacts on Nexus for sources
>> - Checked rat reports
>> - Checked documentation
>> - Checked packaging contents
>> - Built from sources on RHEL 7 box
>> - Ran unit tests for new HDFS features with Java 8
>>
>> Thanks,
>> --Konstantin
>>
>> On Tue, Oct 22, 2019 at 2:55 PM Jonathan Hung 
>> wrote:
>>
>> > Hi folks,
>> >
>> > This is the second release candidate for the first release of Apache
>> Hadoop
>> > 2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It includes
>> > features such as:
>> >
>> > - User-defined resource types
>> > - Native GPU support as a schedulable resource type
>> > - Consistent reads from standby node
>> > - Namenode port based selective encryption
>> > - Improvements related to rolling upgrade support from 2.x to 3.x
>> > - Cost based fair call queue
>> >
>> > The RC1 artifacts are at:
>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fhome.apache.org%2F~jhung%2Fhadoop-2.10.0-RC1%2F&data=02%7C01%7Cekrogen%40linkedin.com%7C1fee1e5911d8415a418b08d7581f0c7e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C637074762694349124&sdata=ZX7lF4N3fV38ggkplLU56ybhKBZrx%2FUKMkfxm2WJ7eU%3D&reserved=0
>> >
>> > RC tag is release-2.10.0-RC1.
>> >
>> > The maven artifacts are hosted here:
>> >
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Forgapachehadoop-1243%2F&data=02%7C01%7Cekrogen%40linkedin.com%7C

Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)

2019-10-25 Thread Jonathan Hung
Thanks for looking Erik.

For the release notes, yeah I think it's because there's no release notes
for the corresponding JIRAs. I've added details for these features to the
index.md.vm file which should show up on the homepage for 2.10.0 (e.g.
https://hadoop.apache.org/docs/r2.9.0/index.html). We could add release
notes for these JIRAs, but that would require recreating the tar.gzs since
the release notes are bundled in there.

For the javadoc issue, I was able to repro this issue, seems it's because
the org.apache.hadoop.yarn.client.ClientRMProxy import was removed in
FederationProxyProviderUtil in YARN-7900 in branch-2 (but not in other
branches). But it's referenced in javadocs in this file so it throws this
error. Re-adding this import and building with java 8 allows it to succeed.

I checked javadoc html for FederationProxyProviderUtil in the produced
artifacts and it appears to be correct.

I think we could easily overwrite the current RC1 artifacts with ones
containing proper release notes. Not sure what to do about the javadoc
issue though, that would require overwriting the release-2.10.0-RC1 tag
which I don't want to do. What do others think?

Jonathan Hung


On Fri, Oct 25, 2019 at 9:21 AM Erik Krogen  wrote:

> Thanks for putting this together, Jonathan!
>
> I noticed that the RELEASENOTES.md makes no mention of any of the major
> features you mentioned in your email about the RC. Is this expected? I
> guess it is caused by the lack of a release note on the JIRAs for those
> features.
>
> I also noticed that building a distribution package (mvn -DskipTests
> package -Pdist) fails on Java 8 (1.8.0_172) with a bunch of Javadoc errors.
> It works fine on Java 7. Is this expected?
>
> Other verifications I performed:
>
>- Verified all signatures in RC1
>- Verified all checksums in RC1
>- Visually inspected contents of src tarball
>- Built from source on Mac OSX 10.14.6 and RHEL7 (Java 8)
>- mvn -DskipTests package
>- Visually inspected contents of binary tarball
>
> Thanks,
> Erik
>
> ------
> *From:* Konstantin Shvachko 
> *Sent:* Wednesday, October 23, 2019 6:10 PM
> *To:* Jonathan Hung 
> *Cc:* Hdfs-dev ; mapreduce-dev <
> mapreduce-dev@hadoop.apache.org>; yarn-dev ;
> Hadoop Common 
> *Subject:* Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)
>
> +1 on RC1
>
> - Verified signatures
> - Verified maven artifacts on Nexus for sources
> - Checked rat reports
> - Checked documentation
> - Checked packaging contents
> - Built from sources on RHEL 7 box
> - Ran unit tests for new HDFS features with Java 8
>
> Thanks,
> --Konstantin
>
> On Tue, Oct 22, 2019 at 2:55 PM Jonathan Hung 
> wrote:
>
> > Hi folks,
> >
> > This is the second release candidate for the first release of Apache
> Hadoop
> > 2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It includes
> > features such as:
> >
> > - User-defined resource types
> > - Native GPU support as a schedulable resource type
> > - Consistent reads from standby node
> > - Namenode port based selective encryption
> > - Improvements related to rolling upgrade support from 2.x to 3.x
> > - Cost based fair call queue
> >
> > The RC1 artifacts are at:
> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fhome.apache.org%2F~jhung%2Fhadoop-2.10.0-RC1%2F&data=02%7C01%7Cekrogen%40linkedin.com%7C1fee1e5911d8415a418b08d7581f0c7e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C637074762694349124&sdata=ZX7lF4N3fV38ggkplLU56ybhKBZrx%2FUKMkfxm2WJ7eU%3D&reserved=0
> >
> > RC tag is release-2.10.0-RC1.
> >
> > The maven artifacts are hosted here:
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Forgapachehadoop-1243%2F&data=02%7C01%7Cekrogen%40linkedin.com%7C1fee1e5911d8415a418b08d7581f0c7e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C637074762694349124&sdata=DsJDfoj8eg3E%2F%2BNEwOAI41LhcRJ2hOWycS923ds3Seg%3D&reserved=0
> >
> > My public key is available here:
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist.apache.org%2Frepos%2Fdist%2Frelease%2Fhadoop%2Fcommon%2FKEYS&data=02%7C01%7Cekrogen%40linkedin.com%7C1fee1e5911d8415a418b08d7581f0c7e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C637074762694349124&sdata=1694z6xhj5NtxwYBpwnRBx%2BgK0npGIUs5O580K3KPJw%3D&reserved=0
> >
> > The vote will run for 5 weekdays, until Tuesday, October 29 at 3:00 pm
> PDT.
> >
> > Thanks,
> > Jonathan Hung
> >
> > [1]
> >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.a

Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)

2019-10-23 Thread Jonathan Hung
Hi Eric, thanks for trying it out. We talked about this in today's YARN
community sync up, summarizing here for everyone else:

I don't think it's worth delaying the 2.10.0 release further, we can
address this in a subsequent 2.10.x release. Wangda mentioned it might be
related to changes in dominant resource calculator, but root cause remains
to be seen.

Jonathan Hung


On Wed, Oct 23, 2019 at 9:02 AM epa...@apache.org  wrote:

> Hi Jonathan,
>
> Thanks very much for all of your work on this release.
>
> I have a concern about cross-queue (inter-queue) preemption in 2.10.
>
> In 2.8, on a 6 node pseudo-cluster, preempting from one queue to meet the
> needs of another queue seems to work as expected. However, 2.10 in the same
> pseudo-cluster (with the same config properties), only one container was
> preempted for the AM and then nothing else.
>
> I don't know how the community feels about holding up the 2.10.0 release
> for this issue, but we need to get to the bottom of this before we can go
> to 2.10.x. I am still investigating.
>
> Thanks,
> -Eric
>
>
>
>
>  On Tuesday, October 22, 2019, 4:55:29 PM CDT, Jonathan Hung <
> jyhung2...@gmail.com> wrote:
> > Hi folks,
> >
> > This is the second release candidate for the first release of Apache
> Hadoop
> > 2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It includes
> > features such as:
> >
> > - User-defined resource types
> > - Native GPU support as a schedulable resource type
> > - Consistent reads from standby node
> > - Namenode port based selective encryption
> > - Improvements related to rolling upgrade support from 2.x to 3.x
> > - Cost based fair call queue
> >
> > The RC1 artifacts are at:
> http://home.apache.org/~jhung/hadoop-2.10.0-RC1/
> >
> > RC tag is release-2.10.0-RC1.
> >
> > The maven artifacts are hosted here:
> > https://repository.apache.org/content/repositories/orgapachehadoop-1243/
> >
> > My public key is available here:
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >
> > The vote will run for 5 weekdays, until Tuesday, October 29 at 3:00 pm
> PDT.
> >
> > Thanks,
> > Jonathan Hung
>


Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)

2019-10-22 Thread Jonathan Hung
Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and
2.10.0 clients and datanodes. Everything worked as expected.

Jonathan Hung


On Tue, Oct 22, 2019 at 3:04 PM Eric Badger 
wrote:

> Hi Jonathan,
>
> Thanks for putting this RC together. You stated that there are
> improvements related to rolling upgrades from 2.x to 3.x and I know I have
> seen multiple JIRAs getting committed to that effect. Could you describe
> any tests that you have done to verify rolling upgrade compatibility
> for 3.x servers talking to 2.x clients and vice versa?
>
> Thanks,
>
> Eric
>
> On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung 
> wrote:
>
>> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar
>> (HDFS-14667). Since this is the first of a minor release, we would like to
>> get it into 2.10.0.
>>
>> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1
>> shortly.
>>
>> Jonathan Hung
>>
>>
>> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang  wrote:
>>
>> > Thanks for the effort, Jonathan!
>> >
>> > +1 (non-binding) on RC0.
>> >  - Set up a single node cluster with the binary tarball
>> >  - Run Spark Pi and pySpark job
>> >
>> > BR,
>> > Zhankun
>> >
>> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko > >
>> > wrote:
>> >
>> >> +1 on RC0.
>> >> - Verified signatures
>> >> - Built from sources
>> >> - Ran unit tests for new features
>> >> - Checked artifacts on Nexus, made sure the sources are present.
>> >>
>> >> Thanks
>> >> --Konstantin
>> >>
>> >>
>> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung 
>> >> wrote:
>> >>
>> >> > Hi folks,
>> >> >
>> >> > This is the first release candidate for the first release of Apache
>> >> Hadoop
>> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It
>> includes
>> >> > features such as:
>> >> >
>> >> > - User-defined resource types
>> >> > - Native GPU support as a schedulable resource type
>> >> > - Consistent reads from standby node
>> >> > - Namenode port based selective encryption
>> >> > - Improvements related to rolling upgrade support from 2.x to 3.x
>> >> >
>> >> > The RC0 artifacts are at:
>> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/
>> >> >
>> >> > RC tag is release-2.10.0-RC0.
>> >> >
>> >> > The maven artifacts are hosted here:
>> >> >
>> >>
>> https://repository.apache.org/content/repositories/orgapachehadoop-1241/
>> >> >
>> >> > My public key is available here:
>> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>> >> >
>> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at
>> 6:00 pm
>> >> > PDT.
>> >> >
>> >> > Thanks,
>> >> > Jonathan Hung
>> >> >
>> >> > [1]
>> >> >
>> >> >
>> >>
>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)
>> >> >
>> >>
>> >
>>
>


[VOTE] Release Apache Hadoop 2.10.0 (RC1)

2019-10-22 Thread Jonathan Hung
Hi folks,

This is the second release candidate for the first release of Apache Hadoop
2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It includes
features such as:

- User-defined resource types
- Native GPU support as a schedulable resource type
- Consistent reads from standby node
- Namenode port based selective encryption
- Improvements related to rolling upgrade support from 2.x to 3.x
- Cost based fair call queue

The RC1 artifacts are at: http://home.apache.org/~jhung/hadoop-2.10.0-RC1/

RC tag is release-2.10.0-RC1.

The maven artifacts are hosted here:
https://repository.apache.org/content/repositories/orgapachehadoop-1243/

My public key is available here:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

The vote will run for 5 weekdays, until Tuesday, October 29 at 3:00 pm PDT.

Thanks,
Jonathan Hung

[1]
https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)


Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)

2019-10-22 Thread Jonathan Hung
Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar
(HDFS-14667). Since this is the first of a minor release, we would like to
get it into 2.10.0.

HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1
shortly.

Jonathan Hung


On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang  wrote:

> Thanks for the effort, Jonathan!
>
> +1 (non-binding) on RC0.
>  - Set up a single node cluster with the binary tarball
>  - Run Spark Pi and pySpark job
>
> BR,
> Zhankun
>
> On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko 
> wrote:
>
>> +1 on RC0.
>> - Verified signatures
>> - Built from sources
>> - Ran unit tests for new features
>> - Checked artifacts on Nexus, made sure the sources are present.
>>
>> Thanks
>> --Konstantin
>>
>>
>> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung 
>> wrote:
>>
>> > Hi folks,
>> >
>> > This is the first release candidate for the first release of Apache
>> Hadoop
>> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It includes
>> > features such as:
>> >
>> > - User-defined resource types
>> > - Native GPU support as a schedulable resource type
>> > - Consistent reads from standby node
>> > - Namenode port based selective encryption
>> > - Improvements related to rolling upgrade support from 2.x to 3.x
>> >
>> > The RC0 artifacts are at:
>> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/
>> >
>> > RC tag is release-2.10.0-RC0.
>> >
>> > The maven artifacts are hosted here:
>> >
>> https://repository.apache.org/content/repositories/orgapachehadoop-1241/
>> >
>> > My public key is available here:
>> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>> >
>> > The vote will run for 5 weekdays, until Wednesday, October 23 at 6:00 pm
>> > PDT.
>> >
>> > Thanks,
>> > Jonathan Hung
>> >
>> > [1]
>> >
>> >
>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)
>> >
>>
>


[VOTE] Release Apache Hadoop 2.10.0 (RC0)

2019-10-16 Thread Jonathan Hung
Hi folks,

This is the first release candidate for the first release of Apache Hadoop
2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It includes
features such as:

- User-defined resource types
- Native GPU support as a schedulable resource type
- Consistent reads from standby node
- Namenode port based selective encryption
- Improvements related to rolling upgrade support from 2.x to 3.x

The RC0 artifacts are at: http://home.apache.org/~jhung/hadoop-2.10.0-RC0/

RC tag is release-2.10.0-RC0.

The maven artifacts are hosted here:
https://repository.apache.org/content/repositories/orgapachehadoop-1241/

My public key is available here:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

The vote will run for 5 weekdays, until Wednesday, October 23 at 6:00 pm
PDT.

Thanks,
Jonathan Hung

[1]
https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)


Re: [DISCUSS] Hadoop 2.10.0 release plan

2019-10-16 Thread Jonathan Hung
I've moved all jiras with target version 2.10.0 to 2.10.1. Also I've
created branch-2.10 and branch-2.10.0, please commit any 2.10.x bug fixes
to branch-2.10.

I'll send out a vote thread for 2.10.0-RC0 shortly.

Jonathan Hung


On Fri, Oct 11, 2019 at 10:32 AM Jonathan Hung  wrote:

> Edit: seems a 2.10.0 blocker was reopened (HDFS-14305). I'll continue
> watching this jira and start the release once this is resolved.
>
> Jonathan Hung
>
>
> On Thu, Oct 10, 2019 at 5:13 PM Jonathan Hung 
> wrote:
>
>> Hi folks, as of now all 2.10.0 blockers have been resolved [1]. So I'll
>> start the release process soon (cutting branches, updating target versions,
>> etc).
>>
>> [1] https://issues.apache.org/jira/issues/?filter=12346975
>>
>> Jonathan Hung
>>
>>
>> On Mon, Aug 26, 2019 at 10:19 AM Jonathan Hung 
>> wrote:
>>
>>> Hi folks,
>>>
>>> As discussed previously (e.g. [1], [2]) we'd like to do a 2.10.0 release
>>> soon. Some features/big-items we're targeting for this release:
>>>
>>>- YARN resource types/GPU support (YARN-8200
>>><https://issues.apache.org/jira/browse/YARN-8200>)
>>>- Selective wire encryption (HDFS-13541
>>><https://issues.apache.org/jira/browse/HDFS-13541>)
>>>- Rolling upgrade support from 2.x to 3.x (e.g. HDFS-14509
>>><https://issues.apache.org/jira/browse/HDFS-14509>)
>>>
>>> Per [3] sounds like there's concern around upgrading dependencies as
>>> well.
>>>
>>> We created a public jira filter here (
>>> https://issues.apache.org/jira/issues/?filter=12346975) marking all
>>> blockers for 2.10.0 release. If you have other jiras that should be 2.10.0
>>> blockers, please mark "Target Version/s" as "2.10.0" and add label
>>> "release-blocker" so we can track it through this filter.
>>>
>>> We're targeting a release at end of September.
>>>
>>> Please share any thoughts you have about this. Thanks!
>>>
>>> [1]
>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29461.html
>>> [2]
>>> https://www.mail-archive.com/mapreduce-dev@hadoop.apache.org/msg21293.html
>>> [3]
>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg33440.html
>>>
>>>
>>> Jonathan Hung
>>>
>>


Re: [DISCUSS] Hadoop 2.10.0 release plan

2019-10-11 Thread Jonathan Hung
Edit: seems a 2.10.0 blocker was reopened (HDFS-14305). I'll continue
watching this jira and start the release once this is resolved.

Jonathan Hung


On Thu, Oct 10, 2019 at 5:13 PM Jonathan Hung  wrote:

> Hi folks, as of now all 2.10.0 blockers have been resolved [1]. So I'll
> start the release process soon (cutting branches, updating target versions,
> etc).
>
> [1] https://issues.apache.org/jira/issues/?filter=12346975
>
> Jonathan Hung
>
>
> On Mon, Aug 26, 2019 at 10:19 AM Jonathan Hung 
> wrote:
>
>> Hi folks,
>>
>> As discussed previously (e.g. [1], [2]) we'd like to do a 2.10.0 release
>> soon. Some features/big-items we're targeting for this release:
>>
>>- YARN resource types/GPU support (YARN-8200
>><https://issues.apache.org/jira/browse/YARN-8200>)
>>- Selective wire encryption (HDFS-13541
>><https://issues.apache.org/jira/browse/HDFS-13541>)
>>- Rolling upgrade support from 2.x to 3.x (e.g. HDFS-14509
>><https://issues.apache.org/jira/browse/HDFS-14509>)
>>
>> Per [3] sounds like there's concern around upgrading dependencies as well.
>>
>> We created a public jira filter here (
>> https://issues.apache.org/jira/issues/?filter=12346975) marking all
>> blockers for 2.10.0 release. If you have other jiras that should be 2.10.0
>> blockers, please mark "Target Version/s" as "2.10.0" and add label
>> "release-blocker" so we can track it through this filter.
>>
>> We're targeting a release at end of September.
>>
>> Please share any thoughts you have about this. Thanks!
>>
>> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29461.html
>> [2]
>> https://www.mail-archive.com/mapreduce-dev@hadoop.apache.org/msg21293.html
>> [3] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg33440.html
>>
>>
>> Jonathan Hung
>>
>


Re: [DISCUSS] Hadoop 2.10.0 release plan

2019-10-10 Thread Jonathan Hung
Hi folks, as of now all 2.10.0 blockers have been resolved [1]. So I'll
start the release process soon (cutting branches, updating target versions,
etc).

[1] https://issues.apache.org/jira/issues/?filter=12346975

Jonathan Hung


On Mon, Aug 26, 2019 at 10:19 AM Jonathan Hung  wrote:

> Hi folks,
>
> As discussed previously (e.g. [1], [2]) we'd like to do a 2.10.0 release
> soon. Some features/big-items we're targeting for this release:
>
>- YARN resource types/GPU support (YARN-8200
><https://issues.apache.org/jira/browse/YARN-8200>)
>- Selective wire encryption (HDFS-13541
><https://issues.apache.org/jira/browse/HDFS-13541>)
>- Rolling upgrade support from 2.x to 3.x (e.g. HDFS-14509
><https://issues.apache.org/jira/browse/HDFS-14509>)
>
> Per [3] sounds like there's concern around upgrading dependencies as well.
>
> We created a public jira filter here (
> https://issues.apache.org/jira/issues/?filter=12346975) marking all
> blockers for 2.10.0 release. If you have other jiras that should be 2.10.0
> blockers, please mark "Target Version/s" as "2.10.0" and add label
> "release-blocker" so we can track it through this filter.
>
> We're targeting a release at end of September.
>
> Please share any thoughts you have about this. Thanks!
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29461.html
> [2]
> https://www.mail-archive.com/mapreduce-dev@hadoop.apache.org/msg21293.html
> [3] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg33440.html
>
>
> Jonathan Hung
>


Re: Incompatible changes between branch-2.8 and branch-2.9

2019-09-24 Thread Jonathan Hung
- I've created YARN-9855 and uploaded patches to fix YARN-6616 in
branch-2.8 and branch-2.7.
- For YARN-6050, not sure either. Robert/Wangda, can you comment on
YARN-6050 compatibility?
- For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
incompatible with this strategy? It should be OK to remove/add optional
fields (removing the field with id 12, and adding the field with id 13).
The difficulties I see here are, we would have to leave id 12 blank in
2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5
would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade
would still work IIUC.

Jonathan Hung


On Tue, Sep 24, 2019 at 2:52 PM Eric Badger 
wrote:

> *   For YARN-6616, for branch-2.8 and below, it was only committed to
> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
> revert YARN-6616 from branch-2.7 and branch-2.8.
>   - This seems reasonable. Since we haven't released anything, it should
> be no issue to change the 2.7/2.8 protobuf field to have the same value as
> 2.9+
>
> *   For YARN-6050, there's a bit here:
> https://developers.google.com/protocol-buffers/docs/proto that says
> "optional is compatible with repeated", so I think we should be OK there.
>   - Optional is compatible with repeatable over the wire such that
> protobuf won't blow up, but does that actually mean that it's compatible in
> this case? If it's expecting an optional and gets a repeated, it's going to
> drop everything except for the last value. I don't know enough about
> YARN-6050 to say if this will be ok or not.
>
> *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5
> to a 2.9+ version will be an issue. One option could be to move the
> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> release this), then upgrade from 2.8.6 to 2.9+.
>   - I'm ok with this, but it should be noted that the upgrade from
> 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
> upgrade. So this would cause some pain to anybody with clusters on those
> versions.
>
> Eric
>
> On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung 
> wrote:
>
>> Sorry, let me edit my first point. We can just create addendums for
>> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the
>> correct id 28. We don’t need to revert YARN-6616 from these branches
>> completely.
>>
>> Jonathan
>>
>> 
>> From: Jonathan Hung 
>> Sent: Tuesday, September 24, 2019 11:38 AM
>> To: Eric Badger
>> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
>> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
>>
>> Hi Eric, thanks for the investigation.
>>
>>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
>> revert YARN-6616 from branch-2.7 and branch-2.8.
>>   *   For YARN-6050, there's a bit here:
>> https://developers.google.com/protocol-buffers/docs/proto that says
>> "optional is compatible with repeated", so I think we should be OK there.
>>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
>> release this), then upgrade from 2.8.6 to 2.9+.
>>
>> Jonathan Hung
>>
>>
>> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger 
>> 
>> wrote:
>> We (Verizon Media) are currently moving towards upgrading our clusters
>> from
>> our internal fork of branch-2.8 to an internal fork of branch-2. During
>> this process, we have found multiple incompatible changes in protobufs
>> between branch-2.8 and branch-2. These incompatibilities were all
>> introduced between branch-2.8 and branch-2.9. I did a git diff over all
>> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of
>> incompatibilities from 3 separate commits. All of the incompatibilities
>> are
>> in yarn_protos.proto
>>
>>
>> I would like to discuss how to fix these incompatible changes. Otherwise,
>> rolling upgrades will not be supported between branch-2.8 (or below) and
>> branch-2.9 (or beyond). We could revert the incompatible changes, but then
>> the new releases would be incompatible with the releases that have these
&

Re: Incompatible changes between branch-2.8 and branch-2.9

2019-09-24 Thread Jonathan Hung
Sorry, let me edit my first point. We can just create addendums for YARN-6616 
in branch-2.7 and branch-2.8 to edit the submitTime field to the correct id 28. 
We don’t need to revert YARN-6616 from these branches completely.

Jonathan


From: Jonathan Hung 
Sent: Tuesday, September 24, 2019 11:38 AM
To: Eric Badger
Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
Subject: Re: Incompatible changes between branch-2.8 and branch-2.9

Hi Eric, thanks for the investigation.

  *   For YARN-6616, for branch-2.8 and below, it was only committed to 
2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can 
revert YARN-6616 from branch-2.7 and branch-2.8.
  *   For YARN-6050, there's a bit here: 
https://developers.google.com/protocol-buffers/docs/proto that says "optional 
is compatible with repeated", so I think we should be OK there.
  *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5 to 
a 2.9+ version will be an issue. One option could be to move the 
intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then 
users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to release 
this), then upgrade from 2.8.6 to 2.9+.

Jonathan Hung


On Tue, Sep 24, 2019 at 9:23 AM Eric Badger  
wrote:
We (Verizon Media) are currently moving towards upgrading our clusters from
our internal fork of branch-2.8 to an internal fork of branch-2. During
this process, we have found multiple incompatible changes in protobufs
between branch-2.8 and branch-2. These incompatibilities were all
introduced between branch-2.8 and branch-2.9. I did a git diff over all
.proto files across the branch-2.8 and branch-2.9 and found 3 instances of
incompatibilities from 3 separate commits. All of the incompatibilities are
in yarn_protos.proto


I would like to discuss how to fix these incompatible changes. Otherwise,
rolling upgrades will not be supported between branch-2.8 (or below) and
branch-2.9 (or beyond). We could revert the incompatible changes, but then
the new releases would be incompatible with the releases that have these
incompatible changes. If we do nothing, then rolling upgrades won't work
between 2.8- and 2.9+.


Thanks,


Eric


---


git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')


https://issues.apache.org/jira/browse/YARN-6616

   - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch

@@ -211,7 +245,20 @@ message ApplicationReportProto {

   optional PriorityProto priority = 23;

   optional string appNodeLabelExpression = 24;

   optional string amNodeLabelExpression = 25;

-  optional int64 submitTime = 26;

+  repeated AppTimeoutsMapProto appTimeouts = 26;

+  optional int64 launchTime = 27;

+  optional int64 submitTime = 28;


https://issues.apache.org/jira/browse/YARN-6050

   - Trunk and branch-2 patches both change the protobuf type in the same
   way.

@@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {

   optional LogAggregationContextProto log_aggregation_context = 14;

   optional ReservationIdProto reservation_id = 15;

   optional string node_label_expression = 16;

-  optional ResourceRequestProto am_container_resource_request = 17;

+  repeated ResourceRequestProto am_container_resource_request = 17;

+  repeated ApplicationTimeoutMapProto application_timeouts = 18;


https://issues.apache.org/jira/browse/YARN-7813

   - Trunk (applied through branch-3.1) and branch-3.0 (applied through
   branch-2.9) patches differ from branch-2.8 patch

@@ -425,7 +501,21 @@ message QueueInfoProto {

   optional string defaultNodeLabelExpression = 9;

   optional QueueStatisticsProto queueStatistics = 10;

   optional bool preemptionDisabled = 11;

-  optional bool intraQueuePreemptionDisabled = 12;

+  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;

+  optional bool intraQueuePreemptionDisabled = 13;


Re: Incompatible changes between branch-2.8 and branch-2.9

2019-09-24 Thread Jonathan Hung
Hi Eric, thanks for the investigation.

   - For YARN-6616, for branch-2.8 and below, it was only committed to
   2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
   revert YARN-6616 from branch-2.7 and branch-2.8.
   - For YARN-6050, there's a bit here:
   https://developers.google.com/protocol-buffers/docs/proto that says
   "optional is compatible with repeated", so I think we should be OK there.
   - For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5
   to a 2.9+ version will be an issue. One option could be to move the
   intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
   users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
   release this), then upgrade from 2.8.6 to 2.9+.


Jonathan Hung


On Tue, Sep 24, 2019 at 9:23 AM Eric Badger
 wrote:

> We (Verizon Media) are currently moving towards upgrading our clusters from
> our internal fork of branch-2.8 to an internal fork of branch-2. During
> this process, we have found multiple incompatible changes in protobufs
> between branch-2.8 and branch-2. These incompatibilities were all
> introduced between branch-2.8 and branch-2.9. I did a git diff over all
> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of
> incompatibilities from 3 separate commits. All of the incompatibilities are
> in yarn_protos.proto
>
>
> I would like to discuss how to fix these incompatible changes. Otherwise,
> rolling upgrades will not be supported between branch-2.8 (or below) and
> branch-2.9 (or beyond). We could revert the incompatible changes, but then
> the new releases would be incompatible with the releases that have these
> incompatible changes. If we do nothing, then rolling upgrades won't work
> between 2.8- and 2.9+.
>
>
> Thanks,
>
>
> Eric
>
>
> ---
>
>
> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>
>
> https://issues.apache.org/jira/browse/YARN-6616
>
>- Trunk patch (applied through branch-2.9) differs from branch-2.8 patch
>
> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>
>optional PriorityProto priority = 23;
>
>optional string appNodeLabelExpression = 24;
>
>optional string amNodeLabelExpression = 25;
>
> -  optional int64 submitTime = 26;
>
> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>
> +  optional int64 launchTime = 27;
>
> +  optional int64 submitTime = 28;
>
>
> https://issues.apache.org/jira/browse/YARN-6050
>
>- Trunk and branch-2 patches both change the protobuf type in the same
>way.
>
> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>
>optional LogAggregationContextProto log_aggregation_context = 14;
>
>optional ReservationIdProto reservation_id = 15;
>
>optional string node_label_expression = 16;
>
> -  optional ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>
>
> https://issues.apache.org/jira/browse/YARN-7813
>
>- Trunk (applied through branch-3.1) and branch-3.0 (applied through
>branch-2.9) patches differ from branch-2.8 patch
>
> @@ -425,7 +501,21 @@ message QueueInfoProto {
>
>optional string defaultNodeLabelExpression = 9;
>
>optional QueueStatisticsProto queueStatistics = 10;
>
>optional bool preemptionDisabled = 11;
>
> -  optional bool intraQueuePreemptionDisabled = 12;
>
> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>
> +  optional bool intraQueuePreemptionDisabled = 13;
>


Re: [VOTE] Merge YARN-8200 to branch-2 and branch-3.0

2019-08-29 Thread Jonathan Hung
Thanks all, +1 from me too.

There's three binding +1, two non-binding +1, and no -1 so I'll merge
YARN-8200 to branch-2 shortly. I'll skip branch-3.0 since it's EOL as
others have mentioned.

Jonathan Hung


On Tue, Aug 27, 2019 at 11:49 AM Konstantin Shvachko 
wrote:

> +1 for the merge.
>
> We probably should not bother with branch-3.0 merge since it's been voted
> EOL.
>
> Thanks,
> --Konstantin
>
> On Thu, Aug 22, 2019 at 4:43 PM Jonathan Hung 
> wrote:
>
>> Hi folks,
>>
>> As per [1], starting a vote to merge YARN-8200 (and YARN-8200.branch3)
>> feature branch to branch-2 (and branch-3.0).
>>
>> Vote runs for 7 days, to Thursday, Aug 29 5:00PM PDT.
>>
>> Thanks.
>>
>> [1]
>>
>> http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201908.mbox/%3cCAHzWLgcX7f5Tr3q=csrqgysvpdf7mh-iu17femgx89dhr+1...@mail.gmail.com%3e
>>
>> Jonathan Hung
>>
>


[DISCUSS] Hadoop 2.10.0 release plan

2019-08-26 Thread Jonathan Hung
Hi folks,

As discussed previously (e.g. [1], [2]) we'd like to do a 2.10.0 release
soon. Some features/big-items we're targeting for this release:

   - YARN resource types/GPU support (YARN-8200
   <https://issues.apache.org/jira/browse/YARN-8200>)
   - Selective wire encryption (HDFS-13541
   <https://issues.apache.org/jira/browse/HDFS-13541>)
   - Rolling upgrade support from 2.x to 3.x (e.g. HDFS-14509
   <https://issues.apache.org/jira/browse/HDFS-14509>)

Per [3] sounds like there's concern around upgrading dependencies as well.

We created a public jira filter here (
https://issues.apache.org/jira/issues/?filter=12346975) marking all
blockers for 2.10.0 release. If you have other jiras that should be 2.10.0
blockers, please mark "Target Version/s" as "2.10.0" and add label
"release-blocker" so we can track it through this filter.

We're targeting a release at end of September.

Please share any thoughts you have about this. Thanks!

[1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29461.html
[2]
https://www.mail-archive.com/mapreduce-dev@hadoop.apache.org/msg21293.html
[3] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg33440.html


Jonathan Hung


[VOTE] Merge YARN-8200 to branch-2 and branch-3.0

2019-08-22 Thread Jonathan Hung
Hi folks,

As per [1], starting a vote to merge YARN-8200 (and YARN-8200.branch3)
feature branch to branch-2 (and branch-3.0).

Vote runs for 7 days, to Thursday, Aug 29 5:00PM PDT.

Thanks.

[1]
http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201908.mbox/%3cCAHzWLgcX7f5Tr3q=csrqgysvpdf7mh-iu17femgx89dhr+1...@mail.gmail.com%3e

Jonathan Hung


Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2

2019-08-21 Thread Jonathan Hung
Reviving this thread: we tested YARN RU starting with a cluster running
2.7.4, to running branch-2 + YARN-8200. Ran some simple MR/Spark jobs
concurrently with the RM/NM upgrades and did not see any issues.

If no other concerns I'll continue with a vote.

Jonathan Hung


On Thu, Apr 18, 2019 at 5:12 PM Jonathan Hung  wrote:

> Sorry for the delay, had to deprioritize this. Hoping to get to this next
> week.
>
> Jonathan
>
> --
> *From:* Jim Brennan 
> *Sent:* Thursday, April 18, 2019 7:28 AM
> *To:* Jonathan Hung
> *Cc:* yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org
> *Subject:* Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2
>
> Hi Jonathan,
>
> Hi Jim, we have not tested rolling upgrade. I don’t foresee this being an
>> issue, but we’ll try it out and report back.
>
>
> Any update on this?
> Jim
>
>
> On Wed, Apr 3, 2019 at 2:16 AM Jonathan Hung  wrote:
>
>> Hi Jim, we have not tested rolling upgrade. I don’t foresee this being an
>> issue, but we’ll try it out and report back.
>>
>> Jonathan
>>
>> ------
>> *From:* Jim Brennan 
>> *Sent:* Tuesday, April 2, 2019 9:17 AM
>> *To:* Jonathan Hung
>> *Cc:* yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org
>> *Subject:* Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2
>>
>> Thanks for working on this!
>> One concern for us is support for a rolling upgrade.  If we are running a
>> cluster based on branch-2.8, will we be able to do a rolling upgrade (no
>> cluster down-time) to a branch containing these changes?  Have you tested
>> rolling upgrades?
>>
>> Thanks.
>> Jim
>>
>> On Fri, Mar 29, 2019 at 2:14 PM Jonathan Hung 
>> wrote:
>>
>>> Hello devs,
>>>
>>> Starting a discuss thread to merge resource types/native GPU scheduling
>>> support to branch-3.0 and branch-2. The resource types work was done in
>>> trunk~branch-3.0 and GPU support done in trunk~branch-3.1, so the
>>> proposal
>>> is to merge GPU support into branch-3.0 and both resource types/GPU
>>> support
>>> to branch-2.
>>>
>>> Internally we've been running resource types/GPU support off a fork of
>>> branch-2.9.0 in a > 300 node GPU cluster for a few months which has
>>> worked
>>> well. Also for completeness we verified that everything going into
>>> branch-2
>>> also exists in branch-3.0.
>>>
>>> The specific list of patches to merge is in feature branch
>>> YARN-8200.branch3 (for branch-3.0) and feature branch YARN-8200 (for
>>> branch-2). Full patches containing the YARN-8200.branch3 -> branch-3.0
>>> diff
>>> and YARN-8200 -> branch-2 diff have been posted to YARN-8200 jira.
>>>
>>> If there's no issues from the community I'll start a merge vote next
>>> week.
>>> Thanks.
>>>
>>> Jonathan Hung
>>>
>>


Re: [VOTE] Mark 2.6, 2.7, 3.0 release lines EOL

2019-08-20 Thread Jonathan Hung
+1. Thanks!

Jonathan Hung


On Tue, Aug 20, 2019 at 8:03 PM Wangda Tan  wrote:

> Hi all,
>
> This is a vote thread to mark any versions smaller than 2.7 (inclusive),
> and 3.0 EOL. This is based on discussions of [1]
>
> This discussion runs for 7 days and will conclude on Aug 28 Wed.
>
> Please feel free to share your thoughts.
>
> Thanks,
> Wangda
>
> [1]
>
> http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201908.mbox/%3cCAD++eC=ou-tit1faob-dbecqe6ht7ede7t1dyra2p1yinpe...@mail.gmail.com%3e
> ,
>


Re: [DISCUSS] Hadoop 2019 Release Planning

2019-08-12 Thread Jonathan Hung
Hi Wangda, Thanks for starting the discussion. We would also like to
release 2.10.0 which was discussed previously
<https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html> and
at various contributor meetups. I'm interested in being release manager for
that.

Thanks,

Jonathan Hung


On Fri, Aug 9, 2019 at 7:59 PM Wangda Tan  wrote:

> Hi all,
>
> Hope this email finds you well
>
> I want to hear your thoughts about what should be the release plan for
> 2019.
>
> In 2018, we released:
> - 1 maintenance release of 2.6
> - 3 maintenance releases of 2.7
> - 3 maintenance releases of 2.8
> - 3 releases of 2.9
> - 4 releases of 3.0
> - 2 releases of 3.1
>
> Total 16 releases in 2018.
>
> In 2019, by far we only have two releases:
> - 1 maintenance release of 3.1
> - 1 minor release of 3.2.
>
> However, the community put a lot of efforts to stabilize features of
> various release branches.
> There're:
> - 217 fixed patches in 3.1.3 [1]
> - 388 fixed patches in 3.2.1 [2]
> - 1172 fixed patches in 3.3.0 [3] (OMG!)
>
> I think it is the time to do maintenance releases of 3.1/3.2 and do a minor
> release for 3.3.0.
>
> In addition, I saw community discussion to do a 2.8.6 release for security
> fixes.
>
> Any other releases? I think there're release plans for Ozone as well. And
> please add your thoughts.
>
> Volunteers welcome! If you have interests to run a release as Release
> Manager (or co-Resource Manager), please respond to this email thread so we
> can coordinate.
>
> Thanks,
> Wangda Tan
>
> [1] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND resolution = Fixed AND
> fixVersion = 3.1.3
> [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND resolution = Fixed AND
> fixVersion = 3.2.1
> [3] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND resolution = Fixed AND
> fixVersion = 3.3.0
>


Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2

2019-04-18 Thread Jonathan Hung
Sorry for the delay, had to deprioritize this. Hoping to get to this next week.

Jonathan


From: Jim Brennan 
Sent: Thursday, April 18, 2019 7:28 AM
To: Jonathan Hung
Cc: yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org
Subject: Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2

Hi Jonathan,

Hi Jim, we have not tested rolling upgrade. I don’t foresee this being an 
issue, but we’ll try it out and report back.

Any update on this?
Jim


On Wed, Apr 3, 2019 at 2:16 AM Jonathan Hung 
mailto:jyhung2...@gmail.com>> wrote:
Hi Jim, we have not tested rolling upgrade. I don’t foresee this being an 
issue, but we’ll try it out and report back.

Jonathan


From: Jim Brennan 
mailto:james.bren...@verizonmedia.com>>
Sent: Tuesday, April 2, 2019 9:17 AM
To: Jonathan Hung
Cc: yarn-...@hadoop.apache.org<mailto:yarn-...@hadoop.apache.org>; 
mapreduce-dev@hadoop.apache.org<mailto:mapreduce-dev@hadoop.apache.org>
Subject: Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2

Thanks for working on this!
One concern for us is support for a rolling upgrade.  If we are running a 
cluster based on branch-2.8, will we be able to do a rolling upgrade (no 
cluster down-time) to a branch containing these changes?  Have you tested 
rolling upgrades?

Thanks.
Jim

On Fri, Mar 29, 2019 at 2:14 PM Jonathan Hung 
mailto:jyhung2...@gmail.com>> wrote:
Hello devs,

Starting a discuss thread to merge resource types/native GPU scheduling
support to branch-3.0 and branch-2. The resource types work was done in
trunk~branch-3.0 and GPU support done in trunk~branch-3.1, so the proposal
is to merge GPU support into branch-3.0 and both resource types/GPU support
to branch-2.

Internally we've been running resource types/GPU support off a fork of
branch-2.9.0 in a > 300 node GPU cluster for a few months which has worked
well. Also for completeness we verified that everything going into branch-2
also exists in branch-3.0.

The specific list of patches to merge is in feature branch
YARN-8200.branch3 (for branch-3.0) and feature branch YARN-8200 (for
branch-2). Full patches containing the YARN-8200.branch3 -> branch-3.0 diff
and YARN-8200 -> branch-2 diff have been posted to YARN-8200 jira.

If there's no issues from the community I'll start a merge vote next week.
Thanks.

Jonathan Hung


Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2

2019-04-03 Thread Jonathan Hung
Hi Jim, we have not tested rolling upgrade. I don’t foresee this being an 
issue, but we’ll try it out and report back.

Jonathan


From: Jim Brennan 
Sent: Tuesday, April 2, 2019 9:17 AM
To: Jonathan Hung
Cc: yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org
Subject: Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2

Thanks for working on this!
One concern for us is support for a rolling upgrade.  If we are running a 
cluster based on branch-2.8, will we be able to do a rolling upgrade (no 
cluster down-time) to a branch containing these changes?  Have you tested 
rolling upgrades?

Thanks.
Jim

On Fri, Mar 29, 2019 at 2:14 PM Jonathan Hung 
mailto:jyhung2...@gmail.com>> wrote:
Hello devs,

Starting a discuss thread to merge resource types/native GPU scheduling
support to branch-3.0 and branch-2. The resource types work was done in
trunk~branch-3.0 and GPU support done in trunk~branch-3.1, so the proposal
is to merge GPU support into branch-3.0 and both resource types/GPU support
to branch-2.

Internally we've been running resource types/GPU support off a fork of
branch-2.9.0 in a > 300 node GPU cluster for a few months which has worked
well. Also for completeness we verified that everything going into branch-2
also exists in branch-3.0.

The specific list of patches to merge is in feature branch
YARN-8200.branch3 (for branch-3.0) and feature branch YARN-8200 (for
branch-2). Full patches containing the YARN-8200.branch3 -> branch-3.0 diff
and YARN-8200 -> branch-2 diff have been posted to YARN-8200 jira.

If there's no issues from the community I'll start a merge vote next week.
Thanks.

Jonathan Hung


[DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2

2019-03-29 Thread Jonathan Hung
Hello devs,

Starting a discuss thread to merge resource types/native GPU scheduling
support to branch-3.0 and branch-2. The resource types work was done in
trunk~branch-3.0 and GPU support done in trunk~branch-3.1, so the proposal
is to merge GPU support into branch-3.0 and both resource types/GPU support
to branch-2.

Internally we've been running resource types/GPU support off a fork of
branch-2.9.0 in a > 300 node GPU cluster for a few months which has worked
well. Also for completeness we verified that everything going into branch-2
also exists in branch-3.0.

The specific list of patches to merge is in feature branch
YARN-8200.branch3 (for branch-3.0) and feature branch YARN-8200 (for
branch-2). Full patches containing the YARN-8200.branch3 -> branch-3.0 diff
and YARN-8200 -> branch-2 diff have been posted to YARN-8200 jira.

If there's no issues from the community I'll start a merge vote next week.
Thanks.

Jonathan Hung


Re: [VOTE] Moving branch-2 precommit/nightly test builds to java 8

2019-02-07 Thread Jonathan Hung
My non-binding +1 to finish. This vote passes with 6 binding +1, 3
non-binding +1, and no vetoes. We will make the changes as part
of HADOOP-15711, please follow there.

Thanks all!

Jonathan Hung


On Tue, Feb 5, 2019 at 11:38 PM Akira Ajisaka  wrote:

> +1
>
> -Akira
>
> On Wed, Feb 6, 2019 at 9:13 AM Wangda Tan  wrote:
> >
> > +1, make sense to me.
> >
> > On Tue, Feb 5, 2019 at 3:29 PM Konstantin Shvachko  >
> > wrote:
> >
> > > +1 Makes sense to me.
> > >
> > > Thanks,
> > > --Konst
> > >
> > > On Mon, Feb 4, 2019 at 6:14 PM Jonathan Hung 
> wrote:
> > >
> > > > Hello,
> > > >
> > > > Starting a vote based on the discuss thread [1] for moving branch-2
> > > > precommit/nightly test builds to openjdk8. After this change, the
> test
> > > > phase for precommit builds [2] and branch-2 nightly build [3] will
> run on
> > > > openjdk8. To maintain source compatibility, these builds will still
> run
> > > > their compile phase for branch-2 on openjdk7 as they do now (in
> addition
> > > to
> > > > compiling on openjdk8).
> > > >
> > > > Vote will run for three business days until Thursday Feb 7 6:00PM
> PDT.
> > > >
> > > > [1]
> > > >
> > > >
> > >
> https://lists.apache.org/thread.html/7e6fb28fc67560f83a2eb62752df35a8d58d86b2a3df4cacb5d738ca@%3Ccommon-dev.hadoop.apache.org%3E
> > > >
> > > > [2]
> > > >
> > >
> https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-HADOOP-Build/
> > > >
> https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-HDFS-Build/
> > > >
> https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-YARN-Build/
> > > >
> > > >
> > >
> https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-MAPREDUCE-Build/
> > > >
> > > > [3]
> > > >
> > > >
> > >
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/
> > > >
> > > > Jonathan Hung
> > > >
> > >
>


[VOTE] Moving branch-2 precommit/nightly test builds to java 8

2019-02-04 Thread Jonathan Hung
Hello,

Starting a vote based on the discuss thread [1] for moving branch-2
precommit/nightly test builds to openjdk8. After this change, the test
phase for precommit builds [2] and branch-2 nightly build [3] will run on
openjdk8. To maintain source compatibility, these builds will still run
their compile phase for branch-2 on openjdk7 as they do now (in addition to
compiling on openjdk8).

Vote will run for three business days until Thursday Feb 7 6:00PM PDT.

[1]
https://lists.apache.org/thread.html/7e6fb28fc67560f83a2eb62752df35a8d58d86b2a3df4cacb5d738ca@%3Ccommon-dev.hadoop.apache.org%3E

[2]
https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-HADOOP-Build/
https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-HDFS-Build/
https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-YARN-Build/
https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-MAPREDUCE-Build/

[3]
https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/

Jonathan Hung


Re: [VOTE] Propose to start new Hadoop sub project "submarine"

2019-02-01 Thread Jonathan Hung
+1. Thanks Wangda.

Jonathan Hung


On Fri, Feb 1, 2019 at 2:25 PM Dinesh Chitlangia <
dchitlan...@hortonworks.com> wrote:

> +1 (non binding), thanks Wangda for organizing this.
>
> Regards,
> Dinesh
>
>
>
> On 2/1/19, 5:24 PM, "Wangda Tan"  wrote:
>
> Hi all,
>
> According to positive feedbacks from the thread [1]
>
> This is vote thread to start a new subproject named "hadoop-submarine"
> which follows the release process already established for ozone.
>
> The vote runs for usual 7 days, which ends at Feb 8th 5 PM PDT.
>
> Thanks,
> Wangda Tan
>
> [1]
>
> https://lists.apache.org/thread.html/f864461eb188bd12859d51b0098ec38942c4429aae7e4d001a633d96@%3Cyarn-dev.hadoop.apache.org%3E
>
>
>


Re: [DISCUSS] Making submarine to different release model like Ozone

2019-01-31 Thread Jonathan Hung
+1. This is important for improving the deep learning on hadoop story.
There's recently a lot of momentum for this, and decoupling
submarine/hadoop will help it continue.

Jonathan Hung


On Thu, Jan 31, 2019 at 11:04 AM Wangda Tan  wrote:

> Hi devs,
>
> Since we started submarine-related effort last year, we received a lot of
> feedbacks, several companies (such as Netease, China Mobile, etc.)  are
> trying to deploy Submarine to their Hadoop cluster along with big data
> workloads. Linkedin also has big interests to contribute a Submarine TonY (
> https://github.com/linkedin/TonY) runtime to allow users to use the same
> interface.
>
> From what I can see, there're several issues of putting Submarine under
> yarn-applications directory and have same release cycle with Hadoop:
>
> 1) We started 3.2.0 release at Sep 2018, but the release is done at Jan
> 2019. Because of non-predictable blockers and security issues, it got
> delayed a lot. We need to iterate submarine fast at this point.
>
> 2) We also see a lot of requirements to use Submarine on older Hadoop
> releases such as 2.x. Many companies may not upgrade Hadoop to 3.x in a
> short time, but the requirement to run deep learning is urgent to them. We
> should decouple Submarine from Hadoop version.
>
> And why we wanna to keep it within Hadoop? First, Submarine included some
> innovation parts such as enhancements of user experiences for YARN
> services/containerization support which we can add it back to Hadoop later
> to address common requirements. In addition to that, we have a big overlap
> in the community developing and using it.
>
> There're several proposals we have went through during Ozone merge to trunk
> discussion:
>
> https://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201803.mbox/%3ccahfhakh6_m3yldf5a2kq8+w-5fbvx5ahfgs-x1vajw8gmnz...@mail.gmail.com%3E
>
> I propose to adopt Ozone model: which is the same master branch, different
> release cycle, and different release branch. It is a great example to show
> agile release we can do (2 Ozone releases after Oct 2018) with less
> overhead to setup CI, projects, etc.
>
> *Links:*
> - JIRA: https://issues.apache.org/jira/browse/YARN-8135
> - Design doc
> <
> https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit
> >
> - User doc
> <
> https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/Index.html
> >
> (3.2.0
> release)
> - Blogposts, {Submarine} : Running deep learning workloads on Apache Hadoop
> <
> https://hortonworks.com/blog/submarine-running-deep-learning-workloads-apache-hadoop/
> >,
> (Chinese Translation: Link <https://www.jishuwen.com/d/2Vpu>)
> - Talks: Strata Data Conf NY
> <
> https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/detail/68289
> >
>
> Thoughts?
>
> Thanks,
> Wangda Tan
>


Re: [VOTE - 2] Merge HDFS-12943 branch to trunk - Consistent Reads from Standby

2018-12-18 Thread Jonathan Hung
+1!

Jonathan Hung


On Sat, Dec 15, 2018 at 8:26 AM Zhe Zhang  wrote:

> +1
>
> Thanks for addressing concerns from the previous vote.
>
> On Fri, Dec 14, 2018 at 6:24 PM Konstantin Shvachko 
> wrote:
>
> > Hi Hadoop developers,
> >
> > I would like to propose to merge to trunk the feature branch HDFS-12943
> for
> > Consistent Reads from Standby Node. The feature is intended to scale read
> > RPC workloads. On large clusters reads comprise 95% of all RPCs to the
> > NameNode. We should be able to accommodate higher overall RPC workloads
> (up
> > to 4x by some estimates) by adding multiple ObserverNodes.
> >
> > The main functionality has been implemented see sub-tasks of HDFS-12943.
> > We followed up with the test plan. Testing was done on two independent
> > clusters (see HDFS-14058 and HDFS-14059) with security enabled.
> > We ran standard HDFS commands, MR jobs, admin commands including manual
> > failover.
> > We know of one cluster running this feature in production.
> >
> > Since the previous vote we addressed Daryn's concern (see HDFS-13873),
> > added documentation for the new feature, and fixed a few other jiras.
> >
> > I attached a unified patch to the umbrella jira for the review.
> > Please vote on this thread. The vote will run for 7 days until Wed Dec
> 21.
> >
> > Thanks,
> > --Konstantin
> >
> --
> Zhe Zhang
> Apache Hadoop Committer
> http://zhe-thoughts.github.io/about/ | @oldcap
>


[DISCUSS] Merging YARN-8200 to branch-2

2018-12-18 Thread Jonathan Hung
Hi folks,

Starting a thread to discuss merging YARN-8200 (resource profiles/GPU
support) to branch-2.

For resource types, we have ported YARN-4081~YARN-7137 (as part of
YARN-3926 umbrella).
For GPU support, we have ported the native non-docker GPU support related
items in YARN-6223.
For both of these, we have also ported miscellaneous fixes for issues we
encountered internally.

Some potential issues I see are, some of the resource types commits did not
make it to branch-3.0. Also most of the GPU-specific commits did not make
it to branch-3.0 either.

We have deployed these two features internally on top of a branch-2.9 fork
on a 100 node GPU cluster which is running deep learning workloads, and it
is working well.

Before the holidays/after new years we will work on cleaning up the feature
branch (YARN-8200), e.g. filing tickets on branch-2 specific bug fixes,
rebasing on latest branch-2, syncing any bug fixes in our internal fork
which did not make it to the feature branch, etc. Assuming no objections,
once it's ready we will start a vote to merge.

Thanks,
Jonathan Hung


Re: [VOTE] Release Apache Hadoop 3.1.0 (RC0)

2018-03-28 Thread Jonathan Hung
Hi Wangda, thanks for handling this release.

+1 (non-binding)

- verified binary checksum
- launched single node RM
- verified refreshQueues functionality
  - verified capacity scheduler conf mutation disabled in this case
- verified capacity scheduler conf mutation with leveldb storage
  - verified refreshQueues mutation is disabled in this case


Jonathan Hung

On Thu, Mar 22, 2018 at 9:10 AM, Wangda Tan  wrote:

> Thanks @Bharat for the quick check, the previously staged repository has
> some issues. I re-deployed jars to nexus.
>
> Here's the new repo (1087)
>
> https://repository.apache.org/content/repositories/orgapachehadoop-1087/
>
> Other artifacts remain same, no additional code changes.
>
> On Wed, Mar 21, 2018 at 11:54 PM, Bharat Viswanadham <
> bviswanad...@hortonworks.com> wrote:
>
> > Hi Wangda,
> > Maven Artifact repositories is not having all Hadoop jars. (It is missing
> > many like hadoop-hdfs, hadoop-client etc.,)
> > https://repository.apache.org/content/repositories/orgapachehadoop-1086/
> >
> >
> > Thanks,
> > Bharat
> >
> >
> > On 3/21/18, 11:44 PM, "Wangda Tan"  wrote:
> >
> > Hi folks,
> >
> > Thanks to the many who helped with this release since Dec 2017 [1].
> > We've
> > created RC0 for Apache Hadoop 3.1.0. The artifacts are available
> here:
> >
> > http://people.apache.org/~wangda/hadoop-3.1.0-RC0/
> >
> > The RC tag in git is release-3.1.0-RC0.
> >
> > The maven artifacts are available via repository.apache.org at
> > https://repository.apache.org/content/repositories/
> > orgapachehadoop-1086/
> >
> > This vote will run 7 days (5 weekdays), ending on Mar 28 at 11:59 pm
> > Pacific.
> >
> > 3.1.0 contains 727 [2] fixed JIRA issues since 3.0.0. Notable
> additions
> > include the first class GPU/FPGA support on YARN, Native services,
> > Support
> > rich placement constraints in YARN, S3-related enhancements, allow
> HDFS
> > block replicas to be provided by an external storage system, etc.
> >
> > We’d like to use this as a starting release for 3.1.x [1], depending
> > on how
> > it goes, get it stabilized and potentially use a 3.1.1 in several
> > weeks as
> > the stable release.
> >
> > We have done testing with a pseudo cluster and distributed shell job.
> > My +1
> > to start.
> >
> > Best,
> > Wangda/Vinod
> >
> > [1]
> > https://lists.apache.org/thread.html/b3fb3b6da8b6357a68513a6dfd104b
> > c9e19e559aedc5ebedb4ca08c8@%3Cyarn-dev.hadoop.apache.org%3E
> > [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in
> > (3.1.0)
> > AND fixVersion not in (3.0.0, 3.0.0-beta1) AND status = Resolved
> ORDER
> > BY
> > fixVersion ASC
> >
> >
> >
>


Re: [VOTE] Release Apache Hadoop 2.7.5 (RC1)

2017-12-13 Thread Jonathan Hung
Thanks Konstantin for working on this.

+1 (non-binding)
- Downloaded binary and verified md5
- Deployed RM HA and tested failover




Jonathan Hung

On Wed, Dec 13, 2017 at 11:02 AM, Eric Payne  wrote:

> Thanks for the hard work on this release, Konstantin.
> +1 (binding)
> - Built from source
> - Verified that refreshing of queues works as expected.
>
> - Verified can run multiple users in a single queue
> - Ran terasort test
> - Verified that cross-queue preemption works as expected
> Thanks. Eric Payne
>
>   From: Konstantin Shvachko 
>  To: "common-...@hadoop.apache.org" ; "
> hdfs-...@hadoop.apache.org" ; "
> mapreduce-dev@hadoop.apache.org" ; "
> yarn-...@hadoop.apache.org" 
>  Sent: Thursday, December 7, 2017 9:22 PM
>  Subject: [VOTE] Release Apache Hadoop 2.7.5 (RC1)
>
> Hi everybody,
>
> I updated CHANGES.txt and fixed documentation links.
> Also committed  MAPREDUCE-6165, which fixes a consistently failing test.
>
> This is RC1 for the next dot release of Apache Hadoop 2.7 line. The
> previous one 2.7.4 was release August 4, 2017.
> Release 2.7.5 includes critical bug fixes and optimizations. See more
> details in Release Note:
> http://home.apache.org/~shv/hadoop-2.7.5-RC1/releasenotes.html
>
> The RC0 is available at: http://home.apache.org/~shv/hadoop-2.7.5-RC1/
>
> Please give it a try and vote on this thread. The vote will run for 5 days
> ending 12/13/2017.
>
> My up to date public key is available from:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Thanks,
> --Konstantin
>
>
>
>


Re: [VOTE] Release Apache Hadoop 3.0.0 RC1

2017-12-12 Thread Jonathan Hung
Thanks Andrew for the huge effort.

+1 (non-binding)
- Downloaded binary tarball and verified md5
- Ran RM HA and verified manual failover
- Verified add/remove/update scheduler configuration API (CLI/REST) works
for leveldb/zookeeper backend
- Verified scheduler configuration changes persisted on restart/failover
- Verified "yarn rmadmin -refreshQueues" works when scheduler configuration
API disabled, and does not work when scheduler configuration API enabled


Jonathan Hung

On Tue, Dec 12, 2017 at 5:44 PM, Junping Du  wrote:

> Thanks Andrew for pushing new RC for 3.0.0. I was out last week, just get
> chance to validate new RC now.
>
> Basically, I found two critical issues with the same rolling upgrade
> scenario as where HADOOP-15059 get found previously:
> HDFS-12920, we changed value format for some hdfs configurations that old
> version MR client doesn't understand when fetching these configurations.
> Some quick workarounds are to add old value (without time unit) in
> hdfs-site.xml to override new default values but will generate many
> annoying warnings. I provided my fix suggestions on the JIRA already for
> more discussion.
> The other one is YARN-7646. After we workaround HDFS-12920, will hit the
> issue that old version MR AppMaster cannot communicate with new version of
> YARN RM - could be related to resource profile changes from YARN side but
> root cause are still in investigation.
>
> The first issue may not belong to a blocker given we can workaround this
> without code change. I am not sure if we can workaround 2nd issue so far.
> If not, we may have to fix this or compromise with withdrawing support of
> rolling upgrade or calling it a stable release.
>
>
> Thanks,
>
> Junping
>
> 
> From: Robert Kanter 
> Sent: Tuesday, December 12, 2017 3:10 PM
> To: Arun Suresh
> Cc: Andrew Wang; Lei Xu; Wei-Chiu Chuang; Ajay Kumar; Xiao Chen; Aaron T.
> Myers; common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org;
> yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org
> Subject: Re: [VOTE] Release Apache Hadoop 3.0.0 RC1
>
> +1 (binding)
>
> + Downloaded the binary release
> + Deployed on a 3 node cluster on CentOS 7.3
> + Ran some MR jobs, clicked around the UI, etc
> + Ran some CLI commands (yarn logs, etc)
>
> Good job everyone on Hadoop 3!
>
>
> - Robert
>
> On Tue, Dec 12, 2017 at 1:56 PM, Arun Suresh  wrote:
>
> > +1 (binding)
> >
> > - Verified signatures of the source tarball.
> > - built from source - using the docker build environment.
> > - set up a pseudo-distributed test cluster.
> > - ran basic HDFS commands
> > - ran some basic MR jobs
> >
> > Cheers
> > -Arun
> >
> > On Tue, Dec 12, 2017 at 1:52 PM, Andrew Wang 
> > wrote:
> >
> > > Hi everyone,
> > >
> > > As a reminder, this vote closes tomorrow at 12:31pm, so please give it
> a
> > > whack if you have time. There are already enough binding +1s to pass
> this
> > > vote, but it'd be great to get additional validation.
> > >
> > > Thanks to everyone who's voted thus far!
> > >
> > > Best,
> > > Andrew
> > >
> > >
> > >
> > > On Tue, Dec 12, 2017 at 11:08 AM, Lei Xu  wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > * Verified src tarball and bin tarball, verified md5 of each.
> > > > * Build source with -Pdist,native
> > > > * Started a pseudo cluster
> > > > * Run ec -listPolicies / -getPolicy / -setPolicy on /  , and run hdfs
> > > > dfs put/get/cat on "/" with XOR-2-1 policy.
> > > >
> > > > Thanks Andrew for this great effort!
> > > >
> > > > Best,
> > > >
> > > >
> > > > On Tue, Dec 12, 2017 at 9:55 AM, Andrew Wang <
> andrew.w...@cloudera.com
> > >
> > > > wrote:
> > > > > Hi Wei-Chiu,
> > > > >
> > > > > The patchprocess directory is left over from the create-release
> > > process,
> > > > > and it looks empty to me. We should still file a create-release
> JIRA
> > to
> > > > fix
> > > > > this, but I think this is not a blocker. Would you agree?
> > > > >
> > > > > Best,
> > > > > Andrew
> > > > >
> > > > > On Tue, Dec 12, 2017 at 9:44 AM, Wei-Chiu Chuang <
> > weic...@cloudera.com
> > > >
> > > > > wrote:
> > > > >
> > > > >> 

Re: [VOTE] Release Apache Hadoop 2.9.0 (RC3)

2017-11-14 Thread Jonathan Hung
Thanks Arun/Subru for working on this.

+1 (non-binding)
- Deployed RM HA on two nodes
- Tested manual failover
- Tested configuration mutation API with zk and leveldb backing store (also
ensuring configuration updates persisted on failover/restart), with queue
addition/removal/update
- Tested "yarn rmadmin -refreshQueues" enabled when configuration mutation
API disabled (and vice-versa)
- Tested queue admin configuration mutation policy



Jonathan Hung

On Mon, Nov 13, 2017 at 4:10 PM, Arun Suresh  wrote:

> Hi Folks,
>
> Apache Hadoop 2.9.0 is the first release of Hadoop 2.9 line and will be the
> starting release for Apache Hadoop 2.9.x line - it includes 30 New Features
> with 500+ subtasks, 407 Improvements, 790 Bug fixes new fixed issues since
> 2.8.2.
>
> More information about the 2.9.0 release plan can be found here:
> *https://cwiki.apache.org/confluence/display/HADOOP/
> Roadmap#Roadmap-Version2.9
> <https://cwiki.apache.org/confluence/display/HADOOP/
> Roadmap#Roadmap-Version2.9>*
>
> New RC is available at: *https://home.apache.org/~
> asuresh/hadoop-2.9.0-RC3/
> <https://home.apache.org/~asuresh/hadoop-2.9.0-RC3/>*
>
> The RC tag in git is: release-2.9.0-RC3, and the latest commit id is:
> 756ebc8394e473ac25feac05fa493f6d612e6c50.
>
> The maven artifacts are available via repository.apache.org at:
> <https://www.google.com/url?q=https%3A%2F%2Frepository.
> apache.org%2Fcontent%2Frepositories%2Forgapachehadoop-1066&sa=D&
> sntz=1&usg=AFQjCNFcern4uingMV_sEreko_zeLlgdlg>*https://
> repository.apache.org/content/repositories/orgapachehadoop-1068/
> <https://repository.apache.org/content/repositories/orgapachehadoop-1068/
> >*
>
> We are carrying over the votes from the previous RC given that the delta is
> the license fix.
>
> Given the above - we are also going to stick with the original deadline for
> the vote : ending on Friday 17th November 2017 2pm PT time.
>
> Thanks,
> -Arun/Subru
>


Re: [VOTE] Release Apache Hadoop 2.9.0 (RC0)

2017-11-07 Thread Jonathan Hung
Thanks Arun and Subru for working on this!

+1 (non-binding) pending YARN-7453.

1) Setup RM HA
2) Verified leveldb/zookeeper scheduler configuration API works via REST/CLI
3) Verified configuration changes persist across restart
4) yarn rmadmin -refreshQueues works when scheduler configuration API
disabled (and vice-versa)


Jonathan Hung

On Tue, Nov 7, 2017 at 2:56 PM, Eric Badger  wrote:

> +1 (non-binding) pending the issue that Sunil/Rohith pointed out
>
> - Verified all hashes and checksums
> - Built from source on macOS 10.12.6, Java 1.8.0u65
> - Deployed a pseudo cluster
> - Ran some example jobs
>
> Thanks,
>
> Eric
>
> On Tue, Nov 7, 2017 at 4:03 PM, Wangda Tan  wrote:
>
>> Sunil / Rohith,
>>
>> Could you check if your configs are same as Jonathan posted configs?
>> https://issues.apache.org/jira/browse/YARN-7453?focusedComme
>> ntId=16242693&page=com.atlassian.jira.plugin.system.
>> issuetabpanels:comment-tabpanel#comment-16242693
>>
>> And could you try if using Jonathan's configs can still reproduce the
>> issue?
>>
>> Thanks,
>> Wangda
>>
>>
>> On Tue, Nov 7, 2017 at 1:52 PM, Arun Suresh  wrote:
>>
>> > Thanks for testing Rohith and Sunil
>> >
>> > Can you please confirm if it is not a config issue at your end ?
>> > We (both Jonathan and myself) just tried testing this on a fresh cluster
>> > (both automatic and manual) and we are not able to reproduce this. I've
>> > updated the YARN-7453 <https://issues.apache.org/jira/browse/YARN-7453>
>> > JIRA
>> > with details of testing.
>> >
>> > Cheers
>> > -Arun/Subru
>> >
>> > On Tue, Nov 7, 2017 at 3:17 AM, Rohith Sharma K S <
>> > rohithsharm...@apache.org
>> > > wrote:
>> >
>> > > Thanks Sunil for confirmation. Btw, I have raised YARN-7453
>> > > <https://issues.apache.org/jira/browse/YARN-7453> JIRA to track this
>> > > issue.
>> > >
>> > > - Rohith Sharma K S
>> > >
>> > > On 7 November 2017 at 16:44, Sunil G  wrote:
>> > >
>> > >> Hi Subru and Arun.
>> > >>
>> > >> Thanks for driving 2.9 release. Great work!
>> > >>
>> > >> I installed cluster built from source.
>> > >> - Ran few MR jobs with application priority enabled. Runs fine.
>> > >> - Accessed new UI and it also seems fine.
>> > >>
>> > >> However I am also getting same issue as Rohith reported.
>> > >> - Started an HA cluster
>> > >> - Pushed RM to standby
>> > >> - Pushed back RM to active then seeing an exception.
>> > >>
>> > >> org.apache.hadoop.ha.ServiceFailedException: RM could not
>> transition to
>> > >> Active
>> > >> at
>> > >> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyE
>> > >> lectorBasedElectorServic
>> > >> e.becomeActive(ActiveStandbyElectorBasedElectorService.java:146)
>> > >> at
>> > >> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(Activ
>> > >> eStandbyElector.java:894
>> > >> )
>> > >>
>> > >> Caused by: org.apache.zookeeper.KeeperException$NoAuthException:
>> > >> KeeperErrorCode = NoAuth
>> > >> at
>> > >> org.apache.zookeeper.KeeperException.create(KeeperException.
>> java:113)
>> > >> at org.apache.zookeeper.ZooKeeper
>> .multiInternal(ZooKeeper.java:
>> > >> 949)
>> > >>
>> > >> Will check and post more details,
>> > >>
>> > >> - Sunil
>> > >>
>> > >>
>> > >> On Tue, Nov 7, 2017 at 12:47 PM Rohith Sharma K S <
>> > >> rohithsharm...@apache.org>
>> > >> wrote:
>> > >>
>> > >> > Thanks Subru/Arun for the great work!
>> > >> >
>> > >> > Downloaded source and built from it. Deployed RM HA non-secured
>> > cluster
>> > >> > along with new YARN UI and ATSv2.
>> > >> >
>> > >> > I am facing basic RM HA switch issue after first time successful
>> > start.
>> > >> > *Can
>> > >> > anyone else is facing this issue?*
>> > >> >
>> > >> >

Re: [VOTE] Merge feature branch YARN-5734 (API based scheduler configuration) to trunk, branch-3.0, branch-2

2017-10-09 Thread Jonathan Hung
Thanks for the votes and discussion. It is now past Monday Oct 9 11:00AM
PDT so the vote has ended. There were 4 +1 and no -1, so vote passes. This
feature will be merged to trunk, branch-3.0, and branch-2 shortly (16
subtasks).

Thanks everyone!




Jonathan Hung

On Mon, Oct 9, 2017 at 9:18 AM, Xuan Gong  wrote:

> +1 (binding)
>
> Xuan Gong
>
>
> >
> >On Mon, Oct 2, 2017 at 11:09 AM, Jonathan Hung 
> >wrote:
> >
> >> Hi all,
> >>
> >> From discussion at [1], I'd like to start a vote to merge feature branch
> >> YARN-5734 to trunk, branch-3.0, and branch-2. Vote will be 7 days,
> >>ending
> >> Monday Oct 9 at 11:00AM PDT.
> >>
> >> This branch adds a framework to the scheduler to allow scheduler
> >> configuration mutation on the fly, including a REST and CLI interface,
> >>and
> >> an interface for the scheduler configuration backing store. Currently
> >>the
> >> capacity scheduler implements this framework.
> >>
> >> Umbrella is here (YARN-5734
> >> <https://issues.apache.org/jira/browse/YARN-5734>), jenkins build is
> >>here
> >> (
> >> YARN-7241 <https://issues.apache.org/jira/browse/YARN-7241>). All
> >>required
> >> tasks for this feature are committed. Since this feature changes RM
> >>only,
> >> we have tested this on a local RM setup with a suite of configuration
> >> changes with no issue so far.
> >>
> >> Key points:
> >> - The feature is turned off by default, and must be explicitly
> >>configured
> >> to turn on. When turned off, the behavior reverts back to the original
> >>file
> >> based mechanism for changing scheduler configuration (i.e. yarn rmadmin
> >> -refreshQueues).
> >> - The framework was designed in a way to be extendable to other
> >>schedulers
> >> (most notably FairScheduler).
> >> - A pluggable ACL policy (YARN-5949
> >> <https://issues.apache.org/jira/browse/YARN-5949>) allows admins
> >> fine-grained control for who can change what configurations.
> >> - The configuration storage backend is also pluggable. Currently an
> >> in-memory, leveldb, and zookeeper implementation are supported.
> >>
> >> There were 15 subtasks completed for this feature.
> >>
> >> Huge thanks to everyone who helped with reviews, commits, guidance, and
> >> technical discussion/design, including Carlo Curino, Xuan Gong, Subru
> >> Krishnan, Min Shen, Konstantin Shvachko, Carl Steinbach, Wangda Tan,
> >>Vinod
> >> Kumar Vavilapalli, Suja Viswesan, Zhe Zhang, Ye Zhou.
> >>
> >> [1]
> >> http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201709.mbox/%
> >> 3CCAHzWLgfEAgczjcEOUCg-03ma3ROtO=pkec9dpggyx9rzf3n...@mail.gmail.com%3E
> >>
> >> Jonathan Hung
> >>
>
>


[VOTE] Merge feature branch YARN-5734 (API based scheduler configuration) to trunk, branch-3.0, branch-2

2017-10-02 Thread Jonathan Hung
Hi all,

>From discussion at [1], I'd like to start a vote to merge feature branch
YARN-5734 to trunk, branch-3.0, and branch-2. Vote will be 7 days, ending
Monday Oct 9 at 11:00AM PDT.

This branch adds a framework to the scheduler to allow scheduler
configuration mutation on the fly, including a REST and CLI interface, and
an interface for the scheduler configuration backing store. Currently the
capacity scheduler implements this framework.

Umbrella is here (YARN-5734
<https://issues.apache.org/jira/browse/YARN-5734>), jenkins build is here (
YARN-7241 <https://issues.apache.org/jira/browse/YARN-7241>). All required
tasks for this feature are committed. Since this feature changes RM only,
we have tested this on a local RM setup with a suite of configuration
changes with no issue so far.

Key points:
- The feature is turned off by default, and must be explicitly configured
to turn on. When turned off, the behavior reverts back to the original file
based mechanism for changing scheduler configuration (i.e. yarn rmadmin
-refreshQueues).
- The framework was designed in a way to be extendable to other schedulers
(most notably FairScheduler).
- A pluggable ACL policy (YARN-5949
<https://issues.apache.org/jira/browse/YARN-5949>) allows admins
fine-grained control for who can change what configurations.
- The configuration storage backend is also pluggable. Currently an
in-memory, leveldb, and zookeeper implementation are supported.

There were 15 subtasks completed for this feature.

Huge thanks to everyone who helped with reviews, commits, guidance, and
technical discussion/design, including Carlo Curino, Xuan Gong, Subru
Krishnan, Min Shen, Konstantin Shvachko, Carl Steinbach, Wangda Tan, Vinod
Kumar Vavilapalli, Suja Viswesan, Zhe Zhang, Ye Zhou.

[1]
http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201709.mbox/%3CCAHzWLgfEAgczjcEOUCg-03ma3ROtO=pkec9dpggyx9rzf3n...@mail.gmail.com%3E

Jonathan Hung


Re: [DISCUSS] Merging API-based scheduler configuration to trunk/branch-2

2017-09-29 Thread Jonathan Hung
Thanks Andrew and Larry for the feedback. I was hoping to start a merge
vote early next week, because of the 2.9 deadline. (I suppose meeting this
deadline depends on the outcome of this DISCUSS thread.) Appreciate any
questions you have on the JIRA.

To answer your questions Larry:
*Is this feature extending the existing YARM RM REST API?*
Yes, this feature adds another endpoint to the YARN RM REST API, for users
to send their configuration change requests.
*When it isn't enabled what is the API behavior?*
When disabled and API is called, nothing happens, it will return HTTP 400
bad request.
*Does it implement the trusted proxy pattern for proxies to be able to
impersonate users and most importantly to dictate what proxies would be
allowed to impersonate an admin for this API - which I assume will be
required?*
Right now there's a pluggable policy which controls which users can make
which configuration changes (see YARN-5949). The default policy is to only
allow YARN admins (i.e. users in yarn.admin.acl) to make changes. There's
also an implementation of a more relaxed policy which allows admins of
queues to make configuration modifications to their own queue. Not sure if
this answers your question.

Thanks,

Jonathan Hung

On Fri, Sep 29, 2017 at 12:01 PM, larry mccay  wrote:

> Hi Jonathan -
>
> Thank you for bringing this up for discussion!
>
> I would personally like to see a specific security review of features like
> this - especially ones that allow for remote access to configuration.
> I'll take a look at the JIRA and see whether I can come up with any
> concerns or questions and I would urge others to give it a pass from a
> security perspective as well.
>
> In addition, here are a couple questions of the top of my head:
>
> Is this feature extending the existing YARM RM REST API?
> When it isn't enabled what is the API behavior?
> Does it implement the trusted proxy pattern for proxies to be able to
> impersonate users and most importantly to dictate what proxies would be
> allowed to impersonate an admin for this API - which I assume will be
> required?
>
> --larry
>
> On Fri, Sep 29, 2017 at 2:44 PM, Andrew Wang 
> wrote:
>
>> Hi Jonathan,
>>
>> I'm okay with putting this into branch-3.0 for GA if it can be merged
>> within the next two weeks. Even though beta1 has slipped by a month, I
>> want
>> to stick to the targeted GA data of Nov 1st as much as possible. Of
>> course,
>> let's not sacrifice quality or stability for speed; if something's not
>> ready, let's defer it to 3.1.0.
>>
>> Subru, have you been able to review this feature from the 2.9.0
>> perspective? It'd add confidence if you think it's immediately ready for
>> merging to branch-2 for 2.9.0.
>>
>> Thanks,
>> Andrew
>>
>> On Thu, Sep 28, 2017 at 11:32 AM, Jonathan Hung 
>> wrote:
>>
>> > Hi everyone,
>> >
>> > Starting this thread to discuss merging API-based scheduler
>> configuration
>> > to trunk/branch-2. The feature adds the framework for allowing users to
>> > modify scheduler configuration via REST or CLI using a configurable
>> backend
>> > (leveldb/zk are currently supported), and adds capacity scheduler
>> support
>> > for this. The umbrella JIRA is YARN-5734. All the required work for this
>> > feature is done and committed to branch YARN-5734, and a full diff has
>> been
>> > generated at YARN-7241.
>> >
>> > Regarding compatibility, this feature is configurable and turned off by
>> > default.
>> >
>> > The feature has been tested locally on a couple RMs (since it is an RM
>> > only change), with queue addition/removal/updates tested on single RM
>> > (leveldb) and two RMs (zk). Also we verified the original configuration
>> > update mechanism (via refreshQueues) is unaffected when the feature is
>> > off/not configured.
>> >
>> > Our original plan was to merge this to trunk (which is what the
>> YARN-7241
>> > diff is based on), and port to branch-2 before the 2.9 release. @Andrew,
>> > what are your thoughts on also merging this to branch-3.0?
>> >
>> > Thanks!
>> >
>> > Jonathan Hung
>> >
>>
>
>


[DISCUSS] Merging API-based scheduler configuration to trunk/branch-2

2017-09-28 Thread Jonathan Hung
Hi everyone,

Starting this thread to discuss merging API-based scheduler configuration
to trunk/branch-2. The feature adds the framework for allowing users to
modify scheduler configuration via REST or CLI using a configurable backend
(leveldb/zk are currently supported), and adds capacity scheduler support
for this. The umbrella JIRA is YARN-5734. All the required work for this
feature is done and committed to branch YARN-5734, and a full diff has been
generated at YARN-7241.

Regarding compatibility, this feature is configurable and turned off by
default.

The feature has been tested locally on a couple RMs (since it is an RM only
change), with queue addition/removal/updates tested on single RM (leveldb)
and two RMs (zk). Also we verified the original configuration update
mechanism (via refreshQueues) is unaffected when the feature is off/not
configured.

Our original plan was to merge this to trunk (which is what the YARN-7241
diff is based on), and port to branch-2 before the 2.9 release. @Andrew,
what are your thoughts on also merging this to branch-3.0?

Thanks!

Jonathan Hung


Re: [DISCUSS] Looking to a 2.9.0 release

2017-09-05 Thread Jonathan Hung
Hi Subru,

Thanks for starting the discussion. We are targeting merging YARN-5734
(API-based scheduler configuration) to branch-2 before the release of
2.9.0, since the feature is close to complete. Regarding the requirements
for merge,

1. API compatibility - this feature adds new APIs, does not modify any
existing ones.
2. Turning feature off - using the feature is configurable and is turned
off by default.
3. Stability/testing - this is an RM-only change, so we plan on deploying
this feature to a test RM and verifying configuration changes for capacity
scheduler. (Right now fair scheduler is not supported.)
4. Deployment - we want to get this feature in to 2.9.0 since we want to
use this feature and 2.9 version in our next upgrade.
5. Timeline - we have one main blocker which we are planning to resolve by
end of week. The rest of the month will be testing then a merge vote on the
last week of Sept.

Please let me know if you have any concerns. Thanks!


Jonathan Hung

On Wed, Jul 26, 2017 at 11:23 AM, J. Rottinghuis 
wrote:

> Thanks Vrushali for being entirely open as to the current status of ATSv2.
> I appreciate that we want to ensure things are tested at scale, and as you
> said we are working on that right now on our clusters.
> We have tested the feature to demonstrate it works at what we consider
> moderate scale.
>
> I think the criteria for including this feature in the 2.9 release should
> be if it can be safely turned off and not cause impact to anybody not using
> the new feature. The confidence for this is high for timeline service v2.
>
> Therefore, I think timeline service v2 should definitely be part of 2.9.
> That is the big draw for us to work on stabilizing a 2.9 release rather
> than just going to 2.8 and back-porting things ourselves.
>
> Thanks,
>
> Joep
>
> On Tue, Jul 25, 2017 at 11:39 PM, Vrushali Channapattan <
> vrushalic2...@gmail.com> wrote:
>
> > Thanks Subru for initiating this discussion.
> >
> > Wanted to share some thoughts in the context of Timeline Service v2. The
> > current status of this module is that we are ramping up for a second
> merge
> > to trunk. We still have a few merge blocker jiras outstanding, which we
> > think we will finish soon.
> >
> > While we have done some testing, we are yet to test at scale. Given all
> > this, we were thinking of initially targeting a beta release vehicle
> rather
> > than a stable release.
> >
> > As such, timeline service v2 has branch-2 branch called as
> > YARN-5355-branch-2 in case anyone wants to try it out. Timeline service
> v2
> > can be turned off and should not affect the cluster.
> >
> > thanks
> > Vrushali
> >
> >
> >
> >
> >
> > On Mon, Jul 24, 2017 at 1:26 PM, Subru Krishnan 
> wrote:
> >
> > > Folks,
> > >
> > > With the release for 2.8, we would like to look ahead to 2.9 release as
> > > there are many features/improvements in branch-2 (about 1062 commits),
> > that
> > > are in need of a release vechile.
> > >
> > > Here's our first cut of the proposal from the YARN side:
> > >
> > >1. Scheduler improvements (decoupling allocation from node
> heartbeat,
> > >allocation ID, concurrency fixes, LightResource etc).
> > >2. Timeline Service v2
> > >3. Opportunistic containers
> > >4. Federation
> > >
> > > We would like to hear a formal list from HDFS & Hadoop (& MapReduce if
> > any)
> > > and will update the Roadmap wiki accordingly.
> > >
> > > Considering our familiarity with the above mentioned YARN features, we
> > > would like to volunteer as the co-RMs for 2.9.0.
> > >
> > > We want to keep the timeline at 8-12 weeks to keep the release
> pragmatic.
> > >
> > > Feedback?
> > >
> > > -Subru/Arun
> > >
> >
>


Re: Branch merges and 3.0.0-beta1 scope

2017-08-25 Thread Jonathan Hung
Hi Andrew,

Thanks for starting the discussion - we have a feature YARN-5734 for API
based scheduler configuration that I feel is pretty close to merge (also "a
few weeks"). It's almost completely code and API additions and we were
careful to design it so that it's compatible (feature is also turned off by
default). Hoping to get this in before 3.0.0-GA. Just wanted to send this
note so that we are not caught off guard by this feature.

Thanks!


Jonathan Hung

On Fri, Aug 25, 2017 at 11:06 AM, Wangda Tan  wrote:

> Resource profile is similar to TSv2, the feature is:
> - Alpha feature, we will not freeze new added APIs. And all added APIs are
> explicitly marked to @Unstable.
> - Allow rolling upgrade from branch-2.
> - Touched existing code, but we have, and will continue tests to make sure
> changes are safe.
>
> Discussed with Andrew offline, we decided to not put this to beta1 since
> beta1 is not far away. But we want to put it before GA if sufficient tests
> are done.
>
> Thanks,
> Wangda
>
>
>
> On Fri, Aug 25, 2017 at 10:54 AM, Rohith Sharma K S <
> rohithsharm...@apache.org> wrote:
>
> > On 25 August 2017 at 22:39, Andrew Wang 
> wrote:
> >
> > > Hi Rohith,
> > >
> > > Given that we're advertising TSv2 as an alpha feature, I think we're
> > > allowed to break compatibility. Let's make sure this is clear in the
> > > release notes and documentation.
> > >
> >
> > > That said, with TSv2 phase 2, is the API going to be frozen? The
> umbrella
> > > JIRA refers to "TSv2 alpha2" which indicated to me it was still
> > alpha-level
> > > quality and stability.
> > >
> > YES, We have decided to freeze API's. I do not think we make any
> > compatibility break in future.
> >
> >
> >
> > >
> > > Best,
> > > Andrew
> > >
> >
>


[jira] [Created] (MAPREDUCE-6885) JobHistory event handler thread should not die if exception thrown

2017-05-05 Thread Jonathan Hung (JIRA)
Jonathan Hung created MAPREDUCE-6885:


 Summary: JobHistory event handler thread should not die if 
exception thrown
 Key: MAPREDUCE-6885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jonathan Hung


If eventHandlingThread handles an event which causes it to throw an exception 
(e.g. if it is unable to flush an event to HDFS), the thread dies. This thread 
is responsible for moving job history files to mapreduce.jobhistory.done-dir, 
if an exception is thrown the files will not be moved here, which is bad.

We should catch these exceptions so that the thread can still move these files 
when the job is complete.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6860) User intermediate-done-dir permissions should use history file permissions configuration

2017-03-09 Thread Jonathan Hung (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung resolved MAPREDUCE-6860.
--
Resolution: Not A Bug

> User intermediate-done-dir permissions should use history file permissions 
> configuration
> 
>
> Key: MAPREDUCE-6860
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6860
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>    Reporter: Jonathan Hung
>
> Currently {{JobHistoryEventHandler}} creates the user intermediate-done-dir 
> directory here: {noformat}  doneDirPrefixPath =
>   FileContext.getFileContext(conf).makeQualified(new 
> Path(userDoneDirStr));
>   mkdir(doneDirFS, doneDirPrefixPath, new FsPermission(
>   
> JobHistoryUtils.HISTORY_INTERMEDIATE_USER_DIR_PERMISSIONS));{noformat} which 
> is hardcoded to 770. But the summary, history, and conf files under this user 
> dir are configurable via 
> {{mapreduce.jobhistory.intermediate-done-dir.file.permission}}. So if the 
> configured permissions has  read/write/execute permissions for "other" users, 
> they will still not have access to these files due to the 770 permission on 
> the user dir.
> I see two options here:
> # Reuse {{mapreduce.jobhistory.intermediate-done-dir.file.permission}} as the 
> permissions for the user dir
> # Create a new config for the user dir permissions, using 770 as the default
> The latter makes more sense to me.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6860) User intermediate-done-dir permissions should use history file permissions configuration

2017-03-07 Thread Jonathan Hung (JIRA)
Jonathan Hung created MAPREDUCE-6860:


 Summary: User intermediate-done-dir permissions should use history 
file permissions configuration
 Key: MAPREDUCE-6860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jonathan Hung


Currently {{JobHistoryEventHandler}} creates the user intermediate-done-dir 
directory here: {noformat}  doneDirPrefixPath =
  FileContext.getFileContext(conf).makeQualified(new 
Path(userDoneDirStr));
  mkdir(doneDirFS, doneDirPrefixPath, new FsPermission(
  
JobHistoryUtils.HISTORY_INTERMEDIATE_USER_DIR_PERMISSIONS));{noformat} which is 
hardcoded to 770. But the summary, history, and conf files under this user dir 
are configurable via 
{{mapreduce.jobhistory.intermediate-done-dir.file.permission}}. So if the 
configured permissions has  read/write/execute permissions for "other" users, 
they will still not have access to these files due to the 770 permission on the 
user dir.

I see two options here:
# Reuse {{mapreduce.jobhistory.intermediate-done-dir.file.permission}} as the 
permissions for the user dir
# Create a new config for the user dir permissions, using 770 as the default



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org