Re: [DISCUSS] Making 2.10 the last minor 2.x release
Source code has been deleted from branch-2. Thanks Akira for taking this up! Jonathan Hung On Thu, Apr 16, 2020 at 11:40 AM Jonathan Hung wrote: > Makes sense. I've cherry-picked the commits in branch-2 that were missed > in branch-2.10. > > Jonathan Hung > > > On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka wrote: > >> Hi folks, >> >> I am still seeing some changes are being committed to branch-2. >> I'd like to delete the source code from branch-2 to avoid mistakes. >> https://issues.apache.org/jira/browse/HADOOP-16988 >> >> -Akira >> >> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena wrote: >> >>> Hi Jim, >>> Thanx for catching, I have configured the build to run on branch-2.10. >>> >>> -Ayush >>> >>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan < >>> james.bren...@verizonmedia.com> wrote: >>> >>>> It looks like QBT tests are still being run on branch-2 ( >>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/), >>>> and they are not very helpful at this point. >>>> Can we change the QBT tests to run against branch-2.10 instead? >>>> >>>> Jim >>>> >>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka >>>> wrote: >>>> >>>>> Thank you, Ayush. >>>>> >>>>> I understand we should keep branch-2 as is, as well as master. >>>>> >>>>> -Akira >>>>> >>>>> >>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena >>>>> wrote: >>>>> >>>>> > Hi Akira >>>>> > Seems there was an INFRA ticket for that. INFRA-19581, >>>>> > But the INFRA people closed as wont do and yes, the branch is >>>>> protected, >>>>> > we can’t delete it directly. >>>>> > >>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581 >>>>> > >>>>> > -Ayush >>>>> > >>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka >>>>> wrote: >>>>> > >>>>> > Thank you for your work, Jonathan. >>>>> > >>>>> > I found branch-2 has been unintentionally pushed again. Would you >>>>> remove >>>>> > it? >>>>> > I think the branch should be protected if possible. >>>>> > >>>>> > -Akira >>>>> > >>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung >>>>> > wrote: >>>>> > >>>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> >>>>> branch-3.1 -> >>>>> > >>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, >>>>> please >>>>> > >>>>> > don't try to commit to it) >>>>> > >>>>> > >>>>> > Completed procedure: >>>>> > >>>>> > >>>>> > - Verified everything in old branch-2.10 was in old branch-2 >>>>> > >>>>> > - Delete old branch-2.10 >>>>> > >>>>> > - Rename branch-2 to (new) branch-2.10 >>>>> > >>>>> > - Set version in new branch-2.10 to 2.10.1-SNAPSHOT >>>>> > >>>>> > - Renamed fix versions from 2.11.0 to 2.10.1 >>>>> > >>>>> > - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE >>>>> > >>>>> > >>>>> > >>>>> > Jonathan Hung >>>>> > >>>>> > >>>>> > >>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung >>>>> > >>>>> > wrote: >>>>> > >>>>> > >>>>> > FYI, starting the rename process, beginning with INFRA-19521. >>>>> > >>>>> > >>>>> > Jonathan Hung >>>>> > >>>>> > >>>>> > >>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko < >>>>> > >>>>> > shv.had...@gmail.com> >>>>> > >>>>> > wrote: >>>>> > >>>>> > >>>>> &g
Re: [DISCUSS] Making 2.10 the last minor 2.x release
Makes sense. I've cherry-picked the commits in branch-2 that were missed in branch-2.10. Jonathan Hung On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka wrote: > Hi folks, > > I am still seeing some changes are being committed to branch-2. > I'd like to delete the source code from branch-2 to avoid mistakes. > https://issues.apache.org/jira/browse/HADOOP-16988 > > -Akira > > On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena wrote: > >> Hi Jim, >> Thanx for catching, I have configured the build to run on branch-2.10. >> >> -Ayush >> >> On Tue, 31 Dec 2019 at 22:50, Jim Brennan >> wrote: >> >>> It looks like QBT tests are still being run on branch-2 ( >>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/), >>> and they are not very helpful at this point. >>> Can we change the QBT tests to run against branch-2.10 instead? >>> >>> Jim >>> >>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka >>> wrote: >>> >>>> Thank you, Ayush. >>>> >>>> I understand we should keep branch-2 as is, as well as master. >>>> >>>> -Akira >>>> >>>> >>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena >>>> wrote: >>>> >>>> > Hi Akira >>>> > Seems there was an INFRA ticket for that. INFRA-19581, >>>> > But the INFRA people closed as wont do and yes, the branch is >>>> protected, >>>> > we can’t delete it directly. >>>> > >>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581 >>>> > >>>> > -Ayush >>>> > >>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka >>>> wrote: >>>> > >>>> > Thank you for your work, Jonathan. >>>> > >>>> > I found branch-2 has been unintentionally pushed again. Would you >>>> remove >>>> > it? >>>> > I think the branch should be protected if possible. >>>> > >>>> > -Akira >>>> > >>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung >>>> > wrote: >>>> > >>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 >>>> -> >>>> > >>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, >>>> please >>>> > >>>> > don't try to commit to it) >>>> > >>>> > >>>> > Completed procedure: >>>> > >>>> > >>>> > - Verified everything in old branch-2.10 was in old branch-2 >>>> > >>>> > - Delete old branch-2.10 >>>> > >>>> > - Rename branch-2 to (new) branch-2.10 >>>> > >>>> > - Set version in new branch-2.10 to 2.10.1-SNAPSHOT >>>> > >>>> > - Renamed fix versions from 2.11.0 to 2.10.1 >>>> > >>>> > - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE >>>> > >>>> > >>>> > >>>> > Jonathan Hung >>>> > >>>> > >>>> > >>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung >>>> > >>>> > wrote: >>>> > >>>> > >>>> > FYI, starting the rename process, beginning with INFRA-19521. >>>> > >>>> > >>>> > Jonathan Hung >>>> > >>>> > >>>> > >>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko < >>>> > >>>> > shv.had...@gmail.com> >>>> > >>>> > wrote: >>>> > >>>> > >>>> > Hey guys, >>>> > >>>> > >>>> > I think we diverged a bit from the initial topic of this discussion, >>>> > >>>> > which is removing branch-2.10, and changing the version of branch-2 >>>> from >>>> > >>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT. >>>> > >>>> > Sounds like the subject line for this thread "Making 2.10 the last >>>> minor >>>> > >>>> > 2.x release" confused people. >>>> > >>>> > It is in fact a wider matter that can be di
Re: [DISCUSS] Making 2.10 the last minor 2.x release
It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please don't try to commit to it) Completed procedure: - Verified everything in old branch-2.10 was in old branch-2 - Delete old branch-2.10 - Rename branch-2 to (new) branch-2.10 - Set version in new branch-2.10 to 2.10.1-SNAPSHOT - Renamed fix versions from 2.11.0 to 2.10.1 - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE Jonathan Hung On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung wrote: > FYI, starting the rename process, beginning with INFRA-19521. > > Jonathan Hung > > > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko > wrote: > >> Hey guys, >> >> I think we diverged a bit from the initial topic of this discussion, >> which is removing branch-2.10, and changing the version of branch-2 from >> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT. >> Sounds like the subject line for this thread "Making 2.10 the last minor >> 2.x release" confused people. >> It is in fact a wider matter that can be discussed when somebody actually >> proposes to release 2.11, which I understand nobody does at the moment. >> >> So if anybody objects removing branch-2.10 please make an argument. >> Otherwise we should go ahead and just do it next week. >> I see people still struggling to keep branch-2 and branch-2.10 in sync. >> >> Thanks, >> --Konstantin >> >> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung >> wrote: >> >>> Thanks for the detailed thoughts, everyone. >>> >>> Eric (Badger), my understanding is the same as yours re. minor vs patch >>> releases. As for putting features into minor/patch releases, if we keep the >>> convention of putting new features only into minor releases, my assumption >>> is still that it's unlikely people will want to get them into branch-2 >>> (based on the 2.10.0 release process). For the java 11 issue, we haven't >>> even really removed support for java 7 in branch-2 (much less java 8), so I >>> feel moving to java 11 would go along with a move to branch 3. And as you >>> mentioned, if people really want to use java 11 on branch-2, we can always >>> revive branch-2. But for now I think the convenience of not needing to port >>> to both branch-2 and branch-2.10 (and below) outweighs the cost of >>> potentially needing to revive branch-2. >>> >>> Jonathan Hung >>> >>> >>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang wrote: >>> >>>> +1 for 2.10.x as last release for 2.x version. >>>> >>>> Software would become more compatible when more companies stress test >>>> the same software and making improvements in trunk. Some may be extra >>>> caution on moving up the version because obligation internally to keep >>>> things running. Company obligation should not be the driving force to >>>> maintain Hadoop branches. There is no proper collaboration in the >>>> community when every name brand company maintains its own Hadoop 2.x >>>> version. I think it would be more healthy for the community to reduce the >>>> branch forking and spend energy on trunk to harden the software. This will >>>> give more confidence to move up the version than trying to fix n >>>> permutations breakage like Flash fixing the timeline. >>>> >>>> Apache license stated, there is no warranty of any kind for code >>>> contributions. Fewer community release process should improve software >>>> quality when eyes are on trunk, and help steering toward the same end >>>> goals. >>>> >>>> regards, >>>> Eric >>>> >>>> >>>> >>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger >>>> wrote: >>>> >>>>> Hello all, >>>>> >>>>> Is it written anywhere what the difference is between a minor release >>>>> and a >>>>> point/dot/maintenance (I'll use "point" from here on out) release? I >>>>> have >>>>> looked around and I can't find anything other than some compatibility >>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I >>>>> think >>>>> this would help shape my opinion on whether or not to keep branch-2 >>>>> alive. >>>>> My current understanding is that we can't really break compatibility i
Re: [DISCUSS] Making 2.10 the last minor 2.x release
FYI, starting the rename process, beginning with INFRA-19521. Jonathan Hung On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko wrote: > Hey guys, > > I think we diverged a bit from the initial topic of this discussion, which > is removing branch-2.10, and changing the version of branch-2 from > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT. > Sounds like the subject line for this thread "Making 2.10 the last minor > 2.x release" confused people. > It is in fact a wider matter that can be discussed when somebody actually > proposes to release 2.11, which I understand nobody does at the moment. > > So if anybody objects removing branch-2.10 please make an argument. > Otherwise we should go ahead and just do it next week. > I see people still struggling to keep branch-2 and branch-2.10 in sync. > > Thanks, > --Konstantin > > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung > wrote: > >> Thanks for the detailed thoughts, everyone. >> >> Eric (Badger), my understanding is the same as yours re. minor vs patch >> releases. As for putting features into minor/patch releases, if we keep the >> convention of putting new features only into minor releases, my assumption >> is still that it's unlikely people will want to get them into branch-2 >> (based on the 2.10.0 release process). For the java 11 issue, we haven't >> even really removed support for java 7 in branch-2 (much less java 8), so I >> feel moving to java 11 would go along with a move to branch 3. And as you >> mentioned, if people really want to use java 11 on branch-2, we can always >> revive branch-2. But for now I think the convenience of not needing to port >> to both branch-2 and branch-2.10 (and below) outweighs the cost of >> potentially needing to revive branch-2. >> >> Jonathan Hung >> >> >> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang wrote: >> >>> +1 for 2.10.x as last release for 2.x version. >>> >>> Software would become more compatible when more companies stress test >>> the same software and making improvements in trunk. Some may be extra >>> caution on moving up the version because obligation internally to keep >>> things running. Company obligation should not be the driving force to >>> maintain Hadoop branches. There is no proper collaboration in the >>> community when every name brand company maintains its own Hadoop 2.x >>> version. I think it would be more healthy for the community to reduce the >>> branch forking and spend energy on trunk to harden the software. This will >>> give more confidence to move up the version than trying to fix n >>> permutations breakage like Flash fixing the timeline. >>> >>> Apache license stated, there is no warranty of any kind for code >>> contributions. Fewer community release process should improve software >>> quality when eyes are on trunk, and help steering toward the same end goals. >>> >>> regards, >>> Eric >>> >>> >>> >>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger >>> wrote: >>> >>>> Hello all, >>>> >>>> Is it written anywhere what the difference is between a minor release >>>> and a >>>> point/dot/maintenance (I'll use "point" from here on out) release? I >>>> have >>>> looked around and I can't find anything other than some compatibility >>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think >>>> this would help shape my opinion on whether or not to keep branch-2 >>>> alive. >>>> My current understanding is that we can't really break compatibility in >>>> either a minor or point release. But the only mention of the difference >>>> between minor and point releases is how to deal with Stable, Evolving, >>>> and >>>> Unstable tags, and how to deal with changing default configuration >>>> values. >>>> So it seems like there really isn't a big official difference between >>>> the >>>> two. In my mind, the functional difference between the two is that the >>>> minor releases may have added features and rewrites, while the point >>>> releases only have bug fixes. This might be an incorrect understanding, >>>> but >>>> that's what I have gathered from watching the releases over the last few >>>> years. Whether or not this is a correct understanding, I think that this >>>> needs to be documented somewhere, even if it is just a convent
Re: [DISCUSS] Making 2.10 the last minor 2.x release
Thanks for the detailed thoughts, everyone. Eric (Badger), my understanding is the same as yours re. minor vs patch releases. As for putting features into minor/patch releases, if we keep the convention of putting new features only into minor releases, my assumption is still that it's unlikely people will want to get them into branch-2 (based on the 2.10.0 release process). For the java 11 issue, we haven't even really removed support for java 7 in branch-2 (much less java 8), so I feel moving to java 11 would go along with a move to branch 3. And as you mentioned, if people really want to use java 11 on branch-2, we can always revive branch-2. But for now I think the convenience of not needing to port to both branch-2 and branch-2.10 (and below) outweighs the cost of potentially needing to revive branch-2. Jonathan Hung On Wed, Nov 20, 2019 at 10:50 AM Eric Yang wrote: > +1 for 2.10.x as last release for 2.x version. > > Software would become more compatible when more companies stress test the > same software and making improvements in trunk. Some may be extra caution > on moving up the version because obligation internally to keep things > running. Company obligation should not be the driving force to maintain > Hadoop branches. There is no proper collaboration in the community when > every name brand company maintains its own Hadoop 2.x version. I think it > would be more healthy for the community to reduce the branch forking and > spend energy on trunk to harden the software. This will give more > confidence to move up the version than trying to fix n permutations > breakage like Flash fixing the timeline. > > Apache license stated, there is no warranty of any kind for code > contributions. Fewer community release process should improve software > quality when eyes are on trunk, and help steering toward the same end goals. > > regards, > Eric > > > > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger > wrote: > >> Hello all, >> >> Is it written anywhere what the difference is between a minor release and >> a >> point/dot/maintenance (I'll use "point" from here on out) release? I have >> looked around and I can't find anything other than some compatibility >> documentation in 2.x that has since been removed in 3.x [1] [2]. I think >> this would help shape my opinion on whether or not to keep branch-2 alive. >> My current understanding is that we can't really break compatibility in >> either a minor or point release. But the only mention of the difference >> between minor and point releases is how to deal with Stable, Evolving, and >> Unstable tags, and how to deal with changing default configuration values. >> So it seems like there really isn't a big official difference between the >> two. In my mind, the functional difference between the two is that the >> minor releases may have added features and rewrites, while the point >> releases only have bug fixes. This might be an incorrect understanding, >> but >> that's what I have gathered from watching the releases over the last few >> years. Whether or not this is a correct understanding, I think that this >> needs to be documented somewhere, even if it is just a convention. >> >> Given my assumed understanding of minor vs point releases, here are the >> pros/cons that I can think of for having a branch-2. Please add on or >> correct me for anything you feel is missing or inadequate. >> Pros: >> - Features/rewrites/higher-risk patches are less likely to be put into >> 2.10.x >> - It is less necessary to move to 3.x >> >> Cons: >> - Bug fixes are less likely to be put into 2.10.x >> - An extra branch to maintain >> - Committers have an extra branch (5 vs 4 total branches) to commit >> patches to if they should go all the way back to 2.10.x >> - It is less necessary to move to 3.x >> >> So on the one hand you get added stability in fewer features being >> committed to 2.10.x, but then on the other you get fewer bug fixes being >> committed. In a perfect world, we wouldn't have to make this tradeoff. But >> we don't live in a perfect world and committers will make mistakes either >> because of lack of knowledge or simply because they made a mistake. If we >> have a branch-2, committers will forget, not know to, or choose not to >> (for >> whatever reason) commit valid bug fixes back all the way to branch-2.10. >> If >> we don't have a branch-2, committers who want their borderline risky >> feature in the 2.x line will err on the side of putting it into >> branch-2.10 >> instead of proposing the creation of a branch-2. Cle
Re: [DISCUSS] Making 2.10 the last minor 2.x release
Thanks Eric for the comments - regarding your concerns, I feel the pros outweigh the cons. To me, the chances of patch releases on 2.10.x are much higher than a new 2.11 minor release. (There didn't seem to be many people outside of our company who expressed interest in getting new features to branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0 release, there's 29 patches that have gone into branch-2 and 9 in branch-2.10, so it's already diverged quite a bit. In any case, we can always reverse this decision if we really need to, by recreating branch-2. But this proposal would reduce a lot of confusion IMO. Jonathan Hung On Fri, Nov 15, 2019 at 11:41 AM epa...@apache.org wrote: > Thanks Jonathan for opening the discussion. > > I am not in favor of this proposal. 2.10 was very recently released, and > moving to 2.10 will take some time for the community. It seems premature to > make a decision at this point that there will never be a need for a 2.11 > release. > > -Eric > > > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung < > jyhung2...@gmail.com> wrote: > > Hi folks, > > Given the release of 2.10.0, and the fact that it's intended to be a bridge > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor > release line in branch-2. Currently, the main issue is that there's many > fixes going into branch-2 (the theoretical 2.11.0) that's not going into > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will > likely never see the light of day unless they are backported to > branch-2.10. > > To do this, I propose we: > > - Delete branch-2.10 > - Rename branch-2 to branch-2.10 > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT > > This way we get all the current branch-2 fixes into the 2.10.x release > line. Then the commit chain will look like: trunk -> branch-3.2 -> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8 > > Thoughts? > > Jonathan Hung > > [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html >
Re: [DISCUSS] Making 2.10 the last minor 2.x release
Some other additional items we would need: - Mark all fix-versions in YARN/HDFS/MAPREDUCE/HADOOP from 2.11.0 to 2.10.1 - Remove 2.11.0 as a version in these projects Jonathan Hung On Thu, Nov 14, 2019 at 6:51 PM Jonathan Hung wrote: > Hi folks, > > Given the release of 2.10.0, and the fact that it's intended to be a > bridge release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last > minor release line in branch-2. Currently, the main issue is that there's > many fixes going into branch-2 (the theoretical 2.11.0) that's not going > into branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will > likely never see the light of day unless they are backported to branch-2.10. > > To do this, I propose we: > >- Delete branch-2.10 >- Rename branch-2 to branch-2.10 >- Set version in the new branch-2.10 to 2.10.1-SNAPSHOT > > This way we get all the current branch-2 fixes into the 2.10.x release > line. Then the commit chain will look like: trunk -> branch-3.2 -> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8 > > Thoughts? > > Jonathan Hung > > [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html >
[DISCUSS] Making 2.10 the last minor 2.x release
Hi folks, Given the release of 2.10.0, and the fact that it's intended to be a bridge release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor release line in branch-2. Currently, the main issue is that there's many fixes going into branch-2 (the theoretical 2.11.0) that's not going into branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will likely never see the light of day unless they are backported to branch-2.10. To do this, I propose we: - Delete branch-2.10 - Rename branch-2 to branch-2.10 - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT This way we get all the current branch-2 fixes into the 2.10.x release line. Then the commit chain will look like: trunk -> branch-3.2 -> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8 Thoughts? Jonathan Hung [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
[ANNOUNCE] Apache Hadoop 2.10.0 release
Hi all, I am happy to announce that the Apache Hadoop 2.10.0 has been released. Apache Hadoop 2.10.0 is the first release in the Apache Hadoop 2.10 line. The release details, including links to downloads, list of major features, release notes, and changelog, are on the 2.10.0 announcement page [1]. You can also download the release from the Downloads page [2]. - Major features: https://hadoop.apache.org/docs/r2.10.0/index.html - Release notes: http://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/release/2.10.0/RELEASENOTES.2.10.0.html - Changelog: http://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/release/2.10.0/CHANGES.2.10.0.html Thanks! [1] https://hadoop.apache.org/release/2.10.0.html [2] https://hadoop.apache.org/releases.html Jonathan
Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)
+1 from me too. The vote passed, so I'll continue with the rest of the release. Thanks everyone! Jonathan Hung On Tue, Oct 29, 2019 at 1:40 PM Giovanni Matteo Fumarola < giovanni.fumar...@gmail.com> wrote: > +1 (non-binding). > > - Built from source on Ubuntu with OpenJDK 11.0.3 > - Verified signatures > - Verified documentation > - Setup up a single node cluster and ran basic yarn commands > - Ran UTs for Yarn Router, Yarn Common, Yarn API, YARN NM and YARN RM. > > Thanks for putting this together, Jonathan. > > On Tue, Oct 29, 2019 at 8:47 AM Dinesh Chitlangia > wrote: > >> +1 (non-binding) >> >> - Verified signatures >> - Verified documentation >> - Built from sources on CentOS 7 >> - Tested with basic hdfs commands on a single node setup. >> >> Thank for organizing the release, Jonathan. >> >> -Dinesh >> >> >> >> On Tue, Oct 29, 2019 at 9:45 AM epa...@apache.org >> wrote: >> >> > Compatibility testing has gone well for me. >> > >> > - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.8.5 >> and >> > 2.10.0 >> > - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.10.0 >> and >> > trunk >> > - With one 4-node cluster running 2.10.0 and one 4-node cluster running >> > trunk, I ran a word count job in each cluster whose inputs and outputs >> were >> > from and to the opposite cluster. >> > - I verified that HDFS replication works as expected in a trunk cluster >> > that has one 2.10.0 datanode. >> > >> > Thanks, >> > -Eric >> > >> > >> > > On Tuesday, October 22, 2019, 4:55:29 PM CDT, Jonathan Hung < >> > jyhung2...@gmail.com> wrote: >> > > Hi folks, >> > > >> > >This is the second release candidate for the first release of Apache >> > Hadoop >> > >2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It >> includes >> > >features such as: >> > > >> > > - User-defined resource types >> > > - Native GPU support as a schedulable resource type >> > > - Consistent reads from standby node >> > > - Namenode port based selective encryption >> > > - Improvements related to rolling upgrade support from 2.x to 3.x >> > > - Cost based fair call queue >> > > >> > > The RC1 artifacts are at: >> > http://home.apache.org/~jhung/hadoop-2.10.0-RC1/ >> > > >> > > RC tag is release-2.10.0-RC1. >> > > >> > > The maven artifacts are hosted here: >> > > >> https://repository.apache.org/content/repositories/orgapachehadoop-1243/ >> > > >> > > My public key is available here: >> > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS >> > > >> > > The vote will run for 5 weekdays, until Tuesday, October 29 at 3:00 pm >> > PDT. >> > > >> > > Thanks, >> > > Jonathan Hung >> > >> > >> > - >> > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org >> > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org >> > >> > >> >
Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)
Thanks Eric! I sent out an RC1 earlier last week, not sure if you saw that. The only diff between RC1 and RC0 is HDFS-14667. If RC1 looks good to you then it'd be great to get your testing results on that thread. Jonathan Hung On Mon, Oct 28, 2019 at 1:06 PM epa...@apache.org wrote: > Compatibility testing has gone well for me. > > - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.8.5 and > 2.10.0 > - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.10.0 and > trunk > - With one 4-node cluster running 2.10.0 and one 4-node cluster running > trunk, I ran a word count job in each cluster whose inputs and outputs were > from and to the opposite cluster. > - I verified that HDFS replication works as expected in a trunk cluster > that has one 2.10.0 datanode. > > Thanks, > -Eric > > On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung < > jyhung2...@gmail.com> wrote: > > > > > > Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and > 2.10.0 clients and datanodes. Everything worked as expected. > > Jonathan Hung > > > On Tue, Oct 22, 2019 at 3:04 PM Eric Badger > wrote: > > > Hi Jonathan, > > > > Thanks for putting this RC together. You stated that there are > > improvements related to rolling upgrades from 2.x to 3.x and I know I > have > > seen multiple JIRAs getting committed to that effect. Could you describe > > any tests that you have done to verify rolling upgrade compatibility > > for 3.x servers talking to 2.x clients and vice versa? > > > > Thanks, > > > > Eric > > > > On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung > > wrote: > > > >> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar > >> (HDFS-14667). Since this is the first of a minor release, we would like > to > >> get it into 2.10.0. > >> > >> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1 > >> shortly. > >> > >> Jonathan Hung > >> > >> > >> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang wrote: > >> > >> > Thanks for the effort, Jonathan! > >> > > >> > +1 (non-binding) on RC0. > >> > - Set up a single node cluster with the binary tarball > >> > - Run Spark Pi and pySpark job > >> > > >> > BR, > >> > Zhankun > >> > > >> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko < > shv.had...@gmail.com > >> > > >> > wrote: > >> > > >> >> +1 on RC0. > >> >> - Verified signatures > >> >> - Built from sources > >> >> - Ran unit tests for new features > >> >> - Checked artifacts on Nexus, made sure the sources are present. > >> >> > >> >> Thanks > >> >> --Konstantin > >> >> > >> >> > >> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung > >> >> wrote: > >> >> > >> >> > Hi folks, > >> >> > > >> >> > This is the first release candidate for the first release of Apache > >> >> Hadoop > >> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It > >> includes > >> >> > features such as: > >> >> > > >> >> > - User-defined resource types > >> >> > - Native GPU support as a schedulable resource type > >> >> > - Consistent reads from standby node > >> >> > - Namenode port based selective encryption > >> >> > - Improvements related to rolling upgrade support from 2.x to 3.x > >> >> > > >> >> > The RC0 artifacts are at: > >> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/ > >> >> > > >> >> > RC tag is release-2.10.0-RC0. > >> >> > > >> >> > The maven artifacts are hosted here: > >> >> > > >> >> > >> > https://repository.apache.org/content/repositories/orgapachehadoop-1241/ > >> >> > > >> >> > My public key is available here: > >> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS > >> >> > > >> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at > >> 6:00 pm > >> >> > PDT. > >> >> > > >> >> > Thanks, > >> >> > Jonathan Hung > >> >> > > >> >> > [1] > >> >> > > >> >> > > >> >> > >> > https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0) > >> >> > > >> >> > >> > > >> > > > > - > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > >
Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)
Hi Eric, I took a quick look, are you using mapreduce.application.framework.path to run your MR jobs? If not, this seems like expected behavior if AM and tasks get launched on different NMs with different locally installed hadoop versions? Jonathan Hung On Sat, Oct 26, 2019 at 8:55 AM epa...@apache.org wrote: > I ran a few compatibility tests between 2.10.0 and 3.3.0 (trunk) > > Unfortunately, I ran into the following problem: > > Running with 2.10 RM and 3.3.0 (trunk) NM fails attempts with the > following error: > > 2019-10-26 15:44:06,885 WARN [main] org.apache.hadoop.mapred.YarnChild: > Exception running child : > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RPC$VersionMismatch): > Protocol org.apache.hadoop.mapred.TaskUmbilicalProtocol version mismatch. > (client = 19, server = 21) > > The AM happened to launch on the 3.3.0 node. > > Is this a protobuf issue? I thought we addressed that? > > -Eric Payne > > > > On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung < > jyhung2...@gmail.com> wrote: > > > > > > Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and > 2.10.0 clients and datanodes. Everything worked as expected. > > Jonathan Hung > > > On Tue, Oct 22, 2019 at 3:04 PM Eric Badger > wrote: > > > Hi Jonathan, > > > > Thanks for putting this RC together. You stated that there are > > improvements related to rolling upgrades from 2.x to 3.x and I know I > have > > seen multiple JIRAs getting committed to that effect. Could you describe > > any tests that you have done to verify rolling upgrade compatibility > > for 3.x servers talking to 2.x clients and vice versa? > > > > Thanks, > > > > Eric > > > > On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung > > wrote: > > > >> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar > >> (HDFS-14667). Since this is the first of a minor release, we would like > to > >> get it into 2.10.0. > >> > >> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1 > >> shortly. > >> > >> Jonathan Hung > >> > >> > >> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang wrote: > >> > >> > Thanks for the effort, Jonathan! > >> > > >> > +1 (non-binding) on RC0. > >> > - Set up a single node cluster with the binary tarball > >> > - Run Spark Pi and pySpark job > >> > > >> > BR, > >> > Zhankun > >> > > >> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko < > shv.had...@gmail.com > >> > > >> > wrote: > >> > > >> >> +1 on RC0. > >> >> - Verified signatures > >> >> - Built from sources > >> >> - Ran unit tests for new features > >> >> - Checked artifacts on Nexus, made sure the sources are present. > >> >> > >> >> Thanks > >> >> --Konstantin > >> >> > >> >> > >> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung > >> >> wrote: > >> >> > >> >> > Hi folks, > >> >> > > >> >> > This is the first release candidate for the first release of Apache > >> >> Hadoop > >> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It > >> includes > >> >> > features such as: > >> >> > > >> >> > - User-defined resource types > >> >> > - Native GPU support as a schedulable resource type > >> >> > - Consistent reads from standby node > >> >> > - Namenode port based selective encryption > >> >> > - Improvements related to rolling upgrade support from 2.x to 3.x > >> >> > > >> >> > The RC0 artifacts are at: > >> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/ > >> >> > > >> >> > RC tag is release-2.10.0-RC0. > >> >> > > >> >> > The maven artifacts are hosted here: > >> >> > > >> >> > >> > https://repository.apache.org/content/repositories/orgapachehadoop-1241/ > >> >> > > >> >> > My public key is available here: > >> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS > >> >> > > >> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at > >> 6:00 pm > >> >> > PDT. > >> >> > > >> >> > Thanks, > >> >> > Jonathan Hung > >> >> > > >> >> > [1] > >> >> > > >> >> > > >> >> > >> > https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0) > >> >> > > >> >> > >> > > >> > > >
Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)
Some more thoughts: for the javadoc issue, I think we can just support building on java 7. For the release notes issue, I can work with the authors of the major features to come up with release notes and update them before pushing it to site. The release notes in the published artifacts won't be up to date, but I think that's fine. I'll go ahead with this plan if no objections. Jonathan Hung On Fri, Oct 25, 2019 at 12:19 PM Jonathan Hung wrote: > Thanks for looking Erik. > > For the release notes, yeah I think it's because there's no release notes > for the corresponding JIRAs. I've added details for these features to the > index.md.vm file which should show up on the homepage for 2.10.0 (e.g. > https://hadoop.apache.org/docs/r2.9.0/index.html). We could add release > notes for these JIRAs, but that would require recreating the tar.gzs since > the release notes are bundled in there. > > For the javadoc issue, I was able to repro this issue, seems it's because > the org.apache.hadoop.yarn.client.ClientRMProxy import was removed in > FederationProxyProviderUtil in YARN-7900 in branch-2 (but not in other > branches). But it's referenced in javadocs in this file so it throws this > error. Re-adding this import and building with java 8 allows it to succeed. > > I checked javadoc html for FederationProxyProviderUtil in the produced > artifacts and it appears to be correct. > > I think we could easily overwrite the current RC1 artifacts with ones > containing proper release notes. Not sure what to do about the javadoc > issue though, that would require overwriting the release-2.10.0-RC1 tag > which I don't want to do. What do others think? > > Jonathan Hung > > > On Fri, Oct 25, 2019 at 9:21 AM Erik Krogen wrote: > >> Thanks for putting this together, Jonathan! >> >> I noticed that the RELEASENOTES.md makes no mention of any of the major >> features you mentioned in your email about the RC. Is this expected? I >> guess it is caused by the lack of a release note on the JIRAs for those >> features. >> >> I also noticed that building a distribution package (mvn -DskipTests >> package -Pdist) fails on Java 8 (1.8.0_172) with a bunch of Javadoc errors. >> It works fine on Java 7. Is this expected? >> >> Other verifications I performed: >> >>- Verified all signatures in RC1 >>- Verified all checksums in RC1 >>- Visually inspected contents of src tarball >>- Built from source on Mac OSX 10.14.6 and RHEL7 (Java 8) >> - mvn -DskipTests package >>- Visually inspected contents of binary tarball >> >> Thanks, >> Erik >> >> -- >> *From:* Konstantin Shvachko >> *Sent:* Wednesday, October 23, 2019 6:10 PM >> *To:* Jonathan Hung >> *Cc:* Hdfs-dev ; mapreduce-dev < >> mapreduce-dev@hadoop.apache.org>; yarn-dev ; >> Hadoop Common >> *Subject:* Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1) >> >> +1 on RC1 >> >> - Verified signatures >> - Verified maven artifacts on Nexus for sources >> - Checked rat reports >> - Checked documentation >> - Checked packaging contents >> - Built from sources on RHEL 7 box >> - Ran unit tests for new HDFS features with Java 8 >> >> Thanks, >> --Konstantin >> >> On Tue, Oct 22, 2019 at 2:55 PM Jonathan Hung >> wrote: >> >> > Hi folks, >> > >> > This is the second release candidate for the first release of Apache >> Hadoop >> > 2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It includes >> > features such as: >> > >> > - User-defined resource types >> > - Native GPU support as a schedulable resource type >> > - Consistent reads from standby node >> > - Namenode port based selective encryption >> > - Improvements related to rolling upgrade support from 2.x to 3.x >> > - Cost based fair call queue >> > >> > The RC1 artifacts are at: >> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fhome.apache.org%2F~jhung%2Fhadoop-2.10.0-RC1%2F&data=02%7C01%7Cekrogen%40linkedin.com%7C1fee1e5911d8415a418b08d7581f0c7e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C637074762694349124&sdata=ZX7lF4N3fV38ggkplLU56ybhKBZrx%2FUKMkfxm2WJ7eU%3D&reserved=0 >> > >> > RC tag is release-2.10.0-RC1. >> > >> > The maven artifacts are hosted here: >> > >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Forgapachehadoop-1243%2F&data=02%7C01%7Cekrogen%40linkedin.com%7C
Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)
Thanks for looking Erik. For the release notes, yeah I think it's because there's no release notes for the corresponding JIRAs. I've added details for these features to the index.md.vm file which should show up on the homepage for 2.10.0 (e.g. https://hadoop.apache.org/docs/r2.9.0/index.html). We could add release notes for these JIRAs, but that would require recreating the tar.gzs since the release notes are bundled in there. For the javadoc issue, I was able to repro this issue, seems it's because the org.apache.hadoop.yarn.client.ClientRMProxy import was removed in FederationProxyProviderUtil in YARN-7900 in branch-2 (but not in other branches). But it's referenced in javadocs in this file so it throws this error. Re-adding this import and building with java 8 allows it to succeed. I checked javadoc html for FederationProxyProviderUtil in the produced artifacts and it appears to be correct. I think we could easily overwrite the current RC1 artifacts with ones containing proper release notes. Not sure what to do about the javadoc issue though, that would require overwriting the release-2.10.0-RC1 tag which I don't want to do. What do others think? Jonathan Hung On Fri, Oct 25, 2019 at 9:21 AM Erik Krogen wrote: > Thanks for putting this together, Jonathan! > > I noticed that the RELEASENOTES.md makes no mention of any of the major > features you mentioned in your email about the RC. Is this expected? I > guess it is caused by the lack of a release note on the JIRAs for those > features. > > I also noticed that building a distribution package (mvn -DskipTests > package -Pdist) fails on Java 8 (1.8.0_172) with a bunch of Javadoc errors. > It works fine on Java 7. Is this expected? > > Other verifications I performed: > >- Verified all signatures in RC1 >- Verified all checksums in RC1 >- Visually inspected contents of src tarball >- Built from source on Mac OSX 10.14.6 and RHEL7 (Java 8) >- mvn -DskipTests package >- Visually inspected contents of binary tarball > > Thanks, > Erik > > ------ > *From:* Konstantin Shvachko > *Sent:* Wednesday, October 23, 2019 6:10 PM > *To:* Jonathan Hung > *Cc:* Hdfs-dev ; mapreduce-dev < > mapreduce-dev@hadoop.apache.org>; yarn-dev ; > Hadoop Common > *Subject:* Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1) > > +1 on RC1 > > - Verified signatures > - Verified maven artifacts on Nexus for sources > - Checked rat reports > - Checked documentation > - Checked packaging contents > - Built from sources on RHEL 7 box > - Ran unit tests for new HDFS features with Java 8 > > Thanks, > --Konstantin > > On Tue, Oct 22, 2019 at 2:55 PM Jonathan Hung > wrote: > > > Hi folks, > > > > This is the second release candidate for the first release of Apache > Hadoop > > 2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It includes > > features such as: > > > > - User-defined resource types > > - Native GPU support as a schedulable resource type > > - Consistent reads from standby node > > - Namenode port based selective encryption > > - Improvements related to rolling upgrade support from 2.x to 3.x > > - Cost based fair call queue > > > > The RC1 artifacts are at: > https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fhome.apache.org%2F~jhung%2Fhadoop-2.10.0-RC1%2F&data=02%7C01%7Cekrogen%40linkedin.com%7C1fee1e5911d8415a418b08d7581f0c7e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C637074762694349124&sdata=ZX7lF4N3fV38ggkplLU56ybhKBZrx%2FUKMkfxm2WJ7eU%3D&reserved=0 > > > > RC tag is release-2.10.0-RC1. > > > > The maven artifacts are hosted here: > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Forgapachehadoop-1243%2F&data=02%7C01%7Cekrogen%40linkedin.com%7C1fee1e5911d8415a418b08d7581f0c7e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C637074762694349124&sdata=DsJDfoj8eg3E%2F%2BNEwOAI41LhcRJ2hOWycS923ds3Seg%3D&reserved=0 > > > > My public key is available here: > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist.apache.org%2Frepos%2Fdist%2Frelease%2Fhadoop%2Fcommon%2FKEYS&data=02%7C01%7Cekrogen%40linkedin.com%7C1fee1e5911d8415a418b08d7581f0c7e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C637074762694349124&sdata=1694z6xhj5NtxwYBpwnRBx%2BgK0npGIUs5O580K3KPJw%3D&reserved=0 > > > > The vote will run for 5 weekdays, until Tuesday, October 29 at 3:00 pm > PDT. > > > > Thanks, > > Jonathan Hung > > > > [1] > > > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.a
Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)
Hi Eric, thanks for trying it out. We talked about this in today's YARN community sync up, summarizing here for everyone else: I don't think it's worth delaying the 2.10.0 release further, we can address this in a subsequent 2.10.x release. Wangda mentioned it might be related to changes in dominant resource calculator, but root cause remains to be seen. Jonathan Hung On Wed, Oct 23, 2019 at 9:02 AM epa...@apache.org wrote: > Hi Jonathan, > > Thanks very much for all of your work on this release. > > I have a concern about cross-queue (inter-queue) preemption in 2.10. > > In 2.8, on a 6 node pseudo-cluster, preempting from one queue to meet the > needs of another queue seems to work as expected. However, 2.10 in the same > pseudo-cluster (with the same config properties), only one container was > preempted for the AM and then nothing else. > > I don't know how the community feels about holding up the 2.10.0 release > for this issue, but we need to get to the bottom of this before we can go > to 2.10.x. I am still investigating. > > Thanks, > -Eric > > > > > On Tuesday, October 22, 2019, 4:55:29 PM CDT, Jonathan Hung < > jyhung2...@gmail.com> wrote: > > Hi folks, > > > > This is the second release candidate for the first release of Apache > Hadoop > > 2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It includes > > features such as: > > > > - User-defined resource types > > - Native GPU support as a schedulable resource type > > - Consistent reads from standby node > > - Namenode port based selective encryption > > - Improvements related to rolling upgrade support from 2.x to 3.x > > - Cost based fair call queue > > > > The RC1 artifacts are at: > http://home.apache.org/~jhung/hadoop-2.10.0-RC1/ > > > > RC tag is release-2.10.0-RC1. > > > > The maven artifacts are hosted here: > > https://repository.apache.org/content/repositories/orgapachehadoop-1243/ > > > > My public key is available here: > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS > > > > The vote will run for 5 weekdays, until Tuesday, October 29 at 3:00 pm > PDT. > > > > Thanks, > > Jonathan Hung >
Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)
Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and 2.10.0 clients and datanodes. Everything worked as expected. Jonathan Hung On Tue, Oct 22, 2019 at 3:04 PM Eric Badger wrote: > Hi Jonathan, > > Thanks for putting this RC together. You stated that there are > improvements related to rolling upgrades from 2.x to 3.x and I know I have > seen multiple JIRAs getting committed to that effect. Could you describe > any tests that you have done to verify rolling upgrade compatibility > for 3.x servers talking to 2.x clients and vice versa? > > Thanks, > > Eric > > On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung > wrote: > >> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar >> (HDFS-14667). Since this is the first of a minor release, we would like to >> get it into 2.10.0. >> >> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1 >> shortly. >> >> Jonathan Hung >> >> >> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang wrote: >> >> > Thanks for the effort, Jonathan! >> > >> > +1 (non-binding) on RC0. >> > - Set up a single node cluster with the binary tarball >> > - Run Spark Pi and pySpark job >> > >> > BR, >> > Zhankun >> > >> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko > > >> > wrote: >> > >> >> +1 on RC0. >> >> - Verified signatures >> >> - Built from sources >> >> - Ran unit tests for new features >> >> - Checked artifacts on Nexus, made sure the sources are present. >> >> >> >> Thanks >> >> --Konstantin >> >> >> >> >> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung >> >> wrote: >> >> >> >> > Hi folks, >> >> > >> >> > This is the first release candidate for the first release of Apache >> >> Hadoop >> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It >> includes >> >> > features such as: >> >> > >> >> > - User-defined resource types >> >> > - Native GPU support as a schedulable resource type >> >> > - Consistent reads from standby node >> >> > - Namenode port based selective encryption >> >> > - Improvements related to rolling upgrade support from 2.x to 3.x >> >> > >> >> > The RC0 artifacts are at: >> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/ >> >> > >> >> > RC tag is release-2.10.0-RC0. >> >> > >> >> > The maven artifacts are hosted here: >> >> > >> >> >> https://repository.apache.org/content/repositories/orgapachehadoop-1241/ >> >> > >> >> > My public key is available here: >> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS >> >> > >> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at >> 6:00 pm >> >> > PDT. >> >> > >> >> > Thanks, >> >> > Jonathan Hung >> >> > >> >> > [1] >> >> > >> >> > >> >> >> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0) >> >> > >> >> >> > >> >
[VOTE] Release Apache Hadoop 2.10.0 (RC1)
Hi folks, This is the second release candidate for the first release of Apache Hadoop 2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It includes features such as: - User-defined resource types - Native GPU support as a schedulable resource type - Consistent reads from standby node - Namenode port based selective encryption - Improvements related to rolling upgrade support from 2.x to 3.x - Cost based fair call queue The RC1 artifacts are at: http://home.apache.org/~jhung/hadoop-2.10.0-RC1/ RC tag is release-2.10.0-RC1. The maven artifacts are hosted here: https://repository.apache.org/content/repositories/orgapachehadoop-1243/ My public key is available here: https://dist.apache.org/repos/dist/release/hadoop/common/KEYS The vote will run for 5 weekdays, until Tuesday, October 29 at 3:00 pm PDT. Thanks, Jonathan Hung [1] https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)
Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)
Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar (HDFS-14667). Since this is the first of a minor release, we would like to get it into 2.10.0. HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1 shortly. Jonathan Hung On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang wrote: > Thanks for the effort, Jonathan! > > +1 (non-binding) on RC0. > - Set up a single node cluster with the binary tarball > - Run Spark Pi and pySpark job > > BR, > Zhankun > > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko > wrote: > >> +1 on RC0. >> - Verified signatures >> - Built from sources >> - Ran unit tests for new features >> - Checked artifacts on Nexus, made sure the sources are present. >> >> Thanks >> --Konstantin >> >> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung >> wrote: >> >> > Hi folks, >> > >> > This is the first release candidate for the first release of Apache >> Hadoop >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It includes >> > features such as: >> > >> > - User-defined resource types >> > - Native GPU support as a schedulable resource type >> > - Consistent reads from standby node >> > - Namenode port based selective encryption >> > - Improvements related to rolling upgrade support from 2.x to 3.x >> > >> > The RC0 artifacts are at: >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/ >> > >> > RC tag is release-2.10.0-RC0. >> > >> > The maven artifacts are hosted here: >> > >> https://repository.apache.org/content/repositories/orgapachehadoop-1241/ >> > >> > My public key is available here: >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS >> > >> > The vote will run for 5 weekdays, until Wednesday, October 23 at 6:00 pm >> > PDT. >> > >> > Thanks, >> > Jonathan Hung >> > >> > [1] >> > >> > >> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0) >> > >> >
[VOTE] Release Apache Hadoop 2.10.0 (RC0)
Hi folks, This is the first release candidate for the first release of Apache Hadoop 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It includes features such as: - User-defined resource types - Native GPU support as a schedulable resource type - Consistent reads from standby node - Namenode port based selective encryption - Improvements related to rolling upgrade support from 2.x to 3.x The RC0 artifacts are at: http://home.apache.org/~jhung/hadoop-2.10.0-RC0/ RC tag is release-2.10.0-RC0. The maven artifacts are hosted here: https://repository.apache.org/content/repositories/orgapachehadoop-1241/ My public key is available here: https://dist.apache.org/repos/dist/release/hadoop/common/KEYS The vote will run for 5 weekdays, until Wednesday, October 23 at 6:00 pm PDT. Thanks, Jonathan Hung [1] https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)
Re: [DISCUSS] Hadoop 2.10.0 release plan
I've moved all jiras with target version 2.10.0 to 2.10.1. Also I've created branch-2.10 and branch-2.10.0, please commit any 2.10.x bug fixes to branch-2.10. I'll send out a vote thread for 2.10.0-RC0 shortly. Jonathan Hung On Fri, Oct 11, 2019 at 10:32 AM Jonathan Hung wrote: > Edit: seems a 2.10.0 blocker was reopened (HDFS-14305). I'll continue > watching this jira and start the release once this is resolved. > > Jonathan Hung > > > On Thu, Oct 10, 2019 at 5:13 PM Jonathan Hung > wrote: > >> Hi folks, as of now all 2.10.0 blockers have been resolved [1]. So I'll >> start the release process soon (cutting branches, updating target versions, >> etc). >> >> [1] https://issues.apache.org/jira/issues/?filter=12346975 >> >> Jonathan Hung >> >> >> On Mon, Aug 26, 2019 at 10:19 AM Jonathan Hung >> wrote: >> >>> Hi folks, >>> >>> As discussed previously (e.g. [1], [2]) we'd like to do a 2.10.0 release >>> soon. Some features/big-items we're targeting for this release: >>> >>>- YARN resource types/GPU support (YARN-8200 >>><https://issues.apache.org/jira/browse/YARN-8200>) >>>- Selective wire encryption (HDFS-13541 >>><https://issues.apache.org/jira/browse/HDFS-13541>) >>>- Rolling upgrade support from 2.x to 3.x (e.g. HDFS-14509 >>><https://issues.apache.org/jira/browse/HDFS-14509>) >>> >>> Per [3] sounds like there's concern around upgrading dependencies as >>> well. >>> >>> We created a public jira filter here ( >>> https://issues.apache.org/jira/issues/?filter=12346975) marking all >>> blockers for 2.10.0 release. If you have other jiras that should be 2.10.0 >>> blockers, please mark "Target Version/s" as "2.10.0" and add label >>> "release-blocker" so we can track it through this filter. >>> >>> We're targeting a release at end of September. >>> >>> Please share any thoughts you have about this. Thanks! >>> >>> [1] >>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29461.html >>> [2] >>> https://www.mail-archive.com/mapreduce-dev@hadoop.apache.org/msg21293.html >>> [3] >>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg33440.html >>> >>> >>> Jonathan Hung >>> >>
Re: [DISCUSS] Hadoop 2.10.0 release plan
Edit: seems a 2.10.0 blocker was reopened (HDFS-14305). I'll continue watching this jira and start the release once this is resolved. Jonathan Hung On Thu, Oct 10, 2019 at 5:13 PM Jonathan Hung wrote: > Hi folks, as of now all 2.10.0 blockers have been resolved [1]. So I'll > start the release process soon (cutting branches, updating target versions, > etc). > > [1] https://issues.apache.org/jira/issues/?filter=12346975 > > Jonathan Hung > > > On Mon, Aug 26, 2019 at 10:19 AM Jonathan Hung > wrote: > >> Hi folks, >> >> As discussed previously (e.g. [1], [2]) we'd like to do a 2.10.0 release >> soon. Some features/big-items we're targeting for this release: >> >>- YARN resource types/GPU support (YARN-8200 >><https://issues.apache.org/jira/browse/YARN-8200>) >>- Selective wire encryption (HDFS-13541 >><https://issues.apache.org/jira/browse/HDFS-13541>) >>- Rolling upgrade support from 2.x to 3.x (e.g. HDFS-14509 >><https://issues.apache.org/jira/browse/HDFS-14509>) >> >> Per [3] sounds like there's concern around upgrading dependencies as well. >> >> We created a public jira filter here ( >> https://issues.apache.org/jira/issues/?filter=12346975) marking all >> blockers for 2.10.0 release. If you have other jiras that should be 2.10.0 >> blockers, please mark "Target Version/s" as "2.10.0" and add label >> "release-blocker" so we can track it through this filter. >> >> We're targeting a release at end of September. >> >> Please share any thoughts you have about this. Thanks! >> >> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29461.html >> [2] >> https://www.mail-archive.com/mapreduce-dev@hadoop.apache.org/msg21293.html >> [3] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg33440.html >> >> >> Jonathan Hung >> >
Re: [DISCUSS] Hadoop 2.10.0 release plan
Hi folks, as of now all 2.10.0 blockers have been resolved [1]. So I'll start the release process soon (cutting branches, updating target versions, etc). [1] https://issues.apache.org/jira/issues/?filter=12346975 Jonathan Hung On Mon, Aug 26, 2019 at 10:19 AM Jonathan Hung wrote: > Hi folks, > > As discussed previously (e.g. [1], [2]) we'd like to do a 2.10.0 release > soon. Some features/big-items we're targeting for this release: > >- YARN resource types/GPU support (YARN-8200 ><https://issues.apache.org/jira/browse/YARN-8200>) >- Selective wire encryption (HDFS-13541 ><https://issues.apache.org/jira/browse/HDFS-13541>) >- Rolling upgrade support from 2.x to 3.x (e.g. HDFS-14509 ><https://issues.apache.org/jira/browse/HDFS-14509>) > > Per [3] sounds like there's concern around upgrading dependencies as well. > > We created a public jira filter here ( > https://issues.apache.org/jira/issues/?filter=12346975) marking all > blockers for 2.10.0 release. If you have other jiras that should be 2.10.0 > blockers, please mark "Target Version/s" as "2.10.0" and add label > "release-blocker" so we can track it through this filter. > > We're targeting a release at end of September. > > Please share any thoughts you have about this. Thanks! > > [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29461.html > [2] > https://www.mail-archive.com/mapreduce-dev@hadoop.apache.org/msg21293.html > [3] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg33440.html > > > Jonathan Hung >
Re: Incompatible changes between branch-2.8 and branch-2.9
- I've created YARN-9855 and uploaded patches to fix YARN-6616 in branch-2.8 and branch-2.7. - For YARN-6050, not sure either. Robert/Wangda, can you comment on YARN-6050 compatibility? - For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be incompatible with this strategy? It should be OK to remove/add optional fields (removing the field with id 12, and adding the field with id 13). The difficulties I see here are, we would have to leave id 12 blank in 2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5 would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade would still work IIUC. Jonathan Hung On Tue, Sep 24, 2019 at 2:52 PM Eric Badger wrote: > * For YARN-6616, for branch-2.8 and below, it was only committed to > 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can > revert YARN-6616 from branch-2.7 and branch-2.8. > - This seems reasonable. Since we haven't released anything, it should > be no issue to change the 2.7/2.8 protobuf field to have the same value as > 2.9+ > > * For YARN-6050, there's a bit here: > https://developers.google.com/protocol-buffers/docs/proto that says > "optional is compatible with repeated", so I think we should be OK there. > - Optional is compatible with repeatable over the wire such that > protobuf won't blow up, but does that actually mean that it's compatible in > this case? If it's expecting an optional and gets a repeated, it's going to > drop everything except for the last value. I don't know enough about > YARN-6050 to say if this will be ok or not. > > * For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5 > to a 2.9+ version will be an issue. One option could be to move the > intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then > users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to > release this), then upgrade from 2.8.6 to 2.9+. > - I'm ok with this, but it should be noted that the upgrade from > 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling > upgrade. So this would cause some pain to anybody with clusters on those > versions. > > Eric > > On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung > wrote: > >> Sorry, let me edit my first point. We can just create addendums for >> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the >> correct id 28. We don’t need to revert YARN-6616 from these branches >> completely. >> >> Jonathan >> >> >> From: Jonathan Hung >> Sent: Tuesday, September 24, 2019 11:38 AM >> To: Eric Badger >> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev >> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9 >> >> Hi Eric, thanks for the investigation. >> >> * For YARN-6616, for branch-2.8 and below, it was only committed to >> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can >> revert YARN-6616 from branch-2.7 and branch-2.8. >> * For YARN-6050, there's a bit here: >> https://developers.google.com/protocol-buffers/docs/proto that says >> "optional is compatible with repeated", so I think we should be OK there. >> * For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or >> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the >> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then >> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to >> release this), then upgrade from 2.8.6 to 2.9+. >> >> Jonathan Hung >> >> >> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger >> >> wrote: >> We (Verizon Media) are currently moving towards upgrading our clusters >> from >> our internal fork of branch-2.8 to an internal fork of branch-2. During >> this process, we have found multiple incompatible changes in protobufs >> between branch-2.8 and branch-2. These incompatibilities were all >> introduced between branch-2.8 and branch-2.9. I did a git diff over all >> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of >> incompatibilities from 3 separate commits. All of the incompatibilities >> are >> in yarn_protos.proto >> >> >> I would like to discuss how to fix these incompatible changes. Otherwise, >> rolling upgrades will not be supported between branch-2.8 (or below) and >> branch-2.9 (or beyond). We could revert the incompatible changes, but then >> the new releases would be incompatible with the releases that have these &
Re: Incompatible changes between branch-2.8 and branch-2.9
Sorry, let me edit my first point. We can just create addendums for YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the correct id 28. We don’t need to revert YARN-6616 from these branches completely. Jonathan From: Jonathan Hung Sent: Tuesday, September 24, 2019 11:38 AM To: Eric Badger Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev Subject: Re: Incompatible changes between branch-2.8 and branch-2.9 Hi Eric, thanks for the investigation. * For YARN-6616, for branch-2.8 and below, it was only committed to 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can revert YARN-6616 from branch-2.7 and branch-2.8. * For YARN-6050, there's a bit here: https://developers.google.com/protocol-buffers/docs/proto that says "optional is compatible with repeated", so I think we should be OK there. * For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5 to a 2.9+ version will be an issue. One option could be to move the intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to release this), then upgrade from 2.8.6 to 2.9+. Jonathan Hung On Tue, Sep 24, 2019 at 9:23 AM Eric Badger wrote: We (Verizon Media) are currently moving towards upgrading our clusters from our internal fork of branch-2.8 to an internal fork of branch-2. During this process, we have found multiple incompatible changes in protobufs between branch-2.8 and branch-2. These incompatibilities were all introduced between branch-2.8 and branch-2.9. I did a git diff over all .proto files across the branch-2.8 and branch-2.9 and found 3 instances of incompatibilities from 3 separate commits. All of the incompatibilities are in yarn_protos.proto I would like to discuss how to fix these incompatible changes. Otherwise, rolling upgrades will not be supported between branch-2.8 (or below) and branch-2.9 (or beyond). We could revert the incompatible changes, but then the new releases would be incompatible with the releases that have these incompatible changes. If we do nothing, then rolling upgrades won't work between 2.8- and 2.9+. Thanks, Eric --- git diff branch-2.8..branch-2.9 $(find . -name '*\.proto') https://issues.apache.org/jira/browse/YARN-6616 - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch @@ -211,7 +245,20 @@ message ApplicationReportProto { optional PriorityProto priority = 23; optional string appNodeLabelExpression = 24; optional string amNodeLabelExpression = 25; - optional int64 submitTime = 26; + repeated AppTimeoutsMapProto appTimeouts = 26; + optional int64 launchTime = 27; + optional int64 submitTime = 28; https://issues.apache.org/jira/browse/YARN-6050 - Trunk and branch-2 patches both change the protobuf type in the same way. @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto { optional LogAggregationContextProto log_aggregation_context = 14; optional ReservationIdProto reservation_id = 15; optional string node_label_expression = 16; - optional ResourceRequestProto am_container_resource_request = 17; + repeated ResourceRequestProto am_container_resource_request = 17; + repeated ApplicationTimeoutMapProto application_timeouts = 18; https://issues.apache.org/jira/browse/YARN-7813 - Trunk (applied through branch-3.1) and branch-3.0 (applied through branch-2.9) patches differ from branch-2.8 patch @@ -425,7 +501,21 @@ message QueueInfoProto { optional string defaultNodeLabelExpression = 9; optional QueueStatisticsProto queueStatistics = 10; optional bool preemptionDisabled = 11; - optional bool intraQueuePreemptionDisabled = 12; + repeated QueueConfigurationsMapProto queueConfigurationsMap = 12; + optional bool intraQueuePreemptionDisabled = 13;
Re: Incompatible changes between branch-2.8 and branch-2.9
Hi Eric, thanks for the investigation. - For YARN-6616, for branch-2.8 and below, it was only committed to 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can revert YARN-6616 from branch-2.7 and branch-2.8. - For YARN-6050, there's a bit here: https://developers.google.com/protocol-buffers/docs/proto that says "optional is compatible with repeated", so I think we should be OK there. - For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5 to a 2.9+ version will be an issue. One option could be to move the intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to release this), then upgrade from 2.8.6 to 2.9+. Jonathan Hung On Tue, Sep 24, 2019 at 9:23 AM Eric Badger wrote: > We (Verizon Media) are currently moving towards upgrading our clusters from > our internal fork of branch-2.8 to an internal fork of branch-2. During > this process, we have found multiple incompatible changes in protobufs > between branch-2.8 and branch-2. These incompatibilities were all > introduced between branch-2.8 and branch-2.9. I did a git diff over all > .proto files across the branch-2.8 and branch-2.9 and found 3 instances of > incompatibilities from 3 separate commits. All of the incompatibilities are > in yarn_protos.proto > > > I would like to discuss how to fix these incompatible changes. Otherwise, > rolling upgrades will not be supported between branch-2.8 (or below) and > branch-2.9 (or beyond). We could revert the incompatible changes, but then > the new releases would be incompatible with the releases that have these > incompatible changes. If we do nothing, then rolling upgrades won't work > between 2.8- and 2.9+. > > > Thanks, > > > Eric > > > --- > > > git diff branch-2.8..branch-2.9 $(find . -name '*\.proto') > > > https://issues.apache.org/jira/browse/YARN-6616 > >- Trunk patch (applied through branch-2.9) differs from branch-2.8 patch > > @@ -211,7 +245,20 @@ message ApplicationReportProto { > >optional PriorityProto priority = 23; > >optional string appNodeLabelExpression = 24; > >optional string amNodeLabelExpression = 25; > > - optional int64 submitTime = 26; > > + repeated AppTimeoutsMapProto appTimeouts = 26; > > + optional int64 launchTime = 27; > > + optional int64 submitTime = 28; > > > https://issues.apache.org/jira/browse/YARN-6050 > >- Trunk and branch-2 patches both change the protobuf type in the same >way. > > @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto { > >optional LogAggregationContextProto log_aggregation_context = 14; > >optional ReservationIdProto reservation_id = 15; > >optional string node_label_expression = 16; > > - optional ResourceRequestProto am_container_resource_request = 17; > > + repeated ResourceRequestProto am_container_resource_request = 17; > > + repeated ApplicationTimeoutMapProto application_timeouts = 18; > > > https://issues.apache.org/jira/browse/YARN-7813 > >- Trunk (applied through branch-3.1) and branch-3.0 (applied through >branch-2.9) patches differ from branch-2.8 patch > > @@ -425,7 +501,21 @@ message QueueInfoProto { > >optional string defaultNodeLabelExpression = 9; > >optional QueueStatisticsProto queueStatistics = 10; > >optional bool preemptionDisabled = 11; > > - optional bool intraQueuePreemptionDisabled = 12; > > + repeated QueueConfigurationsMapProto queueConfigurationsMap = 12; > > + optional bool intraQueuePreemptionDisabled = 13; >
Re: [VOTE] Merge YARN-8200 to branch-2 and branch-3.0
Thanks all, +1 from me too. There's three binding +1, two non-binding +1, and no -1 so I'll merge YARN-8200 to branch-2 shortly. I'll skip branch-3.0 since it's EOL as others have mentioned. Jonathan Hung On Tue, Aug 27, 2019 at 11:49 AM Konstantin Shvachko wrote: > +1 for the merge. > > We probably should not bother with branch-3.0 merge since it's been voted > EOL. > > Thanks, > --Konstantin > > On Thu, Aug 22, 2019 at 4:43 PM Jonathan Hung > wrote: > >> Hi folks, >> >> As per [1], starting a vote to merge YARN-8200 (and YARN-8200.branch3) >> feature branch to branch-2 (and branch-3.0). >> >> Vote runs for 7 days, to Thursday, Aug 29 5:00PM PDT. >> >> Thanks. >> >> [1] >> >> http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201908.mbox/%3cCAHzWLgcX7f5Tr3q=csrqgysvpdf7mh-iu17femgx89dhr+1...@mail.gmail.com%3e >> >> Jonathan Hung >> >
[DISCUSS] Hadoop 2.10.0 release plan
Hi folks, As discussed previously (e.g. [1], [2]) we'd like to do a 2.10.0 release soon. Some features/big-items we're targeting for this release: - YARN resource types/GPU support (YARN-8200 <https://issues.apache.org/jira/browse/YARN-8200>) - Selective wire encryption (HDFS-13541 <https://issues.apache.org/jira/browse/HDFS-13541>) - Rolling upgrade support from 2.x to 3.x (e.g. HDFS-14509 <https://issues.apache.org/jira/browse/HDFS-14509>) Per [3] sounds like there's concern around upgrading dependencies as well. We created a public jira filter here ( https://issues.apache.org/jira/issues/?filter=12346975) marking all blockers for 2.10.0 release. If you have other jiras that should be 2.10.0 blockers, please mark "Target Version/s" as "2.10.0" and add label "release-blocker" so we can track it through this filter. We're targeting a release at end of September. Please share any thoughts you have about this. Thanks! [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29461.html [2] https://www.mail-archive.com/mapreduce-dev@hadoop.apache.org/msg21293.html [3] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg33440.html Jonathan Hung
[VOTE] Merge YARN-8200 to branch-2 and branch-3.0
Hi folks, As per [1], starting a vote to merge YARN-8200 (and YARN-8200.branch3) feature branch to branch-2 (and branch-3.0). Vote runs for 7 days, to Thursday, Aug 29 5:00PM PDT. Thanks. [1] http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201908.mbox/%3cCAHzWLgcX7f5Tr3q=csrqgysvpdf7mh-iu17femgx89dhr+1...@mail.gmail.com%3e Jonathan Hung
Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2
Reviving this thread: we tested YARN RU starting with a cluster running 2.7.4, to running branch-2 + YARN-8200. Ran some simple MR/Spark jobs concurrently with the RM/NM upgrades and did not see any issues. If no other concerns I'll continue with a vote. Jonathan Hung On Thu, Apr 18, 2019 at 5:12 PM Jonathan Hung wrote: > Sorry for the delay, had to deprioritize this. Hoping to get to this next > week. > > Jonathan > > -- > *From:* Jim Brennan > *Sent:* Thursday, April 18, 2019 7:28 AM > *To:* Jonathan Hung > *Cc:* yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org > *Subject:* Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2 > > Hi Jonathan, > > Hi Jim, we have not tested rolling upgrade. I don’t foresee this being an >> issue, but we’ll try it out and report back. > > > Any update on this? > Jim > > > On Wed, Apr 3, 2019 at 2:16 AM Jonathan Hung wrote: > >> Hi Jim, we have not tested rolling upgrade. I don’t foresee this being an >> issue, but we’ll try it out and report back. >> >> Jonathan >> >> ------ >> *From:* Jim Brennan >> *Sent:* Tuesday, April 2, 2019 9:17 AM >> *To:* Jonathan Hung >> *Cc:* yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org >> *Subject:* Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2 >> >> Thanks for working on this! >> One concern for us is support for a rolling upgrade. If we are running a >> cluster based on branch-2.8, will we be able to do a rolling upgrade (no >> cluster down-time) to a branch containing these changes? Have you tested >> rolling upgrades? >> >> Thanks. >> Jim >> >> On Fri, Mar 29, 2019 at 2:14 PM Jonathan Hung >> wrote: >> >>> Hello devs, >>> >>> Starting a discuss thread to merge resource types/native GPU scheduling >>> support to branch-3.0 and branch-2. The resource types work was done in >>> trunk~branch-3.0 and GPU support done in trunk~branch-3.1, so the >>> proposal >>> is to merge GPU support into branch-3.0 and both resource types/GPU >>> support >>> to branch-2. >>> >>> Internally we've been running resource types/GPU support off a fork of >>> branch-2.9.0 in a > 300 node GPU cluster for a few months which has >>> worked >>> well. Also for completeness we verified that everything going into >>> branch-2 >>> also exists in branch-3.0. >>> >>> The specific list of patches to merge is in feature branch >>> YARN-8200.branch3 (for branch-3.0) and feature branch YARN-8200 (for >>> branch-2). Full patches containing the YARN-8200.branch3 -> branch-3.0 >>> diff >>> and YARN-8200 -> branch-2 diff have been posted to YARN-8200 jira. >>> >>> If there's no issues from the community I'll start a merge vote next >>> week. >>> Thanks. >>> >>> Jonathan Hung >>> >>
Re: [VOTE] Mark 2.6, 2.7, 3.0 release lines EOL
+1. Thanks! Jonathan Hung On Tue, Aug 20, 2019 at 8:03 PM Wangda Tan wrote: > Hi all, > > This is a vote thread to mark any versions smaller than 2.7 (inclusive), > and 3.0 EOL. This is based on discussions of [1] > > This discussion runs for 7 days and will conclude on Aug 28 Wed. > > Please feel free to share your thoughts. > > Thanks, > Wangda > > [1] > > http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201908.mbox/%3cCAD++eC=ou-tit1faob-dbecqe6ht7ede7t1dyra2p1yinpe...@mail.gmail.com%3e > , >
Re: [DISCUSS] Hadoop 2019 Release Planning
Hi Wangda, Thanks for starting the discussion. We would also like to release 2.10.0 which was discussed previously <https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html> and at various contributor meetups. I'm interested in being release manager for that. Thanks, Jonathan Hung On Fri, Aug 9, 2019 at 7:59 PM Wangda Tan wrote: > Hi all, > > Hope this email finds you well > > I want to hear your thoughts about what should be the release plan for > 2019. > > In 2018, we released: > - 1 maintenance release of 2.6 > - 3 maintenance releases of 2.7 > - 3 maintenance releases of 2.8 > - 3 releases of 2.9 > - 4 releases of 3.0 > - 2 releases of 3.1 > > Total 16 releases in 2018. > > In 2019, by far we only have two releases: > - 1 maintenance release of 3.1 > - 1 minor release of 3.2. > > However, the community put a lot of efforts to stabilize features of > various release branches. > There're: > - 217 fixed patches in 3.1.3 [1] > - 388 fixed patches in 3.2.1 [2] > - 1172 fixed patches in 3.3.0 [3] (OMG!) > > I think it is the time to do maintenance releases of 3.1/3.2 and do a minor > release for 3.3.0. > > In addition, I saw community discussion to do a 2.8.6 release for security > fixes. > > Any other releases? I think there're release plans for Ozone as well. And > please add your thoughts. > > Volunteers welcome! If you have interests to run a release as Release > Manager (or co-Resource Manager), please respond to this email thread so we > can coordinate. > > Thanks, > Wangda Tan > > [1] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND resolution = Fixed AND > fixVersion = 3.1.3 > [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND resolution = Fixed AND > fixVersion = 3.2.1 > [3] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND resolution = Fixed AND > fixVersion = 3.3.0 >
Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2
Sorry for the delay, had to deprioritize this. Hoping to get to this next week. Jonathan From: Jim Brennan Sent: Thursday, April 18, 2019 7:28 AM To: Jonathan Hung Cc: yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org Subject: Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2 Hi Jonathan, Hi Jim, we have not tested rolling upgrade. I don’t foresee this being an issue, but we’ll try it out and report back. Any update on this? Jim On Wed, Apr 3, 2019 at 2:16 AM Jonathan Hung mailto:jyhung2...@gmail.com>> wrote: Hi Jim, we have not tested rolling upgrade. I don’t foresee this being an issue, but we’ll try it out and report back. Jonathan From: Jim Brennan mailto:james.bren...@verizonmedia.com>> Sent: Tuesday, April 2, 2019 9:17 AM To: Jonathan Hung Cc: yarn-...@hadoop.apache.org<mailto:yarn-...@hadoop.apache.org>; mapreduce-dev@hadoop.apache.org<mailto:mapreduce-dev@hadoop.apache.org> Subject: Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2 Thanks for working on this! One concern for us is support for a rolling upgrade. If we are running a cluster based on branch-2.8, will we be able to do a rolling upgrade (no cluster down-time) to a branch containing these changes? Have you tested rolling upgrades? Thanks. Jim On Fri, Mar 29, 2019 at 2:14 PM Jonathan Hung mailto:jyhung2...@gmail.com>> wrote: Hello devs, Starting a discuss thread to merge resource types/native GPU scheduling support to branch-3.0 and branch-2. The resource types work was done in trunk~branch-3.0 and GPU support done in trunk~branch-3.1, so the proposal is to merge GPU support into branch-3.0 and both resource types/GPU support to branch-2. Internally we've been running resource types/GPU support off a fork of branch-2.9.0 in a > 300 node GPU cluster for a few months which has worked well. Also for completeness we verified that everything going into branch-2 also exists in branch-3.0. The specific list of patches to merge is in feature branch YARN-8200.branch3 (for branch-3.0) and feature branch YARN-8200 (for branch-2). Full patches containing the YARN-8200.branch3 -> branch-3.0 diff and YARN-8200 -> branch-2 diff have been posted to YARN-8200 jira. If there's no issues from the community I'll start a merge vote next week. Thanks. Jonathan Hung
Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2
Hi Jim, we have not tested rolling upgrade. I don’t foresee this being an issue, but we’ll try it out and report back. Jonathan From: Jim Brennan Sent: Tuesday, April 2, 2019 9:17 AM To: Jonathan Hung Cc: yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org Subject: Re: [DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2 Thanks for working on this! One concern for us is support for a rolling upgrade. If we are running a cluster based on branch-2.8, will we be able to do a rolling upgrade (no cluster down-time) to a branch containing these changes? Have you tested rolling upgrades? Thanks. Jim On Fri, Mar 29, 2019 at 2:14 PM Jonathan Hung mailto:jyhung2...@gmail.com>> wrote: Hello devs, Starting a discuss thread to merge resource types/native GPU scheduling support to branch-3.0 and branch-2. The resource types work was done in trunk~branch-3.0 and GPU support done in trunk~branch-3.1, so the proposal is to merge GPU support into branch-3.0 and both resource types/GPU support to branch-2. Internally we've been running resource types/GPU support off a fork of branch-2.9.0 in a > 300 node GPU cluster for a few months which has worked well. Also for completeness we verified that everything going into branch-2 also exists in branch-3.0. The specific list of patches to merge is in feature branch YARN-8200.branch3 (for branch-3.0) and feature branch YARN-8200 (for branch-2). Full patches containing the YARN-8200.branch3 -> branch-3.0 diff and YARN-8200 -> branch-2 diff have been posted to YARN-8200 jira. If there's no issues from the community I'll start a merge vote next week. Thanks. Jonathan Hung
[DISCUSS] Merging YARN-8200 to branch-3.0 and branch-2
Hello devs, Starting a discuss thread to merge resource types/native GPU scheduling support to branch-3.0 and branch-2. The resource types work was done in trunk~branch-3.0 and GPU support done in trunk~branch-3.1, so the proposal is to merge GPU support into branch-3.0 and both resource types/GPU support to branch-2. Internally we've been running resource types/GPU support off a fork of branch-2.9.0 in a > 300 node GPU cluster for a few months which has worked well. Also for completeness we verified that everything going into branch-2 also exists in branch-3.0. The specific list of patches to merge is in feature branch YARN-8200.branch3 (for branch-3.0) and feature branch YARN-8200 (for branch-2). Full patches containing the YARN-8200.branch3 -> branch-3.0 diff and YARN-8200 -> branch-2 diff have been posted to YARN-8200 jira. If there's no issues from the community I'll start a merge vote next week. Thanks. Jonathan Hung
Re: [VOTE] Moving branch-2 precommit/nightly test builds to java 8
My non-binding +1 to finish. This vote passes with 6 binding +1, 3 non-binding +1, and no vetoes. We will make the changes as part of HADOOP-15711, please follow there. Thanks all! Jonathan Hung On Tue, Feb 5, 2019 at 11:38 PM Akira Ajisaka wrote: > +1 > > -Akira > > On Wed, Feb 6, 2019 at 9:13 AM Wangda Tan wrote: > > > > +1, make sense to me. > > > > On Tue, Feb 5, 2019 at 3:29 PM Konstantin Shvachko > > > wrote: > > > > > +1 Makes sense to me. > > > > > > Thanks, > > > --Konst > > > > > > On Mon, Feb 4, 2019 at 6:14 PM Jonathan Hung > wrote: > > > > > > > Hello, > > > > > > > > Starting a vote based on the discuss thread [1] for moving branch-2 > > > > precommit/nightly test builds to openjdk8. After this change, the > test > > > > phase for precommit builds [2] and branch-2 nightly build [3] will > run on > > > > openjdk8. To maintain source compatibility, these builds will still > run > > > > their compile phase for branch-2 on openjdk7 as they do now (in > addition > > > to > > > > compiling on openjdk8). > > > > > > > > Vote will run for three business days until Thursday Feb 7 6:00PM > PDT. > > > > > > > > [1] > > > > > > > > > > > > https://lists.apache.org/thread.html/7e6fb28fc67560f83a2eb62752df35a8d58d86b2a3df4cacb5d738ca@%3Ccommon-dev.hadoop.apache.org%3E > > > > > > > > [2] > > > > > > > > https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-HADOOP-Build/ > > > > > https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-HDFS-Build/ > > > > > https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-YARN-Build/ > > > > > > > > > > > > https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-MAPREDUCE-Build/ > > > > > > > > [3] > > > > > > > > > > > > https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/ > > > > > > > > Jonathan Hung > > > > > > > >
[VOTE] Moving branch-2 precommit/nightly test builds to java 8
Hello, Starting a vote based on the discuss thread [1] for moving branch-2 precommit/nightly test builds to openjdk8. After this change, the test phase for precommit builds [2] and branch-2 nightly build [3] will run on openjdk8. To maintain source compatibility, these builds will still run their compile phase for branch-2 on openjdk7 as they do now (in addition to compiling on openjdk8). Vote will run for three business days until Thursday Feb 7 6:00PM PDT. [1] https://lists.apache.org/thread.html/7e6fb28fc67560f83a2eb62752df35a8d58d86b2a3df4cacb5d738ca@%3Ccommon-dev.hadoop.apache.org%3E [2] https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-HADOOP-Build/ https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-HDFS-Build/ https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-YARN-Build/ https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-MAPREDUCE-Build/ [3] https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/ Jonathan Hung
Re: [VOTE] Propose to start new Hadoop sub project "submarine"
+1. Thanks Wangda. Jonathan Hung On Fri, Feb 1, 2019 at 2:25 PM Dinesh Chitlangia < dchitlan...@hortonworks.com> wrote: > +1 (non binding), thanks Wangda for organizing this. > > Regards, > Dinesh > > > > On 2/1/19, 5:24 PM, "Wangda Tan" wrote: > > Hi all, > > According to positive feedbacks from the thread [1] > > This is vote thread to start a new subproject named "hadoop-submarine" > which follows the release process already established for ozone. > > The vote runs for usual 7 days, which ends at Feb 8th 5 PM PDT. > > Thanks, > Wangda Tan > > [1] > > https://lists.apache.org/thread.html/f864461eb188bd12859d51b0098ec38942c4429aae7e4d001a633d96@%3Cyarn-dev.hadoop.apache.org%3E > > >
Re: [DISCUSS] Making submarine to different release model like Ozone
+1. This is important for improving the deep learning on hadoop story. There's recently a lot of momentum for this, and decoupling submarine/hadoop will help it continue. Jonathan Hung On Thu, Jan 31, 2019 at 11:04 AM Wangda Tan wrote: > Hi devs, > > Since we started submarine-related effort last year, we received a lot of > feedbacks, several companies (such as Netease, China Mobile, etc.) are > trying to deploy Submarine to their Hadoop cluster along with big data > workloads. Linkedin also has big interests to contribute a Submarine TonY ( > https://github.com/linkedin/TonY) runtime to allow users to use the same > interface. > > From what I can see, there're several issues of putting Submarine under > yarn-applications directory and have same release cycle with Hadoop: > > 1) We started 3.2.0 release at Sep 2018, but the release is done at Jan > 2019. Because of non-predictable blockers and security issues, it got > delayed a lot. We need to iterate submarine fast at this point. > > 2) We also see a lot of requirements to use Submarine on older Hadoop > releases such as 2.x. Many companies may not upgrade Hadoop to 3.x in a > short time, but the requirement to run deep learning is urgent to them. We > should decouple Submarine from Hadoop version. > > And why we wanna to keep it within Hadoop? First, Submarine included some > innovation parts such as enhancements of user experiences for YARN > services/containerization support which we can add it back to Hadoop later > to address common requirements. In addition to that, we have a big overlap > in the community developing and using it. > > There're several proposals we have went through during Ozone merge to trunk > discussion: > > https://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201803.mbox/%3ccahfhakh6_m3yldf5a2kq8+w-5fbvx5ahfgs-x1vajw8gmnz...@mail.gmail.com%3E > > I propose to adopt Ozone model: which is the same master branch, different > release cycle, and different release branch. It is a great example to show > agile release we can do (2 Ozone releases after Oct 2018) with less > overhead to setup CI, projects, etc. > > *Links:* > - JIRA: https://issues.apache.org/jira/browse/YARN-8135 > - Design doc > < > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit > > > - User doc > < > https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/Index.html > > > (3.2.0 > release) > - Blogposts, {Submarine} : Running deep learning workloads on Apache Hadoop > < > https://hortonworks.com/blog/submarine-running-deep-learning-workloads-apache-hadoop/ > >, > (Chinese Translation: Link <https://www.jishuwen.com/d/2Vpu>) > - Talks: Strata Data Conf NY > < > https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/detail/68289 > > > > Thoughts? > > Thanks, > Wangda Tan >
Re: [VOTE - 2] Merge HDFS-12943 branch to trunk - Consistent Reads from Standby
+1! Jonathan Hung On Sat, Dec 15, 2018 at 8:26 AM Zhe Zhang wrote: > +1 > > Thanks for addressing concerns from the previous vote. > > On Fri, Dec 14, 2018 at 6:24 PM Konstantin Shvachko > wrote: > > > Hi Hadoop developers, > > > > I would like to propose to merge to trunk the feature branch HDFS-12943 > for > > Consistent Reads from Standby Node. The feature is intended to scale read > > RPC workloads. On large clusters reads comprise 95% of all RPCs to the > > NameNode. We should be able to accommodate higher overall RPC workloads > (up > > to 4x by some estimates) by adding multiple ObserverNodes. > > > > The main functionality has been implemented see sub-tasks of HDFS-12943. > > We followed up with the test plan. Testing was done on two independent > > clusters (see HDFS-14058 and HDFS-14059) with security enabled. > > We ran standard HDFS commands, MR jobs, admin commands including manual > > failover. > > We know of one cluster running this feature in production. > > > > Since the previous vote we addressed Daryn's concern (see HDFS-13873), > > added documentation for the new feature, and fixed a few other jiras. > > > > I attached a unified patch to the umbrella jira for the review. > > Please vote on this thread. The vote will run for 7 days until Wed Dec > 21. > > > > Thanks, > > --Konstantin > > > -- > Zhe Zhang > Apache Hadoop Committer > http://zhe-thoughts.github.io/about/ | @oldcap >
[DISCUSS] Merging YARN-8200 to branch-2
Hi folks, Starting a thread to discuss merging YARN-8200 (resource profiles/GPU support) to branch-2. For resource types, we have ported YARN-4081~YARN-7137 (as part of YARN-3926 umbrella). For GPU support, we have ported the native non-docker GPU support related items in YARN-6223. For both of these, we have also ported miscellaneous fixes for issues we encountered internally. Some potential issues I see are, some of the resource types commits did not make it to branch-3.0. Also most of the GPU-specific commits did not make it to branch-3.0 either. We have deployed these two features internally on top of a branch-2.9 fork on a 100 node GPU cluster which is running deep learning workloads, and it is working well. Before the holidays/after new years we will work on cleaning up the feature branch (YARN-8200), e.g. filing tickets on branch-2 specific bug fixes, rebasing on latest branch-2, syncing any bug fixes in our internal fork which did not make it to the feature branch, etc. Assuming no objections, once it's ready we will start a vote to merge. Thanks, Jonathan Hung
Re: [VOTE] Release Apache Hadoop 3.1.0 (RC0)
Hi Wangda, thanks for handling this release. +1 (non-binding) - verified binary checksum - launched single node RM - verified refreshQueues functionality - verified capacity scheduler conf mutation disabled in this case - verified capacity scheduler conf mutation with leveldb storage - verified refreshQueues mutation is disabled in this case Jonathan Hung On Thu, Mar 22, 2018 at 9:10 AM, Wangda Tan wrote: > Thanks @Bharat for the quick check, the previously staged repository has > some issues. I re-deployed jars to nexus. > > Here's the new repo (1087) > > https://repository.apache.org/content/repositories/orgapachehadoop-1087/ > > Other artifacts remain same, no additional code changes. > > On Wed, Mar 21, 2018 at 11:54 PM, Bharat Viswanadham < > bviswanad...@hortonworks.com> wrote: > > > Hi Wangda, > > Maven Artifact repositories is not having all Hadoop jars. (It is missing > > many like hadoop-hdfs, hadoop-client etc.,) > > https://repository.apache.org/content/repositories/orgapachehadoop-1086/ > > > > > > Thanks, > > Bharat > > > > > > On 3/21/18, 11:44 PM, "Wangda Tan" wrote: > > > > Hi folks, > > > > Thanks to the many who helped with this release since Dec 2017 [1]. > > We've > > created RC0 for Apache Hadoop 3.1.0. The artifacts are available > here: > > > > http://people.apache.org/~wangda/hadoop-3.1.0-RC0/ > > > > The RC tag in git is release-3.1.0-RC0. > > > > The maven artifacts are available via repository.apache.org at > > https://repository.apache.org/content/repositories/ > > orgapachehadoop-1086/ > > > > This vote will run 7 days (5 weekdays), ending on Mar 28 at 11:59 pm > > Pacific. > > > > 3.1.0 contains 727 [2] fixed JIRA issues since 3.0.0. Notable > additions > > include the first class GPU/FPGA support on YARN, Native services, > > Support > > rich placement constraints in YARN, S3-related enhancements, allow > HDFS > > block replicas to be provided by an external storage system, etc. > > > > We’d like to use this as a starting release for 3.1.x [1], depending > > on how > > it goes, get it stabilized and potentially use a 3.1.1 in several > > weeks as > > the stable release. > > > > We have done testing with a pseudo cluster and distributed shell job. > > My +1 > > to start. > > > > Best, > > Wangda/Vinod > > > > [1] > > https://lists.apache.org/thread.html/b3fb3b6da8b6357a68513a6dfd104b > > c9e19e559aedc5ebedb4ca08c8@%3Cyarn-dev.hadoop.apache.org%3E > > [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in > > (3.1.0) > > AND fixVersion not in (3.0.0, 3.0.0-beta1) AND status = Resolved > ORDER > > BY > > fixVersion ASC > > > > > > >
Re: [VOTE] Release Apache Hadoop 2.7.5 (RC1)
Thanks Konstantin for working on this. +1 (non-binding) - Downloaded binary and verified md5 - Deployed RM HA and tested failover Jonathan Hung On Wed, Dec 13, 2017 at 11:02 AM, Eric Payne wrote: > Thanks for the hard work on this release, Konstantin. > +1 (binding) > - Built from source > - Verified that refreshing of queues works as expected. > > - Verified can run multiple users in a single queue > - Ran terasort test > - Verified that cross-queue preemption works as expected > Thanks. Eric Payne > > From: Konstantin Shvachko > To: "common-...@hadoop.apache.org" ; " > hdfs-...@hadoop.apache.org" ; " > mapreduce-dev@hadoop.apache.org" ; " > yarn-...@hadoop.apache.org" > Sent: Thursday, December 7, 2017 9:22 PM > Subject: [VOTE] Release Apache Hadoop 2.7.5 (RC1) > > Hi everybody, > > I updated CHANGES.txt and fixed documentation links. > Also committed MAPREDUCE-6165, which fixes a consistently failing test. > > This is RC1 for the next dot release of Apache Hadoop 2.7 line. The > previous one 2.7.4 was release August 4, 2017. > Release 2.7.5 includes critical bug fixes and optimizations. See more > details in Release Note: > http://home.apache.org/~shv/hadoop-2.7.5-RC1/releasenotes.html > > The RC0 is available at: http://home.apache.org/~shv/hadoop-2.7.5-RC1/ > > Please give it a try and vote on this thread. The vote will run for 5 days > ending 12/13/2017. > > My up to date public key is available from: > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS > > Thanks, > --Konstantin > > > >
Re: [VOTE] Release Apache Hadoop 3.0.0 RC1
Thanks Andrew for the huge effort. +1 (non-binding) - Downloaded binary tarball and verified md5 - Ran RM HA and verified manual failover - Verified add/remove/update scheduler configuration API (CLI/REST) works for leveldb/zookeeper backend - Verified scheduler configuration changes persisted on restart/failover - Verified "yarn rmadmin -refreshQueues" works when scheduler configuration API disabled, and does not work when scheduler configuration API enabled Jonathan Hung On Tue, Dec 12, 2017 at 5:44 PM, Junping Du wrote: > Thanks Andrew for pushing new RC for 3.0.0. I was out last week, just get > chance to validate new RC now. > > Basically, I found two critical issues with the same rolling upgrade > scenario as where HADOOP-15059 get found previously: > HDFS-12920, we changed value format for some hdfs configurations that old > version MR client doesn't understand when fetching these configurations. > Some quick workarounds are to add old value (without time unit) in > hdfs-site.xml to override new default values but will generate many > annoying warnings. I provided my fix suggestions on the JIRA already for > more discussion. > The other one is YARN-7646. After we workaround HDFS-12920, will hit the > issue that old version MR AppMaster cannot communicate with new version of > YARN RM - could be related to resource profile changes from YARN side but > root cause are still in investigation. > > The first issue may not belong to a blocker given we can workaround this > without code change. I am not sure if we can workaround 2nd issue so far. > If not, we may have to fix this or compromise with withdrawing support of > rolling upgrade or calling it a stable release. > > > Thanks, > > Junping > > > From: Robert Kanter > Sent: Tuesday, December 12, 2017 3:10 PM > To: Arun Suresh > Cc: Andrew Wang; Lei Xu; Wei-Chiu Chuang; Ajay Kumar; Xiao Chen; Aaron T. > Myers; common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; > yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org > Subject: Re: [VOTE] Release Apache Hadoop 3.0.0 RC1 > > +1 (binding) > > + Downloaded the binary release > + Deployed on a 3 node cluster on CentOS 7.3 > + Ran some MR jobs, clicked around the UI, etc > + Ran some CLI commands (yarn logs, etc) > > Good job everyone on Hadoop 3! > > > - Robert > > On Tue, Dec 12, 2017 at 1:56 PM, Arun Suresh wrote: > > > +1 (binding) > > > > - Verified signatures of the source tarball. > > - built from source - using the docker build environment. > > - set up a pseudo-distributed test cluster. > > - ran basic HDFS commands > > - ran some basic MR jobs > > > > Cheers > > -Arun > > > > On Tue, Dec 12, 2017 at 1:52 PM, Andrew Wang > > wrote: > > > > > Hi everyone, > > > > > > As a reminder, this vote closes tomorrow at 12:31pm, so please give it > a > > > whack if you have time. There are already enough binding +1s to pass > this > > > vote, but it'd be great to get additional validation. > > > > > > Thanks to everyone who's voted thus far! > > > > > > Best, > > > Andrew > > > > > > > > > > > > On Tue, Dec 12, 2017 at 11:08 AM, Lei Xu wrote: > > > > > > > +1 (binding) > > > > > > > > * Verified src tarball and bin tarball, verified md5 of each. > > > > * Build source with -Pdist,native > > > > * Started a pseudo cluster > > > > * Run ec -listPolicies / -getPolicy / -setPolicy on / , and run hdfs > > > > dfs put/get/cat on "/" with XOR-2-1 policy. > > > > > > > > Thanks Andrew for this great effort! > > > > > > > > Best, > > > > > > > > > > > > On Tue, Dec 12, 2017 at 9:55 AM, Andrew Wang < > andrew.w...@cloudera.com > > > > > > > wrote: > > > > > Hi Wei-Chiu, > > > > > > > > > > The patchprocess directory is left over from the create-release > > > process, > > > > > and it looks empty to me. We should still file a create-release > JIRA > > to > > > > fix > > > > > this, but I think this is not a blocker. Would you agree? > > > > > > > > > > Best, > > > > > Andrew > > > > > > > > > > On Tue, Dec 12, 2017 at 9:44 AM, Wei-Chiu Chuang < > > weic...@cloudera.com > > > > > > > > > wrote: > > > > > > > > > >>
Re: [VOTE] Release Apache Hadoop 2.9.0 (RC3)
Thanks Arun/Subru for working on this. +1 (non-binding) - Deployed RM HA on two nodes - Tested manual failover - Tested configuration mutation API with zk and leveldb backing store (also ensuring configuration updates persisted on failover/restart), with queue addition/removal/update - Tested "yarn rmadmin -refreshQueues" enabled when configuration mutation API disabled (and vice-versa) - Tested queue admin configuration mutation policy Jonathan Hung On Mon, Nov 13, 2017 at 4:10 PM, Arun Suresh wrote: > Hi Folks, > > Apache Hadoop 2.9.0 is the first release of Hadoop 2.9 line and will be the > starting release for Apache Hadoop 2.9.x line - it includes 30 New Features > with 500+ subtasks, 407 Improvements, 790 Bug fixes new fixed issues since > 2.8.2. > > More information about the 2.9.0 release plan can be found here: > *https://cwiki.apache.org/confluence/display/HADOOP/ > Roadmap#Roadmap-Version2.9 > <https://cwiki.apache.org/confluence/display/HADOOP/ > Roadmap#Roadmap-Version2.9>* > > New RC is available at: *https://home.apache.org/~ > asuresh/hadoop-2.9.0-RC3/ > <https://home.apache.org/~asuresh/hadoop-2.9.0-RC3/>* > > The RC tag in git is: release-2.9.0-RC3, and the latest commit id is: > 756ebc8394e473ac25feac05fa493f6d612e6c50. > > The maven artifacts are available via repository.apache.org at: > <https://www.google.com/url?q=https%3A%2F%2Frepository. > apache.org%2Fcontent%2Frepositories%2Forgapachehadoop-1066&sa=D& > sntz=1&usg=AFQjCNFcern4uingMV_sEreko_zeLlgdlg>*https:// > repository.apache.org/content/repositories/orgapachehadoop-1068/ > <https://repository.apache.org/content/repositories/orgapachehadoop-1068/ > >* > > We are carrying over the votes from the previous RC given that the delta is > the license fix. > > Given the above - we are also going to stick with the original deadline for > the vote : ending on Friday 17th November 2017 2pm PT time. > > Thanks, > -Arun/Subru >
Re: [VOTE] Release Apache Hadoop 2.9.0 (RC0)
Thanks Arun and Subru for working on this! +1 (non-binding) pending YARN-7453. 1) Setup RM HA 2) Verified leveldb/zookeeper scheduler configuration API works via REST/CLI 3) Verified configuration changes persist across restart 4) yarn rmadmin -refreshQueues works when scheduler configuration API disabled (and vice-versa) Jonathan Hung On Tue, Nov 7, 2017 at 2:56 PM, Eric Badger wrote: > +1 (non-binding) pending the issue that Sunil/Rohith pointed out > > - Verified all hashes and checksums > - Built from source on macOS 10.12.6, Java 1.8.0u65 > - Deployed a pseudo cluster > - Ran some example jobs > > Thanks, > > Eric > > On Tue, Nov 7, 2017 at 4:03 PM, Wangda Tan wrote: > >> Sunil / Rohith, >> >> Could you check if your configs are same as Jonathan posted configs? >> https://issues.apache.org/jira/browse/YARN-7453?focusedComme >> ntId=16242693&page=com.atlassian.jira.plugin.system. >> issuetabpanels:comment-tabpanel#comment-16242693 >> >> And could you try if using Jonathan's configs can still reproduce the >> issue? >> >> Thanks, >> Wangda >> >> >> On Tue, Nov 7, 2017 at 1:52 PM, Arun Suresh wrote: >> >> > Thanks for testing Rohith and Sunil >> > >> > Can you please confirm if it is not a config issue at your end ? >> > We (both Jonathan and myself) just tried testing this on a fresh cluster >> > (both automatic and manual) and we are not able to reproduce this. I've >> > updated the YARN-7453 <https://issues.apache.org/jira/browse/YARN-7453> >> > JIRA >> > with details of testing. >> > >> > Cheers >> > -Arun/Subru >> > >> > On Tue, Nov 7, 2017 at 3:17 AM, Rohith Sharma K S < >> > rohithsharm...@apache.org >> > > wrote: >> > >> > > Thanks Sunil for confirmation. Btw, I have raised YARN-7453 >> > > <https://issues.apache.org/jira/browse/YARN-7453> JIRA to track this >> > > issue. >> > > >> > > - Rohith Sharma K S >> > > >> > > On 7 November 2017 at 16:44, Sunil G wrote: >> > > >> > >> Hi Subru and Arun. >> > >> >> > >> Thanks for driving 2.9 release. Great work! >> > >> >> > >> I installed cluster built from source. >> > >> - Ran few MR jobs with application priority enabled. Runs fine. >> > >> - Accessed new UI and it also seems fine. >> > >> >> > >> However I am also getting same issue as Rohith reported. >> > >> - Started an HA cluster >> > >> - Pushed RM to standby >> > >> - Pushed back RM to active then seeing an exception. >> > >> >> > >> org.apache.hadoop.ha.ServiceFailedException: RM could not >> transition to >> > >> Active >> > >> at >> > >> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyE >> > >> lectorBasedElectorServic >> > >> e.becomeActive(ActiveStandbyElectorBasedElectorService.java:146) >> > >> at >> > >> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(Activ >> > >> eStandbyElector.java:894 >> > >> ) >> > >> >> > >> Caused by: org.apache.zookeeper.KeeperException$NoAuthException: >> > >> KeeperErrorCode = NoAuth >> > >> at >> > >> org.apache.zookeeper.KeeperException.create(KeeperException. >> java:113) >> > >> at org.apache.zookeeper.ZooKeeper >> .multiInternal(ZooKeeper.java: >> > >> 949) >> > >> >> > >> Will check and post more details, >> > >> >> > >> - Sunil >> > >> >> > >> >> > >> On Tue, Nov 7, 2017 at 12:47 PM Rohith Sharma K S < >> > >> rohithsharm...@apache.org> >> > >> wrote: >> > >> >> > >> > Thanks Subru/Arun for the great work! >> > >> > >> > >> > Downloaded source and built from it. Deployed RM HA non-secured >> > cluster >> > >> > along with new YARN UI and ATSv2. >> > >> > >> > >> > I am facing basic RM HA switch issue after first time successful >> > start. >> > >> > *Can >> > >> > anyone else is facing this issue?* >> > >> > >> > >> >
Re: [VOTE] Merge feature branch YARN-5734 (API based scheduler configuration) to trunk, branch-3.0, branch-2
Thanks for the votes and discussion. It is now past Monday Oct 9 11:00AM PDT so the vote has ended. There were 4 +1 and no -1, so vote passes. This feature will be merged to trunk, branch-3.0, and branch-2 shortly (16 subtasks). Thanks everyone! Jonathan Hung On Mon, Oct 9, 2017 at 9:18 AM, Xuan Gong wrote: > +1 (binding) > > Xuan Gong > > > > > >On Mon, Oct 2, 2017 at 11:09 AM, Jonathan Hung > >wrote: > > > >> Hi all, > >> > >> From discussion at [1], I'd like to start a vote to merge feature branch > >> YARN-5734 to trunk, branch-3.0, and branch-2. Vote will be 7 days, > >>ending > >> Monday Oct 9 at 11:00AM PDT. > >> > >> This branch adds a framework to the scheduler to allow scheduler > >> configuration mutation on the fly, including a REST and CLI interface, > >>and > >> an interface for the scheduler configuration backing store. Currently > >>the > >> capacity scheduler implements this framework. > >> > >> Umbrella is here (YARN-5734 > >> <https://issues.apache.org/jira/browse/YARN-5734>), jenkins build is > >>here > >> ( > >> YARN-7241 <https://issues.apache.org/jira/browse/YARN-7241>). All > >>required > >> tasks for this feature are committed. Since this feature changes RM > >>only, > >> we have tested this on a local RM setup with a suite of configuration > >> changes with no issue so far. > >> > >> Key points: > >> - The feature is turned off by default, and must be explicitly > >>configured > >> to turn on. When turned off, the behavior reverts back to the original > >>file > >> based mechanism for changing scheduler configuration (i.e. yarn rmadmin > >> -refreshQueues). > >> - The framework was designed in a way to be extendable to other > >>schedulers > >> (most notably FairScheduler). > >> - A pluggable ACL policy (YARN-5949 > >> <https://issues.apache.org/jira/browse/YARN-5949>) allows admins > >> fine-grained control for who can change what configurations. > >> - The configuration storage backend is also pluggable. Currently an > >> in-memory, leveldb, and zookeeper implementation are supported. > >> > >> There were 15 subtasks completed for this feature. > >> > >> Huge thanks to everyone who helped with reviews, commits, guidance, and > >> technical discussion/design, including Carlo Curino, Xuan Gong, Subru > >> Krishnan, Min Shen, Konstantin Shvachko, Carl Steinbach, Wangda Tan, > >>Vinod > >> Kumar Vavilapalli, Suja Viswesan, Zhe Zhang, Ye Zhou. > >> > >> [1] > >> http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201709.mbox/% > >> 3CCAHzWLgfEAgczjcEOUCg-03ma3ROtO=pkec9dpggyx9rzf3n...@mail.gmail.com%3E > >> > >> Jonathan Hung > >> > >
[VOTE] Merge feature branch YARN-5734 (API based scheduler configuration) to trunk, branch-3.0, branch-2
Hi all, >From discussion at [1], I'd like to start a vote to merge feature branch YARN-5734 to trunk, branch-3.0, and branch-2. Vote will be 7 days, ending Monday Oct 9 at 11:00AM PDT. This branch adds a framework to the scheduler to allow scheduler configuration mutation on the fly, including a REST and CLI interface, and an interface for the scheduler configuration backing store. Currently the capacity scheduler implements this framework. Umbrella is here (YARN-5734 <https://issues.apache.org/jira/browse/YARN-5734>), jenkins build is here ( YARN-7241 <https://issues.apache.org/jira/browse/YARN-7241>). All required tasks for this feature are committed. Since this feature changes RM only, we have tested this on a local RM setup with a suite of configuration changes with no issue so far. Key points: - The feature is turned off by default, and must be explicitly configured to turn on. When turned off, the behavior reverts back to the original file based mechanism for changing scheduler configuration (i.e. yarn rmadmin -refreshQueues). - The framework was designed in a way to be extendable to other schedulers (most notably FairScheduler). - A pluggable ACL policy (YARN-5949 <https://issues.apache.org/jira/browse/YARN-5949>) allows admins fine-grained control for who can change what configurations. - The configuration storage backend is also pluggable. Currently an in-memory, leveldb, and zookeeper implementation are supported. There were 15 subtasks completed for this feature. Huge thanks to everyone who helped with reviews, commits, guidance, and technical discussion/design, including Carlo Curino, Xuan Gong, Subru Krishnan, Min Shen, Konstantin Shvachko, Carl Steinbach, Wangda Tan, Vinod Kumar Vavilapalli, Suja Viswesan, Zhe Zhang, Ye Zhou. [1] http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201709.mbox/%3CCAHzWLgfEAgczjcEOUCg-03ma3ROtO=pkec9dpggyx9rzf3n...@mail.gmail.com%3E Jonathan Hung
Re: [DISCUSS] Merging API-based scheduler configuration to trunk/branch-2
Thanks Andrew and Larry for the feedback. I was hoping to start a merge vote early next week, because of the 2.9 deadline. (I suppose meeting this deadline depends on the outcome of this DISCUSS thread.) Appreciate any questions you have on the JIRA. To answer your questions Larry: *Is this feature extending the existing YARM RM REST API?* Yes, this feature adds another endpoint to the YARN RM REST API, for users to send their configuration change requests. *When it isn't enabled what is the API behavior?* When disabled and API is called, nothing happens, it will return HTTP 400 bad request. *Does it implement the trusted proxy pattern for proxies to be able to impersonate users and most importantly to dictate what proxies would be allowed to impersonate an admin for this API - which I assume will be required?* Right now there's a pluggable policy which controls which users can make which configuration changes (see YARN-5949). The default policy is to only allow YARN admins (i.e. users in yarn.admin.acl) to make changes. There's also an implementation of a more relaxed policy which allows admins of queues to make configuration modifications to their own queue. Not sure if this answers your question. Thanks, Jonathan Hung On Fri, Sep 29, 2017 at 12:01 PM, larry mccay wrote: > Hi Jonathan - > > Thank you for bringing this up for discussion! > > I would personally like to see a specific security review of features like > this - especially ones that allow for remote access to configuration. > I'll take a look at the JIRA and see whether I can come up with any > concerns or questions and I would urge others to give it a pass from a > security perspective as well. > > In addition, here are a couple questions of the top of my head: > > Is this feature extending the existing YARM RM REST API? > When it isn't enabled what is the API behavior? > Does it implement the trusted proxy pattern for proxies to be able to > impersonate users and most importantly to dictate what proxies would be > allowed to impersonate an admin for this API - which I assume will be > required? > > --larry > > On Fri, Sep 29, 2017 at 2:44 PM, Andrew Wang > wrote: > >> Hi Jonathan, >> >> I'm okay with putting this into branch-3.0 for GA if it can be merged >> within the next two weeks. Even though beta1 has slipped by a month, I >> want >> to stick to the targeted GA data of Nov 1st as much as possible. Of >> course, >> let's not sacrifice quality or stability for speed; if something's not >> ready, let's defer it to 3.1.0. >> >> Subru, have you been able to review this feature from the 2.9.0 >> perspective? It'd add confidence if you think it's immediately ready for >> merging to branch-2 for 2.9.0. >> >> Thanks, >> Andrew >> >> On Thu, Sep 28, 2017 at 11:32 AM, Jonathan Hung >> wrote: >> >> > Hi everyone, >> > >> > Starting this thread to discuss merging API-based scheduler >> configuration >> > to trunk/branch-2. The feature adds the framework for allowing users to >> > modify scheduler configuration via REST or CLI using a configurable >> backend >> > (leveldb/zk are currently supported), and adds capacity scheduler >> support >> > for this. The umbrella JIRA is YARN-5734. All the required work for this >> > feature is done and committed to branch YARN-5734, and a full diff has >> been >> > generated at YARN-7241. >> > >> > Regarding compatibility, this feature is configurable and turned off by >> > default. >> > >> > The feature has been tested locally on a couple RMs (since it is an RM >> > only change), with queue addition/removal/updates tested on single RM >> > (leveldb) and two RMs (zk). Also we verified the original configuration >> > update mechanism (via refreshQueues) is unaffected when the feature is >> > off/not configured. >> > >> > Our original plan was to merge this to trunk (which is what the >> YARN-7241 >> > diff is based on), and port to branch-2 before the 2.9 release. @Andrew, >> > what are your thoughts on also merging this to branch-3.0? >> > >> > Thanks! >> > >> > Jonathan Hung >> > >> > >
[DISCUSS] Merging API-based scheduler configuration to trunk/branch-2
Hi everyone, Starting this thread to discuss merging API-based scheduler configuration to trunk/branch-2. The feature adds the framework for allowing users to modify scheduler configuration via REST or CLI using a configurable backend (leveldb/zk are currently supported), and adds capacity scheduler support for this. The umbrella JIRA is YARN-5734. All the required work for this feature is done and committed to branch YARN-5734, and a full diff has been generated at YARN-7241. Regarding compatibility, this feature is configurable and turned off by default. The feature has been tested locally on a couple RMs (since it is an RM only change), with queue addition/removal/updates tested on single RM (leveldb) and two RMs (zk). Also we verified the original configuration update mechanism (via refreshQueues) is unaffected when the feature is off/not configured. Our original plan was to merge this to trunk (which is what the YARN-7241 diff is based on), and port to branch-2 before the 2.9 release. @Andrew, what are your thoughts on also merging this to branch-3.0? Thanks! Jonathan Hung
Re: [DISCUSS] Looking to a 2.9.0 release
Hi Subru, Thanks for starting the discussion. We are targeting merging YARN-5734 (API-based scheduler configuration) to branch-2 before the release of 2.9.0, since the feature is close to complete. Regarding the requirements for merge, 1. API compatibility - this feature adds new APIs, does not modify any existing ones. 2. Turning feature off - using the feature is configurable and is turned off by default. 3. Stability/testing - this is an RM-only change, so we plan on deploying this feature to a test RM and verifying configuration changes for capacity scheduler. (Right now fair scheduler is not supported.) 4. Deployment - we want to get this feature in to 2.9.0 since we want to use this feature and 2.9 version in our next upgrade. 5. Timeline - we have one main blocker which we are planning to resolve by end of week. The rest of the month will be testing then a merge vote on the last week of Sept. Please let me know if you have any concerns. Thanks! Jonathan Hung On Wed, Jul 26, 2017 at 11:23 AM, J. Rottinghuis wrote: > Thanks Vrushali for being entirely open as to the current status of ATSv2. > I appreciate that we want to ensure things are tested at scale, and as you > said we are working on that right now on our clusters. > We have tested the feature to demonstrate it works at what we consider > moderate scale. > > I think the criteria for including this feature in the 2.9 release should > be if it can be safely turned off and not cause impact to anybody not using > the new feature. The confidence for this is high for timeline service v2. > > Therefore, I think timeline service v2 should definitely be part of 2.9. > That is the big draw for us to work on stabilizing a 2.9 release rather > than just going to 2.8 and back-porting things ourselves. > > Thanks, > > Joep > > On Tue, Jul 25, 2017 at 11:39 PM, Vrushali Channapattan < > vrushalic2...@gmail.com> wrote: > > > Thanks Subru for initiating this discussion. > > > > Wanted to share some thoughts in the context of Timeline Service v2. The > > current status of this module is that we are ramping up for a second > merge > > to trunk. We still have a few merge blocker jiras outstanding, which we > > think we will finish soon. > > > > While we have done some testing, we are yet to test at scale. Given all > > this, we were thinking of initially targeting a beta release vehicle > rather > > than a stable release. > > > > As such, timeline service v2 has branch-2 branch called as > > YARN-5355-branch-2 in case anyone wants to try it out. Timeline service > v2 > > can be turned off and should not affect the cluster. > > > > thanks > > Vrushali > > > > > > > > > > > > On Mon, Jul 24, 2017 at 1:26 PM, Subru Krishnan > wrote: > > > > > Folks, > > > > > > With the release for 2.8, we would like to look ahead to 2.9 release as > > > there are many features/improvements in branch-2 (about 1062 commits), > > that > > > are in need of a release vechile. > > > > > > Here's our first cut of the proposal from the YARN side: > > > > > >1. Scheduler improvements (decoupling allocation from node > heartbeat, > > >allocation ID, concurrency fixes, LightResource etc). > > >2. Timeline Service v2 > > >3. Opportunistic containers > > >4. Federation > > > > > > We would like to hear a formal list from HDFS & Hadoop (& MapReduce if > > any) > > > and will update the Roadmap wiki accordingly. > > > > > > Considering our familiarity with the above mentioned YARN features, we > > > would like to volunteer as the co-RMs for 2.9.0. > > > > > > We want to keep the timeline at 8-12 weeks to keep the release > pragmatic. > > > > > > Feedback? > > > > > > -Subru/Arun > > > > > >
Re: Branch merges and 3.0.0-beta1 scope
Hi Andrew, Thanks for starting the discussion - we have a feature YARN-5734 for API based scheduler configuration that I feel is pretty close to merge (also "a few weeks"). It's almost completely code and API additions and we were careful to design it so that it's compatible (feature is also turned off by default). Hoping to get this in before 3.0.0-GA. Just wanted to send this note so that we are not caught off guard by this feature. Thanks! Jonathan Hung On Fri, Aug 25, 2017 at 11:06 AM, Wangda Tan wrote: > Resource profile is similar to TSv2, the feature is: > - Alpha feature, we will not freeze new added APIs. And all added APIs are > explicitly marked to @Unstable. > - Allow rolling upgrade from branch-2. > - Touched existing code, but we have, and will continue tests to make sure > changes are safe. > > Discussed with Andrew offline, we decided to not put this to beta1 since > beta1 is not far away. But we want to put it before GA if sufficient tests > are done. > > Thanks, > Wangda > > > > On Fri, Aug 25, 2017 at 10:54 AM, Rohith Sharma K S < > rohithsharm...@apache.org> wrote: > > > On 25 August 2017 at 22:39, Andrew Wang > wrote: > > > > > Hi Rohith, > > > > > > Given that we're advertising TSv2 as an alpha feature, I think we're > > > allowed to break compatibility. Let's make sure this is clear in the > > > release notes and documentation. > > > > > > > > That said, with TSv2 phase 2, is the API going to be frozen? The > umbrella > > > JIRA refers to "TSv2 alpha2" which indicated to me it was still > > alpha-level > > > quality and stability. > > > > > YES, We have decided to freeze API's. I do not think we make any > > compatibility break in future. > > > > > > > > > > > > Best, > > > Andrew > > > > > >
[jira] [Created] (MAPREDUCE-6885) JobHistory event handler thread should not die if exception thrown
Jonathan Hung created MAPREDUCE-6885: Summary: JobHistory event handler thread should not die if exception thrown Key: MAPREDUCE-6885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6885 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Jonathan Hung If eventHandlingThread handles an event which causes it to throw an exception (e.g. if it is unable to flush an event to HDFS), the thread dies. This thread is responsible for moving job history files to mapreduce.jobhistory.done-dir, if an exception is thrown the files will not be moved here, which is bad. We should catch these exceptions so that the thread can still move these files when the job is complete. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-6860) User intermediate-done-dir permissions should use history file permissions configuration
[ https://issues.apache.org/jira/browse/MAPREDUCE-6860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung resolved MAPREDUCE-6860. -- Resolution: Not A Bug > User intermediate-done-dir permissions should use history file permissions > configuration > > > Key: MAPREDUCE-6860 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6860 > Project: Hadoop Map/Reduce > Issue Type: Bug > Reporter: Jonathan Hung > > Currently {{JobHistoryEventHandler}} creates the user intermediate-done-dir > directory here: {noformat} doneDirPrefixPath = > FileContext.getFileContext(conf).makeQualified(new > Path(userDoneDirStr)); > mkdir(doneDirFS, doneDirPrefixPath, new FsPermission( > > JobHistoryUtils.HISTORY_INTERMEDIATE_USER_DIR_PERMISSIONS));{noformat} which > is hardcoded to 770. But the summary, history, and conf files under this user > dir are configurable via > {{mapreduce.jobhistory.intermediate-done-dir.file.permission}}. So if the > configured permissions has read/write/execute permissions for "other" users, > they will still not have access to these files due to the 770 permission on > the user dir. > I see two options here: > # Reuse {{mapreduce.jobhistory.intermediate-done-dir.file.permission}} as the > permissions for the user dir > # Create a new config for the user dir permissions, using 770 as the default > The latter makes more sense to me. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6860) User intermediate-done-dir permissions should use history file permissions configuration
Jonathan Hung created MAPREDUCE-6860: Summary: User intermediate-done-dir permissions should use history file permissions configuration Key: MAPREDUCE-6860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6860 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Jonathan Hung Currently {{JobHistoryEventHandler}} creates the user intermediate-done-dir directory here: {noformat} doneDirPrefixPath = FileContext.getFileContext(conf).makeQualified(new Path(userDoneDirStr)); mkdir(doneDirFS, doneDirPrefixPath, new FsPermission( JobHistoryUtils.HISTORY_INTERMEDIATE_USER_DIR_PERMISSIONS));{noformat} which is hardcoded to 770. But the summary, history, and conf files under this user dir are configurable via {{mapreduce.jobhistory.intermediate-done-dir.file.permission}}. So if the configured permissions has read/write/execute permissions for "other" users, they will still not have access to these files due to the 770 permission on the user dir. I see two options here: # Reuse {{mapreduce.jobhistory.intermediate-done-dir.file.permission}} as the permissions for the user dir # Create a new config for the user dir permissions, using 770 as the default -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org