Re: [VOTE] create ozone-dev and ozone-issues mailing lists
+1, Thanks Marton, for the consensus and action plan! -Rohith Sharma K S On Sun, 27 Oct 2019 at 13:55, Elek, Marton wrote: > > As discussed earlier in the thread of "Hadoop-Ozone repository mailing > list configurations" [1] I suggested to solve the current > misconfiguration problem with creating separated mailing lists > (dev/issues) for Hadoop Ozone. > > It would have some additional benefit: for example it would make easier > to follow the Ozone development and future plans. > > Here I am starting a new vote thread (open for at least 72 hours) to > collect more feedback about this. > > Please express your opinion / vote. > > Thanks a lot, > Marton > > [1] > > https://lists.apache.org/thread.html/dc66a30f48a744534e748c418bf7ab6275896166ca5ade11560ebaef@%3Chdfs-dev.hadoop.apache.org%3E > > - > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > >
Re: [Discuss] Hadoop-Ozone repository mailing list configurations
+ common/yarn and mapreduce/submarine Looks like same issue in submarine repository also ! On Mon, 21 Oct 2019 at 09:30, Rohith Sharma K S wrote: > Folks, > > In Hadoop world, any mailing list has its own purposes as below > 1. hdfs/common/yarn/mapreduce-*dev *mailing list is meant for developer > discussion purpose. > 2. hdfs/common/yarn/mapreduce*-issues* mailing list used for comments > made in the issues. > > It appears Hadoop-Ozone repository configured *hdfs-dev *mailing list > for *hdfs-issues* list also. As a result hdfs-dev mailing list is > bombarded with every comment made in hadoop-ozone repository. > > > Could it be fixed? > > -Rohith Sharma K S > > >
Re: [ANNOUNCE] Apache Hadoop 3.2.1 release
Updated twitter message: `` Apache Hadoop 3.2.1 is released: https://s.apache.org/96r4h Announcement: https://s.apache.org/jhnpe Overview: https://s.apache.org/tht6a Changes: https://s.apache.org/pd6of Release notes: https://s.apache.org/ta50b Thanks to our community of developers, operators, and users. -Rohith Sharma K S On Wed, 25 Sep 2019 at 14:15, Sunil Govindan wrote: > Here the link of Overview URL is old. > We should ideally use https://hadoop.apache.org/release/3.2.1.html > > Thanks > Sunil > > On Wed, Sep 25, 2019 at 2:10 PM Rohith Sharma K S < > rohithsharm...@apache.org> wrote: > >> Can someone help to post this in twitter account? >> >> Apache Hadoop 3.2.1 is released: https://s.apache.org/mzdb6 >> Overview: https://s.apache.org/tht6a >> Changes: https://s.apache.org/pd6of >> Release notes: https://s.apache.org/ta50b >> >> Thanks to our community of developers, operators, and users. >> >> -Rohith Sharma K S >> >> On Wed, 25 Sep 2019 at 13:44, Rohith Sharma K S < >> rohithsharm...@apache.org> wrote: >> >>> Hi all, >>> >>> It gives us great pleasure to announce that the Apache Hadoop >>> community has >>> voted to release Apache Hadoop 3.2.1. >>> >>> Apache Hadoop 3.2.1 is the stable release of Apache Hadoop 3.2 line, >>> which >>> includes 493 fixes since Hadoop 3.2.0 release: >>> >>> - For major changes included in Hadoop 3.2 line, please refer Hadoop >>> 3.2.1 main page[1]. >>> - For more details about fixes in 3.2.1 release, please read >>> CHANGELOG[2] and RELEASENOTES[3]. >>> >>> The release news is posted on the Hadoop website too, you can go to the >>> downloads section directly[4]. >>> >>> Thank you all for contributing to the Apache Hadoop! >>> >>> Cheers, >>> Rohith Sharma K S >>> >>> >>> [1] https://hadoop.apache.org/docs/r3.2.1/index.html >>> [2] >>> https://hadoop.apache.org/docs/r3.2.1/hadoop-project-dist/hadoop-common/release/3.2.1/CHANGELOG.3.2.1.html >>> [3] >>> https://hadoop.apache.org/docs/r3.2.1/hadoop-project-dist/hadoop-common/release/3.2.1/RELEASENOTES.3.2.1.html >>> [4] https://hadoop.apache.org >>> >>
Re: [ANNOUNCE] Apache Hadoop 3.2.1 release
Updated announcement Hi all, It gives us great pleasure to announce that the Apache Hadoop community has voted to release Apache Hadoop 3.2.1. Apache Hadoop 3.2.1 is the stable release of Apache Hadoop 3.2 line, which includes 493 fixes since Hadoop 3.2.0 release: - For major changes included in Hadoop 3.2 line, please refer Hadoop 3.2.1 main page [1]. - For more details about fixes in 3.2.1 release, please read CHANGELOG [2] and RELEASENOTES [3]. The release news is posted on the Hadoop website too, you can go to the downloads section directly [4]. This announcement itself is also up on the website [0]. Thank you all for contributing to the Apache Hadoop! Cheers, Rohith Sharma K S [0] Announcement: https://hadoop.apache.org/release/3.2.1.html [1] Overview of major changes: https://hadoop.apache.org/docs/r3.2.1/index.html [2] Detailed change-log: https://hadoop.apache.org/docs/r3.2.1/hadoop-project-dist/hadoop-common/release/3.2.1/CHANGELOG.3.2.1.html [3] Detailed release-notes: https://hadoop.apache.org/docs/r3.2.1/hadoop-project-dist/hadoop-common/release/3.2.1/RELEASENOTES.3.2.1.html [4] Project Home: https://hadoop.apache.org On Wed, 25 Sep 2019 at 13:44, Rohith Sharma K S wrote: > Hi all, > > It gives us great pleasure to announce that the Apache Hadoop > community has > voted to release Apache Hadoop 3.2.1. > > Apache Hadoop 3.2.1 is the stable release of Apache Hadoop 3.2 line, which > includes 493 fixes since Hadoop 3.2.0 release: > > - For major changes included in Hadoop 3.2 line, please refer Hadoop 3.2.1 > main page[1]. > - For more details about fixes in 3.2.1 release, please read CHANGELOG[2] > and RELEASENOTES[3]. > > The release news is posted on the Hadoop website too, you can go to the > downloads section directly[4]. > > Thank you all for contributing to the Apache Hadoop! > > Cheers, > Rohith Sharma K S > > > [1] https://hadoop.apache.org/docs/r3.2.1/index.html > [2] > https://hadoop.apache.org/docs/r3.2.1/hadoop-project-dist/hadoop-common/release/3.2.1/CHANGELOG.3.2.1.html > [3] > https://hadoop.apache.org/docs/r3.2.1/hadoop-project-dist/hadoop-common/release/3.2.1/RELEASENOTES.3.2.1.html > [4] https://hadoop.apache.org >
Re: [ANNOUNCE] Apache Hadoop 3.2.1 release
Can someone help to post this in twitter account? Apache Hadoop 3.2.1 is released: https://s.apache.org/mzdb6 Overview: https://s.apache.org/tht6a Changes: https://s.apache.org/pd6of Release notes: https://s.apache.org/ta50b Thanks to our community of developers, operators, and users. -Rohith Sharma K S On Wed, 25 Sep 2019 at 13:44, Rohith Sharma K S wrote: > Hi all, > > It gives us great pleasure to announce that the Apache Hadoop > community has > voted to release Apache Hadoop 3.2.1. > > Apache Hadoop 3.2.1 is the stable release of Apache Hadoop 3.2 line, which > includes 493 fixes since Hadoop 3.2.0 release: > > - For major changes included in Hadoop 3.2 line, please refer Hadoop 3.2.1 > main page[1]. > - For more details about fixes in 3.2.1 release, please read CHANGELOG[2] > and RELEASENOTES[3]. > > The release news is posted on the Hadoop website too, you can go to the > downloads section directly[4]. > > Thank you all for contributing to the Apache Hadoop! > > Cheers, > Rohith Sharma K S > > > [1] https://hadoop.apache.org/docs/r3.2.1/index.html > [2] > https://hadoop.apache.org/docs/r3.2.1/hadoop-project-dist/hadoop-common/release/3.2.1/CHANGELOG.3.2.1.html > [3] > https://hadoop.apache.org/docs/r3.2.1/hadoop-project-dist/hadoop-common/release/3.2.1/RELEASENOTES.3.2.1.html > [4] https://hadoop.apache.org >
[ANNOUNCE] Apache Hadoop 3.2.1 release
Hi all, It gives us great pleasure to announce that the Apache Hadoop community has voted to release Apache Hadoop 3.2.1. Apache Hadoop 3.2.1 is the stable release of Apache Hadoop 3.2 line, which includes 493 fixes since Hadoop 3.2.0 release: - For major changes included in Hadoop 3.2 line, please refer Hadoop 3.2.1 main page[1]. - For more details about fixes in 3.2.1 release, please read CHANGELOG[2] and RELEASENOTES[3]. The release news is posted on the Hadoop website too, you can go to the downloads section directly[4]. Thank you all for contributing to the Apache Hadoop! Cheers, Rohith Sharma K S [1] https://hadoop.apache.org/docs/r3.2.1/index.html [2] https://hadoop.apache.org/docs/r3.2.1/hadoop-project-dist/hadoop-common/release/3.2.1/CHANGELOG.3.2.1.html [3] https://hadoop.apache.org/docs/r3.2.1/hadoop-project-dist/hadoop-common/release/3.2.1/RELEASENOTES.3.2.1.html [4] https://hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 3.2.1 - RC0
Thanks all who helped to verify and vote 3.2.1 release! I am concluding the vote for 3.2.1 RC0. Summary of votes for hadoop-3.2.1-RC0: 7 binding +1s, from: -- Sunil Govindan, Brahma Reddy Battula, Steve Loughran, Elek, Marton, Weiwei Yang, Naganarasimha Garla, Rohith Sharma K S 10 non-binding +1s, from: --- runlin zhang, Thomas Marquardt, Santosh Marella, Anil Sadineni, Jeffrey Rodriguez, zhankun tang, Ayush Saxena, Dinesh Chitlangia, Prabhu Josep, Abhishek Modi 1 non-binding with +0s from: - Masatake Iwasaki and *no -1s*. So I am glad to announce that the vote for 3.2.1 RC0 passes. Thanks everyone listed above who tried the release candidate and vote, and all who ever help with 3.2.1 release effort in all kinds of ways. I'll push the release bits and send out an announcement for 3.2.1 soon. Thanks, Rohith Sharma K S On Thu, 19 Sep 2019 at 14:34, Abhishek Modi wrote: > Hi Rohith, > > Thanks for driving this release. > > +1 (binding) > > - built from the source on windows machine. > - created a pseudo cluster. > - ran PI job. > - checked basic metrics with ATSv2 enabled. > > On Thu, Sep 19, 2019 at 12:30 PM Sunil Govindan wrote: > >> Hi Rohith >> >> Thanks for putting this together, appreciate the same. >> >> +1 (binding) >> >> - verified signature >> - brought up a cluster from the tar ball >> - Ran some basic MR jobs >> - RM UI seems fine (old and new) >> >> >> Thanks >> Sunil >> >> On Wed, Sep 11, 2019 at 12:56 PM Rohith Sharma K S < >> rohithsharm...@apache.org> wrote: >> >> > Hi folks, >> > >> > I have put together a release candidate (RC0) for Apache Hadoop 3.2.1. >> > >> > The RC is available at: >> > http://home.apache.org/~rohithsharmaks/hadoop-3.2.1-RC0/ >> > >> > The RC tag in git is release-3.2.1-RC0: >> > https://github.com/apache/hadoop/tree/release-3.2.1-RC0 >> > >> > >> > The maven artifacts are staged at >> > >> https://repository.apache.org/content/repositories/orgapachehadoop-1226/ >> > >> > You can find my public key at: >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS >> > >> > This vote will run for 7 days(5 weekdays), ending on 18th Sept at 11:59 >> pm >> > PST. >> > >> > I have done testing with a pseudo cluster and distributed shell job. My >> +1 >> > to start. >> > >> > Thanks & Regards >> > Rohith Sharma K S >> > >> > > > -- > Regards, > Abhishek Modi >
Re: [VOTE] Release Apache Hadoop 3.2.1 - RC0
Inline comments On Thu, 19 Sep 2019 at 11:51, Rohith Sharma K S wrote: > Thanks Brahma for voting and bringing this to my attention! > > On Thu, 19 Sep 2019 at 11:28, Brahma Reddy Battula > wrote: > >> RohithThanks for driving the release >> >> +1 (Binding). >> >> --Built from the source >> --Installed pseudo cluster >> --Verified Basic hdfs shell command >> --Ran Pi jobs >> --Browsed the UI >> >> >> *Rolling Upgrade:* >> Following issue could have been merged.With out this, need to disable >> token till rolling upgrade finalised. (Since one of main rolling upgrade >> issue already merged (HDFS-13596)). >> https://issues.apache.org/jira/browse/HDFS-14509 >> > This issue marked as blocker for 2.10 and still open!. Can anyone HDFS > folks confirms this whether is is blocker for *hadoop-3.2.1* release? > >>>> IMO, it is not a blocker for 3.2.1 release. And I haven't heard any feedback on this from HDFS folks. Hence I am moving forward with release 3.2.1 activities. I will be closing the voting thread today
Re: [VOTE] Release Apache Hadoop 3.2.1 - RC0
Thanks Brahma for voting and bringing this to my attention! On Thu, 19 Sep 2019 at 11:28, Brahma Reddy Battula wrote: > RohithThanks for driving the release > > +1 (Binding). > > --Built from the source > --Installed pseudo cluster > --Verified Basic hdfs shell command > --Ran Pi jobs > --Browsed the UI > > > *Rolling Upgrade:* > Following issue could have been merged.With out this, need to disable > token till rolling upgrade finalised. (Since one of main rolling upgrade > issue already merged (HDFS-13596)). > https://issues.apache.org/jira/browse/HDFS-14509 > This issue marked as blocker for 2.10 and still open!. Can anyone HDFS folks confirms this whether is is blocker for *hadoop-3.2.1* release? -Rohith Sharma K S
Re: [VOTE] Release Apache Hadoop 3.2.1 - RC0
Thanks Steve for detailed verification. Inline comment On Wed, 18 Sep 2019 at 20:34, Steve Loughran wrote: > > > > +1 binding. > > > > One caveat: warn people that guava is now at 27.0 -and that if you run > > with an older version of Guava things will inevitably break. > >: Could you please suggest what is the process to follow now If I want to add into release notes? Should I withdraw RC0 and recreate RC1 with updated Release note in corresponding JIRA so that release script will pick up that? Or any other way?
Re: [VOTE] Release Hadoop-3.1.3-RC0
+1 (binding) - verified sha512 for all the artifacts - built from source and installed pseudo cluster. Run sample MR jobs and distributed shell. -Rohith Sharma K S
[VOTE] Release Apache Hadoop 3.2.1 - RC0
Hi folks, I have put together a release candidate (RC0) for Apache Hadoop 3.2.1. The RC is available at: http://home.apache.org/~rohithsharmaks/hadoop-3.2.1-RC0/ The RC tag in git is release-3.2.1-RC0: https://github.com/apache/hadoop/tree/release-3.2.1-RC0 The maven artifacts are staged at https://repository.apache.org/content/repositories/orgapachehadoop-1226/ You can find my public key at: https://dist.apache.org/repos/dist/release/hadoop/common/KEYS This vote will run for 7 days(5 weekdays), ending on 18th Sept at 11:59 pm PST. I have done testing with a pseudo cluster and distributed shell job. My +1 to start. Thanks & Regards Rohith Sharma K S
3.2.1 branch is closed for commits //Re: [DISCUSS] Hadoop-3.2.1 release proposal
I have created branch *branch-3.2.1* for the release .Hence 3.2.1 branch closed for commits. I will be creating RC on this branch. Kindly use *branch-3.2* for any commit and set "*Fix Version/s*" to *3.2.2* -Rohith Sharma K S On Sat, 7 Sep 2019 at 08:39, Rohith Sharma K S wrote: > Hi Folks > > Given all the blockers/critical issues[1] are resolved, I will be cutting > the branch-3.2.1 sooner. > Thanks all for your support in pushing the JIRAs to closure. > > [1] https://s.apache.org/7yjh5 > > > -Rohith Sharma K S > > On Thu, 29 Aug 2019 at 11:21, Rohith Sharma K S > wrote: > >> [Update] >> Ramping down all the critical/blockers https://s.apache.org/7yjh5, left >> with three issues! >> >> YARN-9785 - Solution discussion is going on, hopefully should be able to >> rap up solution sooner. >> HADOOP-15998 - To be committed. >> YARN-9796 - Patch available, to be committed. >> >> I am closely monitoring for these issues, and will update once these are >> fixed. >> >> -Rohith Sharma K S >> >> >> On Wed, 21 Aug 2019 at 13:42, Bibinchundatt >> wrote: >> >>> Hi Rohith >>> >>> Thank you for initiating this >>> >>> Few critical/blocker jira's we could consider >>> >>> YARN-9714 >>> YARN-9642 >>> YARN-9640 >>> >>> Regards >>> Bibin >>> -Original Message- >>> From: Rohith Sharma K S [mailto:rohithsharm...@apache.org] >>> Sent: 21 August 2019 11:42 >>> To: Wei-Chiu Chuang >>> Cc: Hdfs-dev ; yarn-dev < >>> yarn-...@hadoop.apache.org>; mapreduce-dev < >>> mapreduce-dev@hadoop.apache.org>; Hadoop Common < >>> common-...@hadoop.apache.org> >>> Subject: Re: [DISCUSS] Hadoop-3.2.1 release proposal >>> >>> On Tue, 20 Aug 2019 at 22:28, Wei-Chiu Chuang >>> wrote: >>> >>> > Hi Rohith, >>> > Thanks for initiating this. >>> > I want to bring up one blocker issue: HDFS-13596 >>> > <https://issues.apache.org/jira/browse/HDFS-13596> (NN restart fails >>> > after RollingUpgrade from 2.x to 3.x) >>> > >>> >>> > This should be a blocker for all active Hadoop 3.x releases: 3.3.0, >>> > 3.2.1, 3.1.3. Hopefully we can get this fixed within this week. >>> > Additionally, HDFS-14396 >>> > <https://issues.apache.org/jira/browse/HDFS-14396> (Failed to load >>> > image from FSImageFile when downgrade from 3.x to 2.x).Probably not a >>> > blocker but nice to have. >>> > >>> >>> >>>> Please set target version so that I don't miss in blockers/critical >>> list for 3.2.1 https://s.apache.org/7yjh5. >>> >>> >>> > >>> > On Tue, Aug 20, 2019 at 3:22 AM Rohith Sharma K S < >>> > rohithsharm...@apache.org> wrote: >>> > >>> >> Hello folks, >>> >> >>> >> It's been more than six month Hadoop-3.2.0 is released i.e 16th >>> Jan,2019. >>> >> We have several important fixes landed in branch-3.2 (around 48 >>> >> blockers/critical https://s.apache.org/ozd6o). >>> >> >>> >> I am planning to do a maintenance release of 3.2.1 in next few weeks >>> >> i.e around 1st week of September. >>> >> >>> >> So far I don't see any blockers/critical in 3.2.1. I see few pending >>> >> issues on 3.2.1 are https://s.apache.org/ni6v7. >>> >> >>> >> *Proposal*: >>> >> Code Freezing Date: 30th August 2019 Release Date : 7th Sept 2019 >>> >> >>> >> Please let me know if you have any thoughts or comments on this plan. >>> >> >>> >> Thanks & Regards >>> >> Rohith Sharma K S >>> >> >>> > >>> >>
Re: [DISCUSS] Hadoop-3.2.1 release proposal
Hi Folks Given all the blockers/critical issues[1] are resolved, I will be cutting the branch-3.2.1 sooner. Thanks all for your support in pushing the JIRAs to closure. [1] https://s.apache.org/7yjh5 -Rohith Sharma K S On Thu, 29 Aug 2019 at 11:21, Rohith Sharma K S wrote: > [Update] > Ramping down all the critical/blockers https://s.apache.org/7yjh5, left > with three issues! > > YARN-9785 - Solution discussion is going on, hopefully should be able to > rap up solution sooner. > HADOOP-15998 - To be committed. > YARN-9796 - Patch available, to be committed. > > I am closely monitoring for these issues, and will update once these are > fixed. > > -Rohith Sharma K S > > > On Wed, 21 Aug 2019 at 13:42, Bibinchundatt > wrote: > >> Hi Rohith >> >> Thank you for initiating this >> >> Few critical/blocker jira's we could consider >> >> YARN-9714 >> YARN-9642 >> YARN-9640 >> >> Regards >> Bibin >> -Original Message- >> From: Rohith Sharma K S [mailto:rohithsharm...@apache.org] >> Sent: 21 August 2019 11:42 >> To: Wei-Chiu Chuang >> Cc: Hdfs-dev ; yarn-dev < >> yarn-...@hadoop.apache.org>; mapreduce-dev < >> mapreduce-dev@hadoop.apache.org>; Hadoop Common < >> common-...@hadoop.apache.org> >> Subject: Re: [DISCUSS] Hadoop-3.2.1 release proposal >> >> On Tue, 20 Aug 2019 at 22:28, Wei-Chiu Chuang wrote: >> >> > Hi Rohith, >> > Thanks for initiating this. >> > I want to bring up one blocker issue: HDFS-13596 >> > <https://issues.apache.org/jira/browse/HDFS-13596> (NN restart fails >> > after RollingUpgrade from 2.x to 3.x) >> > >> >> > This should be a blocker for all active Hadoop 3.x releases: 3.3.0, >> > 3.2.1, 3.1.3. Hopefully we can get this fixed within this week. >> > Additionally, HDFS-14396 >> > <https://issues.apache.org/jira/browse/HDFS-14396> (Failed to load >> > image from FSImageFile when downgrade from 3.x to 2.x).Probably not a >> > blocker but nice to have. >> > >> >> >>>> Please set target version so that I don't miss in blockers/critical >> list for 3.2.1 https://s.apache.org/7yjh5. >> >> >> > >> > On Tue, Aug 20, 2019 at 3:22 AM Rohith Sharma K S < >> > rohithsharm...@apache.org> wrote: >> > >> >> Hello folks, >> >> >> >> It's been more than six month Hadoop-3.2.0 is released i.e 16th >> Jan,2019. >> >> We have several important fixes landed in branch-3.2 (around 48 >> >> blockers/critical https://s.apache.org/ozd6o). >> >> >> >> I am planning to do a maintenance release of 3.2.1 in next few weeks >> >> i.e around 1st week of September. >> >> >> >> So far I don't see any blockers/critical in 3.2.1. I see few pending >> >> issues on 3.2.1 are https://s.apache.org/ni6v7. >> >> >> >> *Proposal*: >> >> Code Freezing Date: 30th August 2019 Release Date : 7th Sept 2019 >> >> >> >> Please let me know if you have any thoughts or comments on this plan. >> >> >> >> Thanks & Regards >> >> Rohith Sharma K S >> >> >> > >> >
Re: [VOTE] Moving Submarine to a separate Apache project proposal
+1, Great to see Submarine progress. I am interested to participate in this project. Please include me as well in the project. -Rohith Sharma K S On Sun, 1 Sep 2019 at 10:49, Wangda Tan wrote: > Hi all, > > As we discussed in the previous thread [1], > > I just moved the spin-off proposal to CWIKI and completed all TODO parts. > > > https://cwiki.apache.org/confluence/display/HADOOP/Submarine+Project+Spin-Off+to+TLP+Proposal > > If you have interests to learn more about this. Please review the proposal > let me know if you have any questions/suggestions for the proposal. This > will be sent to board post voting passed. (And please note that the > previous voting thread [2] to move Submarine to a separate Github repo is a > necessary effort to move Submarine to a separate Apache project but not > sufficient so I sent two separate voting thread.) > > Please let me know if I missed anyone in the proposal, and reply if you'd > like to be included in the project. > > This voting runs for 7 days and will be concluded at Sep 7th, 11 PM PDT. > > Thanks, > Wangda Tan > > [1] > > https://lists.apache.org/thread.html/4a2210d567cbc05af92c12aa6283fd09b857ce209d537986ed800029@%3Cyarn-dev.hadoop.apache.org%3E > [2] > > https://lists.apache.org/thread.html/6e94469ca105d5a15dc63903a541bd21c7ef70b8bcff475a16b5ed73@%3Cyarn-dev.hadoop.apache.org%3E >
Re: [DISCUSS] Hadoop-3.2.1 release proposal
[Update] Ramping down all the critical/blockers https://s.apache.org/7yjh5, left with three issues! YARN-9785 - Solution discussion is going on, hopefully should be able to rap up solution sooner. HADOOP-15998 - To be committed. YARN-9796 - Patch available, to be committed. I am closely monitoring for these issues, and will update once these are fixed. -Rohith Sharma K S On Wed, 21 Aug 2019 at 13:42, Bibinchundatt wrote: > Hi Rohith > > Thank you for initiating this > > Few critical/blocker jira's we could consider > > YARN-9714 > YARN-9642 > YARN-9640 > > Regards > Bibin > -Original Message- > From: Rohith Sharma K S [mailto:rohithsharm...@apache.org] > Sent: 21 August 2019 11:42 > To: Wei-Chiu Chuang > Cc: Hdfs-dev ; yarn-dev < > yarn-...@hadoop.apache.org>; mapreduce-dev < > mapreduce-dev@hadoop.apache.org>; Hadoop Common < > common-...@hadoop.apache.org> > Subject: Re: [DISCUSS] Hadoop-3.2.1 release proposal > > On Tue, 20 Aug 2019 at 22:28, Wei-Chiu Chuang wrote: > > > Hi Rohith, > > Thanks for initiating this. > > I want to bring up one blocker issue: HDFS-13596 > > <https://issues.apache.org/jira/browse/HDFS-13596> (NN restart fails > > after RollingUpgrade from 2.x to 3.x) > > > > > This should be a blocker for all active Hadoop 3.x releases: 3.3.0, > > 3.2.1, 3.1.3. Hopefully we can get this fixed within this week. > > Additionally, HDFS-14396 > > <https://issues.apache.org/jira/browse/HDFS-14396> (Failed to load > > image from FSImageFile when downgrade from 3.x to 2.x).Probably not a > > blocker but nice to have. > > > > >>>> Please set target version so that I don't miss in blockers/critical > list for 3.2.1 https://s.apache.org/7yjh5. > > > > > > On Tue, Aug 20, 2019 at 3:22 AM Rohith Sharma K S < > > rohithsharm...@apache.org> wrote: > > > >> Hello folks, > >> > >> It's been more than six month Hadoop-3.2.0 is released i.e 16th > Jan,2019. > >> We have several important fixes landed in branch-3.2 (around 48 > >> blockers/critical https://s.apache.org/ozd6o). > >> > >> I am planning to do a maintenance release of 3.2.1 in next few weeks > >> i.e around 1st week of September. > >> > >> So far I don't see any blockers/critical in 3.2.1. I see few pending > >> issues on 3.2.1 are https://s.apache.org/ni6v7. > >> > >> *Proposal*: > >> Code Freezing Date: 30th August 2019 Release Date : 7th Sept 2019 > >> > >> Please let me know if you have any thoughts or comments on this plan. > >> > >> Thanks & Regards > >> Rohith Sharma K S > >> > > >
Re: [DISCUSS] Hadoop-3.2.1 release proposal
On Tue, 20 Aug 2019 at 22:28, Wei-Chiu Chuang wrote: > Hi Rohith, > Thanks for initiating this. > I want to bring up one blocker issue: HDFS-13596 > <https://issues.apache.org/jira/browse/HDFS-13596> (NN restart fails > after RollingUpgrade from 2.x to 3.x) > > This should be a blocker for all active Hadoop 3.x releases: 3.3.0, 3.2.1, > 3.1.3. Hopefully we can get this fixed within this week. > Additionally, HDFS-14396 > <https://issues.apache.org/jira/browse/HDFS-14396> (Failed to load image > from FSImageFile when downgrade from 3.x to 2.x).Probably not a blocker but > nice to have. > >>>> Please set target version so that I don't miss in blockers/critical list for 3.2.1 https://s.apache.org/7yjh5. > > On Tue, Aug 20, 2019 at 3:22 AM Rohith Sharma K S < > rohithsharm...@apache.org> wrote: > >> Hello folks, >> >> It's been more than six month Hadoop-3.2.0 is released i.e 16th Jan,2019. >> We have several important fixes landed in branch-3.2 (around 48 >> blockers/critical https://s.apache.org/ozd6o). >> >> I am planning to do a maintenance release of 3.2.1 in next few weeks i.e >> around 1st week of September. >> >> So far I don't see any blockers/critical in 3.2.1. I see few pending >> issues >> on 3.2.1 are https://s.apache.org/ni6v7. >> >> *Proposal*: >> Code Freezing Date: 30th August 2019 >> Release Date : 7th Sept 2019 >> >> Please let me know if you have any thoughts or comments on this plan. >> >> Thanks & Regards >> Rohith Sharma K S >> >
[DISCUSS] Hadoop-3.2.1 release proposal
Hello folks, It's been more than six month Hadoop-3.2.0 is released i.e 16th Jan,2019. We have several important fixes landed in branch-3.2 (around 48 blockers/critical https://s.apache.org/ozd6o). I am planning to do a maintenance release of 3.2.1 in next few weeks i.e around 1st week of September. So far I don't see any blockers/critical in 3.2.1. I see few pending issues on 3.2.1 are https://s.apache.org/ni6v7. *Proposal*: Code Freezing Date: 30th August 2019 Release Date : 7th Sept 2019 Please let me know if you have any thoughts or comments on this plan. Thanks & Regards Rohith Sharma K S
Re: [DISCUSS] Hadoop 2019 Release Planning
Hi Wangda, Thanks for initiating this. I would like to nominate myself for 3.2.1 Release Management. -Rohith Sharma K S On Sat, 10 Aug 2019 at 08:29, Wangda Tan wrote: > Hi all, > > Hope this email finds you well > > I want to hear your thoughts about what should be the release plan for > 2019. > > In 2018, we released: > - 1 maintenance release of 2.6 > - 3 maintenance releases of 2.7 > - 3 maintenance releases of 2.8 > - 3 releases of 2.9 > - 4 releases of 3.0 > - 2 releases of 3.1 > > Total 16 releases in 2018. > > In 2019, by far we only have two releases: > - 1 maintenance release of 3.1 > - 1 minor release of 3.2. > > However, the community put a lot of efforts to stabilize features of > various release branches. > There're: > - 217 fixed patches in 3.1.3 [1] > - 388 fixed patches in 3.2.1 [2] > - 1172 fixed patches in 3.3.0 [3] (OMG!) > > I think it is the time to do maintenance releases of 3.1/3.2 and do a minor > release for 3.3.0. > > In addition, I saw community discussion to do a 2.8.6 release for security > fixes. > > Any other releases? I think there're release plans for Ozone as well. And > please add your thoughts. > > Volunteers welcome! If you have interests to run a release as Release > Manager (or co-Resource Manager), please respond to this email thread so we > can coordinate. > > Thanks, > Wangda Tan > > [1] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND resolution = Fixed AND > fixVersion = 3.1.3 > [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND resolution = Fixed AND > fixVersion = 3.2.1 > [3] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND resolution = Fixed AND > fixVersion = 3.3.0 >
Re: [VOTE] Release Apache Hadoop Submarine 0.2.0 - RC0
+1(binding) Verified basics installing a cluster. -rohith Sharma k S On Thu, 6 Jun 2019 at 18:53, Zhankun Tang wrote: > Hi folks, > > Thanks to all of you who have contributed in this submarine 0.2.0 release. > We now have a release candidate (RC0) for Apache Hadoop Submarine 0.2.0. > > > The Artifacts for this Submarine-0.2.0 RC0 are available here: > > https://home.apache.org/~ztang/submarine-0.2.0-rc0/ > > > It's RC tag in git is "submarine-0.2.0-RC0". > > > > The maven artifacts are available via repository.apache.org at > https://repository.apache.org/content/repositories/orgapachehadoop-1221/ > > > This vote will run 7 days (5 weekdays), ending on 13th June at 11:59 pm > PST. > > > > The highlights of this release. > > 1. Linkedin's TonY runtime support in Submarine > > 2. PyTorch enabled in Submarine with both YARN native service runtime > (single node) and TonY runtime > > 3. Support uber jar of Submarine to submit the job > > 4. The YAML file to describe a job > > 5. The Notebook support (by Apache Zeppelin Submarine interpreter) > > > Thanks to Sunil, Wangda, Xun, Zac, Keqiu, Szilard for helping me in > preparing the release. > > I have done a few testing with my pseudo cluster. My +1 (non-binding) to > start. > > > > Regards, > Zhankun >
Re: [ANNOUNCE] Eric Badger is now a committer!
Congrats Eric ! -Rohith Sharma K S On Tue, 5 Mar 2019 at 22:50, Eric Payne wrote: > It is my pleasure to announce that Eric Badger has accepted an invitation > to become a Hadoop Core committer. > > Congratulations, Eric! This is well-deserved! > > -Eric Payne >
Re: [VOTE] Propose to start new Hadoop sub project "submarine"
+1 On Sat, Feb 2, 2019, 3:54 AM Wangda Tan wrote: > Hi all, > > According to positive feedbacks from the thread [1] > > This is vote thread to start a new subproject named "hadoop-submarine" > which follows the release process already established for ozone. > > The vote runs for usual 7 days, which ends at Feb 8th 5 PM PDT. > > Thanks, > Wangda Tan > > [1] > > https://lists.apache.org/thread.html/f864461eb188bd12859d51b0098ec38942c4429aae7e4d001a633d96@%3Cyarn-dev.hadoop.apache.org%3E >
Re: [DISCUSS] Making submarine to different release model like Ozone
+1, Few interested ML/DL folks from Banglore asked about Submarine release for trying out TensorFlow on YARN. We told them wait for release since they were not ready to use trunk. I see agile release cycle for Submarine brings lot of added value. -Rohith Sharma K S On Fri, 1 Feb 2019 at 00:34, Wangda Tan wrote: > Hi devs, > > Since we started submarine-related effort last year, we received a lot of > feedbacks, several companies (such as Netease, China Mobile, etc.) are > trying to deploy Submarine to their Hadoop cluster along with big data > workloads. Linkedin also has big interests to contribute a Submarine TonY ( > https://github.com/linkedin/TonY) runtime to allow users to use the same > interface. > > From what I can see, there're several issues of putting Submarine under > yarn-applications directory and have same release cycle with Hadoop: > > 1) We started 3.2.0 release at Sep 2018, but the release is done at Jan > 2019. Because of non-predictable blockers and security issues, it got > delayed a lot. We need to iterate submarine fast at this point. > > 2) We also see a lot of requirements to use Submarine on older Hadoop > releases such as 2.x. Many companies may not upgrade Hadoop to 3.x in a > short time, but the requirement to run deep learning is urgent to them. We > should decouple Submarine from Hadoop version. > > And why we wanna to keep it within Hadoop? First, Submarine included some > innovation parts such as enhancements of user experiences for YARN > services/containerization support which we can add it back to Hadoop later > to address common requirements. In addition to that, we have a big overlap > in the community developing and using it. > > There're several proposals we have went through during Ozone merge to trunk > discussion: > > https://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201803.mbox/%3ccahfhakh6_m3yldf5a2kq8+w-5fbvx5ahfgs-x1vajw8gmnz...@mail.gmail.com%3E > > I propose to adopt Ozone model: which is the same master branch, different > release cycle, and different release branch. It is a great example to show > agile release we can do (2 Ozone releases after Oct 2018) with less > overhead to setup CI, projects, etc. > > *Links:* > - JIRA: https://issues.apache.org/jira/browse/YARN-8135 > - Design doc > < > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit > > > - User doc > < > https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/Index.html > > > (3.2.0 > release) > - Blogposts, {Submarine} : Running deep learning workloads on Apache Hadoop > < > https://hortonworks.com/blog/submarine-running-deep-learning-workloads-apache-hadoop/ > >, > (Chinese Translation: Link <https://www.jishuwen.com/d/2Vpu>) > - Talks: Strata Data Conf NY > < > https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/detail/68289 > > > > Thoughts? > > Thanks, > Wangda Tan >
Re: [VOTE] Release Apache Hadoop 3.1.2 - RC1
+1(binding) - Built from source for -Dhbase.profile=2.0 and installed RM HA non-secure cluster with 2 nodes. Configured ATSv2 in cluster. - Ran sample jobs with MR and DS. - Verified UI2 for for flow activity page and data validation. - Verified for ATSv2 REST API's -Rohith Sharma K S On Tue, 29 Jan 2019 at 11:50, Sunil G wrote: > Hi Folks, > > On behalf of Wangda, we have an RC1 for Apache Hadoop 3.1.2. > > The artifacts are available here: > http://home.apache.org/~sunilg/hadoop-3.1.2-RC1/ > > The RC tag in git is release-3.1.2-RC1: > https://github.com/apache/hadoop/commits/release-3.1.2-RC1 > > The maven artifacts are available via repository.apache.org at > https://repository.apache.org/content/repositories/orgapachehadoop-1215 > > This vote will run 5 days from now. > > 3.1.2 contains 325 [1] fixed JIRA issues since 3.1.1. > > We have done testing with a pseudo cluster and distributed shell job. > > My +1 to start. > > Best, > Wangda Tan and Sunil Govindan > > [1] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.1.2) > ORDER BY priority DESC >
Re: [VOTE] Release Apache Hadoop 3.1.2 - RC0
@Wangda Tan I have pushed the changes to branch-3.1 and branch-3.1.2. Verified hadoop-3.1.2 branch build for functionalities. Should new RC to be given OR same could be continued with updated repositories? On Mon, 28 Jan 2019 at 10:54, Rohith Sharma K S wrote: > -1, I found an issue in ATSv2 initialization in NodeManager. This causes > none of the ATSv2 events published from NodeManager. I have created > YARN-9242 for tracking the same. > > -Rohith Sharma K S > > On Fri, 25 Jan 2019 at 11:40, Wangda Tan wrote: > >> Hi folks, >> >> With tons of help from Sunil, we have created RC0 for Apache Hadoop 3.1.2. >> The artifacts are available here: >> >> *http://home.apache.org/~sunilg/hadoop-3.1.2-RC0/ >> <http://home.apache.org/~sunilg/hadoop-3.1.2-RC0/>* >> >> The RC tag in git is release-3.1.2-RC0: >> https://github.com/apache/hadoop/commits/release-3.1.2-RC0 >> >> The maven artifacts are available via repository.apache.org at >> *https://repository.apache.org/content/repositories/orgapachehadoop-1212/ >> <https://repository.apache.org/content/repositories/orgapachehadoop-1212/ >> >* >> >> This vote will run 5 days from now. >> >> 3.1.2 contains 325 [1] fixed JIRA issues since 3.1.1. >> >> I have done testing with a pseudo cluster and distributed shell job. My +1 >> to start. >> >> Best, >> Wangda Tan and Sunil Govind >> >> [1] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.1.2) >> ORDER BY priority DESC >> >
Re: [VOTE] Release Apache Hadoop 3.1.2 - RC0
-1, I found an issue in ATSv2 initialization in NodeManager. This causes none of the ATSv2 events published from NodeManager. I have created YARN-9242 for tracking the same. -Rohith Sharma K S On Fri, 25 Jan 2019 at 11:40, Wangda Tan wrote: > Hi folks, > > With tons of help from Sunil, we have created RC0 for Apache Hadoop 3.1.2. > The artifacts are available here: > > *http://home.apache.org/~sunilg/hadoop-3.1.2-RC0/ > <http://home.apache.org/~sunilg/hadoop-3.1.2-RC0/>* > > The RC tag in git is release-3.1.2-RC0: > https://github.com/apache/hadoop/commits/release-3.1.2-RC0 > > The maven artifacts are available via repository.apache.org at > *https://repository.apache.org/content/repositories/orgapachehadoop-1212/ > <https://repository.apache.org/content/repositories/orgapachehadoop-1212/ > >* > > This vote will run 5 days from now. > > 3.1.2 contains 325 [1] fixed JIRA issues since 3.1.1. > > I have done testing with a pseudo cluster and distributed shell job. My +1 > to start. > > Best, > Wangda Tan and Sunil Govind > > [1] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.1.2) > ORDER BY priority DESC >
Re: [VOTE] Release Apache Hadoop 3.2.0 - RC1
+1 (binding) Apologies for late vote! - Built from source for hbase profile=2.0 and deployed RM HA cluster along with ATSv2.0 enabled. - Ran sample jobs and verified all ATSv2.0 data are publishing. - Verified UI2 for ATSv2.0 data! -Rohith Sharma K S On Wed, 16 Jan 2019 at 07:00, Sunil G wrote: > Thanks everyone for helping to vote this release! > > With 7 binding votes, 10 non-binding votes and no veto, this vote stands > passed, > I'm going to work on staging the release. > > Thanks, > Sunil > > On Tue, Jan 15, 2019 at 9:59 PM Weiwei Yang wrote: > > > +1 (binding) > > > > - Setup a cluster, run teragen/terasort jobs > > - Verified general readability of documentation (titles/navigations) > > - Run some simple yarn commands: app/applicationattempt/container > > - Checked restful APIs: RM cluster/metrics/scheduler/nodes, NM > > node/apps/container > > - Verified simple failover scenario > > - Submitted distributed shell apps with affinity/anti-affinity > constraints > > - Configured conf based node attribute provider, alter attribute values > > and verified the change > > - Verified CLI add/list/remove node-attributes, submitted app with simple > > node-attribute constraint > > > > -- > > Weiwei > > >
Re: [Vote] Merge discussion for Node attribute support feature YARN-3409
+1 for merge -Rohith Sharma K S On Wed, 5 Sep 2018 at 18:01, Naganarasimha Garla < naganarasimha...@apache.org> wrote: > Hi All, > Thanks for feedback folks, based on the positive response starting > a Vote thread for merging YARN-3409 to master. > > Regards, > + Naga & Sunil > > On Wed, 5 Sep 2018 2:51 am Wangda Tan, wrote: > > > +1 for the merge, it gonna be a great addition to 3.2.0 release. Thanks > to > > everybody for pushing this feature to complete. > > > > Best, > > Wangda > > > > On Tue, Sep 4, 2018 at 8:25 AM Bibinchundatt > > wrote: > > > >> +1 for merge. Fetaure would be a good addition to 3.2 release. > >> > >> -- > >> Bibin A Chundatt > >> M: +91-9742095715 > >> E: bibin.chund...@huawei.com<mailto:bibin.chund...@huawei.com> > >> 2012实验室-印研IT&Cloud BU分部 > >> 2012 Laboratories-IT&Cloud BU Branch Dept. > >> From:Naganarasimha Garla > >> To:common-...@hadoop.apache.org,Hdfs-dev,yarn-...@hadoop.apache.org, > >> mapreduce-dev@hadoop.apache.org, > >> Date:2018-08-29 20:00:44 > >> Subject:[Discuss] Merge discussion for Node attribute support feature > >> YARN-3409 > >> > >> Hi All, > >> > >> We would like to hear your thoughts on merging “Node Attributes Support > in > >> YARN” branch (YARN-3409) [2] into trunk in a few weeks. The goal is to > get > >> it in for HADOOP 3.2. > >> > >> *Major work happened in this branch* > >> > >> YARN-6858. Attribute Manager to store and provide node attributes in RM > >> YARN-7871. Support Node attributes reporting from NM to RM( distributed > >> node attributes) > >> YARN-7863. Modify placement constraints to support node attributes > >> YARN-7875. Node Attribute store for storing and recovering attributes > >> > >> *Detailed Design:* > >> > >> Please refer [1] for detailed design document. > >> > >> *Testing Efforts:* > >> > >> We did detailed tests for the feature in the last few weeks. > >> This feature will be enabled only when Node Attributes constraints are > >> specified through SchedulingRequest from AM. > >> Manager implementation will help to store and recover Node Attributes. > >> This > >> works with existing placement constraints. > >> > >> *Regarding to API stability:* > >> > >> All newly added @Public APIs are @Unstable. > >> > >> Documentation jira [3] could help to provide detailed configuration > >> details. This feature works from end-to-end and we tested this in our > >> local > >> cluster. Branch code is run against trunk and tracked via [4]. > >> > >> We would love to get your thoughts before opening a voting thread. > >> > >> Special thanks to a team of folks who worked hard and contributed > towards > >> this efforts including design discussion / patch / reviews, etc.: Weiwei > >> Yang, Bibin Chundatt, Wangda Tan, Vinod Kumar Vavilappali, Konstantinos > >> Karanasos, Arun Suresh, Varun Saxena, Devaraj Kavali, Lei Guo, Chong > Chen. > >> > >> [1] : > >> > >> > https://issues.apache.org/jira/secure/attachment/12937633/Node-Attributes-Requirements-Design-doc_v2.pdf > >> [2] : https://issues.apache.org/jira/browse/YARN-3409 > >> [3] : https://issues.apache.org/jira/browse/YARN-7865 > >> [4] : https://issues.apache.org/jira/browse/YARN-8718 > >> > >> Thanks, > >> + Naga & Sunil Govindan > >> > > >
Re: [VOTE] Release Apache Hadoop 3.1.1 - RC0
+1 (binding) - Built from source and deployed 2 node RM HA non-secure cluster - ATSv2 enabled for profile hbase.profile=2.0. Enabled both ats 1.5 and 2.0. - Ran sample MR and DS applications - Verified for RM HA, Work preserving restart, NM work preserving restart. - Verified jobs for application priority and application timeout. - Accessed UI2 for all web pages. Verified for UI2+ATSv2 integration web pages. -Rohith Sharma K S On Fri, 3 Aug 2018 at 00:14, Wangda Tan wrote: > Hi folks, > > I've created RC0 for Apache Hadoop 3.1.1. The artifacts are available here: > > http://people.apache.org/~wangda/hadoop-3.1.1-RC0/ > > The RC tag in git is release-3.1.1-RC0: > https://github.com/apache/hadoop/commits/release-3.1.1-RC0 > > The maven artifacts are available via repository.apache.org at > https://repository.apache.org/content/repositories/orgapachehadoop-1139/ > > You can find my public key at > http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS > > This vote will run 5 days from now. > > 3.1.1 contains 435 [1] fixed JIRA issues since 3.1.0. > > I have done testing with a pseudo cluster and distributed shell job. My +1 > to start. > > Best, > Wangda Tan > > [1] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.1.1) > ORDER BY priority DESC >
Re: [VOTE] reset/force push to clean up inadvertent merge commit pushed to trunk
+1 On 5 July 2018 at 14:37, Subru Krishnan wrote: > Folks, > > There was a merge commit accidentally pushed to trunk, you can find the > details in the mail thread [1]. > > I have raised an INFRA ticket [2] to reset/force push to clean up trunk. > > Can we have a quick vote for INFRA sign-off to proceed as this is blocking > all commits? > > Thanks, > Subru > > [1] > http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201807.mbox/% > 3CCAHqguubKBqwfUMwhtJuSD7X1Bgfro_P6FV%2BhhFhMMYRaxFsF9Q% > 40mail.gmail.com%3E > [2] https://issues.apache.org/jira/browse/INFRA-16727 >
[jira] [Created] (MAPREDUCE-7106) RM Recovery is delayed much if ATS1/1.5 server is not running.
Rohith Sharma K S created MAPREDUCE-7106: Summary: RM Recovery is delayed much if ATS1/1.5 server is not running. Key: MAPREDUCE-7106 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7106 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S It is observed that if ATS1/1.5 daemon is not running, RM recovery is delayed as long as timeline client get timed out for each applications. By default, timed out will take around 5 mins. If completed applications are more then amount of time RM will wait is *(number of completed applications in a cluster * 5 minutes)* which is kind of hanged. Primary reason for this behavior is YARN-3044 YARN-4129 which refactor existing system metric publisher. This refactoring made appFinished event as synchronous which was asynchronous earlier. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 2.8.4 (RC0)
+1 (binding) - Downloaded source and built from it. Installed 2 node RM HA cluster. - Verified for RM HA, RM Restart, work preserving restart. - Ran sample MR jobs and Distributed shell with HA scenario. -Rohith Sharma K S On 8 May 2018 at 23:11, 俊平堵 wrote: > Hi all, > I've created the first release candidate (RC0) for Apache Hadoop > 2.8.4. This is our next maint release to follow up 2.8.3. It includes 77 > important fixes and improvements. > > The RC artifacts are available at: > http://home.apache.org/~junping_du/hadoop-2.8.4-RC0 > > The RC tag in git is: release-2.8.4-RC0 > > The maven artifacts are available via repository.apache.org< > http://repository.apache.org> at: > https://repository.apache.org/content/repositories/orgapachehadoop-1118 > > Please try the release and vote; the vote will run for the usual 5 > working days, ending on 5/14/2018 PST time. > > Thanks, > > Junping >
Re: [VOTE] Release Apache Hadoop 3.1.0 (RC1)
+1 (binding) * Downloaded source, built from source with -Dhbase.profile=2.0. * Installed RM HA cluster integrated with ATSv2. Installed HBase-2.0-beta1 for ATSv2 back end. Scheduler is configured with 2 level queue hierarchy. * Ran sample jobs such as MR/Distributed shell and verified for ** ATSv2 Timeline Reader REST API's. Validated for data published in ATSv2. ** YARN UI2 accessed for Flow Activity pages. Navigated inside flow activity page for all other info. ** YARN old UI also verified for all the pages ** RM HA/Restart/work-preserving-restart are tested while running a job. ** NM restart scenarios are verified. ** Application timeout and application priority feature is tested. Thanks & Regards Rohith Sharma K S On 30 March 2018 at 09:45, Wangda Tan wrote: > Hi folks, > > Thanks to the many who helped with this release since Dec 2017 [1]. We've > created RC1 for Apache Hadoop 3.1.0. The artifacts are available here: > > http://people.apache.org/~wangda/hadoop-3.1.0-RC1 > > The RC tag in git is release-3.1.0-RC1. Last git commit SHA is > 16b70619a24cdcf5d3b0fcf4b58ca77238ccbe6d > > The maven artifacts are available via repository.apache.org at > https://repository.apache.org/content/repositories/orgapachehadoop-1090/ > This vote will run 5 days, ending on Apr 3 at 11:59 pm Pacific. > > 3.1.0 contains 766 [2] fixed JIRA issues since 3.0.0. Notable additions > include the first class GPU/FPGA support on YARN, Native services, Support > rich placement constraints in YARN, S3-related enhancements, allow HDFS > block replicas to be provided by an external storage system, etc. > > For 3.1.0 RC0 vote discussion, please see [3]. > > We’d like to use this as a starting release for 3.1.x [1], depending on how > it goes, get it stabilized and potentially use a 3.1.1 in several weeks as > the stable release. > > We have done testing with a pseudo cluster: > - Ran distributed job. > - GPU scheduling/isolation. > - Placement constraints (intra-application anti-affinity) by using > distributed shell. > > My +1 to start. > > Best, > Wangda/Vinod > > [1] > https://lists.apache.org/thread.html/b3fb3b6da8b6357a68513a6dfd104b > c9e19e559aedc5ebedb4ca08c8@%3Cyarn-dev.hadoop.apache.org%3E > [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.1.0) > AND fixVersion not in (3.0.0, 3.0.0-beta1) AND status = Resolved ORDER BY > fixVersion ASC > [3] > https://lists.apache.org/thread.html/b3a7dc075b7329fd660f65b48237d7 > 2d4061f26f83547e41d0983ea6@%3Cyarn-dev.hadoop.apache.org%3E >
[jira] [Resolved] (MAPREDUCE-7055) MR jobs are failing with Could not find or load main class for MRAppMaster
[ https://issues.apache.org/jira/browse/MAPREDUCE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S resolved MAPREDUCE-7055. -- Resolution: Resolved Closing this JIRA as Resolved since YARN-7677 is reverted. > MR jobs are failing with Could not find or load main class for MRAppMaster > -- > > Key: MAPREDUCE-7055 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7055 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.1.0, 3.2.0 > Reporter: Rohith Sharma K S >Priority: Blocker > Attachments: app-logs.zip, conf.zip > > > It is observed that MR jobs are failing with *Error: Could not find or load > main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster" even though > HADOOP_MAPRED_HOME is set in mapred-site.xml > Tried building tar.gz in branch-3.0 and seems works fine with same > configurations. But in branch-3.1 and trunk, it is failing. I got > launch_container.sh for both and compared classpath exported before > launching AM. Both classpath entries are same, but AM launching is failing > with above mentioned error. > Its better to confirm as 3.1 release is going to happen soon. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7055) MR jobs are failing with Could not find or load main class for MRAppMaster
Rohith Sharma K S created MAPREDUCE-7055: Summary: MR jobs are failing with Could not find or load main class for MRAppMaster Key: MAPREDUCE-7055 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7055 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Sharma K S It is observed that MR jobs are failing with *Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster" even though HADOOP_MAPRED_HOME is set in mapred-site.xml Tried building tar.gz in branch-3.0 and seems works fine with same configurations. But in branch-3.1 and trunk, it is failing. I got launch_container.sh for both and compared classpath exported before launching AM. Both classpath entries are same, but AM launching is failing with above mentioned error. Its better to confirm as 3.1 release is going to happen soon. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: Apache Hadoop 3.0.1 Release plan
I have reduced the priority for YARN-5742 as we have work around. Also, I have removed target version since we do not have plans to handle this in coming releases. -Rohith Sharma K S On 8 February 2018 at 22:30, Lei Xu wrote: > Hi, Brahma > > Thanks for reminder. YARN-5742 does not look like a blocker to me. I > will create a RC right after HADOOP-14060. > > On Thu, Feb 8, 2018 at 7:35 AM, Kihwal Lee wrote: > > HADOOP-14060 is a blocker. Daryn will add more detail to the jira or to > > this thread. > > > > On Thu, Feb 8, 2018 at 7:01 AM, Brahma Reddy Battula < > brbapa...@gmail.com> > > wrote: > >> > >> Hi Eddy, > >> > >> HDFS-12990 got committed to 3.0.1,can we have RC for 3.0.1 (only > >> YARN-5742 > >> blocker is open ) ? > >> > >> > >> On Sat, Feb 3, 2018 at 12:40 AM, Chris Douglas > >> wrote: > >> > >> > On Fri, Feb 2, 2018 at 10:22 AM, Arpit Agarwal > >> > > >> > wrote: > >> > > Do you plan to roll an RC with an uncommitted fix? That isn't the > >> > > right > >> > approach. > >> > > >> > The fix will be committed to the release branch. We'll vote on the > >> > release, and if it receives a majority of +1 votes then it becomes > >> > 3.0.1. That's how the PMC decides how to move forward. In this case, > >> > that will also resolve whether or not it can be committed to trunk. > >> > > >> > If this logic is unpersuasive, then we can require a 2/3 majority to > >> > replace the codebase. Either way, the PMC will vote to define the > >> > consensus view when it is not emergent. > >> > > >> > > This issue has good visibility and enough discussion. > >> > > >> > Yes, it has. We always prefer consensus to voting, but when discussion > >> > reveals that complete consensus is impossible, we still need a way > >> > forward. This is rare, and usually reserved for significant changes > >> > (like merging YARN). Frankly, it's embarrassing to resort to it here, > >> > but here we are. > >> > > >> > > If there is a binding veto in effect then the change must be > >> > > abandoned. > >> > Else you should be able to proceed with committing. However, 3.0.0 > must > >> > be > >> > called out as an abandoned release if we commit it. > >> > > >> > This is not accurate. A binding veto from any committer halts > >> > progress, but the PMC sets the direction of the project. That includes > >> > making decisions that are not universally accepted. -C > >> > > >> > > On 2/1/18, 3:01 PM, "Lei Xu" wrote: > >> > > > >> > > Sounds good to me, ATM. > >> > > > >> > > On Thu, Feb 1, 2018 at 2:34 PM, Aaron T. Myers > >> > wrote: > >> > > > Hey Anu, > >> > > > > >> > > > My feeling on HDFS-12990 is that we've discussed it quite a > bit > >> > already and > >> > > > it doesn't seem at this point like either side is going to > >> > > budge. > >> > I'm > >> > > > certainly happy to have a phone call about it, but I don't > >> > > expect > >> > that we'd > >> > > > make much progress. > >> > > > > >> > > > My suggestion is that we simply include the patch posted to > >> > HDFS-12990 in > >> > > > the 3.0.1 RC and call this issue out clearly in the subsequent > >> > VOTE thread > >> > > > for the 3.0.1 release. Eddy, are you up for that? > >> > > > > >> > > > Best, > >> > > > Aaron > >> > > > > >> > > > On Thu, Feb 1, 2018 at 1:13 PM, Lei Xu > wrote: > >> > > >> > >> > > >> +Xiao > >> > > >> > >> > > >> My understanding is that we will have this for 3.0.1. Xiao, > >> > could > >> > > >> you give your inputs here? > >> > > >> > >> > > >> On Thu, Feb 1, 2018 at 11:55 AM, Anu Engineer < > >> > aengin...@hortonworks.com> > >> > > >> wrote:
Re: [VOTE] Release Apache Hadoop 3.0.0 RC1
+1 (binding) - built from source and deployed 3 node cluster - installed RM HA cluster along with ATSv2 enabled and new YARN UI. - verified for -- RM HA switch / RM Restart / RM Work preserving restart -- NM work preserving restart -- Ran sample MR jobs and Distributed shell along with multiple RM and NM switch - verified for ATSv2 entities data, REST API's validation as HBase-1.2.6 as back end. - verified for priority. timeout feature of RM. - verified for new YARN UI and pages along with atsv2 integration Thanks & Regards Rohith Sharma K S On 9 December 2017 at 02:01, Andrew Wang wrote: > Hi all, > > Let me start, as always, by thanking the efforts of all the contributors > who contributed to this release, especially those who jumped on the issues > found in RC0. > > I've prepared RC1 for Apache Hadoop 3.0.0. This release incorporates 302 > fixed JIRAs since the previous 3.0.0-beta1 release. > > You can find the artifacts here: > > http://home.apache.org/~wang/3.0.0-RC1/ > > I've done the traditional testing of building from the source tarball and > running a Pi job on a single node cluster. I also verified that the shaded > jars are not empty. > > Found one issue that create-release (probably due to the mvn deploy change) > didn't sign the artifacts, but I fixed that by calling mvn one more time. > Available here: > > https://repository.apache.org/content/repositories/orgapachehadoop-1075/ > > This release will run the standard 5 days, closing on Dec 13th at 12:31pm > Pacific. My +1 to start. > > Best, > Andrew >
Re: [VOTE] Release Apache Hadoop 2.7.5 (RC1)
+1 (binding) Built from source and deployed non-secure 3 node cluster Verified for - RM HA/RM Restart/RM work preserving restart - Ran sample MR and Distributed shell jobs. Thanks & Regards Rohith Sharma K S On 8 December 2017 at 08:52, Konstantin Shvachko wrote: > Hi everybody, > > I updated CHANGES.txt and fixed documentation links. > Also committed MAPREDUCE-6165, which fixes a consistently failing test. > > This is RC1 for the next dot release of Apache Hadoop 2.7 line. The > previous one 2.7.4 was release August 4, 2017. > Release 2.7.5 includes critical bug fixes and optimizations. See more > details in Release Note: > http://home.apache.org/~shv/hadoop-2.7.5-RC1/releasenotes.html > > The RC0 is available at: http://home.apache.org/~shv/hadoop-2.7.5-RC1/ > > Please give it a try and vote on this thread. The vote will run for 5 days > ending 12/13/2017. > > My up to date public key is available from: > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS > > Thanks, > --Konstantin >
Re: [VOTE] Release Apache Hadoop 2.8.3 (RC0)
+1 (Binding) Built from source and deployed non-secure 3 node cluster Verified for - RM HA/RM Restart/RM work preserving restart - NM work preserving restart - Ran sample MR and Distributed shell jobs. Thanks & Regards Rohith Sharma K S On 5 December 2017 at 15:28, Junping Du wrote: > Hi all, > I've created the first release candidate (RC0) for Apache Hadoop > 2.8.3. This is our next maint release to follow up 2.8.2. It includes 79 > important fixes and improvements. > > The RC artifacts are available at: http://home.apache.org/~ > junping_du/hadoop-2.8.3-RC0 > > The RC tag in git is: release-2.8.3-RC0 > > The maven artifacts are available via repository.apache.org at: > https://repository.apache.org/content/repositories/orgapachehadoop-1072 > > Please try the release and vote; the vote will run for the usual 5 > working days, ending on 12/12/2017 PST time. > > Thanks, > > Junping >
Re: [VOTE] Merge Absolute resource configuration support in Capacity Scheduler (YARN-5881) to trunk
+1 On Nov 30, 2017 7:26 AM, "Sunil G" wrote: > Hi All, > > > Based on the discussion at [1], I'd like to start a vote to merge feature > branch > > YARN-5881 to trunk. Vote will run for 7 days, ending Wednesday Dec 6 at > 6:00PM PDT. > > > This branch adds support to configure queue capacity as absolute resource > in > > capacity scheduler. This will help admins who want fine control of > resources of queues. > > > Feature development is done at YARN-5881 [2], jenkins build is here > (YARN-7510 [3]). > > All required tasks for this feature are committed. This feature changes > RM’s Capacity Scheduler only, > > and we did extensive tests for the feature in the last couple of months > including performance tests. > > > Key points: > > - The feature is turned off by default, and have to configure absolute > resource to enable same. > > - Detailed documentation about how to use this feature is done as part of > [4]. > > - No major performance degradation is observed with this branch work. SLS > and UT performance > > tests are done. > > > There were 11 subtasks completed for this feature. > > > Huge thanks to everyone who helped with reviews, commits, guidance, and > > technical discussion/design, including Wangda Tan, Vinod Vavilapalli, > Rohith Sharma K S, Eric Payne . > > > [1] : > http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201711.mbox/% > 3CCACYiTuhKhF1JCtR7ZFuZSEKQ4sBvN_n_tV5GHsbJ3YeyJP%2BP4Q% > 40mail.gmail.com%3E > > [2] : https://issues.apache.org/jira/browse/YARN-5881 > > [3] : https://issues.apache.org/jira/browse/YARN-7510 > > [4] : https://issues.apache.org/jira/browse/YARN-7533 > > > Regards > > Sunil and Wangda >
Re: [VOTE] Release Apache Hadoop 3.0.0 RC0
Hi Vinod/Allen bq. We need to figure out if this V1 TimelineService should even be support given ATSv2. ATSv2 is in alpha phase. We should continue to support Timeline Service V1 till we have the detailed entity level ACLs in V2. And also there are proposal to upgrade/migration paths from TSv1 to TSv2. bq. If ATSv1 isn’t replaced by ATSv2, then why is it marked deprecated? Ideally it should not be. Can you point out where it is marked as deprecated? If it is in historyserver daemon start, that change made very long back when timeline server added. Thanks & Regards Rohith Sharma K S On 26 November 2017 at 03:28, Allen Wittenauer wrote: > > > On Nov 21, 2017, at 2:16 PM, Vinod Kumar Vavilapalli > wrote: > > > >>> - $HADOOP_YARN_HOME/sbin/yarn-daemon.sh start historyserver doesn't > even work. Not just deprecated in favor of timelineserver as was advertised. > >> > >> This works for me in trunk and the bash code doesn’t appear to > have changed in a very long time. Probably something local to your > install. (I do notice that the deprecation message says “starting” which > is awkward when the stop command is given though.) Also: is the > deprecation message even true at this point? > > > > > > Sorry, I mischaracterized the problem. > > > > The real issue is that I cannot use this command line when the MapReduce > JobHistoryServer is already started on the same machine. > > The specific string is: > > hadoop-${HADOOP_IDENT_STRING}-${HADOOP_SUBCMD}.pid > > More specifically, the pid handling code will conflict if the > following are true: > > * same machine (obviously) > * same subcommand name > * same HADOOP_IDENT_USER: which by default is the user > name of whatever starts it… but was designed to be overridden way back in > hadoop 0.X. > > … which means for most production setups, this is probably not > real a problem. > > > > So, it looks like in shell-scripts, there can ever be only one daemon of > a given name, irrespective of which daemon scripts are invoked. > > Correct. Naming multiple, different daemons the same thing is > extremely anti-user. In fact, I thought this was originally about the > “other” history server. > > > > > We need to figure out two things here > > (a) The behavior of this command. Clearly, it will conflict with the > MapReduce JHS - only one of them can be started on the same node. > > … by the same user, by default. Started by a different user or > different HADOOP_IDENT_USER, it will come up just fine. > > > (b) We need to figure out if this V1 TimelineService should even be > support given ATSv2. > > If ATSv1 isn’t replaced by ATSv2, then why is it marked deprecated? > > > On Nov 22, 2017, at 9:45 AM, Brahma Reddy Battula > wrote: > > > > 1) Change the name > > 2) Create PID based on the CLASS Name, here applicationhistoryserver and > jobhistoryserver > > 3) Use same as branch-2.9..i.e suffixing with mapred or yarn > > > > > > @allen, any thoughts on this..? > > Using the classname works in this instance, but just as we saw > with the router daemons, people tend to use the same class names when > building different components. It also means that if different daemons can > be started in different ways from the same class dependent upon options, > this conflict will still exist. Also, with dynamic commands, it is very > possible to run the same daemon from multiple start points. > > As part of this discussion, I think it’s important to recognize: > > a) This is likely to be primarily impacting developers. > b) We’re talking about two daemons where one has been deprecated. > c) Calling two different daemons “history server” is just awful from an > end user perspective. > d) There is already a work around in place if one absolutely needs to run > both on the same node as the same user, just as people do with datanode and > nodemanager today. > > > > - > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > >
Re: [DISCUSS] Merge Absolute resource configuration support in Capacity Scheduler (YARN-5881) to trunk
+1, thanks Sunil for working on this feature! -Rohith Sharma K S On 24 November 2017 at 23:19, Sunil G wrote: > Hi All, > > We would like to bring up the discussion of merging “absolute min/max > resources support in capacity scheduler” branch (YARN-5881) [2] into trunk > in a few weeks. The goal is to get it in for Hadoop 3.1. > > *Major work happened in this branch* > >- YARN-6471. Support to add min/max resource configuration for a queue >- YARN-7332. Compute effectiveCapacity per each resource vector >- YARN-7411. Inter-Queue preemption's computeFixpointAllocation need to >handle absolute resources. > > *Regarding design details* > > Please refer [1] for detailed design document. > > *Regarding to testing:* > > We did extensive tests for the feature in the last couple of months. > Comparing to latest trunk. > > - For SLS benchmark: We didn't see observable performance gap from > simulated test based on 8K nodes SLS traces (1 PB memory). We got 3k+ > containers allocated per second. > > - For microbenchmark: We use performance test cases added by YARN 6775, it > did not show much performance regression comparing to trunk. > > *YARN-5881* <https://issues.apache.org/jira/browse/YARN-5881> > > #ResourceTypes = 2. Avg of fastest 20: 55294.52 > #ResourceTypes = 2. Avg of fastest 20: 55401.66 > > *trunk* > #ResourceTypes = 2. Avg of fastest 20: 55865.92 > #ResourceTypes = 2. Avg of fastest 20: 55096.418 > > *Regarding to API stability:* > > All newly added @Public APIs are @Unstable. > > Documentation jira [3] could help to provide detailed configuration > details. This feature works from end-to-end and we are running this in our > development cluster for last couple of months and undergone good amount of > testing. Branch code is run against trunk and tracked via [4]. > > We would love to get your thoughts before opening a voting thread. > > Special thanks to a team of folks who worked hard and contributed towards > this efforts including design discussion / patch / reviews, etc.: Wangda > Tan, Vinod Kumar Vavilappali, Rohith Sharma K S. > > [1] : > https://issues.apache.org/jira/secure/attachment/ > 12855984/YARN-5881.Support.Absolute.Min.Max.Resource.In. > Capacity.Scheduler.design-doc.v1.pdf > [2] : https://issues.apache.org/jira/browse/YARN-5881 > > [3] : https://issues.apache.org/jira/browse/YARN-7533 > > [4] : https://issues.apache.org/jira/browse/YARN-7510 > > Thanks, > > Sunil G and Wangda Tan >
[jira] [Created] (MAPREDUCE-7014) Fix java doc errors
Rohith Sharma K S created MAPREDUCE-7014: Summary: Fix java doc errors Key: MAPREDUCE-7014 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7014 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Sharma K S Trunk compilation fails with Java Doc errors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 2.9.0 (RC3)
+1(binding) * Downloaded source and verified for checksum. Built from source and deployed RM HA Non-Secure cluster along with Atsv2 and yarn-ui enabled! * Tested for ** RM HA/Restart/Work-preserving restart and NM work preserving restart scenarios ** Application priority and Application Timeout features are verified ** Ran sample MR and Distributed jobs to test HA/work-preserving restart scenarios/priority/lifetime features ** Verified for ATSv2 data using REST queries. ** Accessed new yarn-ui for data validations. All the pages and links are verified. * Also integrated JCarder tool to verify all above cases to ensure no deadlock cycles are found in basic scenarios. -Rohith Sharma K S On 14 November 2017 at 05:40, Arun Suresh wrote: > Hi Folks, > > Apache Hadoop 2.9.0 is the first release of Hadoop 2.9 line and will be the > starting release for Apache Hadoop 2.9.x line - it includes 30 New Features > with 500+ subtasks, 407 Improvements, 790 Bug fixes new fixed issues since > 2.8.2. > > More information about the 2.9.0 release plan can be found here: > *https://cwiki.apache.org/confluence/display/HADOOP/ > Roadmap#Roadmap-Version2.9 > <https://cwiki.apache.org/confluence/display/HADOOP/ > Roadmap#Roadmap-Version2.9>* > > New RC is available at: *https://home.apache.org/~ > asuresh/hadoop-2.9.0-RC3/ > <https://home.apache.org/~asuresh/hadoop-2.9.0-RC3/>* > > The RC tag in git is: release-2.9.0-RC3, and the latest commit id is: > 756ebc8394e473ac25feac05fa493f6d612e6c50. > > The maven artifacts are available via repository.apache.org at: > <https://www.google.com/url?q=https%3A%2F%2Frepository. > apache.org%2Fcontent%2Frepositories%2Forgapachehadoop-1066&sa=D& > sntz=1&usg=AFQjCNFcern4uingMV_sEreko_zeLlgdlg>*https:// > repository.apache.org/content/repositories/orgapachehadoop-1068/ > <https://repository.apache.org/content/repositories/orgapachehadoop-1068/ > >* > > We are carrying over the votes from the previous RC given that the delta is > the license fix. > > Given the above - we are also going to stick with the original deadline for > the vote : ending on Friday 17th November 2017 2pm PT time. > > Thanks, > -Arun/Subru >
Re: [VOTE] Release Apache Hadoop 2.9.0 (RC2)
+1(binding) * Downloaded source and verified for checksum. Built from source and deployed RM HA Non-Secure cluster along with Atsv2 and yarn-ui enabled! * Tested for ** RM HA/Restart/Work-preserving restart and NM work preserving restart scenarios ** Application priority and Application Timeout features are verified ** Ran sample MR and Distributed jobs to test HA/work-preserving restart scenarios/priority/lifetime features ** Verified for ATSv2 data using REST queries. ** Accessed new yarn-ui for data validations. All the pages and links are verified. * Also integrated JCarder tool to verify all above cases to ensure no deadlock cycles are found in basic scenarios. - Rohith Sharma K S On 13 November 2017 at 03:01, Subru Krishnan wrote: > Hi Folks, > > Apache Hadoop 2.9.0 is the first release of Hadoop 2.9 line and will be the > starting release for Apache Hadoop 2.9.x line - it includes 30 New Features > with 500+ subtasks, 407 Improvements, 790 Bug fixes new fixed issues since > 2.8.2. > > More information about the 2.9.0 release plan can be found here: > *https://cwiki.apache.org/confluence/display/HADOOP/ > Roadmap#Roadmap-Version2.9 > <https://cwiki.apache.org/confluence/display/HADOOP/ > Roadmap#Roadmap-Version2.9>* > > New RC is available at: http://home.apache.org/~asuresh/hadoop-2.9.0-RC2/ > <http://www.google.com/url?q=http%3A%2F%2Fhome.apache.org% > 2F~asuresh%2Fhadoop-2.9.0-RC1%2F&sa=D&sntz=1&usg= > AFQjCNE7BF35IDIMZID3hPqiNglWEVsTpg> > > The RC tag in git is: release-2.9.0-RC2, and the latest commit id is: > 1eb05c1dd48fbc9e4b375a76f2046a59103bbeb1. > > The maven artifacts are available via repository.apache.org at: > https://repository.apache.org/content/repositories/orgapachehadoop-1067/ > <https://www.google.com/url?q=https%3A%2F%2Frepository. > apache.org%2Fcontent%2Frepositories%2Forgapachehadoop-1066&sa=D& > sntz=1&usg=AFQjCNFcern4uingMV_sEreko_zeLlgdlg> > > Please try the release and vote; the vote will run for the usual 5 days, > ending on Friday 17th November 2017 2pm PT time. > > We want to give a big shout out to Sunil, Varun, Rohith, Wangda, Vrushali > and Inigo for the extensive testing/validation which helped prepare for > RC2. Do report your results in this vote as it'll be very useful to the > entire community. > > Thanks, > -Subru/Arun >
Re: [VOTE] Release Apache Hadoop 2.9.0 (RC0)
Thanks Sunil for confirmation. Btw, I have raised YARN-7453 <https://issues.apache.org/jira/browse/YARN-7453> JIRA to track this issue. - Rohith Sharma K S On 7 November 2017 at 16:44, Sunil G wrote: > Hi Subru and Arun. > > Thanks for driving 2.9 release. Great work! > > I installed cluster built from source. > - Ran few MR jobs with application priority enabled. Runs fine. > - Accessed new UI and it also seems fine. > > However I am also getting same issue as Rohith reported. > - Started an HA cluster > - Pushed RM to standby > - Pushed back RM to active then seeing an exception. > > org.apache.hadoop.ha.ServiceFailedException: RM could not transition to > Active > at > org.apache.hadoop.yarn.server.resourcemanager. > ActiveStandbyElectorBasedElectorServic > e.becomeActive(ActiveStandbyElectorBasedElectorService.java:146) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive( > ActiveStandbyElector.java:894 > ) > > Caused by: org.apache.zookeeper.KeeperException$NoAuthException: > KeeperErrorCode = NoAuth > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:113) > at org.apache.zookeeper.ZooKeeper.multiInternal( > ZooKeeper.java:949) > > Will check and post more details, > > - Sunil > > > On Tue, Nov 7, 2017 at 12:47 PM Rohith Sharma K S < > rohithsharm...@apache.org> > wrote: > > > Thanks Subru/Arun for the great work! > > > > Downloaded source and built from it. Deployed RM HA non-secured cluster > > along with new YARN UI and ATSv2. > > > > I am facing basic RM HA switch issue after first time successful start. > > *Can > > anyone else is facing this issue?* > > > > When RM is switched from ACTIVE to STANDBY to ACTIVE, RM never switch to > > active successfully. Exception trace I see from the log is > > > > 2017-11-07 12:35:56,540 WARN org.apache.hadoop.ha.ActiveStandbyElector: > > Exception handling the winning of election > > org.apache.hadoop.ha.ServiceFailedException: RM could not transition to > > Active > > at > > > > org.apache.hadoop.yarn.server.resourcemanager. > ActiveStandbyElectorBasedElectorService.becomeActive( > ActiveStandbyElectorBasedElectorService.java:146) > > at > > > > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive( > ActiveStandbyElector.java:894) > > at > > > > org.apache.hadoop.ha.ActiveStandbyElector.processResult( > ActiveStandbyElector.java:473) > > at > > > > org.apache.zookeeper.ClientCnxn$EventThread. > processEvent(ClientCnxn.java:599) > > at org.apache.zookeeper.ClientCnxn$EventThread.run( > ClientCnxn.java:498) > > Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when > > transitioning to Active mode > > at > > > > org.apache.hadoop.yarn.server.resourcemanager.AdminService. > transitionToActive(AdminService.java:325) > > at > > > > org.apache.hadoop.yarn.server.resourcemanager. > ActiveStandbyElectorBasedElectorService.becomeActive( > ActiveStandbyElectorBasedElectorService.java:144) > > ... 4 more > > Caused by: org.apache.hadoop.service.ServiceStateException: > > org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = > > NoAuth > > at > > > > org.apache.hadoop.service.ServiceStateException.convert( > ServiceStateException.java:105) > > at > > org.apache.hadoop.service.AbstractService.start( > AbstractService.java:205) > > at > > > > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager. > startActiveServices(ResourceManager.java:1131) > > at > > > > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run( > ResourceManager.java:1171) > > at > > > > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run( > ResourceManager.java:1167) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:422) > > at > > > > org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1886) > > at > > > > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager. > transitionToActive(ResourceManager.java:1167) > > at > > > > org.apache.hadoop.yarn.server.resourcemanager.AdminService. > transitionToActive(AdminService.java:320) > > ... 5 more > > Caused by: org.apache.zookeeper.KeeperException$NoAuthException: > > KeeperErrorCode = NoAuth > &g
Re: [VOTE] Release Apache Hadoop 2.9.0 (RC0)
Thanks Subru/Arun for the great work! Downloaded source and built from it. Deployed RM HA non-secured cluster along with new YARN UI and ATSv2. I am facing basic RM HA switch issue after first time successful start. *Can anyone else is facing this issue?* When RM is switched from ACTIVE to STANDBY to ACTIVE, RM never switch to active successfully. Exception trace I see from the log is 2017-11-07 12:35:56,540 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of election org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:146) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:894) at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:473) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when transitioning to Active mode at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:325) at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144) ... 4 more Caused by: org.apache.hadoop.service.ServiceStateException: org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:205) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1131) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1171) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1886) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1167) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:320) ... 5 more Caused by: org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth at org.apache.zookeeper.KeeperException.create(KeeperException.java:113) at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:949) at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915) at org.apache.curator.framework.imps.CuratorTransactionImpl.doOperation(CuratorTransactionImpl.java:159) at org.apache.curator.framework.imps.CuratorTransactionImpl.access$200(CuratorTransactionImpl.java:44) at org.apache.curator.framework.imps.CuratorTransactionImpl$2.call(CuratorTransactionImpl.java:129) at org.apache.curator.framework.imps.CuratorTransactionImpl$2.call(CuratorTransactionImpl.java:125) at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) at org.apache.curator.framework.imps.CuratorTransactionImpl.commit(CuratorTransactionImpl.java:122) at org.apache.hadoop.util.curator.ZKCuratorManager$SafeTransaction.commit(ZKCuratorManager.java:403) at org.apache.hadoop.util.curator.ZKCuratorManager.safeSetData(ZKCuratorManager.java:372) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.getAndIncrementEpoch(ZKRMStateStore.java:493) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:754) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ... 13 more Thanks & Regards Rohith Sharma K S On 4 November 2017 at 04:20, Arun Suresh wrote: > Hi folks, > > Apache Hadoop 2.9.0 is the first stable release of Hadoop 2.9 line and > will be the latest stable/production release for Apache Hadoop - it > includes 30 New Features with 500+ subtasks, 407 Improvements, 787 Bug > fixes new fixed issues since 2.8.2 . > > More information about the 2.9.0 release plan can be found here: > *https://cwiki.apache.org/confluence/display/HADOOP/ > Roadmap#Roadmap-Version2.9 > <https://cwiki.apache.org/confluence/display/HADOOP/ > Roadmap#Roadmap-Version2.9>* > > New RC is available at: > http://home.apache.org/~asuresh/hadoop-2.9.0-RC0/ > > The RC tag in git is: release-2.9.0-RC0, and the latest commit id is: > 6697f0c18b12f1bdb99cbdf81394091f4fef1f0a > > The maven artifacts are available via repository.apache.org at: > *https://repository.apache.org/co
Re: [VOTE] Merge yarn-native-services branch into trunk
+1 (binding) thanks Jian for all the great work! Built from branch and deployed, able to bring up services along with atsv2 enabled and new YARN UI integration. Tried flexing, start, stop operations using REST api's. - Rohith Sharma K S On 31 October 2017 at 01:36, Jian He wrote: > Hi All, > > I would like to restart the vote for merging yarn-native-services to trunk. > Since last vote, we have been working on several issues in documentation, > DNS, CLI modifications etc. We believe now the feature is in a much better > shape. > > Some back ground: > At a high level, the following are the key feautres implemented. > - YARN-5079[1]. A native YARN framework (ApplicationMaster) to orchestrate > existing services to YARN either docker or non-docker based. > - YARN-4793[2]. A Rest API service embeded in RM (optional) for user to > deploy a service via a simple JSON spec > - YARN-4757[3]. Extending today's service registry with a simple DNS > service to enable users to discover services deployed on YARN via standard > DNS lookup > - YARN-6419[4]. UI support for native-services on the new YARN UI > All these new services are optional and are sitting outside of the > existing system, and have no impact on existing system if disabled. > > Special thanks to a team of folks who worked hard towards this: Billie > Rinaldi, Gour Saha, Vinod Kumar Vavilapalli, Jonathan Maron, Rohith Sharma > K S, Sunil G, Akhil PB, Eric Yang. This effort could not be possible > without their ideas and hard work. > Also thanks Allen for some review and verifications. > > Thanks, > Jian > > [1] https://issues.apache.org/jira/browse/YARN-5079 > [2] https://issues.apache.org/jira/browse/YARN-4793 > [3] https://issues.apache.org/jira/browse/YARN-4757 > [4] https://issues.apache.org/jira/browse/YARN-6419 >
Re: [VOTE] Release Apache Hadoop 3.0.0-beta1 RC0
+1 (binding) Built from source and deployed YARN HA cluster with ATSv2 enabled in non-secured cluster. - tested for RM HA/work-preservring-restart/ NM-work-preserving restart for ATSv2 entities. - verified all ATSv2 REST end points to retrieve the entities - ran sample MR jobs and distributed jobs Thanks & Regards Rohith Sharma K S On 4 October 2017 at 05:31, Andrew Wang wrote: > Thanks everyone for voting! With 4 binding +1s and 7 non-binding +1s, the > vote passes. > > I'll get started on pushing out the release. > > Best, > Andrew > > On Tue, Oct 3, 2017 at 3:45 PM, Aaron Fabbri wrote: > > > +1 > > > > Built from source. Ran S3A integration tests in us-west-2 with S3Guard > > (both Local and Dynamo metadatastore). > > > > Everything worked fine except I hit one integration test failure. It is > a > > minor test issue IMO and I've filed HADOOP-14927 > > > > Failed tests: > > ITestS3GuardToolDynamoDB>AbstractS3GuardToolTestBase. > testDestroyNoBucket:228 > > Expected an exception, got 0 > > ITestS3GuardToolLocal>AbstractS3GuardToolTestBase. > testDestroyNoBucket:228 > > Expected an exception, got 0 > > > > > > > > On Tue, Oct 3, 2017 at 2:45 PM, Ajay Kumar > > wrote: > > > >> +1 (non-binding) > >> > >> - built from source > >> - deployed on single node cluster > >> - Basic hdfs operations > >> - Run wordcount on a text file > >> Thanks, > >> Ajay > >> > >> > >> On 10/3/17, 1:04 PM, "Eric Badger" wrote: > >> > >> +1 (non-binding) > >> > >> - Verified all checksums and signatures > >> - Built native from source on macOS 10.12.6 and RHEL 7.1 > >> - Deployed a single node pseudo cluster > >> - Ran pi and sleep jobs > >> - Verified Docker was marked as experimental > >> > >> Thanks, > >> > >> Eric > >> > >> On Tue, Oct 3, 2017 at 1:41 PM, John Zhuge > >> wrote: > >> > >> > +1 (binding) > >> > > >> >- Verified checksums and signatures of all tarballs > >> >- Built source with native, Java 1.8.0_131-b11 on Mac OS X > >> 10.12.6 > >> >- Verified cloud connectors: > >> > - All S3A integration tests > >> > - All ADL live unit tests > >> >- Deployed both binary and built source to a pseudo cluster, > >> passed the > >> >following sanity tests in insecure, SSL, and SSL+Kerberos mode: > >> > - HDFS basic and ACL > >> > - DistCp basic > >> > - MapReduce wordcount (only failed in SSL+Kerberos mode for > >> binary > >> > tarball, probably unrelated) > >> > - KMS and HttpFS basic > >> > - Balancer start/stop > >> > > >> > Hit the following errors but they don't seem to be blocking: > >> > > >> > == Missing dependencies during build == > >> > > >> > > ERROR: hadoop-aliyun has missing dependencies: > json-lib-jdk15.jar > >> > > ERROR: hadoop-azure has missing dependencies: > >> jetty-util-ajax-9.3.19. > >> > > v20170502.jar > >> > > ERROR: hadoop-azure-datalake has missing dependencies: > >> okhttp-2.4.0.jar > >> > > ERROR: hadoop-azure-datalake has missing dependencies: > >> okio-1.4.0.jar > >> > > >> > > >> > Filed HADOOP-14923, HADOOP-14924, and HADOOP-14925. > >> > > >> > == Unit tests failed in Kerberos+SSL mode for KMS and HttpFs > >> default HTTP > >> > servlet /conf, /stacks, and /logLevel == > >> > > >> > One example below: > >> > > >> > >Connecting to > >> > > https://localhost:14000/logLevel?log=org.apache.hadoop.fs. > >> http.server. > >> > HttpFSServer > >> > >Exception in thread "main" > >> > > org.apache.hadoop.security.authentication.client. > >> > AuthenticationException: > >> > > Authentication failed, URL: > >> > > https://localhost:14000/logLevel?log=org.apache.hadoop.fs. > >> http.server. > >> > HttpFSServer&user.name=jzhuge, > >> &
Re: [DISCUSS] Branches and versions for Hadoop 3
On 29 August 2017 at 06:24, Andrew Wang wrote: > So far I've seen no -1's to the branching proposal, so I plan to execute > this tomorrow unless there's further feedback. > For on going branch merge threads i.e TSv2, voting will be closing tomorrow. Does it end up in merging into trunk(3.1.0-SNAPSHOT) and branch-3.0(3.0.0-beta1-SNAPSHOT) ? If so, would you be able to wait for couple of more days before creating branch-3.0 so that TSv2 branch merge would be done directly to trunk? > > Regarding the above discussion, I think Jason and I have essentially the > same opinion. > > I hope that keeping trunk a release branch means a higher bar for merges > and code review in general. In the past, I've seen some patches committed > to trunk-only as a way of passing responsibility to a future user or > reviewer. That doesn't help anyone; patches should be committed with the > intent of running them in production. > > I'd also like to repeat the above thanks to the many, many contributors > who've helped with release improvements. Allen's work on create-release and > automated changes and release notes were essential, as was Xiao's work on > LICENSE and NOTICE files. I'm also looking forward to Marton's site > improvements, which addresses one of the remaining sore spots in the > release process. > > Things have gotten smoother with each alpha we've done over the last year, > and it's a testament to everyone's work that we have a good probability of > shipping beta and GA later this year. > > Cheers, > Andrew > >
Re: [VOTE] Merge feature branch YARN-5355 (Timeline Service v2) to trunk
+1 (binding) Thank you very much for the great team work! Built from source and deployed in secured cluster. The below are the test result. Deployment : Standard hadoop security deployment authentication and authorization as well. Branch-2 Hadoop and Hbase security cluster. Branch-3 Hadoop security cluster. HBase client is pointing to Branch-2 hbase cluster. All security configurations are set in-place. Each service is running with its own user. Say, HDFS is running with hdfs, YARN user is running with yarn, Hbase is running with hbase Smoke test user : test-user Test Cases : Authentication : Verify for all daemons start up successful : OK Run a MR job using test-user : OK Verify for REST API’s with in the scope of application : OK Verify for REST API’s newly added I.e outside scope of application : OK. RM Restart/ NM restart / RM_work-preserving restart has executed and verified for data : OK. (Entity validation is done, but not entity data validation! Token redistribution to AM, NM is verified. Authorization : 1 . Basic whitelisting of users to read has been validated. Works as expected! Disabling TSv2 configuration is also being tested. Thanks & Regards Rohith Sharma K S On 22 August 2017 at 12:02, Vrushali Channapattan wrote: > Hi folks, > > Per earlier discussion [1], I'd like to start a formal vote to merge > feature branch YARN-5355 [2] (Timeline Service v.2) to trunk. The vote will > run for 7 days, and will end August 29 11:00 PM PDT. > > We have previously completed one merge onto trunk [3] and Timeline Service > v2 has been part of Hadoop release 3.0.0-alpha1. > > Since then, we have been working on extending the capabilities of Timeline > Service v2 in a feature branch [2] for a while, and we are reasonably > confident that the state of the feature meets the criteria to be merged > onto trunk and we'd love folks to get their hands on it in a test capacity > and provide valuable feedback so that we can make it production-ready. > > In a nutshell, Timeline Service v.2 delivers significant scalability and > usability improvements based on a new architecture. What we would like to > merge to trunk is termed "alpha 2" (milestone 2). The feature has a > complete end-to-end read/write flow with security and read level > authorization via whitelists. You should be able to start setting it up and > testing it. > > At a high level, the following are the key features that have been > implemented since alpha1: > - Security via Kerberos Authentication and delegation tokens > - Read side simple authorization via whitelist > - Client configurable entity sort ordering > - Richer REST APIs for apps, app attempts, containers, fetching metrics by > timerange, pagination, sub-app entities > - Support for storing sub-application entities (entities that exist outside > the scope of an application) > - Configurable TTLs (time-to-live) for tables, configurable table prefixes, > configurable hbase cluster > - Flow level aggregations done as dynamic (table level) coprocessors > - Uses latest stable HBase release 1.2.6 > > There are a total of 82 subtasks that were completed as part of this > effort. > > We paid close attention to ensure that once disabled Timeline Service v.2 > does not impact existing functionality when disabled (by default). > > Special thanks to a team of folks who worked hard and contributed towards > this effort with patches, reviews and guidance: Rohith Sharma K S, Varun > Saxena, Haibo Chen, Sangjin Lee, Li Lu, Vinod Kumar Vavilapalli, Joep > Rottinghuis, Jason Lowe, Jian He, Robert Kanter, Micheal Stack. > > Regards, > Vrushali > > [1] http://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg27383.html > [2] https://issues.apache.org/jira/browse/YARN-5355 > [3] https://issues.apache.org/jira/browse/YARN-2928 > [4] https://github.com/apache/hadoop/commits/YARN-5355 >
Re: Branch merges and 3.0.0-beta1 scope
On 25 August 2017 at 22:39, Andrew Wang wrote: > Hi Rohith, > > Given that we're advertising TSv2 as an alpha feature, I think we're > allowed to break compatibility. Let's make sure this is clear in the > release notes and documentation. > > That said, with TSv2 phase 2, is the API going to be frozen? The umbrella > JIRA refers to "TSv2 alpha2" which indicated to me it was still alpha-level > quality and stability. > YES, We have decided to freeze API's. I do not think we make any compatibility break in future. > > Best, > Andrew >
Re: Branch merges and 3.0.0-beta1 scope
Hi Andrew Thanks for update on release plan! I would like to discuss specifically regarding compatibility of releases. What is the compatibility to be maintained for GA if we don't merge to beta1 release? IIUC, till now all the releases were alpha where compatibility was not that important. All the public interfaces are subjected to modifications. Once we release beta, compatibility would be a matter. During this gap i.e between beta-GA release, should we maintain compatibility ? If my understanding is right then TSv2 have to be merged with beta1 release. In TSv2 phase-2, we have compatibility changes from phase-1. Thanks & Regards Rohith Sharma K S On 25 August 2017 at 02:03, Andrew Wang wrote: > Glad to see the discussion continued in my absence :) > > From a release management perspective, it's *extremely* reasonable to block > the inclusion of new features a month from the planned release date. A > typical software development lifecycle includes weeks of feature freeze and > weeks of code freeze. It is no knock on any developer or any feature to say > that we should not include something in 3.0.0. > > I've been very open and clear about the goals, schedule, and scope of 3.0.0 > over the last year plus. The point of the extended alpha process was to get > all our features in during alpha, and the alpha merge window has been open > for a year. I'm unmoved by arguments about how long a feature has been > worked on. None of these were not part of the original 3.0.0 scope, and our > users have been waiting even longer for big-ticket 3.0 items like JDK8 and > HDFS EC that were part of the discussed scope. > > I see that two VOTEs have gone out since I was out. I still plan to follow > the proposal in my original email. This means I'll cut branch-3 and > branch-3.0, and move trunk to 4.0.0 before these VOTEs end. This will open > up development for Hadoop 3.1.0 and 4.0.0. > > I'm reaching out to the lead contributor of each of these features > individually to discuss. We need to close on this quickly, and email is too > low bandwidth at this stage. > > Best, > Andrew >
Re: [DISCUSS] Merging YARN-5355 (Timeline Service v.2) to trunk
Hi Sangjin Thanks for bringing this point. We did similar exercise with current YARN-5355 branch today. Tests are validated against default configuration and timeline server v.1.5 as well. All YARN major features such as RM HA/Restart/work-preserving-restart, NM restart, scheduling, sample MR/distributes shell jobs run are validated. Everything looks fine. This testing will also be done once YARN-5355 branch code freezed completely as well in couple of days. Thanks & Regards Rohith Sharma K S On 18 August 2017 at 22:56, Sangjin Lee wrote: > Kudos to Vrushali and the team for getting ready for this large and > important feature! I know a huge team effort went into this. I look forward > to seeing this merged. > > I'd like to ask one piece of due diligence. Could you please inspect > rigorously to ensure that when disabled Timeline Service v.2 does not > impact other features in any way? We did a similar exercise when we had the > first drop, and it would be good to repeat that... Thanks! > > Sangjin > > On Wed, Aug 16, 2017 at 11:44 AM, Andrew Wang > wrote: > > > Great, thanks Vrushali! Sounds good to me. > > > > I have a few procedural release notes comments I'll put on YARN-5355, to > > make sure we advertise this to our users appropriately. > > > > On Wed, Aug 16, 2017 at 11:32 AM, Vrushali Channapattan < > > vrushal...@gmail.com> wrote: > > > > > Hi Andrew, > > > > > > Thanks for your response! > > > > > > There have been no changes to existing APIs since alpha1. > > > > > > We at Twitter have tested the feature to demonstrate it works at what > we > > > consider moderate scale but this did not include the security related > > > testing. The security testing is in progress at present by Timeline > > Service > > > V2 team in the community and we think we will have more details on this > > > very soon. > > > > > > About the jiras under YARN-5355: Only 3 of those sub-tasks are what we > > > think of as "merge-blockers". The issues being targeted for merge are > in > > > [link1] below. There are about 59 jiras of which 56 are completed. > > > > > > We plan to make a new umbrella jira after the merge to trunk. We will > > then > > > create a new branch with the new jira name and move these open jiras > > under > > > YARN-5355 as subtasks of that new umbrella jira. > > > > > > thanks > > > Vrushali > > > [link1] https://issues.apache.org/jira/projects/YARN/versions/12337991 > > > > > > > > > On Wed, Aug 16, 2017 at 10:47 AM, Andrew Wang < > andrew.w...@cloudera.com> > > > wrote: > > > > > >> Hi Vrushali, > > >> > > >> Glad to hear this major dev milestone is nearing completion! > > >> > > >> Repeating my request on other merge [DISCUSS] threads, could you > comment > > >> on testing and API stability of this merge? Our timeline for beta1 is > > about > > >> a month out, so there's not much time to fix things beforehand. > > >> > > >> Looking at YARN-5355 there are also many unresolved subtasks. Should > > most > > >> of these be moved out to a new umbrella? I'm wondering what needs to > be > > >> completed before sending the merge vote. > > >> > > >> Given that TSv2 is committed for 3.0.0 GA, I'm more willing to flex > the > > >> beta1 release date for this feature than others. Hopefully that won't > be > > >> necessary though :) > > >> > > >> Best, > > >> Andrew > > >> > > >> On Wed, Aug 16, 2017 at 10:26 AM, Vrushali Channapattan < > > >> vrushalic2...@gmail.com> wrote: > > >> > > >>> Looks like some of the hyperlinks appear messed up, my apologies, > > >>> resending > > >>> the same email with hopefully better looking content: > > >>> > > >>> Hi All, > > >>> > > >>> I'd like to open a discussion for merging Timeline Service v2 > > (YARN-5355) > > >>> to trunk in a few weeks. > > >>> > > >>> We have previously completed one merge onto trunk [1] and Timeline > > >>> Service > > >>> v2 has been part of Hadoop release 3.0.0-alpha1. > > >>> > > >>> Since then, we have been working on extending the capabilities of > > >>> Timeline > > >>> Service v
Re: About 2.7.4 Release
Couple of more JIRAs need to be back ported for 2.7.4 release. These will solve RM HA unstability issues. https://issues.apache.org/jira/browse/YARN-5333 https://issues.apache.org/jira/browse/YARN-5988 https://issues.apache.org/jira/browse/YARN-6304 I will raise a JIRAs to back port it. @Akira , could you help to add these JIRAs into wiki? Thanks & Regards Rohith Sharma K S On 29 May 2017 at 12:19, Akira Ajisaka wrote: > Created a page for 2.7.4 release. > https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.7.4 > > If you want to edit this wiki, please ping me. > > Regards, > Akira > > > On 2017/05/23 4:42, Brahma Reddy Battula wrote: > >> Hi Konstantin Shvachko >> >> >> how about creating a wiki page for 2.7.4 release status like 2.8 and >> trunk in following link.?? >> >> >> https://cwiki.apache.org/confluence/display/HADOOP >> >> >> >> From: Konstantin Shvachko >> Sent: Saturday, May 13, 2017 3:58 AM >> To: Akira Ajisaka >> Cc: Hadoop Common; Hdfs-dev; mapreduce-dev@hadoop.apache.org; >> yarn-...@hadoop.apache.org >> Subject: Re: About 2.7.4 Release >> >> Latest update on the links and filters. Here is the correct link for the >> filter: >> https://issues.apache.org/jira/secure/IssueNavigator.jspa? >> requestId=12340814 >> >> Also updated: https://s.apache.org/Dzg4 >> >> Had to do some Jira debugging. Sorry for confusion. >> >> Thanks, >> --Konstantin >> >> On Wed, May 10, 2017 at 2:30 PM, Konstantin Shvachko < >> shv.had...@gmail.com> >> wrote: >> >> Hey Akira, >>> >>> I didn't have private filters. Most probably Jira caches something. >>> Your filter is in the right direction, but for some reason it lists only >>> 22 issues, while mine has 29. >>> It misses e.g. YARN-5543 <https://issues.apache.org/jir >>> a/browse/YARN-5543> >>> . >>> >>> Anyways, I created a Jira filter now "Hadoop 2.7.4 release blockers", >>> shared it with "everybody", and updated my link to point to that filter. >>> So >>> you can use any of the three methods below to get the correct list: >>> 1. Go to https://s.apache.org/Dzg4 >>> 2. Go to the filter via >>> https://issues.apache.org/jira/issues?filter=12340814 >>>or by finding "Hadoop 2.7.4 release blockers" filter in the jira >>> 3. On Advanced issues search page paste this: >>> project in (HDFS, HADOOP, YARN, MAPREDUCE) AND labels = release-blocker >>> AND "Target Version/s" = 2.7.4 >>> >>> Hope this solves the confusion for which issues are included. >>> Please LMK if it doesn't, as it is important. >>> >>> Thanks, >>> --Konstantin >>> >>> On Tue, May 9, 2017 at 9:58 AM, Akira Ajisaka >>> wrote: >>> >>> Hi Konstantin, >>>> >>>> Thank you for volunteering as release manager! >>>> >>>> Actually the original link works fine: https://s.apache.org/Dzg4 >>>>> >>>> I couldn't see the link. Maybe is it private filter? >>>> >>>> Here is a link I generated: https://s.apache.org/ehKy >>>> This filter includes resolved issue and excludes fixversion == 2.7.4 >>>> >>>> Thanks and Regards, >>>> Akira >>>> >>>> On 2017/05/08 19:20, Konstantin Shvachko wrote: >>>> >>>> Hi Brahma Reddy Battula, >>>>> >>>>> Actually the original link works fine: https://s.apache.org/Dzg4 >>>>> Your link excludes closed and resolved issues, which needs backporting, >>>>> and >>>>> which we cannot reopen, as discussed in this thread earlier. >>>>> >>>>> Looked through the issues you proposed: >>>>> >>>>> HDFS-9311 <https://issues.apache.org/jira/browse/HDFS-9311> >>>>> Seems like a new feature. It helps failover to standby node when >>>>> primary >>>>> is >>>>> under heavy load, but it introduces new APIs, addresses, config >>>>> parameters. >>>>> And needs at least one follow up jira. >>>>> Looks like a backward compatible change, though. >>>>> Did you have a chance to run it in production? >>>>> >>>>> +1 on >>>>> HDFS-10987 <https://issues.apache.org/jira/browse/HDFS-10987>
[jira] [Created] (MAPREDUCE-6889) FileSystem leak when Job#submit() used when ATS1.5 enabled
Rohith Sharma K S created MAPREDUCE-6889: Summary: FileSystem leak when Job#submit() used when ATS1.5 enabled Key: MAPREDUCE-6889 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6889 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S ATS1.5 uses FileSystemTimelineWriter which creates FS object on every writer initialization. If writer is not closed, then there is possibility of OOM see YARN-5438 fixes closing FS object. TimelineClient is used by YarnClient. So all the user who uses YarnClient with ATS1.5 need to stop service properly. Otherwise there is big chance of FS object leak. Of course MR uses YARN client submit job. If MR do not stop YarnClient then there is FS object leak. JobClient provides a API to stop all these service using *JobClient#close*. But many MR clients uses *Job* object to submit a job. But do not stop started services by default. So, Job class should provide a API to close the services very similar to JobClient#close. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 2.6.5 (RC1)
+1 ( non- binding) --Downloaded source and built from it. --Installed 3 node cluster with HA deployed. —Ran sample MR jobs and tested with basics operations. — Integrated JCarder tool to find any potential deadlock cycles. And did not observe any potential cycles. Thanks & Regards Rohith Sharma K S > On Oct 3, 2016, at 5:42 AM, Sangjin Lee wrote: > > Hi folks, > > I have pushed a new release candidate (R1) for the Apache Hadoop 2.6.5 > release (the next maintenance release in the 2.6.x release line). RC1 > contains fixes to CHANGES.txt, and is otherwise identical to RC0. > > Below are the details of this release candidate: > > The RC is available for validation at: > http://home.apache.org/~sjlee/hadoop-2.6.5-RC1/. > > The RC tag in git is release-2.6.5-RC1 and its git commit is > e8c9fe0b4c252caf2ebf1464220599650f119997. > > The maven artifacts are staged via repository.apache.org at: > https://repository.apache.org/content/repositories/orgapachehadoop-1050/. > > You can find my public key at > http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS. > > Please try the release and vote. The vote will run for the usual 5 days. I > would greatly appreciate your timely vote. Thanks! > > Regards, > Sangjin - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 2.7.3 RC2
+1 (non-binding) - Downloaded source and built from it with native compilation. - Installed 5 node cluster with HA enabled. - Verified HA scenarios such as RM HA-RMRestart-RMworkpreserving restart while running MR and distributed shell applications. Thanks & Regards Rohith Sharma K S > On Aug 20, 2016, at 11:07 AM, Kuhu Shukla > wrote: > > +1 (non-binding).- Downloaded tarball (source and binary) > - Verified signatures. > - Compiled, built source code and deployed on a single node cluster > - Ran sample MR jobs (Sleep, Wordcount) and some "hadoop fs" commands. > Thanks a lot Vinod for your work on this release! > Regards,Kuhu Shukla > >On Friday, August 19, 2016 5:32 PM, Eric Payne > wrote: > > > Thanks, Vinod, for working so hard on each 2.7 release. > > +1 (non-binding) > > Here's what I did: > > - Built native > - Installed on 3-node unsecure cluster > - Configured 2 queues with 2 separate label partitions > - Verified that a job will successfully run on the correctly labelled node by > specifying a non-default (but queue-accessible) label. > - Verified that a distributed shell job would keep non-AM containers running > across an App Master attempt restart. > - Verified that preemption happens as expected (sort of). I say "sort of" > because about twice as many containers were preempted as I thought should > have been, but once the other underserved app began to run, it stopped > preempting. Also, it didn't preempt between 2 queues with the same partition > label. Partition preemption may not be supported in 2.7, so this is probably > also okay. > > > Thanks! > Eric Payne > > > > > > From: Vinod Kumar Vavilapalli > To: "common-...@hadoop.apache.org" ; > hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; > "mapreduce-dev@hadoop.apache.org" > Cc: Vinod Kumar Vavilapalli > Sent: Wednesday, August 17, 2016 9:05 PM > Subject: [VOTE] Release Apache Hadoop 2.7.3 RC2 > > > Hi all, > > I've created a new release candidate RC2 for Apache Hadoop 2.7.3. > > As discussed before, this is the next maintenance release to follow up 2.7.2. > > The RC is available for validation at: > http://home.apache.org/~vinodkv/hadoop-2.7.3-RC2/ > <http://home.apache.org/~vinodkv/hadoop-2.7.3-RC2/> > > The RC tag in git is: release-2.7.3-RC2 > > The maven artifacts are available via repository.apache.org > <http://repository.apache.org/> at > https://repository.apache.org/content/repositories/orgapachehadoop-1046 > <https://repository.apache.org/content/repositories/orgapachehadoop-1046> > > The release-notes are inside the tar-balls at location > hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted > this at http://home.apache.org/~vinodkv/hadoop-2.7.3-RC2/releasenotes.html > <http://home.apache.org/~vinodkv/hadoop-2.7.3-RC2/releasenotes.html> for your > quick perusal. > > As you may have noted, > - few issues with RC0 forced a RC1 [1] > - few more issues with RC1 forced a RC2 [2] > - a very long fix-cycle for the License & Notice issues (HADOOP-12893) caused > 2.7.3 (along with every other Hadoop release) to slip by quite a bit. This > release's related discussion thread is linked below: [3]. > > Please try the release and vote; the vote will run for the usual 5 days. > > Thanks, > Vinod > > [1] [VOTE] Release Apache Hadoop 2.7.3 RC0: > https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/index.html#26106 > <https://www.mail-archive.com/hdfs-dev@hadoop.apache.org/index.html#26106> > [2] [VOTE] Release Apache Hadoop 2.7.3 RC1: > https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg26336.html > <https://www.mail-archive.com/hdfs-dev@hadoop.apache.org/msg26336.html> > [3] 2.7.3 release plan: > https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html > <http://markmail.org/thread/6yv2fyrs4jlepmmr> > > - > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > > > - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 2.7.3 RC0
+1(non-binding) Downloaded and built from source Cluster installed in 3 nodes and verified running simple MR jobs. Verified for RM HA , RM work preserving restart with CapacityScheduler Thanks & Regards Rohith Sharma K S > On Jul 26, 2016, at 6:50 PM, Vinayakumar B wrote: > > +1 (binding) > > 1. Downloaded and Built from branch-2.7.3 > 2. Started up HDFS and YARN in Single Node cluster. > 3. Ran WordCount job multiple times and Success. > 4. Verified the "Release Notes" available at the URL mentioned by Vinod. > > > Apart from that, > Faced same issues as Andrew wang, while running the WordCount job first time > in my new Ubuntu installation, without 'configuring the shuffle handler > properly'. Whole session got logged by closing all other applications open. > After configuring the shuffle handler properly, job was successful though. > > -Vinay > > -Original Message- > From: Andrew Wang [mailto:andrew.w...@cloudera.com] > Sent: 26 July 2016 00:22 > To: Karthik Kambatla > Cc: larry mccay ; Vinod Kumar Vavilapalli > ; common-...@hadoop.apache.org; > hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; > mapreduce-dev@hadoop.apache.org > Subject: Re: [VOTE] Release Apache Hadoop 2.7.3 RC0 > > I'll also add that, as a YARN newbie, I did hit two usability issues. These > are very unlikely to be regressions, and I can file JIRAs if they seem > fixable. > > * I didn't have SSH to localhost set up (new laptop), and when I tried to run > the Pi job, it'd exit my window manager session. I feel there must be a more > developer-friendly solution here. > * If you start the NodeManager and not the RM, the NM has a handler for > SIGTERM and SIGINT that blocked my Ctrl-C and kill attempts during startup. > I had to kill -9 it. > > On Mon, Jul 25, 2016 at 11:44 AM, Andrew Wang > wrote: > >> I got asked this off-list, so as a reminder, only PMC votes are >> binding on releases. Everyone is encouraged to vote on releases though! >> >> +1 (binding) >> >> * Downloaded source, built >> * Started up HDFS and YARN >> * Ran Pi job which as usual returned 4, and a little teragen >> >> On Mon, Jul 25, 2016 at 11:08 AM, Karthik Kambatla >> >> wrote: >> >>> +1 (binding) >>> >>> * Downloaded and build from source >>> * Checked LICENSE and NOTICE >>> * Pseudo-distributed cluster with FairScheduler >>> * Ran MR and HDFS tests >>> * Verified basic UI >>> >>> On Sun, Jul 24, 2016 at 1:07 PM, larry mccay wrote: >>> >>>> +1 binding >>>> >>>> * downloaded and built from source >>>> * checked LICENSE and NOTICE files >>>> * verified signatures >>>> * ran standalone tests >>>> * installed pseudo-distributed instance on my mac >>>> * ran through HDFS and mapreduce tests >>>> * tested credential command >>>> * tested webhdfs access through Apache Knox >>>> >>>> >>>> On Fri, Jul 22, 2016 at 10:15 PM, Vinod Kumar Vavilapalli < >>>> vino...@apache.org> wrote: >>>> >>>>> Hi all, >>>>> >>>>> I've created a release candidate RC0 for Apache Hadoop 2.7.3. >>>>> >>>>> As discussed before, this is the next maintenance release to >>>>> follow up 2.7.2. >>>>> >>>>> The RC is available for validation at: >>>>> http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/ < >>>>> http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/> >>>>> >>>>> The RC tag in git is: release-2.7.3-RC0 >>>>> >>>>> The maven artifacts are available via repository.apache.org < >>>>> http://repository.apache.org/> at >>>>> >>> https://repository.apache.org/content/repositories/orgapachehadoop-10 >>> 40/ >>>> < >>>>> >>> https://repository.apache.org/content/repositories/orgapachehadoop-10 >>> 40/ >>>>> >>>>> >>>>> The release-notes are inside the tar-balls at location >>>>> hadoop-common-project/hadoop-common/src/main/docs/releasenotes.ht >>>>> ml. I hosted this at >>>>> http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/releasenotes.htm >>>>> l < >>>>> http://people.apache.org/~vinodkv/hadoop-2.7.2-RC1/releasenotes.h >>>>> tml> >>>> for >&g
RE: Failure to submit job on trunk hadoop cluster
Hi Tsuyoshi I think it is modified for security reasons. I too did not find any JIRA id corresponding to this commit. Related discussion happened in the following JIRA. https://issues.apache.org/jira/browse/MAPREDUCE-6704 Thanks & Regards Rohith Sharma K S -Original Message- From: Tsuyoshi Ozawa [mailto:oz...@apache.org] Sent: 27 June 2016 16:10 To: common-...@hadoop.apache.org; yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org Subject: Failure to submit job on trunk hadoop cluster Hi, Have anyone tried to submit jobs on trunk cluster? I failed to launch jobs by class not found error with following error messages. Should I change environment variables to setup from branch-2's one? I'm using the same configuration as Hadoop 2 can run. $ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-alpha1-SNAPSHOT.jar randomtextwriter 2 -> error $ yarn logs -applicationId application_1467022691488_0004 ... Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:946) at org.apache.hadoop.util.Shell.run(Shell.java:850) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1144) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:238) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:385) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:281) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:89) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1. Last 4096 bytes of stderr : Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster Container exited with a non-zero exit code 1. Last 4096 bytes of stderr : Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster Thanks, - Tsuyoshi - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[Review Request] HADOOP-12687 | Blocker for hadoop2.8 release
Hi Folks Could anyone review HADOOP-12687 ? Basically patch is going to break RFC 1535. Does Hadoop meet mandatory RFC standards? It would be greatly appreciated if folks express your opinion and helping in for getting consensus. Thanks & Regards Rohith Sharma K S
RE: [Release thread] 2.8.0 release activities
There are few test cases regularly failing in YARN because of HADOOP-12687(issue with DNS). This issue is blocker for releases since YARN test cases do not guarantee quality. There are some patches ready for this issue. But we doubt will this violate RFC-1535? I would like to hear opinion from community on this. Thanks & Regards Rohith Sharma K S -Original Message- From: Jian He [mailto:j...@hortonworks.com] Sent: 12 May 2016 06:34 To: mapreduce-dev@hadoop.apache.org; yarn-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; common-...@hadoop.apache.org Subject: Re: [Release thread] 2.8.0 release activities For MapReduce/YARN, I closed a few staled ones. Only 4 jiras needs attention for 2.8 MAPREDUCE-6288 YARN-1815 YARN-4685 YARN-4844 The rest are either improvements or long-standing issues and does not qualify release blocker, IMO. I think we’ll try to get these 4 jiras in asap. The rest will be on best effort, resolve as much as possible and move them out if not resolved in time. Jian On May 11, 2016, at 5:37 PM, Wangda Tan mailto:wheele...@gmail.com>> wrote: Sounds good to me :). Jian and I have looked at all existing 2.8.0 blockers and criticals today. To me more than half of MR/YARN blockers/criticals of 2.8 should be moved out. Left comments on these JIRAs asked original owners, plan to update target version of these JIRAs early next week. Will keep this thread updated. Thanks, Wangda On Wed, May 11, 2016 at 5:06 PM, Sangjin Lee mailto:sj...@apache.org>> wrote: How about this? I'll review the HADOOP/HDFS bugs in that list to come up with true blockers for 2.8.0 or JIRAs that are close to being ready. I'll report the list here. Then folks can chime in if you agree Perhaps Wangda, you can go over the YARN/MR bugs. Sound like a plan? Thanks, Sangjin On Wed, May 11, 2016 at 4:26 PM, Wangda Tan mailto:wheele...@gmail.com>> wrote: +1, we should close such staled JIRAs to avoid doing unnecessary checks for every releases. I'm working on reviewing YARN/MR critical/blocker patches currently, it gonna very helpful if someone else can help with reviewing Common/HDFS JIRAs. Thanks, Wangda On Wed, May 11, 2016 at 4:20 PM, Sangjin Lee mailto:sj...@apache.org>> wrote: Where do we stand in terms of closing out blocker/critical issues for 2.8.0? I still see 50 open JIRAs in Vinod's list: https://issues.apache.org/jira/issues/?filter=12334985 But I see a lot of JIRAs with no patches or very stale patches. It would be a good exercise to come up with the list of JIRAs that we need to block 2.8.0 for and focus our attention on closing them out. Thoughts? Thanks, Sangjin On Sat, Apr 23, 2016 at 5:05 AM, Steve Loughran mailto:ste...@hortonworks.com> wrote: On 23 Apr 2016, at 01:24, Vinod Kumar Vavilapalli < vino...@apache.org<mailto:vino...@apache.org>> wrote: We are not converging - there’s still 58 more. I need help from the community in addressing / review 2.8.0 blockers. If folks can start with reviewing Patch available tickets, that’ll be great. I'm still doing the s3a stuff, other people testing and reviewing this stuff welcome. in particular, I could do with others playing with this patch of mine, which adds counters and things into S3a, based on the azure instrumentation https://issues.apache.org/jira/browse/HADOOP-13028 - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-6700) Jobhistory server attempt and task table not loading maps/reduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S resolved MAPREDUCE-6700. -- Resolution: Implemented This was commit miss in branch-2.8 for the issue YARN-3840. It is fixed by re opening the YARN-3840. Closing as Implemented.. Thanks [~bibinchundatt] for finding this issue in regression. I really appreciate it:-) It would have been downstream defect. Thanks [~varun_saxena] for finding root cause and providing rebase patch in YARN-3840 soon. > Jobhistory server attempt and task table not loading maps/reduce > > > Key: MAPREDUCE-6700 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6700 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Blocker > > Browser > === > Chrome > Steps to reproduce > == > # Submit mapreduce application with 20 maps > # Wait till completion of mapreduce application > # Check maps attempts page > {{jobhistory/attempts/job_1463446678437_0003/m/SUCCESSFUL}} > and {{jobhistory/tasks/job_1463446678437_0003/m}} page > Actual > = > Table not loading. > Sort based on any column other than attempt contents are loaded. > Column 0 is of *natural sorting* and not working. So waiting for ever to be > sorted. > {noformat} > SCRIPT438: Object doesn't support property or method 'natural-asc' > jquery.dataTables.min.js, line 86 character 179 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 2.7.2 RC2
+1(non binding) -downloaded package and installed in 3 node cluster -Verified set up for RM HA, RM work preserving restart, NM work preserving restart -Ran few MR jobs and distributed shell applications. On Sat, Jan 23, 2016 at 3:34 AM, Jian He wrote: > +1, > - built from source code > - deployed a single cluster > - run sample jobs which pass successfully. > > Jian > > > On Jan 22, 2016, at 6:52 AM, Sunil Govind > wrote: > > > > +1 (Non Binding) > > > > * Built tar ball from source and deployed > > * Verified few MR Jobs for various nodelabel and preemption cases. > > * Verified RM Web UI and REST queries. looks fine. > > > > Thanks and Regards > > Sunil G > > > > On Fri, Jan 15, 2016 at 10:27 AM Vinod Kumar Vavilapalli < > vino...@apache.org> > > wrote: > > > >> Hi all, > >> > >> I've created an updated release candidate RC2 for Apache Hadoop 2.7.2. > >> > >> As discussed before, this is the next maintenance release to follow up > >> 2.7.1. > >> > >> The RC is available for validation at: > >> http://people.apache.org/~vinodkv/hadoop-2.7.2-RC2/ > >> > >> The RC tag in git is: release-2.7.2-RC2 > >> > >> The maven artifacts are available via repository.apache.org < > >> http://repository.apache.org/> at > >> https://repository.apache.org/content/repositories/orgapachehadoop-1027 > < > >> https://repository.apache.org/content/repositories/orgapachehadoop-1027 > > > >> > >> The release-notes are inside the tar-balls at location > >> hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I > >> hosted this at > >> http://people.apache.org/~vinodkv/hadoop-2.7.2-RC2/releasenotes.html < > >> http://people.apache.org/~vinodkv/hadoop-2.7.2-RC1/releasenotes.html> > for > >> your quick perusal. > >> > >> As you may have noted, > >> - I terminated the RC1 related voting thread after finding out that we > >> didn’t have a bunch of patches that are already in the released 2.6.3 > >> version. After a brief discussion, we decided to keep the parallel 2.6.x > >> and 2.7.x releases incremental, see [4] for this discussion. > >> - The RC0 related voting thread got halted due to some critical issues. > >> It took a while again for getting all those blockers out of the way. See > >> the previous voting thread [3] for details. > >> - Before RC0, an unusually long 2.6.3 release caused 2.7.2 to slip by > >> quite a bit. This release's related discussion threads are linked below: > >> [1] and [2]. > >> > >> Please try the release and vote; the vote will run for the usual 5 days. > >> > >> Thanks, > >> Vinod > >> > >> [1]: 2.7.2 release plan: http://markmail.org/message/oozq3gvd4nhzsaes < > >> http://markmail.org/message/oozq3gvd4nhzsaes> > >> [2]: Planning Apache Hadoop 2.7.2 > >> http://markmail.org/message/iktqss2qdeykgpqk < > >> http://markmail.org/message/iktqss2qdeykgpqk> > >> [3]: [VOTE] Release Apache Hadoop 2.7.2 RC0: > >> http://markmail.org/message/5txhvr2qdiqglrwc < > >> http://markmail.org/message/5txhvr2qdiqglrwc> > >> [4] Retracted [VOTE] Release Apache Hadoop 2.7.2 RC1: > >> http://markmail.org/thread/n7ljbsnquihn3wlw > > -- Thanks & Regards Rohith Sharma K S
RE: [VOTE] Release Apache Hadoop 2.7.2 RC1
+1(non-binding) -Built from source and deployed in 3 node cluster -Tested with combination of RMRestart/RM HA/RM work preserving restart/NM work preserving restart modes. -Ran MR and distributed shell sample applications. -Verified signature and md5 of binary Thanks & Regards Rohith Sharma K S -Original Message- From: Tsuyoshi Ozawa [mailto:oz...@apache.org] Sent: 22 December 2015 09:04 To: common-...@hadoop.apache.org Cc: yarn-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org; Vinod Kumar Vavilapalli Subject: Re: [VOTE] Release Apache Hadoop 2.7.2 RC1 +1 - downloaded src and bin tar balls and verified signatures. - built Tez and Spark with 2.7.2 artifacts and JDK7. - ran tests of Tez with 2.7.2 artifacts, it passed. FYI: YARN-4348, reported by Jian, is one of critical issues of 2.7.2 release.It's better to release 2.7.3 as soon as possible after this release. Thanks, - Tsuyoshi On Tue, Dec 22, 2015 at 4:51 AM, Wangda Tan wrote: > +1 (binding) > > - Build & deploy single-node Hadoop from source code > - Add/Remove node labels to queues/nodes > - Run distributed shell commanding using default/specified node labels > > Thanks, > Wangda > > > On Mon, Dec 21, 2015 at 9:58 AM, Masatake Iwasaki < > iwasak...@oss.nttdata.co.jp> wrote: > >> +1(non-binding) >> >> - verified mds and signature of source and binary tarball >> - started 3 node cluster and ran example jobs such as wordcount and >> terasort >> - built from source tarball with -Pnative on CentOS 7 and OpenJDK 7 >> - built site documentation and skimmed the contents >> >> Thanks, >> Masatake Iwasaki >> >> >> >> On 12/17/15 11:49, Vinod Kumar Vavilapalli wrote: >> >>> Hi all, >>> >>> I've created a release candidate RC1 for Apache Hadoop 2.7.2. >>> >>> As discussed before, this is the next maintenance release to follow >>> up 2.7.1. >>> >>> The RC is available for validation at: >>> http://people.apache.org/~vinodkv/hadoop-2.7.2-RC1/ < >>> http://people.apache.org/~vinodkv/hadoop-2.7.2-RC1/> >>> >>> The RC tag in git is: release-2.7.2-RC1 >>> >>> The maven artifacts are available via repository.apache.org < >>> http://repository.apache.org/> at >>> https://repository.apache.org/content/repositories/orgapachehadoop-1 >>> 026/ >>> <https://repository.apache.org/content/repositories/orgapachehadoop- >>> 1026/ >>> > >>> >>> The release-notes are inside the tar-balls at location >>> hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. >>> I hosted this at >>> http://people.apache.org/~vinodkv/hadoop-2.7.2-RC1/releasenotes.html >>> for quick perusal. >>> >>> As you may have noted, >>> - The RC0 related voting thread got halted due to some critical issues. >>> It took a while again for getting all those blockers out of the way. >>> See the previous voting thread [3] for details. >>> - Before RC0, an unusually long 2.6.3 release caused 2.7.2 to slip >>> by quite a bit. This release's related discussion threads are linked below: >>> [1] and [2]. >>> >>> Please try the release and vote; the vote will run for the usual 5 days. >>> >>> Thanks, >>> Vinod >>> >>> [1]: 2.7.2 release plan: >>> http://markmail.org/message/oozq3gvd4nhzsaes < >>> http://markmail.org/message/oozq3gvd4nhzsaes> >>> [2]: Planning Apache Hadoop 2.7.2 >>> http://markmail.org/message/iktqss2qdeykgpqk < >>> http://markmail.org/message/iktqss2qdeykgpqk> >>> [3]: [VOTE] Release Apache Hadoop 2.7.2 RC0: >>> http://markmail.org/message/5txhvr2qdiqglrwc >>> >>> >>> >>
[jira] [Created] (MAPREDUCE-6580) Test failure : TestMRJobsWithProfiler
Rohith Sharma K S created MAPREDUCE-6580: Summary: Test failure : TestMRJobsWithProfiler Key: MAPREDUCE-6580 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6580 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Sharma K S >From >[https://builds.apache.org/job/PreCommit-YARN-Build/9976/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.8.0_66.txt] > TestMRJobsWithProfiler fails intermittently {code} Running org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 212.973 sec <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler testDifferentProfilers(org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler) Time elapsed: 133.116 sec <<< FAILURE! java.lang.AssertionError: expected:<4> but was:<1> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testProfilerInternal(TestMRJobsWithProfiler.java:212) at org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testDifferentProfilers(TestMRJobsWithProfiler.java:117) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6579) Test failure : TestNetworkedJob
Rohith Sharma K S created MAPREDUCE-6579: Summary: Test failure : TestNetworkedJob Key: MAPREDUCE-6579 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6579 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Sharma K S >From >[https://builds.apache.org/job/PreCommit-YARN-Build/9976/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.8.0_66.txt] > TestNetworkedJob are failed intermittently. {code} Running org.apache.hadoop.mapred.TestNetworkedJob Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 81.131 sec <<< FAILURE! - in org.apache.hadoop.mapred.TestNetworkedJob testNetworkedJob(org.apache.hadoop.mapred.TestNetworkedJob) Time elapsed: 30.55 sec <<< FAILURE! org.junit.ComparisonFailure: expected:<[[Tue Dec 15 14:02:45 + 2015] Application is Activated, waiting for resources to be assigned for AM. Details : AM Partition = ; Partition Resource = ; Queue's Absolute capacity = 100.0 % ; Queue's Absolute used capacity = 0.0 % ; Queue's Absolute max capacity = 100.0 % ; ]> but was:<[]> at org.junit.Assert.assertEquals(Assert.java:115) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.mapred.TestNetworkedJob.testNetworkedJob(TestNetworkedJob.java:174) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
RE: [VOTE] Release Apache Hadoop 2.6.2
+1(non-binding) Downloaded tar.gz, and installed 2 node HA cluster. RMHA/RMRestart/RMWorkpreservingRestart cluster are fine. Verified cluster for high availability. Ran sample MR applications with RM HA enabled. Run sanity test cases and are working fine. Thanks & Regards Rohith Sharma K S -Original Message- From: sjl...@gmail.com [mailto:sjl...@gmail.com] On Behalf Of Sangjin Lee Sent: 23 October 2015 02:44 To: common-...@hadoop.apache.org; yarn-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org Cc: Vinod Kumar Vavilapalli Subject: [VOTE] Release Apache Hadoop 2.6.2 Hi all, I have created a release candidate (RC0) for Hadoop 2.6.2. The RC is available at: http://people.apache.org/~sjlee/hadoop-2.6.2-RC0/ The RC tag in git is: release-2.6.2-RC0 The list of JIRAs committed for 2.6.2: https://issues.apache.org/jira/browse/YARN-4101?jql=project%20in%20(HADOOP%2C%20HDFS%2C%20YARN%2C%20MAPREDUCE)%20AND%20fixVersion%20%3D%202.6.2 The maven artifacts are staged at https://repository.apache.org/content/repositories/orgapachehadoop-1022/ Please try out the release candidate and vote. The vote will run for 5 days. Thanks, Sangjin
[jira] [Created] (MAPREDUCE-6508) TestNetworkedJob fails intermittently
Rohith Sharma K S created MAPREDUCE-6508: Summary: TestNetworkedJob fails intermittently Key: MAPREDUCE-6508 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6508 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Sharma K S {noformat} Running org.apache.hadoop.mapred.TestNetworkedJob Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 84.215 sec <<< FAILURE! - in org.apache.hadoop.mapred.TestNetworkedJob testNetworkedJob(org.apache.hadoop.mapred.TestNetworkedJob) Time elapsed: 31.537 sec <<< ERROR! java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: java.io.IOException: Delegation Token can be issued only with kerberos authentication at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getDelegationToken(ClientRMService.java:1044) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getDelegationToken(ApplicationClientProtocolPBServiceImpl.java:325) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:483) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2236) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2232) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1667) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2230) Caused by: java.io.IOException: Delegation Token can be issued only with kerberos authentication at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getDelegationToken(ClientRMService.java:1017) ... 10 more at org.apache.hadoop.ipc.Client.call(Client.java:1448) at org.apache.hadoop.ipc.Client.call(Client.java:1379) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy84.getDelegationToken(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getDelegationToken(ApplicationClientProtocolPBClientImpl.java:339) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:251) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) at com.sun.proxy.$Proxy85.getDelegationToken(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getRMDelegationToken(YarnClientImpl.java:541) at org.apache.hadoop.mapred.ResourceMgrDelegate.getDelegationToken(ResourceMgrDelegate.java:177) at org.apache.hadoop.mapred.YARNRunner.getDelegationToken(YARNRunner.java:231) at org.apache.hadoop.mapreduce.Cluster.getDelegationToken(Cluster.java:401) at org.apache.hadoop.mapred.JobClient$16.run(JobClient.java:1234) at org.apache.hadoop.mapred.JobClient$16.run(JobClient.java:1231) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1667) at org.apache.hadoop.mapred.JobClient.getDelegationToken(JobClient.java:1230) at org.apache.hadoop.mapred.TestNetworkedJob.testNetworkedJob(TestNetworkedJob.java:260) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6507) TestRMNMInfo fails intermittently
Rohith Sharma K S created MAPREDUCE-6507: Summary: TestRMNMInfo fails intermittently Key: MAPREDUCE-6507 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6507 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Rohith Sharma K S TestRMNMInfo fails intermittently. Below is trace for the failure {noformat} testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 sec <<< FAILURE! java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but was:<3> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
RE: [VOTE] Release Apache Hadoop 2.6.1 RC1
+1 (non-binding) Downloaded source, built package and installed 5 node cluster. 1. Verified for RMHA/RMRestart/RMWorkpreservingRestart cluster in secure/non-secure mode. 2. Attached JCarder tool to cluster for identifying deadlock cycles, No cycles found. 3. Verified cluster for high availability. 4. Ran sample MR applications with RM HA enabled. 5. Run sanity test cases and working fine. Thanks & Regards Rohith Sharma K S -Original Message- From: sjl...@gmail.com [mailto:sjl...@gmail.com] On Behalf Of Sangjin Lee Sent: 18 September 2015 09:50 To: yarn-...@hadoop.apache.org Cc: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org Subject: Re: [VOTE] Release Apache Hadoop 2.6.1 RC1 +1 (non-binding) Verified the signatures, set up a pseudo-distributed cluster, ran several test jobs, and ran an uber job. Also verified that the UI issue I saw on RC0 is now gone. Thanks Vinod! Sangjin On Thu, Sep 17, 2015 at 7:24 PM, Jian He wrote: > +1 (binding) > > Build from source code. > Deployed a local cluster. > Validated sample jobs passed. > > Jian > > > On Sep 18, 2015, at 7:34 AM, Wangda Tan wrote: > > > > Deployed a local cluster, verified configured cluster with node > > labels, > run > > jobs with/without node labels. > > > > +1 (non-binding) > > > > Thanks! > > > > On Thu, Sep 17, 2015 at 2:40 PM, Xuan Gong > wrote: > > > >> Update my vote from +1 (non-binding) to +1 binding > >> > >> Thanks > >> > >> Xuan Gong > >> > >>> On Sep 17, 2015, at 2:05 PM, Xuan Gong wrote: > >>> > >>> +1 (non-binding) > >>> Download and compile the source code, run several MR jobs. > >>> > >>> Xuan Gong > >>> > >>>> On Sep 16, 2015, at 7:10 PM, Vinod Kumar Vavilapalli < > >> vino...@apache.org> wrote: > >>>> > >>>> Hi all, > >>>> > >>>> After a nearly month long [1] toil, with loads of help from > >>>> Sangjin > Lee > >> and > >>>> Akira Ajisaka, and 153 (RC0)+7(RC1) commits later, I've created a > >> release > >>>> candidate RC1 for hadoop-2.6.1. > >>>> > >>>> RC1 is RC0 [0] (for which I opened and closed a vote last week) + > >>>> UI > >> fixes > >>>> for the issue Sangjin raised (YARN-3171 and the dependencies > YARN-3779, > >>>> YARN-3248), additional fix to avoid incompatibility (YARN-3740), > >>>> other > >> UI > >>>> bugs (YARN-1884, YARN-3544) and the MiniYARNCluster issue (right > >>>> patch > >> for > >>>> YARN-2890) that Jeff Zhang raised. > >>>> > >>>> The RC is available at: > >> http://people.apache.org/~vinodkv/hadoop-2.6.1-RC1/ > >>>> > >>>> The RC tag in git is: release-2.6.1-RC1 > >>>> > >>>> The maven artifacts are available via repository.apache.org at > >>>> > https://repository.apache.org/content/repositories/orgapachehadoop-102 > 1 > >>>> > >>>> Some notes from our release process > >>>> - - Sangjin and I moved out a bunch of items pending from 2.6.1 > >>>> [2] - non-committed but desired patches. 2.6.1 is already big as > >>>> is and is > >> late > >>>> by any standard, we can definitely include them in the next release. > >>>> - The 2.6.1 wiki page [3] captures some (but not all) of the > >>>> context > of > >>>> the patches that we pushed in. > >>>> - Given the number of fixes pushed [4] in, we had to make a bunch > >>>> of changes to our original plan - we added a few improvements > >>>> that helped > >> us > >>>> backport patches easier (or in many cases made backports > >>>> possible), > and > >> we > >>>> dropped a few that didn't make sense (HDFS-7831, HDFS-7926, > >>>> HDFS-7676, HDFS-7611, HDFS-7843, HDFS-8850). > >>>> - I ran all the unit tests which (surprisingly?) passed. (Except > >>>> for > >> one, > >>>> which pointed out a missing fix HDFS-7552). > >>>> > >>>> As discussed before [5] > >>>> - This release is the first point release after 2.6.0 > >>>> - I’d like to use this as a starting release for 2.6.2 in a few > >>>> weeks > >> an
RE: Planning Hadoop 2.6.1 release
Can we add following fixes to 2.6.1? YARN-3733 YARN-2865 YARN-3990 YARN-2894 Thanks & Regards Rohith Sharma K S -Original Message- From: Allan Wilson [mailto:awils...@pandora.com] Sent: 05 August 2015 23:25 To: common-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org; yarn-...@hadoop.apache.org Cc: hdfs-...@hadoop.apache.org Subject: Re: Planning Hadoop 2.6.1 release Another +1 to add those fixes. YARN-3487 bug can grind a large cluster to a halt repeatedly -Allan Allan Wilson | Sr. Software Engineer | Pandora m 919.841.2449 | awils...@pandora.com On 8/5/15, 1:52 PM, "Rich Haase" wrote: >+1 to add those fixes. > > >Rich Haase | Sr. Software Engineer | Pandora m 303.887.1146 | >rha...@pandora.com > > > > >On 8/5/15, 11:42 AM, "Wangda Tan" wrote: > >>Can we add following two fixes to 2.6.1? >> >>https://issues.apache.org/jira/browse/YARN-2922 and >>https://issues.apache.org/jira/browse/YARN-3487. >> >>They're not fatal issue, but they can cause lots of issue in a large >>cluster. >> >>Thanks, >>Wangda >> >> >>On Mon, Aug 3, 2015 at 1:21 PM, Sangjin Lee wrote: >> >>> See my later update in the thread. HDFS-7704 is in the list. >>> >>> Thanks, >>> Sangjin >>> >>> On Mon, Aug 3, 2015 at 1:19 PM, Vinod Kumar Vavilapalli < >>> vino...@hortonworks.com> wrote: >>> >>> > Makes sense, it was caused by HDFS-7704 which got into 2.7.0 only >>> > and >>>is >>> > not part of the candidate list. Removed HDFS-7916 from the list. >>> > >>> > Thanks >>> > +Vinod >>> > >>> > > On Jul 24, 2015, at 6:32 PM, Sangjin Lee wrote: >>> > > >>> > > Out of the JIRAs we proposed, please remove HDFS-7916. I don't >>>think it >>> > > applies to 2.6. >>> > > >>> > > Thanks, >>> > > Sangjin >>> > >>> > >>> >
RE: [VOTE] Release Apache Hadoop 2.7.1 RC0
+1 (non-binding) Build from source deployed in 4 nodes cluster for Secure Mode and Non-Secure Mode. Tested with applications spark and MapReduce for RM HA, RM workPreservingRestat, NM work preserving restart. - Rohith Sharma K S -Original Message- From: Mit Desai [mailto:mitdesa...@gmail.com] Sent: 30 June 2015 23:33 To: hdfs-...@hadoop.apache.org Cc: common-...@hadoop.apache.org; yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org Subject: Re: [VOTE] Release Apache Hadoop 2.7.1 RC0 +1 (non-binding) + Built from source + Verified signatures + Deployed on a single node cluster + Ran some sample jobs to successful completion Thanks for driving the release Vinod! -Mit Desai On Tue, Jun 30, 2015 at 12:51 PM, Varun Vasudev wrote: > +1 (non-binding) > > Built from source, deployed in a single node cluster and ran some test > jobs. > > -Varun > > > > On 6/30/15, 9:58 AM, "Zhijie Shen" wrote: > > >+1 (binding) > > > >Built from source, deployed a single node cluster and tried some MR jobs. > > > >- Zhijie > > > >From: Devaraj K > >Sent: Monday, June 29, 2015 9:24 PM > >To: common-...@hadoop.apache.org > >Cc: hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; > mapreduce-dev@hadoop.apache.org > >Subject: Re: [VOTE] Release Apache Hadoop 2.7.1 RC0 > > > >+1 (non-binding) > > > >Deployed in a 3 node cluster and ran some Yarn Apps and MR examples, > >works fine. > > > > > >On Tue, Jun 30, 2015 at 1:46 AM, Xuan Gong wrote: > > > >> +1 (non-binding) > >> > >> Compiled and deployed a single node cluster, ran all the tests. > >> > >> > >> Xuan Gong > >> > >> On 6/29/15, 1:03 PM, "Arpit Gupta" wrote: > >> > >> >+1 (non binding) > >> > > >> >We have been testing rolling upgrades and downgrades from 2.6 to > >> >this release and have had successful runs. > >> > > >> >-- > >> >Arpit Gupta > >> >Hortonworks Inc. > >> >http://hortonworks.com/ > >> > > >> >> On Jun 29, 2015, at 12:45 PM, Lei Xu wrote: > >> >> > >> >> +1 binding > >> >> > >> >> Downloaded src and bin distribution, verified md5, sha1 and > >> >> sha256 checksums of both tar files. > >> >> Built src using mvn package. > >> >> Ran a pseudo HDFS cluster > >> >> Ran dfs -put some files, and checked files on NN's web interface. > >> >> > >> >> > >> >> > >> >> On Mon, Jun 29, 2015 at 11:54 AM, Wangda Tan > >> >> > >> >>wrote: > >> >>> +1 (non-binding) > >> >>> > >> >>> Compiled and deployed a single node cluster, tried to change > >> >>>node labels and run distributed_shell with node label > >> >>>specified. > >> >>> > >> >>> On Mon, Jun 29, 2015 at 10:30 AM, Ted Yu > wrote: > >> >>> > >> >>>> +1 (non-binding) > >> >>>> > >> >>>> Compiled hbase branch-1 with Java 1.8.0_45 Ran unit test suite > >> >>>> which passed. > >> >>>> > >> >>>> On Mon, Jun 29, 2015 at 7:22 AM, Steve Loughran > >> >>>> > >> >>>> wrote: > >> >>>> > >> >>>>> > >> >>>>> +1 binding from me. > >> >>>>> > >> >>>>> Tests: > >> >>>>> > >> >>>>> Rebuild slider with Hadoop.version=2.7.1; ran all the tests > including > >> >>>>> against a secure cluster. > >> >>>>> Repeated for windows running Java 8. > >> >>>>> > >> >>>>> All tests passed > >> >>>>> > >> >>>>> > >> >>>>>> On 29 Jun 2015, at 09:45, Vinod Kumar Vavilapalli > >> >>>>>> > >> >>>>> wrote: > >> >>>>>> > >> >>>>>> Hi all, > >> >>>>>> > >> >>>>>> I've created a release candidate RC0 for Apache Hadoop 2.7.1. > >> >>>>>> > >> >>>>>> As discussed before, this is the next stable release to > >> >>>>>> follow up > >> >>>> 2.6.0, > >> >>>>>> and the first stable one in the 2.7.x line. > >> >>>>>> > >> >>>>>> The RC is available for validation at: > >> >>>>>> *http://people.apache.org/~vinodkv/hadoop-2.7.1-RC0/ > >> >>>>>> <http://people.apache.org/~vinodkv/hadoop-2.7.1-RC0/>* > >> >>>>>> > >> >>>>>> The RC tag in git is: release-2.7.1-RC0 > >> >>>>>> > >> >>>>>> The maven artifacts are available via repository.apache.org > >> >>>>>> at > >> >>>>>> * > >> >>>>> > >> >>>>> > >> https://repository.apache.org/content/repositories/orgapachehadoop- > >> 101 > >> >>>>>9/ > >> >>>>>> < > >> >>>>> > >> >>>>> > >> https://repository.apache.org/content/repositories/orgapachehadoop- > >> 101 > >> >>>>>9/ > >> >>>>> * > >> >>>>>> > >> >>>>>> Please try the release and vote; the vote will run for the > >> >>>>>> usual > 5 > >> >>>> days. > >> >>>>>> > >> >>>>>> Thanks, > >> >>>>>> Vinod > >> >>>>>> > >> >>>>>> PS: It took 2 months instead of the planned [1] 2 weeks in > getting > >> >>>>>>this > >> >>>>>> release out: post-mortem in a separate thread. > >> >>>>>> > >> >>>>>> [1]: A 2.7.1 release to follow up 2.7.0 > >> >>>>>> http://markmail.org/thread/zwzze6cqqgwq4rmw > >> >>>>> > >> >>>>> > >> >>>> > >> >> > >> >> > >> >> > >> >> -- > >> >> Lei (Eddy) Xu > >> >> Software Engineer, Cloudera > >> >> > >> > > >> > > >> > >> > > > > > >-- > > > > > >Thanks > >Devaraj K > >
RE: [VOTE] Release Apache Hadoop 2.7.0 RC0
+1(non-binding) Build from source, Deployed in 3 Node cluster Verified RM HA, RM Work preserving restart and NM restart features submitting simple MR Jobs And basic sanity testing is done. Thanks & Regards Rohith Sharma K S -Original Message- From: Vinod Kumar Vavilapalli [mailto:vino...@apache.org] Sent: 11 April 2015 05:14 To: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org Cc: vino...@apache.org Subject: [VOTE] Release Apache Hadoop 2.7.0 RC0 Hi all, I've created a release candidate RC0 for Apache Hadoop 2.7.0. The RC is available at: http://people.apache.org/~vinodkv/hadoop-2.7.0-RC0/ The RC tag in git is: release-2.7.0-RC0 The maven artifacts are available via repository.apache.org at https://repository.apache.org/content/repositories/orgapachehadoop-1017/ As discussed before - This release will only work with JDK 1.7 and above - I’d like to use this as a starting release for 2.7.x [1], depending on how it goes, get it stabilized and potentially use a 2.7.1 in a few weeks as the stable release. Please try the release and vote; the vote will run for the usual 5 days. Thanks, Vinod [1]: A 2.7.1 release to follow up 2.7.0 http://markmail.org/thread/zwzze6cqqgwq4rmw
[Discuss] : Container JVM reuse feature for MRv2.
Hi folks I would like to know, Is there any feature exists for container JVM reuse within the same task types MRv2.? I see some Open Jira's(MAPREDUCE-3902, MAPREDUCE-4502) related to this feature, but I don't see any of these are available in any of released versions or trunk. Is there separate branch for MAPREDUCE-3902. ? It will be more useful if this feature available in MRv2. Thanks & Regards Rohith Sharma K S
JobHistoryEventHandler failed with AvroTypeException.
Hi all, I am using Hadoop-2.3 for Yarn Cluster. While running job, I encountered below exception in MRAppmaster. Why this error is logging? 2014-02-21 22:10:33,841 INFO [Thread-355] org.apache.hadoop.service.AbstractService: Service JobHistoryEventHandler failed in state STOPPED; cause: org.apache.avro.AvroTypeException: Attempt to process a enum when a string was expected. org.apache.avro.AvroTypeException: Attempt to process a enum when a string was expected. at org.apache.avro.io.parsing.Parser.advance(Parser.java:93) at org.apache.avro.io.JsonEncoder.writeEnum(JsonEncoder.java:217) at org.apache.avro.specific.SpecificDatumWriter.writeEnum(SpecificDatumWriter.java:54) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:67) at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58) at org.apache.hadoop.mapreduce.jobhistory.EventWriter.write(EventWriter.java:66) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$MetaInfo.writeEvent(JobHistoryEventHandler.java:870) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:517) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:332) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:159) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1386) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:550) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:602) Thanks & Regards Rohith Sharma K S
RE: Reducers are launched after jobClient is exited.
Thank you vinod for your reply.. I used FileOutputCommitter(Default) for committing file in job out directory. I noticed that there is no commit/abort happened for Reducer Task when Reducer is killed by NodeManager(stopContainer request) . For reproducing test, I manually killed("kill" and "kill -9") Reducer task, and end up with same issue. I walkthrough YarnChild class and found there is NO shutdownHook is registered. Why there is no shutdownHook for YarnChild? Is this intentional? -Original Message- From: Vinod Kumar Vavilapalli [mailto:vino...@hortonworks.com] Sent: 29 January 2014 12:17 To: yarn-...@hadoop.apache.org Cc: mapreduce-dev@hadoop.apache.org; mapreduce-u...@hadoop.apache.org; yarn-u...@hadoop.apache.org Subject: Re: Reducers are launched after jobClient is exited. MapReduce AppMaster and YARN at large use asynchronous event handling inside the JVM and so you may run into race conditions like this. Even otherwise, doing this in a deterministic manner is better achieved by overriding your OutputCommitter. Job output commit/abort happens only once. +Vinod On Jan 28, 2014, at 7:06 PM, Rohith Sharma K S wrote: > Hi All , > > I ran job with 1 Map and 1 Reducers ( > mapreduce.job.reduce.slowstart.completedmaps=1 ). Map failed ( > because of error in Mapper implementation), but still Reducers are > launched by applicationMaster. These reducers killed by > applicationMaster while > > stopping RMCommunicator service. > > > 1. Why Reducers are launching after job is finished.? ( Is this is bug > in MR? ) > > > > Our use case is when job is finished(succeeded/failed),client program > delete the JobOutput directory. Here, jobclient exit immediately after > jobStatus is set. ( in below log, at 2014-01-23 07:34:43,166) > > > > But , in the below log as mentioned reducers are launched later , Reducer > temporary directory and files are created(_temporary). These files left in > hdfs undeleted forever. > > Kindly suggest your thoughts, how we can handle this situation? > > > > 2014-01-23 07:34:43,151 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: > task_1389970937094_0047_m_00 Task Transitioned from RUNNING to > FAILED > 2014-01-23 07:34:43,151 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed > Tasks: 1 > 2014-01-23 07:34:43,151 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Job failed as > tasks failed. failedMaps:1 failedReduces:0 > 2014-01-23 07:34:43,153 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: > job_1389970937094_0047Job Transitioned from RUNNING to FAIL_ABORT > 2014-01-23 07:34:43,153 INFO [CommitterEvent Processor #0] > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: > Processing the event EventType: JOB_ABORT > 2014-01-23 07:34:43,166 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: > job_1389970937094_0047Job Transitioned from FAIL_ABORT to FAILED > ... > ... > 2014-01-23 07:34:43,707 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before > Scheduling: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 > AssignedMaps:1 AssignedReds:0 CompletedMaps:1 CompletedReds:0 > ContAlloc:4 ContRel:0 HostLocal:1 RackLocal:0 > 2014-01-23 07:34:43,709 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: > Recalculating schedule, headroom=12288 > 2014-01-23 07:34:43,709 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start > threshold reached. Scheduling reduces. > 2014-01-23 07:34:43,709 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: All maps > assigned. Ramping up all remaining reduces:1 ... > ... > 2014-01-23 07:34:45,714 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got > allocated containers 1 > 2014-01-23 07:34:45,714 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned > to reduce > 2014-01-23 07:34:45,714 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned > container container_1389970937094_0047_01_06 to > attempt_1389970937094_0047_r_00_0 > ... > ... > 2014-01-23 07:34:45,724 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.T
Reducers are launched after jobClient is exited.
Hi All , I ran job with 1 Map and 1 Reducers ( mapreduce.job.reduce.slowstart.completedmaps=1 ). Map failed ( because of error in Mapper implementation), but still Reducers are launched by applicationMaster. These reducers killed by applicationMaster while stopping RMCommunicator service. 1. Why Reducers are launching after job is finished.? ( Is this is bug in MR? ) Our use case is when job is finished(succeeded/failed),client program delete the JobOutput directory. Here, jobclient exit immediately after jobStatus is set. ( in below log, at 2014-01-23 07:34:43,166) But , in the below log as mentioned reducers are launched later , Reducer temporary directory and files are created(_temporary). These files left in hdfs undeleted forever. Kindly suggest your thoughts, how we can handle this situation? 2014-01-23 07:34:43,151 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1389970937094_0047_m_00 Task Transitioned from RUNNING to FAILED 2014-01-23 07:34:43,151 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 1 2014-01-23 07:34:43,151 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Job failed as tasks failed. failedMaps:1 failedReduces:0 2014-01-23 07:34:43,153 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1389970937094_0047Job Transitioned from RUNNING to FAIL_ABORT 2014-01-23 07:34:43,153 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_ABORT 2014-01-23 07:34:43,166 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1389970937094_0047Job Transitioned from FAIL_ABORT to FAILED ... ... 2014-01-23 07:34:43,707 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:4 ContRel:0 HostLocal:1 RackLocal:0 2014-01-23 07:34:43,709 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=12288 2014-01-23 07:34:43,709 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start threshold reached. Scheduling reduces. 2014-01-23 07:34:43,709 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: All maps assigned. Ramping up all remaining reduces:1 ... ... 2014-01-23 07:34:45,714 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 1 2014-01-23 07:34:45,714 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2014-01-23 07:34:45,714 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1389970937094_0047_01_06 to attempt_1389970937094_0047_r_00_0 ... ... 2014-01-23 07:34:45,724 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1389970937094_0047_r_00_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED 2014-01-23 07:34:45,725 INFO [ContainerLauncher #8] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1389970937094_0047_01_06 taskAttempt attempt_1389970937094_0047_r_00_0 2014-01-23 07:34:45,725 INFO [ContainerLauncher #8] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1389970937094_0047_r_00_0 2014-01-23 07:34:45,727 INFO [ContainerLauncher #8] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1389970937094_0047_r_00_0 : 11234 2014-01-23 07:34:45,728 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1389970937094_0047_r_00_0] using containerId: [container_1389970937094_0047_01_06 on NM: [linux85:11232] 2014-01-23 07:34:45,728 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1389970937094_0047_r_00_0 TaskAttempt Transitioned from ASSIGNED to RUNNING 2014-01-23 07:34:45,728 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1389970937094_0047_r_00 Task Transitioned from SCHEDULED to RUNNING ... . 2014-01-23 07:34:48,178 INFO [Thread-59] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING attempt_1389970937094_0047_r
Reducers are launched after jobClient is exited.
Hi All , I ran job with 1 Map and 1 Reducers ( mapreduce.job.reduce.slowstart.completedmaps=1 ). Map failed ( because of error in Mapper implementation), but still Reducers are launched by applicationMaster. These reducers killed by applicationMaster while stopping RMCommunicator service. 1. Why Reducers are launching after job is finished.? ( Is this is bug in MR? ) Our use case is when job is finished(succeeded/failed),client program delete the JobOutput directory. Here, jobclient exit immediately after jobStatus is set. ( in below log, at 2014-01-23 07:34:43,166) But , in the below log as mentioned reducers are launched later , Reducer temporary directory and files are created(_temporary). These files left in hdfs undeleted forever. Kindly suggest your thoughts, how we can handle this situation? 2014-01-23 07:34:43,151 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1389970937094_0047_m_00 Task Transitioned from RUNNING to FAILED 2014-01-23 07:34:43,151 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 1 2014-01-23 07:34:43,151 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Job failed as tasks failed. failedMaps:1 failedReduces:0 2014-01-23 07:34:43,153 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1389970937094_0047Job Transitioned from RUNNING to FAIL_ABORT 2014-01-23 07:34:43,153 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_ABORT 2014-01-23 07:34:43,166 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1389970937094_0047Job Transitioned from FAIL_ABORT to FAILED ... ... 2014-01-23 07:34:43,707 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:4 ContRel:0 HostLocal:1 RackLocal:0 2014-01-23 07:34:43,709 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=12288 2014-01-23 07:34:43,709 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start threshold reached. Scheduling reduces. 2014-01-23 07:34:43,709 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: All maps assigned. Ramping up all remaining reduces:1 ... ... 2014-01-23 07:34:45,714 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 1 2014-01-23 07:34:45,714 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2014-01-23 07:34:45,714 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1389970937094_0047_01_06 to attempt_1389970937094_0047_r_00_0 ... ... 2014-01-23 07:34:45,724 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1389970937094_0047_r_00_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED 2014-01-23 07:34:45,725 INFO [ContainerLauncher #8] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1389970937094_0047_01_06 taskAttempt attempt_1389970937094_0047_r_00_0 2014-01-23 07:34:45,725 INFO [ContainerLauncher #8] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1389970937094_0047_r_00_0 2014-01-23 07:34:45,727 INFO [ContainerLauncher #8] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1389970937094_0047_r_00_0 : 11234 2014-01-23 07:34:45,728 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1389970937094_0047_r_00_0] using containerId: [container_1389970937094_0047_01_06 on NM: [linux85:11232] 2014-01-23 07:34:45,728 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1389970937094_0047_r_00_0 TaskAttempt Transitioned from ASSIGNED to RUNNING 2014-01-23 07:34:45,728 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1389970937094_0047_r_00 Task Transitioned from SCHEDULED to RUNNING ... . 2014-01-23 07:34:48,178 INFO [Thread-59] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING attempt_1389970937094_0047_r
[jira] [Created] (MAPREDUCE-5486) Potential file handler leak in JobHistoryServer web ui.
Rohith Sharma K S created MAPREDUCE-5486: Summary: Potential file handler leak in JobHistoryServer web ui. Key: MAPREDUCE-5486 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5486 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 2.0.5-alpha, 2.1.1-beta Reporter: Rohith Sharma K S Any problem in getting aggregated logs for rendering on web ui, then LogReader is not closed. Now, it reader is not closed which causing many connections in close_wait state. hadoopuser@hadoopuser:> jps *27909* JobHistoryServer DataNode port is 50010. When greped with DataNode port, many connections are in CLOSE_WAIT from JHS. hadoopuser@hadoopuser:> netstat -tanlp |grep 50010 tcp0 0 10.18.40.48:50010 0.0.0.0:* LISTEN 21453/java tcp1 0 10.18.40.48:20596 10.18.40.48:50010 CLOSE_WAIT *27909*/java tcp1 0 10.18.40.48:19667 10.18.40.152:50010 CLOSE_WAIT *27909*/java tcp1 0 10.18.40.48:20593 10.18.40.48:50010 CLOSE_WAIT *27909*/java tcp1 0 10.18.40.48:12290 10.18.40.48:50010 CLOSE_WAIT *27909*/java tcp1 0 10.18.40.48:19662 10.18.40.152:50010 CLOSE_WAIT *27909*/java -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5444) MRAppMaster throws InvalidStateTransitonException: Invalid event: JOB_AM_REBOOT at SUCCEEDED
Rohith Sharma K S created MAPREDUCE-5444: Summary: MRAppMaster throws InvalidStateTransitonException: Invalid event: JOB_AM_REBOOT at SUCCEEDED Key: MAPREDUCE-5444 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5444 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: applicationmaster Reporter: Rohith Sharma K S Priority: Minor {noformat} 2013-08-02 14:55:11,537 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Calling handler for JobFinishedEvent 2013-08-02 14:55:11,538 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1375199817609_0049Job Transitioned from COMMITTING to SUCCEEDED 2013-08-02 14:55:11,663 INFO [Thread-52] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://0.0.0.0:45000/home/restest/staging-dir/restest/.staging/job_1375199817609_0049/job_1375199817609_0049_2.jhist to hdfs://0.0.0.0:45000/home/restest/staging-dir/history/done_intermediate/restest/job_1375199817609_0049-1375435337429-restest-word+count-1375435511533-10-1-SUCCEEDED-a.jhist_tmp 2013-08-02 14:55:11,750 INFO [Thread-52] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://0.0.0.0:45000/home/restest/staging-dir/history/done_intermediate/restest/job_1375199817609_0049-1375435337429-restest-word+count-1375435511533-10-1-SUCCEEDED-a.jhist_tmp 2013-08-02 14:55:11,769 INFO [Thread-52] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://0.0.0.0:45000/home/restest/staging-dir/restest/.staging/job_1375199817609_0049/job_1375199817609_0049_2_conf.xml to hdfs://0.0.0.0:45000/home/restest/staging-dir/history/done_intermediate/restest/job_1375199817609_0049_conf.xml_tmp 2013-08-02 14:55:11,880 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:0 AssignedReds:1 CompletedMaps:10 CompletedReds:1 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0 2013-08-02 14:55:13,649 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating with RM: Resource Manager doesn't recognize AttemptId: application_1375199817609_0049 org.apache.hadoop.yarn.YarnException: Resource Manager doesn't recognize AttemptId: application_1375199817609_0049 at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:626) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:238) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:250) at java.lang.Thread.run(Thread.java:662) 2013-08-02 14:55:13,649 ERROR [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Can't handle this event at current state org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: JOB_AM_REBOOT at SUCCEEDED at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:914) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:129) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1114) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1110) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130) at org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$RecoveryDispatcher.realDispatch(RecoveryService.java:309) at org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$RecoveryDispatcher.dispatch(RecoveryService.java:305) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77) at java.lang.Thread.run(Thread.java:662) 2013-08-02 14:55:13,652 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: JobHistoryEvent is triggered from JobImpl 2013-08-02 14:55:13,652 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1375199817609_0049Job Transitioned from SUCCEEDED to ERROR {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5441) JobClient exit whenever RM issue Reboot command to 1st attempt App Master.
Rohith Sharma K S created MAPREDUCE-5441: Summary: JobClient exit whenever RM issue Reboot command to 1st attempt App Master. Key: MAPREDUCE-5441 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5441 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, client Affects Versions: 2.1.1-beta Reporter: Rohith Sharma K S When RM issue Reboot command to app master, app master shutdown gracefully. All the history event are writtent to hdfs with job status set as ERROR. Jobclient get job state as ERROR and exit. But RM launches 2nd attempt app master where no client are there to get job status.In RM UI, job status is displayed as SUCCESS but for client Job is Failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira