Thanks Robert, All,
So it seems that YARN-1493 and YARN-1490 are introducing serious regressions. I would propose to revert them and the follow up JIRAs from the 2.3 branch and keep working on them on trunk/branch-2 until the are stable (I would even prefer reverting them from branch-2 not to block a 2.4 if they are not ready in time). As I've mentioned before, the list of JIRAs to revert were: YARN-1493 YARN-1490 YARN-1166 YARN-1041 YARN-1566 Plus 2 additional JIRAs committed since my email on this issue 2 days ago: *YARN-1661 *YARN-1689 (not sure if this JIRA is related in functionality to the previous ones but it is creating conflicts). I think we should hold on continuing work on top of something that is broken until the broken stuff is fixed. Quoting Arun, "Committers - Henceforth, please use extreme caution while committing to branch-2.3. Please commit *only* blockers to 2.3." YARN-1661 & YARN-1689 are not blockers. Unless there are objections, I'll revert all these JIRAs from branch-2.3 tomorrow around noon and I'll update fixedVersion in the JIRAs. I'm inclined to revert them from branch-2 as well. Thoughts? Thanks. On Thu, Feb 6, 2014 at 3:54 PM, Robert Kanter <rkan...@cloudera.com> wrote: > I think we should revert YARN-1490 from Hadoop 2.3 branch. I think it was > causing some strange behavior in the Oozie unit tests: > > Basically, we use a single MiniMRCluster and MiniDFSCluster across all unit > tests in a module. With YARN-1490 we saw that, regardless of test order, > the last few tests would timeout waiting for an MR job to finish; on slower > machines, the entire test suite would timeout. Through some digging, I > found that we were getting a ton of "Connection refused" Exceptions on > LeaseRenewer talking to the NN and a few on the AM talking to the RM. > > After a bunch of investigation, I found that the problem went away once > YARN-1490 was removed. Though I couldn't figure out the exact problem. > Even though this occurred in unit tests, it does make me concerned that it > could indicate some bigger issue in a long-running real cluster (where > everything isn't running on the same machine) that we haven't seen yet. > > > > On Thu, Feb 6, 2014 at 3:06 PM, Karthik Kambatla <ka...@cloudera.com> > wrote: > > > I have marked MAPREDUCE-5744 a blocker for 2.3. Committing it shortly. > Will > > pull it out of branch-2.3 if anyone objects. > > > > > > On Thu, Feb 6, 2014 at 2:04 PM, Arpit Agarwal <aagar...@hortonworks.com > > >wrote: > > > > > Merged HADOOP-10273 to branch-2.3 as r1565456. > > > > > > > > > On Wed, Feb 5, 2014 at 4:49 PM, Arpit Agarwal < > aagar...@hortonworks.com > > > >wrote: > > > > > > > IMO HADOOP-10273 (Fix 'mvn site') should be included in 2.3. > > > > > > > > I will merge it to branch-2.3 tomorrow PST if no one disagrees. > > > > > > > > > > > > On Tue, Feb 4, 2014 at 5:03 PM, Alejandro Abdelnur < > t...@cloudera.com > > > >wrote: > > > > > > > >> IMO YARN-1577 is a blocker, it is breaking unmanaged AMs in a very > odd > > > >> ways > > > >> (to the point it seems un-deterministic). > > > >> > > > >> I'd say eiher YARN-1577 is fixed or we revert > > > >> YARN-1493/YARN-1490/YARN-1166/YARN-1041/YARN-1566 (almost clean > > reverts) > > > >> from Hadoop 2.3 branch before doing the release. > > > >> > > > >> > > > >> I've verified that after reverting those JIRAs things work fine with > > > >> unmanaged AMs. > > > >> > > > >> Thanks. > > > >> > > > >> > > > >> > > > >> > > > >> On Tue, Feb 4, 2014 at 11:45 AM, Arun C Murthy <a...@hortonworks.com > > > > > >> wrote: > > > >> > > > >> > I punted YARN-1444 to 2.4 since it's a long-standing issue. > > > >> > > > > >> > Jian is away and I don't see YARN-1577 & YARN-1206 making much > > > progress > > > >> > till he is back; so I'm inclined to push both to 2.4 too. Any > > > >> objections? > > > >> > > > > >> > Looks like Daryn has both HADOOP-10301 & HDFS-4564 covered. > > > >> > > > > >> > Overall, I'll try get this out in next couple of days if we can > > clear > > > >> the > > > >> > list. > > > >> > > > > >> > thanks, > > > >> > Arun > > > >> > > > > >> > On Feb 3, 2014, at 12:14 PM, Arun C Murthy <a...@hortonworks.com> > > > wrote: > > > >> > > > > >> > > An update. Per https://s.apache.org/hadoop-2.3.0-blockers we > are > > > now > > > >> > down to 5 blockers: 1 Common, 1 HDFS, 3 YARN. > > > >> > > > > > >> > > Daryn (thanks!) has both the non-YARN covered. Vinod is helping > > out > > > >> with > > > >> > the YARN ones. > > > >> > > > > > >> > > thanks, > > > >> > > Arun > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > >> > -- > > > >> > Arun C. Murthy > > > >> > Hortonworks Inc. > > > >> > http://hortonworks.com/ > > > >> > > > > >> > > > > >> > > > > >> > -- > > > >> > CONFIDENTIALITY NOTICE > > > >> > NOTICE: This message is intended for the use of the individual or > > > >> entity to > > > >> > which it is addressed and may contain information that is > > > confidential, > > > >> > privileged and exempt from disclosure under applicable law. If the > > > >> reader > > > >> > of this message is not the intended recipient, you are hereby > > notified > > > >> that > > > >> > any printing, copying, dissemination, distribution, disclosure or > > > >> > forwarding of this communication is strictly prohibited. If you > have > > > >> > received this communication in error, please contact the sender > > > >> immediately > > > >> > and delete it from your system. Thank You. > > > >> > > > > >> > > > >> > > > >> > > > >> -- > > > >> Alejandro > > > >> > > > > > > > > > > > > > > -- > > > CONFIDENTIALITY NOTICE > > > NOTICE: This message is intended for the use of the individual or > entity > > to > > > which it is addressed and may contain information that is confidential, > > > privileged and exempt from disclosure under applicable law. If the > reader > > > of this message is not the intended recipient, you are hereby notified > > that > > > any printing, copying, dissemination, distribution, disclosure or > > > forwarding of this communication is strictly prohibited. If you have > > > received this communication in error, please contact the sender > > immediately > > > and delete it from your system. Thank You. > > > > > > -- Alejandro