Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Andrew Purtell
Looking at the voting, it appears YARN wants to become a TLP RIGHT NOW but at the price of the complete decoherence of the Apache Hadoop platform. For all of us who have invested in the Apache Hadoop platform, how does this benefit us? Certainly our interests seem to get little consideration with

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Mattmann, Chris A (388J)
Hi Andrew, How many new Apache Foundation *members* has the Hadoop PMC added over the past 3-4 years, and by whom (the answer to this question might surprise you)? The thing you and others continue not to see is that the ASF isn't about the most superior technical solutions, or the best

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Mattmann, Chris A (388J)
One quick fix to the below, sorry for the confusion: On Aug 30, 2012, at 11:15 PM, Mattmann, Chris A (388J) wrote: Hi Andrew, How many new Apache Foundation *members* has the Hadoop PMC added over the past 3-4 years, and by whom (the answer to this question might surprise you)? To

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Andrew Purtell
If Apache Hadoop -- as an umbrella or sum of its parts -- isn't practical to develop end applications or downstream projects on, the community will disappear. I don't follow your logic. I deal with the technical realities of actually trying to use an Apache Hadoop distribution, the pieces released

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Mattmann, Chris A (388J)
Hi Andrew, On Aug 30, 2012, at 11:42 PM, Andrew Purtell wrote: If Apache Hadoop -- as an umbrella or sum of its parts -- isn't practical to develop end applications or downstream projects on, the community will disappear. Sure, the end-user community might disappear, but the point I'm trying

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Andrew Purtell
The end user community might disappear, and you are ok with this? I'm simply astonished. Who are these people showing up to help, document, be on lists, whatever, if not current or prospective end users? Who the hell shows up to write unit tests? Who is this public in public good? Looks to me like

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Steve Loughran
On 29 August 2012 21:34, Tom White t...@cloudera.com wrote: Eric - I agree with Common being included in HDFS. That's what I meant by Common not having a clear enough mission to be a TLP by itself. That makes sense too. Even better if you could do JIRAs/commits to the same codebase together.

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Robert Evans
Andrew, I agree with you that the DLL/CLASSPATH issues is one huge concern that needs to be addressed before we can really move forward with a valid longterm split. There is hope on the horizon for that though with some of the OSGI work that Tom White has been doing. Chris, I completely agree

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Mahadev Konar
I agree with Bobby and Andrew here. As has been said on this thread, I think the technical issues should be addressed. Just going ahead and doing the split will be counter productive. I am all for the project going TLP's (sooner than later) but I think we need to work through a plan on when/how

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Mattmann, Chris A (388J)
Hi Bobby, and Andrew, Sorry I think both of you are still missing my point (maybe I'm wrong). And sorry that I've failed to explain it in such a way that you guys understand, that's as much my issue as anyone else's. My point is - technical issues, such as how to pull apart components and

Re: Heads up: next hadoop-2 release

2012-08-31 Thread Arun C Murthy
Eli, Good point. Looks like both HDFS MR committers have been lax in maintaining branch-2.1.0-alpha. Lots of stabilization has occurred via branch-2/branch-0.23. Since branch-0.23 is nearly there, I'll go ahead and cherry-pick to branch-2.1.0-alpha - do you mind do the same for HDFS from

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Roman Shaposhnik
On Tue, Aug 28, 2012 at 7:33 PM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: [decided to minimize traffic and to simply put this in one thread] Hi Guys, See the recent discussion on these threads: YARN as its own Hadoop sub project: http://s.apache.org/WW1 Maintain a

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Doug Cutting
On Fri, Aug 31, 2012 at 8:09 AM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: I am saying that the current members of the Apache Software Foundation's Hadoop Project Management Committee exhibit the characteristics (not just during discrete events; it's been happening for a

Re: [VOTE] Maintain a single committer list for the Hadoop project

2012-08-31 Thread Arun C Murthy
On Aug 30, 2012, at 9:30 PM, Eli Collins wrote: With 12 +1s (9 binding) and 3 -1s (binding) the vote passes. I'll update the bylaws to reflect the merged committer lists (http://s.apache.org/Owx). Also, could you please update the site to reflect this? thanks, Arun Thanks, Eli On

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Mattmann, Chris A (388J)
Hey Doug, On Aug 31, 2012, at 9:00 AM, Doug Cutting wrote: On Fri, Aug 31, 2012 at 8:09 AM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: I am saying that the current members of the Apache Software Foundation's Hadoop Project Management Committee exhibit the characteristics

Re: Heads up: next hadoop-2 release

2012-08-31 Thread Eli Collins
On Fri, Aug 31, 2012 at 8:48 AM, Arun C Murthy a...@hortonworks.com wrote: Eli, Good point. Looks like both HDFS MR committers have been lax in maintaining branch-2.1.0-alpha. Lots of stabilization has occurred via branch-2/branch-0.23. Since branch-0.23 is nearly there, I'll go ahead

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Eli Collins
How about a proposal to just spin YARN off as a TLP? Rationale: 1. YARN started as a separate project and has a more independent community than Common/HDFS/MR (per below these communities do not divide at sub-project boundaries) that appears to want to be even more independent. 2. YARN is

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Robert Evans
The problem there is that YARN depends on Common, and MapReduce depends on YARN, so we would either have a circular dependency or we would have to split off MapRedcue too. --Bobby On 8/31/12 11:54 AM, Eli Collins e...@cloudera.com wrote: How about a proposal to just spin YARN off as a TLP?

Re: [VOTE] Maintain a single committer list for the Hadoop project

2012-08-31 Thread Eli Collins
On Fri, Aug 31, 2012 at 9:04 AM, Arun C Murthy a...@hortonworks.com wrote: On Aug 30, 2012, at 9:30 PM, Eli Collins wrote: With 12 +1s (9 binding) and 3 -1s (binding) the vote passes. I'll update the bylaws to reflect the merged committer lists (http://s.apache.org/Owx). Also, could you

Re: Heads up: next hadoop-2 release

2012-08-31 Thread Eli Collins
Great, it definitely makes sense on the HDFS side (I was going to merge in everything from branch-2) so if it makes sense for MR/YARN as well then let's just cut a new branch and update all the 2.2.0 fixed jiras to be 2.1.0. Thanks, Eli On Fri, Aug 31, 2012 at 9:45 AM, Arun C Murthy

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Todd Lipcon
On Fri, Aug 31, 2012 at 9:58 AM, Robert Evans ev...@yahoo-inc.com wrote: The problem there is that YARN depends on Common, and MapReduce depends on YARN, so we would either have a circular dependency or we would have to split off MapRedcue too. I haven't been in the MR codebase much of late,

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Alejandro Abdelnur
On Fri, Aug 31, 2012 at 9:59 AM, Todd Lipcon t...@cloudera.com wrote: As for technical things we need to do to get to a feasible split: big +1 that classpath pollution issues are near top of the list. We need a reasonable classloader strategy, and I think Tom's OSGi stuff is a good start in

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Alejandro Abdelnur
s/pen/pencil/ On Fri, Aug 31, 2012 at 10:10 AM, Alejandro Abdelnur t...@cloudera.com wrote: On Fri, Aug 31, 2012 at 9:59 AM, Todd Lipcon t...@cloudera.com wrote: As for technical things we need to do to get to a feasible split: big +1 that classpath pollution issues are near top of the list.

RE: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Jagane Sundar
As for technical things we need to do to get to a feasible split: big +1 that classpath pollution issues are near top of the list. We need a reasonable classloader strategy, and I think Tom's OSGi stuff is a good start in that direction. But it's going to be quite some time before that's all

Re: Heads up: next hadoop-2 release

2012-08-31 Thread Robert Evans
Yes, I hope to branch and make the RC Tuesday of next week. There was a flurry of new issues that came up recently. I am now waiting on YARN-60 YARN-66 MAPREDUCE-4612 MAPREDUCE-4611 But I would like to also have in HDFS-3852 MAPREDUCE-4604 HADOOP-8727 --Bobby On 8/31/12 11:45 AM, Arun C

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Robert Evans
That would be wonderful to have. +1 I would love to see MR run on more then just HDFS/YARN. So people can pick what execution environment makes since for them, just like what MPI does, or something like what HDFS does with FileSystem. My perspective was just from the current state of things, if

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Inder.dev Java
How often you call for Emeriti lists. Otherwise , if list is simply growing, then people may surprise like this. And also some people(Eli Collins) showed concerns about growing lists above right. But in reality all that people may not be active and not looking to project from long. Having them in

Re: Heads up: next hadoop-2 release

2012-08-31 Thread Vinod Kumar Vavilapalli
On Aug 31, 2012, at 9:45 AM, Arun C Murthy wrote: Makes sense. The only issue is rejiggering fix-versions, CHANGES.txt etc., but I can do that. My plan to create the RC after 0.23.3 completes the test/integration cycle it's currently in - should be done in the next week or so barring

Re: Heads up: next hadoop-2 release

2012-08-31 Thread Eli Collins
On Fri, Aug 31, 2012 at 12:21 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: On Aug 31, 2012, at 9:45 AM, Arun C Murthy wrote: Makes sense. The only issue is rejiggering fix-versions, CHANGES.txt etc., but I can do that. My plan to create the RC after 0.23.3 completes the

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Doug Cutting
On Fri, Aug 31, 2012 at 12:00 PM, Inder.dev Java java.in...@gmail.com wrote: How often you call for Emeriti lists. Otherwise , if list is simply growing, then people may surprise like this. And also some people(Eli Collins) showed concerns about growing lists above right. But in reality all

Re: svn branches cleanup

2012-08-31 Thread Owen O'Malley
On Mon, Aug 27, 2012 at 6:55 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Seems to me that stale branches have started accumulating. I just cleaned up some obvious trash from the branches directory: MR-279 MR-279-merge MR-279-merge-to-trunk yahoo-merge I'd also like to delete

Re: svn branches cleanup

2012-08-31 Thread Arun C Murthy
On Aug 31, 2012, at 2:26 PM, Owen O'Malley wrote: We should rename the 2.x branches as: branch-2.0.1-alpha - branch-2.0 branch-2.1.0-alpha - branch-2.1 I'm ok with the rest, but please don't rename the 'alpha' branches. We'll need those for later after hadoop-2 is stable. thanks, Arun

Re: svn branches cleanup

2012-08-31 Thread Eli Collins
On Fri, Aug 31, 2012 at 2:26 PM, Owen O'Malley omal...@apache.org wrote: On Mon, Aug 27, 2012 at 6:55 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Seems to me that stale branches have started accumulating. I just cleaned up some obvious trash from the branches directory:

Re: svn branches cleanup

2012-08-31 Thread Owen O'Malley
On Fri, Aug 31, 2012 at 2:28 PM, Arun C Murthy a...@hortonworks.com wrote: On Aug 31, 2012, at 2:26 PM, Owen O'Malley wrote: We should rename the 2.x branches as: branch-2.0.1-alpha - branch-2.0 branch-2.1.0-alpha - branch-2.1 I'm ok with the rest, but please don't rename the 'alpha'

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Eric Baldeschwieler
I'd be fascinated to hear more from folks who have lead other projects at Apache how Hadoop's (and Lucene's, same DNA) committer management process compares and some good and bad lessons learned from those projects. Chris M has mentioned his experience from other projects. Can others comment?

Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

2012-08-31 Thread Eric Baldeschwieler
Hi Folks, Lots of good points raised here. I remain convinced that the time to split Hadoop into TLPs in here, but I think we should also consider the practical concerns raised. Hadoop 2.0 has been years of work in the making and is finally relatively close. I think it would be a mistake to