[DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-25 Thread Arun C Murthy
Folks, It's been nearly a year since we merged Hadoop YARN into trunk and we have made several releases since. It's exciting to see various open-source communities (both in the ASF and externally) start to explore integration with YARN such as Apache Hama, Apache Giraph, Apache S4, Spark etc.

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-25 Thread Mattmann, Chris A (388J)
Hi Arun, IMHO, it sounds like you guys might be better off proposing a new project for the Apache Incubator. Looking at the things you list below the ---, it looks like an Incubator proposal minus the initial committer list, and affiliations and mentors/champions ;) If you don't want to go to t

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-25 Thread Edward J. Yoon
> Given this, I think it would be useful to make YARN a sub-project of Apache > Hadoop along with Common, HDFS & MapReduce. I believe this would help other > communities realize that they could consider using YARN as a general-purpose > resource management layer and help us enhance YARN beyond i

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-25 Thread Arun C Murthy
Hi Chris, On Jul 25, 2012, at 7:03 PM, Mattmann, Chris A (388J) wrote: > Hi Arun, > > IMHO, it sounds like you guys might be better off proposing a new project for > the Apache Incubator. > Looking at the things you list below the ---, it looks like an Incubator > proposal minus the initial co

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-25 Thread Mattmann, Chris A (388J)
Hey Arun, On Jul 25, 2012, at 7:11 PM, Arun C Murthy wrote: > Hi Chris, > > On Jul 25, 2012, at 7:03 PM, Mattmann, Chris A (388J) wrote: > >> Hi Arun, >> >> IMHO, it sounds like you guys might be better off proposing a new project >> for the Apache Incubator. >> Looking at the things you list

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Aaron T. Myers
On Wed, Jul 25, 2012 at 7:30 PM, Mattmann, Chris A (388J) < chris.a.mattm...@jpl.nasa.gov> wrote: > I realize I'm asking a hard question here: why *aren't* they separate > projects? What's the barrier? They seem > to be operating that way (and have been for a while). And I don't see how > Hadoop s

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Mahadev Konar
+1 mahadev On Wed, Jul 25, 2012 at 7:09 PM, Edward J. Yoon wrote: >> Given this, I think it would be useful to make YARN a sub-project of Apache >> Hadoop along with Common, HDFS & MapReduce. I believe this would help other >> communities realize that they could consider using YARN as a g

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Tom White
On Wed, Jul 25, 2012 at 9:40 PM, Arun C Murthy wrote: > Folks, > > It's been nearly a year since we merged Hadoop YARN into trunk and we have > made several releases since. > > It's exciting to see various open-source communities (both in the ASF and > externally) start to explore integration wi

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Robert Evans
+1 for what Aaron said. The projects are not ready to split yet. MAPREDUCE-3300 for example. YARN cannot display a UI for aggregated container logs unless we also have the MR History Server up and running. If we do want to split all of the projects HDFS, COMMON, YARN, and MAPREDUCE it will take s

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Mattmann, Chris A (388J)
Hey Aaron, On Jul 25, 2012, at 11:16 PM, Aaron T. Myers wrote: > On Wed, Jul 25, 2012 at 7:30 PM, Mattmann, Chris A (388J) < > chris.a.mattm...@jpl.nasa.gov> wrote: > >> I realize I'm asking a hard question here: why *aren't* they separate >> projects? What's the barrier? They seem >> to be oper

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Mattmann, Chris A (388J)
Thanks for your comments Bobby, makes sense. Cheers, Chris On Jul 26, 2012, at 7:28 AM, Robert Evans wrote: > +1 for what Aaron said. The projects are not ready to split yet. > MAPREDUCE-3300 for example. YARN cannot display a UI for aggregated > container logs unless we also have the MR Histo

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Alejandro Abdelnur
+1 on moving hadoop-yarn to trunk/ level. As part of that, can we flatten the internal hierarchy so there are not multiple nested modules within hadoop-yarn module? just one level as in common, hdfs & tools? this will make the build more consistent and will allow to consolidate logic in the POMs. T

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Steve Loughran
On 25 July 2012 18:40, Arun C Murthy wrote: > Folks, > > It's been nearly a year since we merged Hadoop YARN into trunk and we have > made several releases since. > > It's exciting to see various open-source communities (both in the ASF and > externally) start to explore integration with YARN suc

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Steve Loughran
On 26 July 2012 08:10, Alejandro Abdelnur wrote: > As part of that, can we flatten > the internal hierarchy so there are not multiple nested modules within > hadoop-yarn module? just one level as in common, hdfs & tools? this will > make the build more consistent and will allow to consolidate log

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Suresh Srinivas
+1 from me. The main question is, is this a good idea without considering the details of how easy/hard it is to do? I think it is a good idea and we should move in this direction. If we all agree on this, lets discuss main issues that need to be resolved to split YARN into a separate project. As o

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Finger, Jay
I'm not sure what the goal of that is. If this is an Apache organizational/political thing then I am oblivious. If the point is that YARN should not be a subproject of MapReduce, then I agree completely. Any argument by which YARN is a subproject of MR could also be made that YARN should be a su

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Aaron T. Myers
Hi Chris, On Thu, Jul 26, 2012 at 8:00 AM, Mattmann, Chris A (388J) < chris.a.mattm...@jpl.nasa.gov> wrote: > Sub projects are not a good thing at Apache. Well, "official" sub projects > that have their own committees, mailing lists, etc. You guys aren't talking > about sub projects (though you c

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Mattmann, Chris A (388J)
Thanks Aaron, makes total sense. Take care! Cheers, Chris On Jul 26, 2012, at 10:20 AM, Aaron T. Myers wrote: > Hi Chris, > > On Thu, Jul 26, 2012 at 8:00 AM, Mattmann, Chris A (388J) < > chris.a.mattm...@jpl.nasa.gov> wrote: > >> Sub projects are not a good thing at Apache. Well, "official"

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Luke Lu
+1. Probably should've done so when we mavenized the whole thing :) On Wed, Jul 25, 2012 at 6:40 PM, Arun C Murthy wrote: > Folks, > > It's been nearly a year since we merged Hadoop YARN into trunk and we have > made several releases since. > > It's exciting to see various open-source communitie

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Thomas Graves
+1 for the idea. I think separating the framework from the MR application makes sense. Tom On 7/25/12 8:40 PM, "Arun C Murthy" wrote: > Folks, > > It's been nearly a year since we merged Hadoop YARN into trunk and we have > made several releases since. > > It's exciting to see various op

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Hitesh Shah
+1. -- Hitesh On Jul 25, 2012, at 6:40 PM, Arun C Murthy wrote: > Folks, > > It's been nearly a year since we merged Hadoop YARN into trunk and we have > made several releases since. > > It's exciting to see various open-source communities (both in the ASF and > externally) start to explore

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Doug Cutting
+1 This would be an improved layering of components. As others have noted we should probably stop using the term "subproject" for these, as that's most often used at Apache for things that are released independently. Better terms might be "components" or "modules". Addressing that might also re

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Ahmed Radwan
Thanks Arun! +1, this organization makes sense. Also, what will be the strategy for applications other than MapReduce going forward. Will they be part of YARN or separate sub-projects like MapReduce? They now live inside hadoop-yarn-applications. I think they can remain there, and when getting matu

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Jun Ping Du
+1. It definitely should be some work to do for separating YARN, but it deserve. Thanks, Junping - Original Message - From: "Arun C Murthy" To: general@hadoop.apache.org Sent: Thursday, July 26, 2012 9:40:21 AM Subject: [DISCUSS] - YARN as a sub-project of Apache Hadoop Fo

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Arun C Murthy
Looks like the feedback has been very positive, I'll start a vote to formalize it. thanks, Arun On Jul 25, 2012, at 6:40 PM, Arun C Murthy wrote: > Folks, > > It's been nearly a year since we merged Hadoop YARN into trunk and we have > made several releases since. > > It's exciting to see va

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Zizon Qiu
why not naming MAPREDUCE to YARN ,as in hadoop 2.0 MR2 is a implementation of YARN? On Fri, Jul 27, 2012 at 11:20 AM, Arun C Murthy wrote: > Looks like the feedback has been very positive, I'll start a vote to > formalize it. > > thanks, > Arun > > On Jul 25, 2012, at 6:40 PM, Arun C Murthy wrot

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-26 Thread Harsh J
Hi Zizon, MR is still MR, while YARN is a resource scheduler (generic, agnostic of 'MR'). MR1 ran over JobTracker and TaskTrackers, while MR2 runs from an AM and runs tasks via YARN. It would not make sense to rename MR to YARN as these are separate things, and calling YARN as MR2 only adds to t

Re: [DISCUSS] - YARN as a sub-project of Apache Hadoop

2012-07-27 Thread Steve Loughran
one more thing I think the service lifecycle stuff (inner start/stop methods) are actually a layer below Yarn and could go into common, though there are some things I'd like to fix there first (state machine doesn't let you stop without starting, implementations state checks happen after subclasse