+1. -- Hitesh
On Jul 25, 2012, at 6:40 PM, Arun C Murthy wrote: > Folks, > > It's been nearly a year since we merged Hadoop YARN into trunk and we have > made several releases since. > > It's exciting to see various open-source communities (both in the ASF and > externally) start to explore integration with YARN such as Apache Hama, > Apache Giraph, Apache S4, Spark etc. This promises to help us realize our > hopes of making Apache Hadoop a much more general data processing platform (& > storage, of course) and not tied to MapReduce alone for processing data. > Furthermore, we already have people contributing interesting prototypes such > as DistributedShell and PaaS on YARN. > > Given this, I think it would be useful to make YARN a sub-project of Apache > Hadoop along with Common, HDFS & MapReduce. I believe this would help other > communities realize that they could consider using YARN as a general-purpose > resource management layer and help us enhance YARN beyond it's humble > beginnings. > > Clearly, YARN and MapReduce are different enough that they can and will > attract a diverse community. > > I'd like to clarify that this proposal *does not* mean we move the code base > out of hadoop/common/ tree. It just alleviates hadoop-yarn alongside > hadoop-common, hadoop-hdfs & hadoop-mapreduce in hadoop/trunk. Also, there > would be *no changes* to release cycles - YARN would be co-released with > Common, HDFS & MapReduce. > > Thoughts? > > ---- > > What does it mean to the Hadoop developer community? > > # Project dependencies > > The change is that Hadoop would now have 4 sub-projects: Common, HDFS, YARN & > MapReduce. As today, the dependencies *do not change*: > - Common is the base > - HDFS depends only on Common > - YARN depends only on Common & HDFS > - MapReduce depends on Common, HDFS & YARN. > > # Jira & Mailing lists > > We would have a separate YARN jira project and a yarn-dev@ mailing list. > > We already use separate MAPREDUCE jira issues for making changes to YARN > (ResourceManager, NodeManager) and to the MapReduce framework (MapReduce > ApplicationMaster, MapReduce runtime etc.). Hence, this isn't a much of a > change. > > # Subversion > > Not much at all! YARN has, since the beginning, been developed with the > understanding that it is very independent of MapReduce and the code-bases are > already independent i.e. hadoop-mapreduce-project/hadoop-yarn and > hadoop-mapreduce-project/hadoop-mapreduce-client. > > Essentially the change would be: > $ svn mv hadoop-mapreduce-project/hadoop-yarn hadoop-yarn-project/hadoop-yarn > ... and the necessary, albeit small, changes to our maven build > infrastructure. > > # Release Cycles > > No changes. > > YARN would be co-released with Common, HDFS & MapReduce, as is the case today. > > thanks, > Arun