Hello,

Below is a proposal for a new incubator project. This idea came out of and had 
strong support on the general hadoop list, see the thread at 
http://mail-archives.apache.org/mod_mbox/hadoop-general/201209.mbox/browser.

We are looking for feedback, and to see who else would be interested in 
contributing to this effort as a committer, as well as an additional mentor.

Cheers,
Adam Berry


Hadoop Development Tools Proposal

I’d like to propose the Hadoop Development Tools, a set of extensions to the 
Eclipse IDE to support developing against Apache Hadoop technologies.

= HDT (Hadoop Development Tools) =

== Abstract ==
 Tools to support developing applications that use Apache Hadoop from within 
Eclipse.

== Proposal ==
 Hadoop Development Tools are a set of extensions to Eclipse providing support 
for creating, launching and debugging distributed applications, as well as 
interacting with HDFS filesystems. This work will build on the existing Map 
Reduce Tools present in the Apache Hadoop project.

== Background ==
 Map Reduce Tools have existed as part of contrib for Apache Hadoop. 
Unfortunately they are source tied to a single version of Hadoop, and 
development has stalled, with little movement past the Hadoop 0.20 line.

== Rationale ==
 Support for newer versions of Hadoop from within Eclipse is regularly raised 
on the Hadoop mailing lists, so there is a clear need to drive these tools 
forward. Development tools generally are worked on separate from the target 
tools/platform, separating the tools out will allow for supporting multiple 
versions, so a developer could work with a heterogeneous environment.

== Initial Goals ==
 * Give the tools project a home of its own.
 * Port current MapReduce tools feature set to all current release lines of 
Hadoop in a single Eclipse install.
 * Documentation and tutorials for all features.
 * Publish Eclipse update site, and join Eclipse marketplace listing.
 * Establish release cycle that combines support for Hadoop and Eclipse release 
cycles.
 * Look to build support for YARN, MRUnit and possibly other Hadoop-related 
projects.

== Current Status ==
The source for the current MapReduceTools lives in the contrib section of the 
Hadoop source. In its current implementation it is tied to the version of 
Hadoop against which it is compiled. The layout and API that it was developed 
with means that it can only be used with the 0.20 or 1.0 Hadoop releases, the 
new layout and YARN api introduced with the 0.23 and 2.0 lines are not 
supported.


=== Meritocracy ===
Several people and companies have already expressed an interest in contributing 
to this project, and we hope to attract additional interest during the proposal 
discussion. We plan to invest and support a meritocracy that attracts, invites, 
and supports newcomers to build a vibrant and  diverse community.

=== Community ===
The target community is developers who are working developing Map/Reduce 
applications against Hadoop. Given the success of Hadoop the target group is 
likely to be quite large. Separation from the Hadoop community would make it 
easier to support multiple versions of hadoop, as well as merging the release 
cycles of Hadoop and Eclipse to provide predictable iteration and improvement 
in the toolset.

=== Core Developers ===
The initial list of developers includes people experienced with developing 
against the Eclipse platform.
 * Adam Berry (amberry at yahoo-inc dot com)
 * Jeffrey Zemerick (jeffrrey at mtnfog dot com)

=== Alignment ===
Hadoop Development Tools aligns with both Hadoop and Eclipse. Hadoop as the 
platform for the development target, and Eclipse as the IDE platform used as 
the base for the tools.

== Known Risks ==

=== Orphaned Products ===

=== Inexperience with Open Source ===
Adam Berry has experience of the Eclipse open source community, and has been 
building familiarity with the Apache processes through patches to the existing 
source.

=== Reliance on Salaried Developers ===
Hadoop Development Tools will be developed with a mix of salaried and volunteer 
time.

=== Relationships with Other Apache Projects ===
Hadoop Development Tools is closely related to Apache Hadoop.

=== An Excessive Fascination with the Apache Brand ===
 Given the success of Hadoop and associated projects, Apache is the natural 
place for the Hadoop Development Tools. Chris Mattman suggested the Apache 
Incubator as appropriate on the Hadoop general mailing list following the 
success that MRUnit had taking the path from Hadoop contrib to an Apache top 
level project.

== Documentation ==
Documentation for the current tools can be found at 
http://wiki.apache.org/hadoop/EclipsePlugIn

== Initial Source ==
http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/eclipse-plugin/

==  Source and Intellectual Property Submission Plan ==
The source, and any suggested initial patches, are already hosted either in 
Apache’s Subversion or JIRA.

==  External Dependencies ==
Eclipse Platform
Eclipse Java Development Tools

==  Cryptography ==
Hadoop Development Tools likely does not fall into this area.

==  Required Resources ==
=== Mailing lists ===
 * hdt-dev
 * hdt-commits
 * hdt-user

=== Subversion Directory ===
 * https://svn.apache.org/repos/asf/incubator/hdt

=== Issue Tracking ===
 * JIRA Hadoop Development Tools (HDT)

=== Other Resources ===
 * Jenkins/Hudson for builds and test running.

== Initial Committers ==
 * Adam Berry (amberry at yahoo-inc dot com)
 * Jeffrey Zemerick (jeffrrey at mtnfog dot com)

== Affiliations ==
 * Adam Berry - Yahoo!
 * Jeffrey Zemerick - Mountain Fog

== Sponsors ==
=== Champion ===
Chris Douglas

=== Nominated Mentors ===
Chris Douglas
Chris Mattman

=== Sponsoring Entity ===
Incubator PMC

Reply via email to