As long as hadoop-tools is in some directory at some depth under trunk, release of the hadoop-tools is tied to the release of core.
So we actually have these two options instead: (1) Separate source tree (http://svn.apache.org/repos/asf/hadoop/tools) -- Sources at tools/trunk/hadoop-distcp -- Each tool will work with specific version of Hadoop core. -- Releases can really be separate (2) Same source tree: trunk/ -- Sources at either (1.1) trunk/hadoop-tools or (1.2) trunk/hadoop-mapreduce-project/hadoop-mr-tools/hadoop-distcp/ -- Given release isn't decoupled anyway, either will work. (1.2) is prefereable if building mapreduce builds the tools also. +Vinod On Tue, Aug 30, 2011 at 1:31 PM, Amareshwari Sri Ramadasu < amar...@yahoo-inc.com> wrote: > Copying common-dev. > > Summarizing the below discussion: What should be the tools layout after > mavenization? > > Option #1: Have hadoop-tools at top level i.e > trunk/ > hadoop-tools/ > hadoop-distcp/ > Pros: > Cleaner layout. > In future, tools could be released separately from Hadoop releases > > Cons: Difficult to maintain > > Option #2: Keep the tools aggregator module for MapReduce/HDFS/Common if > they are depending on MapReduce/HDFS/Common respectively. > For ex: > hadoop-mapreduce-project/ > hadoop-mr-tools/ > hadoop-distcp/ > > Pros: Easy to maintain > Cons: Still has tight coupling with related projects. > > Personally, I'm fine with any of the above options. Looking for suggestions > and reaching a consensus on this. > > Thanks > Amareshwari > > On 8/30/11 12:10 AM, "Allen Wittenauer" <a...@apache.org> wrote: > > > > I have a feeling this discussion should get moved to common-dev or even to > general. > > My #1 question is if tools is basically contrib reborn. If not, what makes > it different? > > On Aug 29, 2011, at 1:43 AM, Amareshwari Sri Ramadasu wrote: > > > Some questions on making hadoop-tools top level under trunk, > > > > 1. Should the patches for tools be created against Hadoop Common? > > 2. What will happen to the tools test automation? Will it run as part of > Hadoop Common tests? > > 3. Will it introduce a dependency from MapReduce to Common? Or is this > taken care in Mavenization? > > > > > > Thanks > > Amareshwari > > > > On 8/26/11 10:17 PM, "Alejandro Abdelnur" <t...@cloudera.com> wrote: > > > > Please, don't add more Mavenization work on us (eventually I want to go > back > > to coding) > > > > Given that Hadoop is already Mavenized, the patch should be Mavenized. > > > > What will have to be done extra (besides Mavenizing distcp) is to create > a > > hadoop-tools module at root level and within it a hadoop-distcp module. > > > > The hadoop-tools POM will look pretty much like the hadoop-common-project > > POM. > > > > The hadoop-distcp POM should follow the hadoop-common POM patterns. > > > > Thanks. > > > > Alejandro > > > > On Fri, Aug 26, 2011 at 9:37 AM, Amareshwari Sri Ramadasu < > > amar...@yahoo-inc.com> wrote: > > > >> Agree with Mithun and Robert. DistCp and Tools restructuring are > separate > >> tasks. Since DistCp code is ready to be committed, it need not wait for > the > >> Tools separation from MR/HDFS. > >> I would say it can go into contrib as the patch is now, and when the > tools > >> restructuring happens it would be just an svn mv. If there are no > issues > >> with this proposal I can commit the code tomorrow. > >> > >> Thanks > >> Amareshwari > >> > >> On 8/26/11 7:45 PM, "Robert Evans" <ev...@yahoo-inc.com> wrote: > >> > >> I agree with Mithun. They are related but this goes beyond distcpv2 and > >> should not block distcpv2 from going in. It would be very nice, > however, to > >> get the layout settled soon so that we all know where to find something > when > >> we want to work on it. > >> > >> Also +1 for Alejandro's I also prefer to keep tools at the trunk level. > >> > >> Even though HDFS, Common, and Mapreduce and perhaps soon tools are > separate > >> modules right now, there is still tight coupling between the different > >> pieces, especially with tests. IMO until we can reduce that coupling we > >> should treat building and testing Hadoop as a single project instead of > >> trying to keep them separate. > >> > >> --Bobby > >> > >> On 8/26/11 7:45 AM, "Mithun Radhakrishnan" < > mithun.radhakrish...@yahoo.com> > >> wrote: > >> > >> Would it be acceptable if retooling of tools/ were taken up separately? > It > >> sounds to me like this might be a distinct (albeit related) task. > >> > >> Mithun > >> > >> > >> ________________________________ > >> From: Giridharan Kesavan <gkesa...@hortonworks.com> > >> To: mapreduce-dev@hadoop.apache.org > >> Sent: Friday, August 26, 2011 12:04 PM > >> Subject: Re: DistCpV2 in 0.23 > >> > >> +1 to Alejandro's > >> > >> I prefer to keep the hadoop-tools at trunk level. > >> > >> -Giri > >> > >> On Thu, Aug 25, 2011 at 9:15 PM, Alejandro Abdelnur <t...@cloudera.com> > >> wrote: > >>> I'd suggest putting hadoop-tools either at trunk/ level or having a a > >> tools > >>> aggregator module for hdfs and other for common. > >>> > >>> I personal would prefer at trunk/. > >>> > >>> Thanks. > >>> > >>> Alejandro > >>> > >>> On Thu, Aug 25, 2011 at 9:06 PM, Amareshwari Sri Ramadasu < > >>> amar...@yahoo-inc.com> wrote: > >>> > >>>> Agree. It should be separate maven module (and patch puts it as > separate > >>>> maven module now). And top level for hadoop tools is nice to have, but > >> it > >>>> becomes hard to maintain until patch automation tests run the tests > >> under > >>>> tools. Currently we see many times the changes in HDFS effecting RAID > >> tests > >>>> in MapReduce. So, I'm fine putting the tools under hadoop-mapreduce. > >>>> > >>>> I propose we can have something like the following: > >>>> > >>>> trunk/ > >>>> - hadoop-mapreduce > >>>> - hadoop-mr-client > >>>> - hadoop-yarn > >>>> - hadoop-tools > >>>> - hadoop-streaming > >>>> - hadoop-archives > >>>> - hadoop-distcp > >>>> > >>>> Thoughts? > >>>> > >>>> @Eli and @JD, we did not replace old legacy distcp because this is > >> really a > >>>> complete rewrite and did not want to remove it until users are > >> familiarized > >>>> with new one. > >>>> > >>>> On 8/26/11 12:51 AM, "Todd Lipcon" <t...@cloudera.com> wrote: > >>>> > >>>> Maybe a separate toplevel for hadoop-tools? Stuff like RAID could go > >>>> in there as well - ie tools that are downstream of MR and/or HDFS. > >>>> > >>>> On Thu, Aug 25, 2011 at 12:09 PM, Mahadev Konar < > >> maha...@hortonworks.com> > >>>> wrote: > >>>>> +1 for a seperate module in hadoop-mapreduce-project. I think > >>>>> hadoop-mapreduce-client might not be right place for it. We might > have > >>>>> to pick a new maven module under hadoop-mapreduce-project that could > >>>>> host streaming/distcp/hadoop archives. > >>>>> > >>>>> thanks > >>>>> mahadev > >>>>> > >>>>> On Thu, Aug 25, 2011 at 11:04 AM, Alejandro Abdelnur < > >> t...@cloudera.com> > >>>> wrote: > >>>>>> Agree, it should be a separate maven module. > >>>>>> > >>>>>> And it should be under hadoop-mapreduce-client, right? > >>>>>> > >>>>>> And now that we are in the topic, the same should go for streaming, > >> no? > >>>>>> > >>>>>> Thanks. > >>>>>> > >>>>>> Alejandro > >>>>>> > >>>>>> On Thu, Aug 25, 2011 at 10:58 AM, Todd Lipcon <t...@cloudera.com> > >>>> wrote: > >>>>>> > >>>>>>> On Thu, Aug 25, 2011 at 10:36 AM, Eli Collins <e...@cloudera.com> > >>>> wrote: > >>>>>>>> Nice work! I definitely think this should go in 23 and 20x. > >>>>>>>> > >>>>>>>> Agree with JD that it should be in the core code, not contrib. If > >>>>>>>> it's going to be maintained then we should put it in the core > >> code. > >>>>>>> > >>>>>>> Now that we're all mavenized, though, a separate maven module and > >>>>>>> artifact does make sense IMO - ie "hadoop jar > >>>>>>> hadoop-distcp-0.23.0-SNAPSHOT" rather than "hadoop distcp" > >>>>>>> > >>>>>>> -Todd > >>>>>>> -- > >>>>>>> Todd Lipcon > >>>>>>> Software Engineer, Cloudera > >>>>>>> > >>>>>> > >>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Todd Lipcon > >>>> Software Engineer, Cloudera > >>>> > >>>> > >>> > >> > >> > >> > >> -- > >> -Giri > >> > >> > >> > > > > >