Agree. It should be separate maven module (and patch puts it as separate maven
module now). And top level for hadoop tools is nice to have, but it becomes
hard to maintain until patch automation tests run the tests under tools.
Currently we see many times the changes in HDFS effecting RAID tests in
MapReduce. So, I'm fine putting the tools under hadoop-mapreduce.
I propose we can have something like the following:
trunk/
- hadoop-mapreduce
- hadoop-mr-client
- hadoop-yarn
- hadoop-tools
- hadoop-streaming
- hadoop-archives
- hadoop-distcp
Thoughts?
@Eli and @JD, we did not replace old legacy distcp because this is really a
complete rewrite and did not want to remove it until users are familiarized
with new one.
On 8/26/11 12:51 AM, "Todd Lipcon" <[email protected]> wrote:
Maybe a separate toplevel for hadoop-tools? Stuff like RAID could go
in there as well - ie tools that are downstream of MR and/or HDFS.
On Thu, Aug 25, 2011 at 12:09 PM, Mahadev Konar <[email protected]> wrote:
> +1 for a seperate module in hadoop-mapreduce-project. I think
> hadoop-mapreduce-client might not be right place for it. We might have
> to pick a new maven module under hadoop-mapreduce-project that could
> host streaming/distcp/hadoop archives.
>
> thanks
> mahadev
>
> On Thu, Aug 25, 2011 at 11:04 AM, Alejandro Abdelnur <[email protected]>
> wrote:
>> Agree, it should be a separate maven module.
>>
>> And it should be under hadoop-mapreduce-client, right?
>>
>> And now that we are in the topic, the same should go for streaming, no?
>>
>> Thanks.
>>
>> Alejandro
>>
>> On Thu, Aug 25, 2011 at 10:58 AM, Todd Lipcon <[email protected]> wrote:
>>
>>> On Thu, Aug 25, 2011 at 10:36 AM, Eli Collins <[email protected]> wrote:
>>> > Nice work! I definitely think this should go in 23 and 20x.
>>> >
>>> > Agree with JD that it should be in the core code, not contrib. If
>>> > it's going to be maintained then we should put it in the core code.
>>>
>>> Now that we're all mavenized, though, a separate maven module and
>>> artifact does make sense IMO - ie "hadoop jar
>>> hadoop-distcp-0.23.0-SNAPSHOT" rather than "hadoop distcp"
>>>
>>> -Todd
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
>>>
>>
>
--
Todd Lipcon
Software Engineer, Cloudera