Mapreduce and HDFS are distinct function of Hadoop. They are loosely coupled. If we have tools aggregator module, it will not have as clear distinct function as other Hadoop modules. Hence, it is possible for a tool to be depend on both HDFS and map reduce. If something broke in tools module, it is unclear which subproject's responsibility to maintain tools function. Therefore, it is safer to send tools to incubator or apache extra rather than deposit the utility tools in tools subcategory. There are many short lived projects that attempts to associate themselves with Hadoop but not being maintained. It would be better to spin off those utility projects than use Hadoop as a dumping ground.
The previous discussion for removing contrib, most people were in favor of doing so, and only a few contrib owners were reluctant to remove contrib. Fewer people has participated in restore functionality of broken contrib projects. History speaks for itself. -1 (non-binding) for hadoop-tools. regards, Eric On Tue, Sep 6, 2011 at 6:55 PM, Alejandro Abdelnur <t...@cloudera.com> wrote: > Eric, > > Personally I'm fine either way. > > Still, I fail to see why a generic/categorized tools increase/reduce the > risk of dead code and how they make more-difficult/easier the > package&deployment. > > Would you please explain this? > > Thanks. > > Alejandro > > On Tue, Sep 6, 2011 at 6:38 PM, Eric Yang <eric...@gmail.com> wrote: > >> Option #2 proposed by Amareshwari, seems like a better proposal. We don't >> want to repeat history for contrib again with hadoop-tools. Having a >> generic module like hadoop-tools increases the risk of accumulate dead code. >> It would be better to categorize the hdfs or mapreduce specific tools in >> their respected subcategories. It is also easier to manage from >> package/deployment prospective. >> >> regards, >> Eric >> >> On Sep 6, 2011, at 4:32 PM, Eli Collins wrote: >> >> > On Tue, Sep 6, 2011 at 10:11 AM, Allen Wittenauer <a...@apache.org> wrote: >> >> >> >> On Sep 6, 2011, at 9:30 AM, Vinod Kumar Vavilapalli wrote: >> >>> We still need to answer Amareshwari's question (2) she asked some time >> back >> >>> about the automated code compilation and test execution of the tools >> module. >> >> >> >> >> >> >> >>>>> My #1 question is if tools is basically contrib reborn. If not, what >> >>>> makes >> >>>>> it different? >> >> >> >> >> >> I'm still waiting for this answer as well. >> >> >> >> Until such, I would be pretty much against a tools module. >> Changing the name of the dumping ground doesn't make it any less of a >> dumping ground. >> > >> > IMO if the tools module only gets stuff like distcp that's maintained >> > then it's not contrib, if it contains all the stuff from the current >> > MR contrib then tools is just a re-labeling of contrib. Given that >> > this proposal only covers moving distcp to tools it doesn't sound like >> > contrib to me. >> > >> > Thanks, >> > Eli >> >> >