Re: Hadoop Utils

Grant Ingersoll Wed, 22 Feb 2012 07:06:33 -0800

On Feb 22, 2012, at 9:52 AM, Sean Owen wrote:

> I think its fine to let them live in integration here rather than a new
> module. The iterators could be useful upstream yes and maybe a few more
> bits.

> The AbstractJob might still be a little too app specific.

I've been reusing some of it, although I don't much need the default options 
(other than in/out).  As it is, lately instead of extending AbstractJob, I 
construct a command line object that uses the CLI processing pieces of AJ and 
then feed the results in as needed using APIs.  It separates out the command 
line processing a bit and feels cleaner to me. 

I'll see if I can do some refactoring to clean up and show what I mean.

> On Feb 22, 2012 2:37 PM, "Grant Ingersoll" <[email protected]> wrote:
> 
>> We've collected a fair bit of Hadoop utils over the years.  I am finding
>> them generally useful in other projects.  Would it make sense to either
>> split them out to a standalone jar and/or donate them upstream to Hadoop
>> itself?
>> 
>> I'm thinking the things like:
>> Seq File iterators and potentially the SeqFileDumper too
>> AbstractJob and related
>> 
>> My gut preference is that we maintain ownership of them but pub them in a
>> separate JAR.
>> 
>> Thoughts?
>> 
>> -Grant

Re: Hadoop Utils

Reply via email to