Job vs. Configuration

Chris K Wensel Mon, 10 Aug 2009 21:02:47 -0700


Hey all

Looking at (converting to) the new .20 API, I see that the staticconfig setters take Job or JobContext, not Configuration.

>> public static Path[] getInputPaths(JobContext context)

I get the utility of this from the perspective of a user writinghadoop jobs. a lot less job.getConfiguration() calls.

But, I do find it odd FileInputFormat, for example, knows about Joband JobContext (and children) when it feels as if it should only knowabout Configuration (considering thats all they do is get/setproperties).

From my perspective, Cascading in part isn't much more than a fancyConfiguration builder. And the internals all really only care aboutConfiguration as they may be asked to provide a property outside thecontext of a job.

So being a builder, a Configuration object is passed around throughoutthe system at different stages (planning, execution, etc) in order toaccumulate properties from nested components.

With the new API, it all adds up to the need to wrap Configuration ina Job/JobContext and then unwrap it so the Configuration instance canmove down the configuration chain.

But this isn't really possible simply as new Job( configuration ) setsthe configuration as a default property collection and any set() onJob won't influence the defaults. The result is a lot of Configurationalgebra to merge the final results (or a bit of reflection).

Would it make sense to accept Configuration instead of the JobContextand its sub-classes.

You could argue I should just use JobContext in my API's. but again,many of my subsystems shouldn't really know of JobContext, they onlycare about manipulating the Configuration object. further, the use ofJob, JobContext, TaskAttemptContext, etc in the static setters isinconsistent.>> public static void addInputPath(Job job, Path path) throwsIOException {


I wonder if Hive and Pig (will) have similar issues.

cheers,
chris

--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com

Job vs. Configuration

Reply via email to