Re: doubt on Hadoop job submission process

Manoj Babu Mon, 13 Aug 2012 04:14:01 -0700

Hi Harsh,

Thanks for your reply.


Consider from my main program i am doing so
many activities(Reading/writing/updating non hadoop activities) before
invoking JobClient.runJob(conf);
Is it anyway to separate the process flow by programmatic instead of going
for any workflow engine?

Cheers!
Manoj.



On Mon, Aug 13, 2012 at 4:10 PM, Harsh J <ha...@cloudera.com> wrote:

> Hi Manoj,
>
> Reply inline.
>
> On Mon, Aug 13, 2012 at 3:42 PM, Manoj Babu <manoj...@gmail.com> wrote:
> > Hi All,
> >
> > Normal Hadoop job submission process involves:
> >
> > Checking the input and output specifications of the job.
> > Computing the InputSplits for the job.
> > Setup the requisite accounting information for the DistributedCache of
> the
> > job, if necessary.
> > Copying the job's jar and configuration to the map-reduce system
> directory
> > on the distributed file-system.
> > Submitting the job to the JobTracker and optionally monitoring it's
> status.
> >
> > I have a doubt in 4th point of  job execution flow could any of you
> explain
> > it?
> >
> > What is job's jar?
>
> The job.jar is the jar you supply via "hadoop jar <jar>". Technically
> though, it is the jar pointed by JobConf.getJar() (Set via setJar or
> setJarByClass calls).
>
> > Is it job's jar is the one we submitted to hadoop or hadoop will build
> based
> > on the job configuration object?
>
> It is the former, as explained above.
>
> --
> Harsh J
>

Re: doubt on Hadoop job submission process

Reply via email to