Hi Alexey,
Thanks for pointing it out.
Indeed doc is confusing.

The example is only for pig action where pig could submit multiple MR jobs. We 
will take action to clarify this.

General steps are:

1. For each action Oozie submits a launcher MR job (map-only)
2. Launcher map in turn executes the action specific task as follow:
      a. For  java action, Oozie executes the java code  and control gives to 
user code.
      b. For pig/hive action, it calls the Pig/Hive API to execute the script. 
The control moves to pig/hive API.
      c. For MR action,  it submits MR job using hadoop JobClient API  and 
return the Launcher mapper right-a-way.
    Note: All actions type (except MR) wait for the execution to finish before 
completing the Launcher MR job.


Addressing your comments:

>Does it mean, that map launcher is kind of daemon, that runs all the
>time and processes multiple requests from Oozie server (delivered
>through Job Tracker),

No. It submits a new Launche job for each action.


>Or it means that one request from Oozie server can actually trigger
>multiple map-reduce jobs?


No. it can't. For pig and hive , it could be but those are not in Oozie 
Launcher's control.

>Or it means something else? Or this is an error in doc?


Doc doesn't explain this well.

>Also: is this Map launcher used only for Oozie originated jobs? Wjaty
>about job from Oozie that come from java actions? What java classes in
>Hadoop (Oozie) implement Map launcher?


Oozie only controls the job directly submitted through it. For example,if java 
code submits another set of MR jobs , Oozie will not know that other than 
waiting for the Java code to return.


Please let me know if it is not clear.

Regards,
Mohammad  




----- Original Message -----
From: Alexey Yakubovich <[email protected]>
To: [email protected]
Cc: 
Sent: Sunday, February 26, 2012 3:41 PM
Subject: Oozie map launcher and multiple MR jobs on Hadoop

This is rather an academical question .. I just try to understand how
Oozie / Hadoop work together.

In the Apache-incubator documentation, there is a picture and
description about Map launcher.
http://incubator.apache.org/oozie/overview.html#Launcher-Mapper
That pic/desc. show multiple map-reduce jobs coordinated from map
launcher (one instance?).

Does it mean, that map launcher is kind of daemon, that runs all the
time and processes multiple requests from Oozie server (delivered
through Job Tracker),

Or it means that one request from Oozie server can actually trigger
multiple map-reduce jobs?

Or it means something else? Or this is an error in doc?

Also: is this Map launcher used only for Oozie originated jobs? Wjaty
about job from Oozie that come from java actions? What java classes in
Hadoop (Oozie) implement Map launcher?

Thanks

Alexey

Reply via email to