Hi Alexey, Thanks for pointing it out. Indeed doc is confusing. The example is only for pig action where pig could submit multiple MR jobs. We will take action to clarify this.
General steps are: 1. For each action Oozie submits a launcher MR job (map-only) 2. Launcher map in turn executes the action specific task as follow: a. For java action, Oozie executes the java code and control gives to user code. b. For pig/hive action, it calls the Pig/Hive API to execute the script. The control moves to pig/hive API. c. For MR action, it submits MR job using hadoop JobClient API and return the Launcher mapper right-a-way. Note: All actions type (except MR) wait for the execution to finish before completing the Launcher MR job. Addressing your comments: >Does it mean, that map launcher is kind of daemon, that runs all the >time and processes multiple requests from Oozie server (delivered >through Job Tracker), No. It submits a new Launche job for each action. >Or it means that one request from Oozie server can actually trigger >multiple map-reduce jobs? No. it can't. For pig and hive , it could be but those are not in Oozie Launcher's control. >Or it means something else? Or this is an error in doc? Doc doesn't explain this well. >Also: is this Map launcher used only for Oozie originated jobs? Wjaty >about job from Oozie that come from java actions? What java classes in >Hadoop (Oozie) implement Map launcher? Oozie only controls the job directly submitted through it. For example,if java code submits another set of MR jobs , Oozie will not know that other than waiting for the Java code to return. Please let me know if it is not clear. Regards, Mohammad ----- Original Message ----- From: Alexey Yakubovich <[email protected]> To: [email protected] Cc: Sent: Sunday, February 26, 2012 3:41 PM Subject: Oozie map launcher and multiple MR jobs on Hadoop This is rather an academical question .. I just try to understand how Oozie / Hadoop work together. In the Apache-incubator documentation, there is a picture and description about Map launcher. http://incubator.apache.org/oozie/overview.html#Launcher-Mapper That pic/desc. show multiple map-reduce jobs coordinated from map launcher (one instance?). Does it mean, that map launcher is kind of daemon, that runs all the time and processes multiple requests from Oozie server (delivered through Job Tracker), Or it means that one request from Oozie server can actually trigger multiple map-reduce jobs? Or it means something else? Or this is an error in doc? Also: is this Map launcher used only for Oozie originated jobs? Wjaty about job from Oozie that come from java actions? What java classes in Hadoop (Oozie) implement Map launcher? Thanks Alexey
