Are you looking to pass information onto Hadoop by detecting a specific type configuration, or are you looking to control the job's execution?
I also wish to mention that Oozie is not a Hadoop job scheduler - it is a workflow scheduler and works at a higher level above Hadoop. Once an Oozie submitted launcher or MR job hits Hadoop, the real scheduling of the tasks that the job will need is handled by Hadoop's scheduler (and not by Oozie). Or to say, Oozie has no notion of a "cluster" and its "nodes". It submits packaged and configured jobs onto Hadoop, and lets Hadoop's scheduler handle and worry about its execution, distribution, etc.. If you are looking to control actual execution of a Hadoop job, then Oozie isn't the right place to do it. On Wed, May 21, 2014 at 9:19 AM, Tina Samuel <[email protected]> wrote: > I would like to modify the Oozie code to introduce a new scheduling pattern > in Hadoop. I am new to Oozie. I read that there is a file called > workflow.xml which has the actions that are to be performed by Hadoop. I > want to introduce a new field to the job, something like a JOB_TYPE. For > eg, if a job belongs to TYPE_1, then it should be replicated in all the > worker nodes. If a job belongs to TYPE_2, then it should be replicated in > only a fraction of nodes. Is it possible to modify the parser of Oozie > which parses the workflow.xml?Please do help > > -- > Tina > -- Harsh J
