[ 
https://issues.apache.org/jira/browse/OOZIE-654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated OOZIE-654:
--------------------------------

    Attachment: OOZIE-654.patch

If the version of hadoop that Oozie is using doesn't have MAPREDUCE-4408 then 
when Oozie tries to use an uber jar, the MapReduce task will fail.  To combat 
this, I've added a property to oozie-default.xml 
"oozie.action.mapreduce.uber.jar.enable" (default is false); when disabled, 
Oozie will not allow a workflow with an uber jar (it will give a "nicer" and 
more obvious error message than trying to submit the job and having to look 
through the logs wondering what happened).  

I've modified the previous tests to look at the 
oozie.action.mapreduce.uber.jar.enable property.  There is an additional test 
in TestMapReduceActionExecutorUberJar.java which actually submits a job with an 
uber jar and verifies that it worked correctly; currently, this test is 
disabled by default (i.e. "mvn test" won't run it) because it fails against the 
current release of Hadoop (because it doesn't have MAPREDUCE-4408).  However, I 
have tested it locally against branch-1 and trunk.  
                
> Provide a way to use 'uber' jars with Oozie MR actions
> ------------------------------------------------------
>
>                 Key: OOZIE-654
>                 URL: https://issues.apache.org/jira/browse/OOZIE-654
>             Project: Oozie
>          Issue Type: Improvement
>            Reporter: Harsh J
>            Assignee: Robert Kanter
>            Priority: Minor
>         Attachments: OOZIE-654.patch, OOZIE-654.patch
>
>
> Right now, say if you have a custom MR code in a jar that has a {{lib/}} 
> folder inside which carries more dependent jars (a structure known as 'uber' 
> jars), and you submit the job via a regular 'hadoop jar' command, these 
> lib/*.jars get picked up by the framework because the supplied jar is 
> specified explicitly via conf.setJarByClass or conf.setJar. That is, if this 
> user uber jar goes to the JT as the mapred.jar, then  it is handled by the 
> framework properly and the lib/*.jars are all considered and placed on the 
> classpath.
> Distributed cache jars do not have this effect, and that is cause the MR 
> framework does not consider them as uber jars and does not extract and use 
> their internal lib/ directories.
> We should have a way in oozie to let users promote one of their jars as uber 
> jars, as an option.
> Proposal: Have an optional oozie-prefixed config, or an optional element in 
> the MR action XML, that lets user specify what class should be loaded to be 
> set as setJarByClass(...). This will have to be a class available in the 
> higher level of the uber jar (not under lib/) but can be any class inside the 
> targeted jar really (just not from a jar under lib/). We then set this as 
> jobConf.setJarByClass(loadedCls), and then run the job.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to