Attila Sasvari created OOZIE-3042:
-------------------------------------

             Summary: Allow isolation of workflow action dependencies 
                 Key: OOZIE-3042
                 URL: https://issues.apache.org/jira/browse/OOZIE-3042
             Project: Oozie
          Issue Type: Bug
    Affects Versions: 4.3.0, 5.0.0
            Reporter: Attila Sasvari


Currently JAR files stored in the lib/ subdirectory of the application 
directory are automatically added in the classpath of Hadoop jobs started by 
Oozie. There can be problems if a workflow includes multiple actions and 
classes are not compatible.
For example, I recently experienced an issue that occurred when my workflow 
included a Spark action after a Java action. A jar file with classes my Java 
action depended on contained classes that caused runtime issues for the Spark 
action. I enabled verbose class loading for the Spark driver and also for the 
Oozie launcher and its application master to see what was going on. 
Incompatible classes were loaded from the classpath when Oozie tried to run the 
second action. (Both actions were able to run without error in separate 
actions. )

The problem is that Oozie puts everything to the classpath in the workflow 
{{lib}} directory, even if it might cause problem for other action. 

The workflow {{lib}} directory could contain subdirs with the name of the 
workflow actions. These wf actions could store their dependencies in those 
directories. Say if my workflow has A1 and A2 actions there could be lib/A1 and 
lib/A2.
{{JavaActionExecutor}} could look for the presence of those directories based 
on the action name.
If there is such a directory it would build the classpath accordingly -> first 
jars would come from {{lib/A1/}} if it is the A1 action to be executed. jars in 
the {{lib}} directory could go to the end of the classpath. Those can be 
considered common for all the workflow actions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to