[
https://issues.apache.org/jira/browse/OOZIE-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16820095#comment-16820095
]
Junfan Zhang commented on OOZIE-3472:
-------------------------------------
Can you give me some advice? [~asalamon74] :)
> Improve Spark Action compatibility with Oozie launcher
> ------------------------------------------------------
>
> Key: OOZIE-3472
> URL: https://issues.apache.org/jira/browse/OOZIE-3472
> Project: Oozie
> Issue Type: Improvement
> Components: action
> Affects Versions: 5.1.0
> Reporter: Junfan Zhang
> Assignee: Junfan Zhang
> Priority: Major
>
> In the production environment, when using the spark action, our users often
> encounter conflicts between the user jar and the launcher, causing the
> launcher to fail to start.
> To do this we have a maven plugin to guide the user to remove Hadoop related
> dependencies from the user jar. But the user jar is more complicated and
> sometimes not easy to remove. Therefore, it is appropriate to solve this
> problem from the oozie side.
> We research code found that the spark action is inherited to the Java action.
> The reason for the conflict is because the Java action will put the user jar
> into the cache before the mr starts (related
> [link|https://github.com/apache/oozie/blob/b91457edd2a76f94f41a89ec718eec574c200c71/core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java#L722]).
> If there is a Hadoop dependency in the user jar and the version is
> incompatible, a conflict will occur.
> From the root cause analysis, the spark action just uses the map node in mr
> as a spark submit client, and does not need to add the user jar to the mr
> distributed cache. We solved this conflict by using spark submit sdk to load
> the user jar from HDFS directly. It currently works well in our production
> environment. :)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)