[jira] [Commented] (OOZIE-2606) Set spark.yarn.jars to fix Spark 2.0 with Oozie

Jonathan Kelly (JIRA) Thu, 07 Jul 2016 11:17:31 -0700

    [ 
https://issues.apache.org/jira/browse/OOZIE-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15366545#comment-15366545
 ]


Jonathan Kelly commented on OOZIE-2606:
---------------------------------------

Good points, [~satishsaley] and [~rkanter]. Yes, what would happen here is that 
Oozie will upload the jars to HDFS and put them in the DistributedCache, then 
Spark itself will again upload them to HDFS separately and add them to the 
DistributedCache for use in its executors (and driver for yarn-cluster mode). 
It would indeed be ideal for Oozie to pass HDFS paths for spark.yarn.jars 
rather than local paths.

> Set spark.yarn.jars to fix Spark 2.0 with Oozie
> -----------------------------------------------
>
>                 Key: OOZIE-2606
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2606
>             Project: Oozie
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 4.2.0
>            Reporter: Jonathan Kelly
>              Labels: spark, spark2.0.0
>             Fix For: trunk
>
>         Attachments: OOZIE-2606.patch
>
>
> Oozie adds all of the jars in the Oozie Spark sharelib to the 
> DistributedCache such that all jars will be present in the current working 
> directory of the YARN container (as well as in the container classpath). 
> However, this is not quite enough to make Spark 2.0 work, since Spark 2.0 by 
> default looks for the jars in assembly/target/scala-2.11/jars [1] (as if it 
> is a locally built distribution for development) and will not find them in 
> the current working directory.
> To fix this, we can set spark.yarn.jars to *.jar so that it finds the jars in 
> the current working directory rather than looking in the wrong place. [2]
> [1] 
> https://github.com/apache/spark/blob/v2.0.0-rc2/launcher/src/main/java/org/apache/spark/launcher/CommandBuilderUtils.java#L357
> [2] 
> https://github.com/apache/spark/blob/v2.0.0-rc2/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L476
> Note: This property will be ignored by Spark 1.x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (OOZIE-2606) Set spark.yarn.jars to fix Spark 2.0 with Oozie

Reply via email to