[ https://issues.apache.org/jira/browse/OOZIE-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504967#comment-15504967 ]
Rohini Palaniswamy commented on OOZIE-2606: ------------------------------------------- Some comments: 1) Please use a separate method than fixFsDefaultUris 2) Please use separate if blocks for pattern matching and getting the file {code} SPARK_YARN_JAR_PATTERN.matcher(p.getName()).find() || SPARK_ASSEMBLY_JAR_PATTERN.matcher(p.getName()).find() {code} 3) Hadoop has APIs to get version- org.apache.hadoop.util.VersionInfo.getVersion(). Check if spark has something similar and that can be used instead of looking at manifest directly 4) Skip adding --conf spark.yarn.jars if it is version 1.x. 5) Currently all jars are in --files and spark.yarn.jars which will be confusing for user and will also generate lot of log messages saying duplicate jar. Will it work if you just put spark-yarn*.jar in spark.yarn.jars and rest in --files? > Set spark.yarn.jars to fix Spark 2.0 with Oozie > ----------------------------------------------- > > Key: OOZIE-2606 > URL: https://issues.apache.org/jira/browse/OOZIE-2606 > Project: Oozie > Issue Type: Bug > Components: core > Affects Versions: 4.2.0 > Reporter: Jonathan Kelly > Assignee: Satish Subhashrao Saley > Labels: spark, spark2.0.0 > Fix For: 4.3.0 > > Attachments: OOZIE-2606-2.patch, OOZIE-2606.patch > > > Oozie adds all of the jars in the Oozie Spark sharelib to the > DistributedCache such that all jars will be present in the current working > directory of the YARN container (as well as in the container classpath). > However, this is not quite enough to make Spark 2.0 work, since Spark 2.0 by > default looks for the jars in assembly/target/scala-2.11/jars [1] (as if it > is a locally built distribution for development) and will not find them in > the current working directory. > To fix this, we can set spark.yarn.jars to *.jar so that it finds the jars in > the current working directory rather than looking in the wrong place. [2] > [1] > https://github.com/apache/spark/blob/v2.0.0-rc2/launcher/src/main/java/org/apache/spark/launcher/CommandBuilderUtils.java#L357 > [2] > https://github.com/apache/spark/blob/v2.0.0-rc2/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L476 > Note: This property will be ignored by Spark 1.x. -- This message was sent by Atlassian JIRA (v6.3.4#6332)