[ https://issues.apache.org/jira/browse/SPARK-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Reynold Xin updated SPARK-16540: -------------------------------- Fix Version/s: (was: 2.0.1) 2.0.0 > Jars specified with --jars will added twice when running on YARN > ---------------------------------------------------------------- > > Key: SPARK-16540 > URL: https://issues.apache.org/jira/browse/SPARK-16540 > Project: Spark > Issue Type: Bug > Components: Deploy, YARN > Affects Versions: 2.0.0 > Reporter: Saisai Shao > Assignee: Saisai Shao > Fix For: 2.0.0 > > > Currently when running spark on yarn, jars specified with \--jars, > \--packages will be added twice, one is Spark's own file server, another is > yarn's distributed cache, this can be seen from log: > for example: > {code} > ./bin/spark-shell --master yarn-client --jars > examples/target/scala-2.11/jars/scopt_2.11-3.3.0.jar > {code} > If specified the jar to be added is scopt jar, it will added twice: > {noformat} > ... > 16/07/14 15:06:48 INFO Server: Started @5603ms > 16/07/14 15:06:48 INFO Utils: Successfully started service 'SparkUI' on port > 4040. > 16/07/14 15:06:48 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at > http://192.168.0.102:4040 > 16/07/14 15:06:48 INFO SparkContext: Added JAR > file:/Users/sshao/projects/apache-spark/examples/target/scala-2.11/jars/scopt_2.11-3.3.0.jar > at spark://192.168.0.102:63996/jars/scopt_2.11-3.3.0.jar with timestamp > 1468480008637 > 16/07/14 15:06:49 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 > 16/07/14 15:06:49 INFO Client: Requesting a new application from cluster with > 1 NodeManagers > 16/07/14 15:06:49 INFO Client: Verifying our application has not requested > more than the maximum memory capability of the cluster (8192 MB per container) > 16/07/14 15:06:49 INFO Client: Will allocate AM container, with 896 MB memory > including 384 MB overhead > 16/07/14 15:06:49 INFO Client: Setting up container launch context for our AM > 16/07/14 15:06:49 INFO Client: Setting up the launch environment for our AM > container > 16/07/14 15:06:49 INFO Client: Preparing resources for our AM container > 16/07/14 15:06:49 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive > is set, falling back to uploading libraries under SPARK_HOME. > 16/07/14 15:06:50 INFO Client: Uploading resource > file:/private/var/folders/tb/8pw1511s2q78mj7plnq8p9g40000gn/T/spark-a446300b-84bf-43ff-bfb1-3adfb0571a42/__spark_libs__6486179704064718817.zip > -> > hdfs://localhost:8020/user/sshao/.sparkStaging/application_1468468348998_0009/__spark_libs__6486179704064718817.zip > 16/07/14 15:06:51 INFO Client: Uploading resource > file:/Users/sshao/projects/apache-spark/examples/target/scala-2.11/jars/scopt_2.11-3.3.0.jar > -> > hdfs://localhost:8020/user/sshao/.sparkStaging/application_1468468348998_0009/scopt_2.11-3.3.0.jar > 16/07/14 15:06:51 INFO Client: Uploading resource > file:/private/var/folders/tb/8pw1511s2q78mj7plnq8p9g40000gn/T/spark-a446300b-84bf-43ff-bfb1-3adfb0571a42/__spark_conf__326416236462420861.zip > -> > hdfs://localhost:8020/user/sshao/.sparkStaging/application_1468468348998_0009/__spark_conf__.zip > ... > {noformat} > Actually it is not necessary to add into Spark's file server. This problem > exists both in client and cluster modes, and it is introduced in SPARK-15782 > to fix the \--packages not work in spark-shell. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org