Hello guys, I'm new here. I'm using Spark 1.6.0, and I'm trying to programmatically access a Yarn cluster from my scala app. I create a SparkContext as usual, with the following code,
val sc = SparkContext.getOrCreate(new SparkConf().setMaster("yarn-client")) My yarn-site.xml is being read correctly, as far as I know. My guess is that for some reason some files are not being sent to the cluster. I get the following exception, info] o.a.s.d.y.Client - Setting up container launch context for our AM [info] o.a.s.d.y.Client - Setting up the launch environment for our AM container [info] o.a.s.d.y.Client - Preparing resources for our AM container [info] o.a.s.d.y.Client - Source and destination file systems are the same. Not copying file:/home/jose/.ivy2/cache/org.apache.spark/spark-yarn_2.11/jars/spark-yarn_2.11-1.6.0.jar [info] o.a.s.d.y.Client - Source and destination file systems are the same. Not copying file:/tmp/spark-df384d4c-2d8c-4101-b1c2-7caee897e227/__spark_conf__4685218164631909844.zip [info] o.a.s.SecurityManager - Changing view acls to: jose [info] o.a.s.SecurityManager - Changing modify acls to: jose [info] o.a.s.SecurityManager - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jose); users with modify permissions: Set(jose) [info] o.a.s.d.y.Client - Submitting application 674 to ResourceManager [info] o.a.h.y.c.a.i.YarnClientImpl - Submitted application application_1462219356760_0674 to ResourceManager at /10.10.10.142:8032 [info] o.a.s.d.y.Client - Application report for application_1462219356760_0674 (state: ACCEPTED) [info] o.a.s.d.y.Client - client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1475613117174 final status: UNDEFINED tracking URL: http://server_address:20888/proxy/application_1462219356760_0674/ user: jose [info] o.a.s.d.y.Client - Application report for application_1462219356760_0674 (state: ACCEPTED) [info] o.a.s.d.y.Client - Application report for application_1462219356760_0674 (state: FAILED) [info] o.a.s.d.y.Client - client token: N/A diagnostics: Application application_1462219356760_0674 failed 2 times due to AM Container for appattempt_1462219356760_0674_000002 exited with exitCode: -1000 For more detailed output, check application tracking page: http://server_address:8088/cluster/app/application_1462219356760_0674Then, click on links to logs of each attempt. Diagnostics: java.io.FileNotFoundException: File file:/home/jose/.ivy2/cache/org.apache.spark/spark-yarn_2.11/jars/spark-yarn_2.11-1.6.0.jar does not exist Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1475613117174 final status: FAILED tracking URL: http://server_address:8088/cluster/app/application_1462219356760_0674 user: jose [error] o.a.s.SparkContext - Error initializing SparkContext. org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master. at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:124) ~[spark-yarn_2.11-1.6.0.jar:1.6.0] at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:64) ~[spark-yarn_2.11-1.6.0.jar:1.6.0] at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144) ~[spark-core_2.11-1.6.0.jar:1.6.0] at org.apache.spark.SparkContext.<init>(SparkContext.scala:530) ~[spark-core_2.11-1.6.0.jar:1.6.0] at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2281) [spark-core_2.11-1.6.0.jar:1.6.0] thanks for any help!, Alberto.