[jira] [Updated] (SPARK-18160) SparkContext.addFile doesn't work in yarn-cluster mode
[ https://issues.apache.org/jira/browse/SPARK-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-18160: --- Description: The following command will fails for spark 2.0 {noformat} bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --conf spark.files=/usr/spark-client/conf/hive-site.xml examples/target/original-spark-examples_2.11.jar {noformat} The above command can reproduce the error as following in a multiple node cluster. To be noticed, this issue only happens in multiple node cluster. As in the single node cluster, AM use the same local filesystem as the the driver. {noformat} 16/10/28 07:21:42 ERROR SparkContext: Error initializing SparkContext. java.io.FileNotFoundException: File file:/usr/spark-client/conf/hive-site.xml does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:537) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:750) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:527) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409) at org.apache.spark.SparkContext.addFile(SparkContext.scala:1443) at org.apache.spark.SparkContext.addFile(SparkContext.scala:1415) at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462) at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462) at scala.collection.immutable.List.foreach(List.scala:381) at org.apache.spark.SparkContext.(SparkContext.scala:462) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2296) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:843) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:835) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:835) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637) {noformat} was: The following command will fails for spark 2.0 {noformat} bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --conf spark.files=/usr/spark-client/conf/hive-site.xml examples/target/original-spark-examples_2.11.jar {noformat} and this command fails for spark 1.6 {noformat} bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --files /usr/spark-client/conf/hive-site.xml examples/target/original-spark-examples_2.11.jar {noformat} The above command can reproduce the error as following in a multiple node cluster. To be noticed, this issue only happens in multiple node cluster. As in the single node cluster, AM use the same local filesystem as the the driver. {noformat} 16/10/28 07:21:42 ERROR SparkContext: Error initializing SparkContext. java.io.FileNotFoundException: File file:/usr/spark-client/conf/hive-site.xml does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:537) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:750) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:527) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409) at org.apache.spark.SparkContext.addFile(SparkContext.scala:1443) at org.apache.spark.SparkContext.addFile(SparkContext.scala:1415) at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462) at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462) at scala.collection.immutable.List.foreach(List.scala:381) at org.apache.spark.SparkContext.(SparkContext.scala:462) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2296) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:843) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:835) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:835) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31) at
[jira] [Updated] (SPARK-18160) SparkContext.addFile doesn't work in yarn-cluster mode
[ https://issues.apache.org/jira/browse/SPARK-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-18160: --- Description: The following command will fails for spark 2.0 {noformat} bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --conf spark.files=/usr/spark-client/conf/hive-site.xml examples/target/original-spark-examples_2.11.jar {noformat} and this command fails for spark 1.6 {noformat} bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --files /usr/spark-client/conf/hive-site.xml examples/target/original-spark-examples_2.11.jar {noformat} The above command can reproduce the error as following in a multiple node cluster. To be noticed, this issue only happens in multiple node cluster. As in the single node cluster, AM use the same local filesystem as the the driver. {noformat} 16/10/28 07:21:42 ERROR SparkContext: Error initializing SparkContext. java.io.FileNotFoundException: File file:/usr/spark-client/conf/hive-site.xml does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:537) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:750) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:527) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409) at org.apache.spark.SparkContext.addFile(SparkContext.scala:1443) at org.apache.spark.SparkContext.addFile(SparkContext.scala:1415) at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462) at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462) at scala.collection.immutable.List.foreach(List.scala:381) at org.apache.spark.SparkContext.(SparkContext.scala:462) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2296) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:843) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:835) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:835) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637) {noformat} was: {noformat} bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --conf spark.files=/usr/spark-client/conf/hive-site.xml examples/target/original-spark-examples_2.11.jar {noformat} The above command can reproduce the error as following in a multiple node cluster. To be noticed, this issue only happens in multiple node cluster. As in the single node cluster, AM use the same local filesystem as the the driver. {noformat} 16/10/28 07:21:42 ERROR SparkContext: Error initializing SparkContext. java.io.FileNotFoundException: File file:/usr/spark-client/conf/hive-site.xml does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:537) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:750) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:527) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409) at org.apache.spark.SparkContext.addFile(SparkContext.scala:1443) at org.apache.spark.SparkContext.addFile(SparkContext.scala:1415) at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462) at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462) at scala.collection.immutable.List.foreach(List.scala:381) at org.apache.spark.SparkContext.(SparkContext.scala:462) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2296) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:843) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:835) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:835) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at
[jira] [Updated] (SPARK-18160) SparkContext.addFile doesn't work in yarn-cluster mode
[ https://issues.apache.org/jira/browse/SPARK-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-18160: --- Description: {noformat} bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --conf spark.files=/usr/spark-client/conf/hive-site.xml examples/target/original-spark-examples_2.11.jar {noformat} The above command can reproduce the error as following in a multiple node cluster. To be noticed, this issue only happens in multiple node cluster. As in the single node cluster, AM use the same local filesystem as the the driver. {noformat} 16/10/28 07:21:42 ERROR SparkContext: Error initializing SparkContext. java.io.FileNotFoundException: File file:/usr/spark-client/conf/hive-site.xml does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:537) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:750) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:527) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409) at org.apache.spark.SparkContext.addFile(SparkContext.scala:1443) at org.apache.spark.SparkContext.addFile(SparkContext.scala:1415) at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462) at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462) at scala.collection.immutable.List.foreach(List.scala:381) at org.apache.spark.SparkContext.(SparkContext.scala:462) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2296) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:843) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:835) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:835) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637) {noformat} was: {noformat} bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --files /usr/spark-client/conf/hive-site.xml examples/target/original-spark-examples_2.11.jar {noformat} The above command can reproduce the error as following in a multiple node cluster. To be noticed, this issue only happens in multiple node cluster. As in the single node cluster, AM use the same local filesystem as the the driver. {noformat} 16/10/28 07:21:42 ERROR SparkContext: Error initializing SparkContext. java.io.FileNotFoundException: File file:/usr/spark-client/conf/hive-site.xml does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:537) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:750) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:527) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409) at org.apache.spark.SparkContext.addFile(SparkContext.scala:1443) at org.apache.spark.SparkContext.addFile(SparkContext.scala:1415) at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462) at org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462) at scala.collection.immutable.List.foreach(List.scala:381) at org.apache.spark.SparkContext.(SparkContext.scala:462) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2296) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:843) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:835) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:835) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at