[jira] [Updated] (SPARK-18160) SparkContext.addFile doesn't work in yarn-cluster mode

2016-10-31 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated SPARK-18160:
---
Description: 
The following command will fails for spark 2.0
{noformat}
bin/spark-submit --class org.apache.spark.examples.SparkPi --master 
yarn-cluster --conf spark.files=/usr/spark-client/conf/hive-site.xml 
examples/target/original-spark-examples_2.11.jar
{noformat}


The above command can reproduce the error as following in a multiple node 
cluster. To be noticed, this issue only happens in multiple node cluster. As in 
the single node cluster, AM use the same local filesystem as the the driver.
{noformat}
16/10/28 07:21:42 ERROR SparkContext: Error initializing SparkContext.
java.io.FileNotFoundException: File file:/usr/spark-client/conf/hive-site.xml 
does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:537)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:750)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:527)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1443)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1415)
at 
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462)
at 
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.SparkContext.(SparkContext.scala:462)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2296)
at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:843)
at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:835)
at scala.Option.getOrElse(Option.scala:121)
at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:835)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
{noformat}

  was:
The following command will fails for spark 2.0
{noformat}
bin/spark-submit --class org.apache.spark.examples.SparkPi --master 
yarn-cluster --conf spark.files=/usr/spark-client/conf/hive-site.xml 
examples/target/original-spark-examples_2.11.jar
{noformat}
and this command fails for spark 1.6
{noformat}
bin/spark-submit --class org.apache.spark.examples.SparkPi --master 
yarn-cluster --files /usr/spark-client/conf/hive-site.xml 
examples/target/original-spark-examples_2.11.jar
{noformat}

The above command can reproduce the error as following in a multiple node 
cluster. To be noticed, this issue only happens in multiple node cluster. As in 
the single node cluster, AM use the same local filesystem as the the driver.
{noformat}
16/10/28 07:21:42 ERROR SparkContext: Error initializing SparkContext.
java.io.FileNotFoundException: File file:/usr/spark-client/conf/hive-site.xml 
does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:537)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:750)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:527)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1443)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1415)
at 
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462)
at 
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.SparkContext.(SparkContext.scala:462)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2296)
at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:843)
at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:835)
at scala.Option.getOrElse(Option.scala:121)
at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:835)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at 

[jira] [Updated] (SPARK-18160) SparkContext.addFile doesn't work in yarn-cluster mode

2016-10-31 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated SPARK-18160:
---
Description: 
The following command will fails for spark 2.0
{noformat}
bin/spark-submit --class org.apache.spark.examples.SparkPi --master 
yarn-cluster --conf spark.files=/usr/spark-client/conf/hive-site.xml 
examples/target/original-spark-examples_2.11.jar
{noformat}
and this command fails for spark 1.6
{noformat}
bin/spark-submit --class org.apache.spark.examples.SparkPi --master 
yarn-cluster --files /usr/spark-client/conf/hive-site.xml 
examples/target/original-spark-examples_2.11.jar
{noformat}

The above command can reproduce the error as following in a multiple node 
cluster. To be noticed, this issue only happens in multiple node cluster. As in 
the single node cluster, AM use the same local filesystem as the the driver.
{noformat}
16/10/28 07:21:42 ERROR SparkContext: Error initializing SparkContext.
java.io.FileNotFoundException: File file:/usr/spark-client/conf/hive-site.xml 
does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:537)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:750)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:527)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1443)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1415)
at 
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462)
at 
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.SparkContext.(SparkContext.scala:462)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2296)
at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:843)
at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:835)
at scala.Option.getOrElse(Option.scala:121)
at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:835)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
{noformat}

  was:
{noformat}
bin/spark-submit --class org.apache.spark.examples.SparkPi --master 
yarn-cluster --conf spark.files=/usr/spark-client/conf/hive-site.xml 
examples/target/original-spark-examples_2.11.jar
{noformat}
The above command can reproduce the error as following in a multiple node 
cluster. To be noticed, this issue only happens in multiple node cluster. As in 
the single node cluster, AM use the same local filesystem as the the driver.
{noformat}
16/10/28 07:21:42 ERROR SparkContext: Error initializing SparkContext.
java.io.FileNotFoundException: File file:/usr/spark-client/conf/hive-site.xml 
does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:537)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:750)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:527)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1443)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1415)
at 
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462)
at 
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.SparkContext.(SparkContext.scala:462)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2296)
at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:843)
at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:835)
at scala.Option.getOrElse(Option.scala:121)
at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:835)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at 

[jira] [Updated] (SPARK-18160) SparkContext.addFile doesn't work in yarn-cluster mode

2016-10-31 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated SPARK-18160:
---
Description: 
{noformat}
bin/spark-submit --class org.apache.spark.examples.SparkPi --master 
yarn-cluster --conf spark.files=/usr/spark-client/conf/hive-site.xml 
examples/target/original-spark-examples_2.11.jar
{noformat}
The above command can reproduce the error as following in a multiple node 
cluster. To be noticed, this issue only happens in multiple node cluster. As in 
the single node cluster, AM use the same local filesystem as the the driver.
{noformat}
16/10/28 07:21:42 ERROR SparkContext: Error initializing SparkContext.
java.io.FileNotFoundException: File file:/usr/spark-client/conf/hive-site.xml 
does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:537)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:750)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:527)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1443)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1415)
at 
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462)
at 
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.SparkContext.(SparkContext.scala:462)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2296)
at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:843)
at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:835)
at scala.Option.getOrElse(Option.scala:121)
at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:835)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
{noformat}

  was:

{noformat}
bin/spark-submit --class org.apache.spark.examples.SparkPi --master 
yarn-cluster --files /usr/spark-client/conf/hive-site.xml 
examples/target/original-spark-examples_2.11.jar
{noformat}
The above command can reproduce the error as following in a multiple node 
cluster. To be noticed, this issue only happens in multiple node cluster. As in 
the single node cluster, AM use the same local filesystem as the the driver.
{noformat}
16/10/28 07:21:42 ERROR SparkContext: Error initializing SparkContext.
java.io.FileNotFoundException: File file:/usr/spark-client/conf/hive-site.xml 
does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:537)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:750)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:527)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1443)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1415)
at 
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462)
at 
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:462)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.SparkContext.(SparkContext.scala:462)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2296)
at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:843)
at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:835)
at scala.Option.getOrElse(Option.scala:121)
at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:835)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at