[jira] [Commented] (SPARK-21618) http(s) not accepted in spark-submit jar uri

2017-08-08 Thread Ben Mayne (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118946#comment-16118946
 ] 

Ben Mayne commented on SPARK-21618:
---

[~jerryshao] confirmed that the master branch behavior is what I was originally 
trying to accomplish and was suggested by the doc. Thanks. 

> http(s) not accepted in spark-submit jar uri
> 
>
> Key: SPARK-21618
> URL: https://issues.apache.org/jira/browse/SPARK-21618
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy
>Affects Versions: 2.1.1, 2.2.0
> Environment: pre-built for hadoop 2.6 and 2.7 on mac and ubuntu 
> 16.04. 
>Reporter: Ben Mayne
>Priority: Minor
>  Labels: documentation
>
> The documentation suggests I should be able to use an http(s) uri for a jar 
> in spark-submit, but I haven't been successful 
> https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management
> {noformat}
> benmayne@Benjamins-MacBook-Pro ~ $ spark-submit --deploy-mode client --master 
> local[2] --class class.name.Test https://test.com/path/to/jar.jar
> log4j:WARN No appenders could be found for logger 
> (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> Exception in thread "main" java.io.IOException: No FileSystem for scheme: 
> https
>   at 
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2586)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2593)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2632)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2614)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
>   at 
> org.apache.spark.deploy.SparkSubmit$.downloadFile(SparkSubmit.scala:865)
>   at 
> org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$1.apply(SparkSubmit.scala:316)
>   at 
> org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$1.apply(SparkSubmit.scala:316)
>   at scala.Option.map(Option.scala:146)
>   at 
> org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:316)
>   at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
>   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
>   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> benmayne@Benjamins-MacBook-Pro ~ $
> {noformat}
> If I replace the path with a valid hdfs path 
> (hdfs:///user/benmayne/valid-jar.jar), it works as expected. I've seen the 
> same behavior across 2.2.0 (hadoop 2.6 & 2.7 on mac and ubuntu) and on 2.1.1 
> on ubuntu. 
> this is the example that I'm trying to replicate from 
> https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management:
>  
> > Spark uses the following URL scheme to allow different strategies for 
> > disseminating jars:
> > file: - Absolute paths and file:/ URIs are served by the driver’s HTTP file 
> > server, and every executor pulls the file from the driver HTTP server.
> > hdfs:, http:, https:, ftp: - these pull down files and JARs from the URI as 
> > expected
> {noformat}
> # Run on a Mesos cluster in cluster deploy mode with supervise
> ./bin/spark-submit \
>   --class org.apache.spark.examples.SparkPi \
>   --master mesos://207.184.161.138:7077 \
>   --deploy-mode cluster \
>   --supervise \
>   --executor-memory 20G \
>   --total-executor-cores 100 \
>   http://path/to/examples.jar \
>   1000
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-21618) http(s) not accepted in spark-submit jar uri

2017-08-02 Thread Ben Mayne (JIRA)
Ben Mayne created SPARK-21618:
-

 Summary: http(s) not accepted in spark-submit jar uri
 Key: SPARK-21618
 URL: https://issues.apache.org/jira/browse/SPARK-21618
 Project: Spark
  Issue Type: Bug
  Components: Deploy
Affects Versions: 2.2.0, 2.1.1
 Environment: pre-built for hadoop 2.6 and 2.7 on mac and ubuntu 16.04. 
Reporter: Ben Mayne
Priority: Minor


The documentation suggests I should be able to use an http(s) uri for a jar in 
spark-submit, but I haven't been successful 
https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management

{noformat}
benmayne@Benjamins-MacBook-Pro ~ $ spark-submit --deploy-mode client --master 
local[2] --class class.name.Test https://test.com/path/to/jar.jar
log4j:WARN No appenders could be found for logger 
(org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
Exception in thread "main" java.io.IOException: No FileSystem for scheme: https
at 
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2586)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2593)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2632)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2614)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at 
org.apache.spark.deploy.SparkSubmit$.downloadFile(SparkSubmit.scala:865)
at 
org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$1.apply(SparkSubmit.scala:316)
at 
org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$1.apply(SparkSubmit.scala:316)
at scala.Option.map(Option.scala:146)
at 
org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:316)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
benmayne@Benjamins-MacBook-Pro ~ $
{noformat}

If I replace the path with a valid hdfs path 
(hdfs:///user/benmayne/valid-jar.jar), it works as expected. I've seen the same 
behavior across 2.2.0 (hadoop 2.6 & 2.7 on mac and ubuntu) and on 2.1.1 on 
ubuntu. 

this is the example that I'm trying to replicate from 
https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management:
 

> Spark uses the following URL scheme to allow different strategies for 
> disseminating jars:
> file: - Absolute paths and file:/ URIs are served by the driver’s HTTP file 
> server, and every executor pulls the file from the driver HTTP server.
> hdfs:, http:, https:, ftp: - these pull down files and JARs from the URI as 
> expected


{noformat}
# Run on a Mesos cluster in cluster deploy mode with supervise
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master mesos://207.184.161.138:7077 \
  --deploy-mode cluster \
  --supervise \
  --executor-memory 20G \
  --total-executor-cores 100 \
  http://path/to/examples.jar \
  1000
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org