[ 
https://issues.apache.org/jira/browse/SPARK-21618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114379#comment-16114379
 ] 

Steve Loughran commented on SPARK-21618:
----------------------------------------

yes, and that 2.9+ feature breaks things, because when you ask for an http or 
https connection, you get back some Hadoop wrapper class, which is not what 
other code (e.g. Wasb) wants ... their attempts to cast it to the normal 
java,io base fails. This doesn't surface on any shipping Hadoop release (or 
HDP/CDH/EMR) & a fix is in progress.

> http(s) not accepted in spark-submit jar uri
> --------------------------------------------
>
>                 Key: SPARK-21618
>                 URL: https://issues.apache.org/jira/browse/SPARK-21618
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy
>    Affects Versions: 2.1.1, 2.2.0
>         Environment: pre-built for hadoop 2.6 and 2.7 on mac and ubuntu 
> 16.04. 
>            Reporter: Ben Mayne
>            Priority: Minor
>              Labels: documentation
>
> The documentation suggests I should be able to use an http(s) uri for a jar 
> in spark-submit, but I haven't been successful 
> https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management
> {noformat}
> benmayne@Benjamins-MacBook-Pro ~ $ spark-submit --deploy-mode client --master 
> local[2] --class class.name.Test https://test.com/path/to/jar.jar
> log4j:WARN No appenders could be found for logger 
> (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> Exception in thread "main" java.io.IOException: No FileSystem for scheme: 
> https
>       at 
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2586)
>       at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2593)
>       at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
>       at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2632)
>       at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2614)
>       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
>       at 
> org.apache.spark.deploy.SparkSubmit$.downloadFile(SparkSubmit.scala:865)
>       at 
> org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$1.apply(SparkSubmit.scala:316)
>       at 
> org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$1.apply(SparkSubmit.scala:316)
>       at scala.Option.map(Option.scala:146)
>       at 
> org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:316)
>       at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
>       at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
>       at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> benmayne@Benjamins-MacBook-Pro ~ $
> {noformat}
> If I replace the path with a valid hdfs path 
> (hdfs:///user/benmayne/valid-jar.jar), it works as expected. I've seen the 
> same behavior across 2.2.0 (hadoop 2.6 & 2.7 on mac and ubuntu) and on 2.1.1 
> on ubuntu. 
> this is the example that I'm trying to replicate from 
> https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management:
>  
> > Spark uses the following URL scheme to allow different strategies for 
> > disseminating jars:
> > file: - Absolute paths and file:/ URIs are served by the driver’s HTTP file 
> > server, and every executor pulls the file from the driver HTTP server.
> > hdfs:, http:, https:, ftp: - these pull down files and JARs from the URI as 
> > expected
> {noformat}
> # Run on a Mesos cluster in cluster deploy mode with supervise
> ./bin/spark-submit \
>   --class org.apache.spark.examples.SparkPi \
>   --master mesos://207.184.161.138:7077 \
>   --deploy-mode cluster \
>   --supervise \
>   --executor-memory 20G \
>   --total-executor-cores 100 \
>   http://path/to/examples.jar \
>   1000
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to