[ https://issues.apache.org/jira/browse/SPARK-21618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113806#comment-16113806 ]
Saisai Shao commented on SPARK-21618: ------------------------------------- [~benmayne] If you try the master branch of Spark with SPARK-21012 in, the jars could be downloaded from http(s) url, please take a try. > http(s) not accepted in spark-submit jar uri > -------------------------------------------- > > Key: SPARK-21618 > URL: https://issues.apache.org/jira/browse/SPARK-21618 > Project: Spark > Issue Type: Bug > Components: Deploy > Affects Versions: 2.1.1, 2.2.0 > Environment: pre-built for hadoop 2.6 and 2.7 on mac and ubuntu > 16.04. > Reporter: Ben Mayne > Priority: Minor > Labels: documentation > > The documentation suggests I should be able to use an http(s) uri for a jar > in spark-submit, but I haven't been successful > https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management > {noformat} > benmayne@Benjamins-MacBook-Pro ~ $ spark-submit --deploy-mode client --master > local[2] --class class.name.Test https://test.com/path/to/jar.jar > log4j:WARN No appenders could be found for logger > (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > Exception in thread "main" java.io.IOException: No FileSystem for scheme: > https > at > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2586) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2593) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2632) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2614) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370) > at > org.apache.spark.deploy.SparkSubmit$.downloadFile(SparkSubmit.scala:865) > at > org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$1.apply(SparkSubmit.scala:316) > at > org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$1.apply(SparkSubmit.scala:316) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:316) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > benmayne@Benjamins-MacBook-Pro ~ $ > {noformat} > If I replace the path with a valid hdfs path > (hdfs:///user/benmayne/valid-jar.jar), it works as expected. I've seen the > same behavior across 2.2.0 (hadoop 2.6 & 2.7 on mac and ubuntu) and on 2.1.1 > on ubuntu. > this is the example that I'm trying to replicate from > https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management: > > > Spark uses the following URL scheme to allow different strategies for > > disseminating jars: > > file: - Absolute paths and file:/ URIs are served by the driver’s HTTP file > > server, and every executor pulls the file from the driver HTTP server. > > hdfs:, http:, https:, ftp: - these pull down files and JARs from the URI as > > expected > {noformat} > # Run on a Mesos cluster in cluster deploy mode with supervise > ./bin/spark-submit \ > --class org.apache.spark.examples.SparkPi \ > --master mesos://207.184.161.138:7077 \ > --deploy-mode cluster \ > --supervise \ > --executor-memory 20G \ > --total-executor-cores 100 \ > http://path/to/examples.jar \ > 1000 > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org