[ https://issues.apache.org/jira/browse/SPARK-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ilya Ostrovskiy updated SPARK-13960: ------------------------------------ Summary: JAR/File HTTP Server doesn't respect "spark.driver.host" and there is no "spark.fileserver.host" option (was: JAR/File HTTP Server doesn't respect spark.driver.host and there is no "spark.fileserver.host" option) > JAR/File HTTP Server doesn't respect "spark.driver.host" and there is no > "spark.fileserver.host" option > ------------------------------------------------------------------------------------------------------- > > Key: SPARK-13960 > URL: https://issues.apache.org/jira/browse/SPARK-13960 > Project: Spark > Issue Type: Bug > Components: Spark Core, Spark Submit > Affects Versions: 1.6.1 > Environment: Any system with more than one IP address > Reporter: Ilya Ostrovskiy > > There is no option to specify which hostname/IP address the jar/file server > listens on, and rather than using "spark.driver.host" if specified, the > jar/file server will listen on the system's primary IP address. This is an > issue when submitting an application in client mode on a machine with two > NICs connected to two different networks. > Steps to reproduce: > 1) Have a cluster in a remote network, whose master is on 192.168.255.10 > 2) Have a machine at another location, with a "primary" IP address of > 192.168.1.2, connected to the "remote network" as well, with the IP address > 192.168.255.250. Let's call this the "client machine". > 3) Ensure every machine in the spark cluster at the remote location can ping > 192.168.255.250 and reach the client machine via that address. > 4) On the client: > {noformat} > spark-submit --deploy-mode client --conf "spark.driver.host=192.168.255.250" > --master spark://192.168.255.10:7077 --class <any valid spark application> > <local jar with spark application> <whatever args you want> > {noformat} > 5) Navigate to http://192.168.255.250:4040/ and ensure that executors from > the remote cluster have found the driver on the client machine > 6) Navigate to http://192.168.255.250:4040/environment/, and scroll to the > bottom > 7) Observe that the JAR you specified in Step 4 will be listed under > http://192.168.1.2:<random port>/jars/<your jar here>.jar > 8) Enjoy this stack trace periodically appearing on the client machine when > the nodes in the remote cluster cant connect to 192.168.1.2 to get your JAR > {noformat} > 16/03/17 03:25:55 WARN TaskSetManager: Lost task 1.2 in stage 0.0 (TID 5, > 192.168.255.11): java.net.SocketTimeoutException: connect timed out > at java.net.PlainSocketImpl.socketConnect(Native Method) > at > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) > at > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) > at > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > at java.net.Socket.connect(Socket.java:589) > at sun.net.NetworkClient.doConnect(NetworkClient.java:175) > at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) > at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) > at sun.net.www.http.HttpClient.<init>(HttpClient.java:211) > at sun.net.www.http.HttpClient.New(HttpClient.java:308) > at sun.net.www.http.HttpClient.New(HttpClient.java:326) > at > sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169) > at > sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105) > at > sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999) > at > sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933) > at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:588) > at org.apache.spark.util.Utils$.fetchFile(Utils.scala:381) > at > org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:405) > at > org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:397) > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) > at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) > at > org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:397) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:193) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org