[jira] [Updated] (SPARK-13960) HTTP-based JAR Server doesn't respect spark.driver.host and there is no "spark.fileserver.host" option

Ilya Ostrovskiy (JIRA) Sat, 19 Mar 2016 11:48:22 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ilya Ostrovskiy updated SPARK-13960:
------------------------------------
    Description: 
There is no option to specify which hostname/IP address the jar/file server 
listens on, and rather than using "spark.driver.host" if specified, the 
jar/file server will listen on the system's primary IP address. This is an 
issue when submitting an application in client mode on a machine with two NICs 
connected to two different networks. 

Steps to reproduce:

1) Have a cluster in a remote network, whose master is on 192.168.255.10
2) Have a machine at another location, with a "primary" IP address of 
192.168.1.2, connected to the "remote network" as well, with the IP address 
192.168.255.250. Let's call this the "client machine".
3) Ensure every machine in the spark cluster at the remote location can ping 
192.168.255.250 and reach the client machine via that address.
4) On the client: 
{noformat}
spark-submit --deploy-mode client --conf "spark.driver.host=192.168.255.250" 
--master spark://192.168.255.10:7077 --class <any valid spark application> 
<local jar with spark application> <whatever args you want>
{noformat}
5) Navigate to http://192.168.255.250:4040/ and ensure that executors from the 
remote cluster have found the driver on the client machine
6) Navigate to http://192.168.255.250:4040/environment/, and scroll to the 
bottom
7) Observe that the JAR you specified in Step 4 will be listed under 
http://192.168.1.2:<random port>/jars/<your jar here>.jar
8) Enjoy this stack trace periodically appearing on the client machine when the 
nodes in the remote cluster cant connect to 192.168.1.2
{noformat}
16/03/17 03:25:55 WARN TaskSetManager: Lost task 1.2 in stage 0.0 (TID 5, 
172.17.74.1): java.net.SocketTimeoutException: connect timed out
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
        at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
        at sun.net.www.http.HttpClient.New(HttpClient.java:308)
        at sun.net.www.http.HttpClient.New(HttpClient.java:326)
        at 
sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
        at 
sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
        at 
sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
        at 
sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
        at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:588)
        at org.apache.spark.util.Utils$.fetchFile(Utils.scala:381)
        at 
org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:405)
        at 
org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:397)
        at 
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
        at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
        at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
        at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
        at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
        at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
        at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
        at 
org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:397)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:193)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{noformat}

  was:
There is no option to specify which hostname/IP address the jar/file server 
listens on, and rather than using "spark.driver.host" if specified, the 
jar/file server will listen on the system's primary IP address. This is an 
issue when submitting an application in client mode on a machine with two NICs 
connected to two different networks. 

Steps to reproduce:

1) Have a cluster in a remote network, whose master is on 192.168.255.10
2) Have a machine at another location, with a "primary" IP address of 
192.168.1.2, connected to the "remote network" as well, with the IP address 
192.168.255.250. Let's call this the "client machine".
3) Ensure every machine in the spark cluster at the remote location can ping 
192.168.255.250 and reach the client machine via that address.
4) On the client: 
{noformat}
spark-submit --deploy-mode client --conf "spark.driver.host=192.168.255.250" 
--master spark://192.168.255.10:7077 --class <any valid spark application> 
<local jar with spark application> <whatever args you want>
{noformat}
5) Navigate to http://192.168.255.250:4040/ and ensure that executors from the 
remote cluster have found the driver on the client machine
6) Navigate to http://192.168.255.250:4040/environment/, and scroll to the 
bottom
7) Observe that the JAR you specified in Step 4 will be listed under 
http://192.168.1.2:<random port>/jars/<your jar here>.jar


> HTTP-based JAR Server doesn't respect spark.driver.host and there is no 
> "spark.fileserver.host" option
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-13960
>                 URL: https://issues.apache.org/jira/browse/SPARK-13960
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, Spark Submit
>    Affects Versions: 1.6.1
>         Environment: Any system with more than one IP address
>            Reporter: Ilya Ostrovskiy
>
> There is no option to specify which hostname/IP address the jar/file server 
> listens on, and rather than using "spark.driver.host" if specified, the 
> jar/file server will listen on the system's primary IP address. This is an 
> issue when submitting an application in client mode on a machine with two 
> NICs connected to two different networks. 
> Steps to reproduce:
> 1) Have a cluster in a remote network, whose master is on 192.168.255.10
> 2) Have a machine at another location, with a "primary" IP address of 
> 192.168.1.2, connected to the "remote network" as well, with the IP address 
> 192.168.255.250. Let's call this the "client machine".
> 3) Ensure every machine in the spark cluster at the remote location can ping 
> 192.168.255.250 and reach the client machine via that address.
> 4) On the client: 
> {noformat}
> spark-submit --deploy-mode client --conf "spark.driver.host=192.168.255.250" 
> --master spark://192.168.255.10:7077 --class <any valid spark application> 
> <local jar with spark application> <whatever args you want>
> {noformat}
> 5) Navigate to http://192.168.255.250:4040/ and ensure that executors from 
> the remote cluster have found the driver on the client machine
> 6) Navigate to http://192.168.255.250:4040/environment/, and scroll to the 
> bottom
> 7) Observe that the JAR you specified in Step 4 will be listed under 
> http://192.168.1.2:<random port>/jars/<your jar here>.jar
> 8) Enjoy this stack trace periodically appearing on the client machine when 
> the nodes in the remote cluster cant connect to 192.168.1.2
> {noformat}
> 16/03/17 03:25:55 WARN TaskSetManager: Lost task 1.2 in stage 0.0 (TID 5, 
> 172.17.74.1): java.net.SocketTimeoutException: connect timed out
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>         at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>         at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>         at java.net.Socket.connect(Socket.java:589)
>         at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
>         at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>         at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>         at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
>         at sun.net.www.http.HttpClient.New(HttpClient.java:308)
>         at sun.net.www.http.HttpClient.New(HttpClient.java:326)
>         at 
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
>         at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
>         at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
>         at 
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
>         at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:588)
>         at org.apache.spark.util.Utils$.fetchFile(Utils.scala:381)
>         at 
> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:405)
>         at 
> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:397)
>         at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>         at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>         at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>         at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
>         at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
>         at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
>         at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>         at 
> org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:397)
>         at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:193)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-13960) HTTP-based JAR Server doesn't respect spark.driver.host and there is no "spark.fileserver.host" option

Reply via email to