niefei created SPARK-21159:
------------------------------

             Summary: Cluster mode, driver throws connection refused exception 
submitted by SparkLauncher
                 Key: SPARK-21159
                 URL: https://issues.apache.org/jira/browse/SPARK-21159
             Project: Spark
          Issue Type: Bug
          Components: Spark Core, Spark Submit
    Affects Versions: 2.1.0
         Environment: Server A-Master
Server B-Slave
            Reporter: niefei


When an spark application submitted by SparkLauncher#startApplication method, 
this will get a SparkAppHandle. In the test environment, the launcher runs on 
server A, if it runs in Client mode, everything is ok. In cluster mode, the 
launcher will run on Server A, and the driver will be run on Server B, in this 
scenario, when initialize SparkContext, a LauncherBackend will try to connect 
to the launcher application via specified port and ip address. the problem is 
the implementation of LauncherBackend uses loopback ip to connect which is 
127.0.0.1. this will cause the connection refused as server B never ran the 
launcher. 

The expected behavior is the LauncherBackend should use Server A's Ip address 
to connect for reporting the running status.

Below is the stacktrace:
17/06/20 17:24:37 ERROR SparkContext: Error initializing SparkContext.
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at java.net.Socket.connect(Socket.java:538)
        at java.net.Socket.<init>(Socket.java:434)
        at java.net.Socket.<init>(Socket.java:244)
        at 
org.apache.spark.launcher.LauncherBackend.connect(LauncherBackend.scala:43)
        at 
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend.start(StandaloneSchedulerBackend.scala:60)
        at 
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
        at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
        at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
        at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
        at scala.Option.getOrElse(Option.scala:121)
        at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
        at 
com.asura.grinder.datatask.task.AbstractCommonSparkTask.executeSparkJob(AbstractCommonSparkTask.scala:91)
        at 
com.asura.grinder.datatask.task.AbstractCommonSparkTask.runSparkJob(AbstractCommonSparkTask.scala:25)
        at com.asura.grinder.datatask.main.TaskMain$.main(TaskMain.scala:61)
        at com.asura.grinder.datatask.main.TaskMain.main(TaskMain.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58)
        at 
org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
17/06/20 17:24:37 INFO SparkUI: Stopped Spark web UI at 
http://172.25.108.62:4040
17/06/20 17:24:37 INFO StandaloneSchedulerBackend: Shutting down all executors
17/06/20 17:24:37 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking 
each executor to shut down
17/06/20 17:24:37 ERROR Utils: Uncaught exception in thread main
java.lang.NullPointerException
        at 
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend.org$apache$spark$scheduler$cluster$StandaloneSchedulerBackend$$stop(StandaloneSchedulerBackend.scala:214)
        at 
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend.stop(StandaloneSchedulerBackend.scala:116)
        at 
org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:467)
        at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1588)
        at 
org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1826)
        at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1283)
        at org.apache.spark.SparkContext.stop(SparkContext.scala:1825)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:587)
        at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
        at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
        at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
        at scala.Option.getOrElse(Option.scala:121)
        at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
        at 
com.asura.grinder.datatask.task.AbstractCommonSparkTask.executeSparkJob(AbstractCommonSparkTask.scala:91)
        at 
com.asura.grinder.datatask.task.AbstractCommonSparkTask.runSparkJob(AbstractCommonSparkTask.scala:25)
        at com.asura.grinder.datatask.main.TaskMain$.main(TaskMain.scala:61)
        at com.asura.grinder.datatask.main.TaskMain.main(TaskMain.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58)
        at 
org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
17/06/20 17:24:37 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
17/06/20 17:24:37 INFO MemoryStore: MemoryStore cleared
17/06/20 17:24:37 INFO BlockManager: BlockManager stopped
17/06/20 17:24:37 INFO BlockManagerMaster: BlockManagerMaster stopped
17/06/20 17:24:37 WARN MetricsSystem: Stopping a MetricsSystem that is not 
running
17/06/20 17:24:37 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
17/06/20 17:24:37 INFO SparkContext: Successfully stopped SparkContext
17/06/20 17:24:37 ERROR MongoPilotTask: error occurred group{2}:task(222)
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at java.net.Socket.connect(Socket.java:538)
        at java.net.Socket.<init>(Socket.java:434)
        at java.net.Socket.<init>(Socket.java:244)
        at 
org.apache.spark.launcher.LauncherBackend.connect(LauncherBackend.scala:43)
        at 
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend.start(StandaloneSchedulerBackend.scala:60)
        at 
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
        at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
        at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
        at 
org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
        at scala.Option.getOrElse(Option.scala:121)
        at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
        at 
com.asura.grinder.datatask.task.AbstractCommonSparkTask.executeSparkJob(AbstractCommonSparkTask.scala:91)
        at 
com.asura.grinder.datatask.task.AbstractCommonSparkTask.runSparkJob(AbstractCommonSparkTask.scala:25)
        at com.asura.grinder.datatask.main.TaskMain$.main(TaskMain.scala:61)
        at com.asura.grinder.datatask.main.TaskMain.main(TaskMain.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58)
        at 
org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to