Hi, any input regarding is it expected: Driver starts and unable to connect to external shuffle service on one of the nodes(no matter what is the reason) This makes framework to go to Inactive mode in Mesos UI However it seems that driver doesn't exits and continues to execute tasks(or tries to). The attached stacktrace below shows few lines around the connection error and aborting message
The question is is it expected behaviour? Here is stacktracke I0412 07:31:25.827283 274 sched.cpp:759] Framework registered with 15d9838f-b266-413b-842d-f7c3567bd04a-0051 Exception in thread "Thread-295" java.io.IOException: Failed to connect to my-company.com/x.x.x.x:7337 at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:232) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:182) at org.apache.spark.network.shuffle.mesos.MesosExternalShuffleClient.registerDriverWithShuffleService(MesosExternalShuffleClient.java:75) at org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.statusUpdate(MesosCoarseGrainedSchedulerBackend.scala:537) Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused:my-company.com/x.x.x.x:7337 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:257) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:291) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:631) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144) at java.lang.Thread.run(Thread.java:748) I0412 07:35:12.032925 277 sched.cpp:2055] Asked to abort the driver I0412 07:35:12.033035 277 sched.cpp:1233] Aborting framework 15d9838f-b266-413b-842d-f7c3567bd04a-0051 -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org