Hi,
any input regarding is it expected:
Driver starts and unable to connect to external shuffle service on one of
the nodes(no matter what is the reason)
This makes framework to go to Inactive mode in Mesos UI
However it seems that driver doesn't exits and continues to execute tasks(or
tries to). The attached stacktrace below shows few lines around the
connection error and aborting message
The question is is it expected behaviour?
Here is stacktracke
I0412 07:31:25.827283 274 sched.cpp:759] Framework registered with
15d9838f-b266-413b-842d-f7c3567bd04a-0051
Exception in thread "Thread-295" java.io.IOException: Failed to connect to
my-company.com/x.x.x.x:7337
at
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:232)
at
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:182)
at
org.apache.spark.network.shuffle.mesos.MesosExternalShuffleClient.registerDriverWithShuffleService(MesosExternalShuffleClient.java:75)
at
org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.statusUpdate(MesosCoarseGrainedSchedulerBackend.scala:537)
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException:
Connection refused:my-company.com/x.x.x.x:7337
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at
io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:257)
at
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:291)
at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:631)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:748)
I0412 07:35:12.032925 277 sched.cpp:2055] Asked to abort the driver
I0412 07:35:12.033035 277 sched.cpp:1233] Aborting framework
15d9838f-b266-413b-842d-f7c3567bd04a-0051
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]