Hi Jürgen, Did you ever find a way to resolve this issue ?
Looking at the implementation of the application master, it seems that there is no heartbeat/keepalive mechanism for the communication between the driver and AM, so when something closes the connection for inactivity, the AM shuts down: https://github.com/apache/spark/blob/branch-2.3/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala#L807 Jürgen Thomann wrote > Hi, > > I'm using the Spark Thrift Server and after some time the driver and > application master are shutting down because of timeouts. There is a > firewall > in between and there is no traffic between them as it seems. Is there a > way to > configure TCP keep alive for the connection or some other way to make the > firewall happy? > > Environment: > CentOS 7, HDP 2.6.5 with Spark 2.3.0 > > The Error on the driver is "ERROR YarnClientSchedulerBackend: Yarn > application > has already exited with state finished" and a bit later there are some > Exceptions with ClosedChannelException. > > The application master has the following message: > WARN TransportChannelHandler: Exception in connection from > <driver Host> > java.io.IOException: Connection timed out > ... Stacktrace omitted > The messages are at the same time (same second, sadly no milliseconds in > the > logs). > > Thanks, > Jürgen > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: > user-unsubscribe@.apache -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org