RE: Executor could not connect to Driver?
Thanks, my case seems not caused by GC, cpu is pretty low and both YGC and FGC seems behavior quite normal. Hmm, weird. Best Regards, Raymond Liu From: Aaron Davidson [mailto:ilike...@gmail.com] Sent: Saturday, November 02, 2013 12:07 AM To: user@spark.incubator.apache.org Subject: Re: Executor could not connect to Driver? I've seen this happen before due to the driver doing long GCs when the driver machine was heavily memory-constrained. For this particular issue, simply freeing up memory used by other applications fixed the problem. On Fri, Nov 1, 2013 at 12:14 AM, Liu, Raymond mailto:raymond@intel.com>> wrote: Hi I am encounter an issue that the executor actor could not connect to Driver actor. But I could not figure out what's the reason. Say the Driver actor is listening on :35838 root@sr434:~# netstat -lpv Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp0 0 *:50075 *:* LISTEN 18242/java tcp0 0 *:50020 *:* LISTEN 18242/java tcp0 0 *:ssh *:* LISTEN 1325/sshd tcp0 0 *:50010 *:* LISTEN 18242/java tcp6 0 0 sr434:35838 [::]:* LISTEN 9420/java tcp6 0 0 [::]:40390 [::]:* LISTEN 9420/java tcp6 0 0 [::]:4040 [::]:* LISTEN 9420/java tcp6 0 0 [::]:8040 [::]:* LISTEN 28324/java tcp6 0 0 [::]:60712 [::]:* LISTEN 28324/java tcp6 0 0 [::]:8042 [::]:* LISTEN 28324/java tcp6 0 0 [::]:34028 [::]:* LISTEN 9420/java tcp6 0 0 [::]:ssh[::]:* LISTEN 1325/sshd tcp6 0 0 [::]:45528 [::]:* LISTEN 9420/java tcp6 0 0 [::]:13562 [::]:* LISTEN 28324/java while the executor driver report errors as below : 13/11/01 13:16:43 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: akka://spark@sr434:35838/user/CoarseGrainedScheduler 13/11/01 13:16:43 ERROR executor.CoarseGrainedExecutorBackend: Driver terminated or disconnected! Shutting down. Any idea? Best Regards, Raymond Liu
Re: Executor could not connect to Driver?
I've seen this happen before due to the driver doing long GCs when the driver machine was heavily memory-constrained. For this particular issue, simply freeing up memory used by other applications fixed the problem. On Fri, Nov 1, 2013 at 12:14 AM, Liu, Raymond wrote: > Hi > > I am encounter an issue that the executor actor could not connect to > Driver actor. But I could not figure out what's the reason. > > Say the Driver actor is listening on :35838 > > root@sr434:~# netstat -lpv > Active Internet connections (only servers) > Proto Recv-Q Send-Q Local Address Foreign Address State > PID/Program name > tcp0 0 *:50075 *:* LISTEN > 18242/java > tcp0 0 *:50020 *:* LISTEN > 18242/java > tcp0 0 *:ssh *:* LISTEN > 1325/sshd > tcp0 0 *:50010 *:* LISTEN > 18242/java > tcp6 0 0 sr434:35838 [::]:* LISTEN > 9420/java > tcp6 0 0 [::]:40390 [::]:* LISTEN > 9420/java > tcp6 0 0 [::]:4040 [::]:* LISTEN > 9420/java > tcp6 0 0 [::]:8040 [::]:* LISTEN > 28324/java > tcp6 0 0 [::]:60712 [::]:* LISTEN > 28324/java > tcp6 0 0 [::]:8042 [::]:* LISTEN > 28324/java > tcp6 0 0 [::]:34028 [::]:* LISTEN > 9420/java > tcp6 0 0 [::]:ssh[::]:* LISTEN > 1325/sshd > tcp6 0 0 [::]:45528 [::]:* LISTEN > 9420/java > tcp6 0 0 [::]:13562 [::]:* LISTEN > 28324/java > > > while the executor driver report errors as below : > > 13/11/01 13:16:43 INFO executor.CoarseGrainedExecutorBackend: Connecting > to driver: akka://spark@sr434:35838/user/CoarseGrainedScheduler > 13/11/01 13:16:43 ERROR executor.CoarseGrainedExecutorBackend: Driver > terminated or disconnected! Shutting down. > > Any idea? > > Best Regards, > Raymond Liu >
Executor could not connect to Driver?
Hi I am encounter an issue that the executor actor could not connect to Driver actor. But I could not figure out what's the reason. Say the Driver actor is listening on :35838 root@sr434:~# netstat -lpv Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp0 0 *:50075 *:* LISTEN 18242/java tcp0 0 *:50020 *:* LISTEN 18242/java tcp0 0 *:ssh *:* LISTEN 1325/sshd tcp0 0 *:50010 *:* LISTEN 18242/java tcp6 0 0 sr434:35838 [::]:* LISTEN 9420/java tcp6 0 0 [::]:40390 [::]:* LISTEN 9420/java tcp6 0 0 [::]:4040 [::]:* LISTEN 9420/java tcp6 0 0 [::]:8040 [::]:* LISTEN 28324/java tcp6 0 0 [::]:60712 [::]:* LISTEN 28324/java tcp6 0 0 [::]:8042 [::]:* LISTEN 28324/java tcp6 0 0 [::]:34028 [::]:* LISTEN 9420/java tcp6 0 0 [::]:ssh[::]:* LISTEN 1325/sshd tcp6 0 0 [::]:45528 [::]:* LISTEN 9420/java tcp6 0 0 [::]:13562 [::]:* LISTEN 28324/java while the executor driver report errors as below : 13/11/01 13:16:43 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: akka://spark@sr434:35838/user/CoarseGrainedScheduler 13/11/01 13:16:43 ERROR executor.CoarseGrainedExecutorBackend: Driver terminated or disconnected! Shutting down. Any idea? Best Regards, Raymond Liu