Hi Matei,

Thanks! Will look out for long GC pauses.
On Oct 3, 2013 3:00 PM, "Matei Zaharia" <matei.zaha...@gmail.com> wrote:

> Hi Ashish,
> Those "removing" messages mean that the node in question didn't
> communicate with your application for 45 seconds. Most likely the executor
> process on the node died, though there's also a chance that it was doing a
> super-long garbage collection or that there was a network problem. Look at
> the logs on the node to see if it crashed due to an exception or the OS
> killed it for some reason. For garbage collection debugging, you can turn
> on -XX:+PrintGCDetails in SPARK_JAVA_OPTS to log the lengths of GC pauses.
> Matei
> On Oct 3, 2013, at 1:10 PM, Ashish Rangole <arang...@gmail.com> wrote:
> > Hi,
> >
> > Trying to figure out what does it mean when the application (driver
> program) logs end with the the lines like the ones below. This is with the
> application running on Spark 0.8.0 on EC2.
> >
> > Any help will be greatly appreciated.
> >
> > Thanks!
> >
> >
> > 13/10/03 16:17:33 INFO cluster.ClusterTaskSetManager: Finished TID 1744
> in 1183507 ms on ip-10-232-80-206.ec2.internal (progress: 46/60)
> > 13/10/03 16:17:33 INFO scheduler.DAGScheduler: Completed ResultTask(4,
> 20)
> > 13/10/03 16:17:57 WARN storage.BlockManagerMasterActor: Removing
> BlockManager BlockManagerId(1, ip-10-170-16-83.ec2.internal, 46907, 0) with
> no recent
> > heart beats: 68685ms exceeds 45000ms
> > 13/10/03 16:18:57 WARN storage.BlockManagerMasterActor: Removing
> BlockManager BlockManagerId(0, ip-10-232-27-176.ec2.internal, 55654, 0)
> with no recent
> >  heart beats: 95376ms exceeds 45000ms
> > 13/10/03 16:19:13 INFO storage.BlockManagerMasterActor$BlockManagerInfo:
> Registering block manager ip-10-232-27-176.ec2.internal:55654 with 19.3 GB
> > 13/10/03 16:21:38 INFO storage.BlockManagerMasterActor$BlockManagerInfo:
> Added rdd_7_293 in memory on ip-10-232-27-176.ec2.internal:55654 (size:
> 1975.7
> >  KB, free: 19.2 GB)
> > 13/10/03 16:21:38 INFO storage.BlockManagerMasterActor$BlockManagerInfo:
> Added rdd_7_293 in memory on ip-10-232-27-176.ec2.internal:55654 (size:
> 1975.7
> >  KB, free: 19.2 GB)
> > 13/10/03 16:22:57 WARN storage.BlockManagerMasterActor: Removing
> BlockManager BlockManagerId(0, ip-10-232-27-176.ec2.internal, 55654, 0)
> with no recent
> >  heart beats: 78761ms exceeds 45000ms
> >

Reply via email to