Re: One task hangs and never finishes

2015-12-21 Thread Akhil Das
Pasting the relevant code might help to understand better what exactly you
are doing.

Thanks
Best Regards

On Thu, Dec 17, 2015 at 9:25 PM, Daniel Haviv <
daniel.ha...@veracity-group.com> wrote:

> Hi,
>
> I have an application running a set of transformations and finishes with 
> saveAsTextFile.
>
> Out of 80 tasks all finish pretty fast but one that just hangs and outputs 
> these message to STDERR:
>
> 5/12/17 17:22:19 INFO collection.ExternalAppendOnlyMap: Thread 82 spilling 
> in-memory map of 4.0 GB to disk (6 times so far)
>
> 15/12/17 17:23:41 INFO collection.ExternalAppendOnlyMap: Thread 82 spilling 
> in-memory map of 3.8 GB to disk (7 times so far)
>
>
> Inside the WEBUI I can see that for some reason the shuffle spill memory is 
> exteremly high (15GB) compared to the others (around a few mb to 1 GB) and as 
> a result the GC time is exteremly bad
>
>
>
> IndexIDAttemptStatus  â–´Locality LevelExecutor ID / HostLaunch 
> TimeDurationScheduler DelayTask Deserialization TimeGC TimeResult 
> Serialization TimeGetting Result TimePeak Execution MemoryOutput Size / 
> RecordsShuffle Read Size / RecordsShuffle Spill (Memory)Shuffle Spill 
> (Disk)Errors171530RUNNINGPROCESS_LOCAL8 / impact3.indigo.co.il2015/12/17 
> 17:18:3232 min0 ms0 ms25 min0 ms0 ms0.0 B0.0 B / 0835.8 MB / 521783315.2 
> GB662.9 MB
>
> I'm running with 8 executors with 8 cpus and 25GB ram each and it seems that 
> tasks are correctly spread across the nodes:
>
>
>
> Executor IDAddressRDD BlocksStorage MemoryDisk UsedActive TasksFailed 
> TasksComplete TasksTotal TasksTask TimeInputShuffle ReadShuffle 
> WriteLogsThread Dump1impact1.indigo.co.il:3812000.0 B / 12.9 GB0.0 
> B0025253.54 h2.1 GB377.6 MB555.3 MB
> stdout 
> 
> stderr 
> 
> Thread Dump 
> 2impact4.indigo.co.il:4076800.0
>  B / 12.9 GB0.0 B0024244.32 h2.0 GB513.1 MB495.9 MB
> stdout 
> 
> stderr 
> 
> Thread Dump 
> 3impact2.indigo.co.il:4366600.0
>  B / 12.9 GB0.0 B0024243.78 h2.0 GB332.7 MB503.1 MB
> stdout 
> 
> stderr 
> 
> Thread Dump 
> 4impact3.indigo.co.il:4902000.0
>  B / 12.9 GB0.0 B0026263.39 h2.2 GB532.0 MB596.1 MB
> stdout 
> 
> stderr 
> 
> Thread Dump 
> 5impact1.indigo.co.il:4906800.0
>  B / 12.9 GB0.0 B0024243.30 h2.0 GB187.3 MB502.1 MB
> stdout 
> 
> stderr 
> 
> Thread Dump 
> 6impact4.indigo.co.il:5006900.0
>  B / 12.9 GB0.0 B0028283.64 h2.4 GB336.4 MB498.9 MB
> stdout 
> 
> stderr 
> 
> Thread Dump 
> 7impact2.indigo.co.il:4022500.0
>  B / 12.9 GB0.0 B0028283.62 h2.0 GB93.6 MB496.2 MB
> stdout 
> 
> stderr 
> 
> Thread Dump 
> 8impact3.indigo.co.il:5076700.0
>  B / 12.9 GB0.0 B1024253.38 h2.1 GB336.2 

One task hangs and never finishes

2015-12-17 Thread Daniel Haviv
Hi,

I have an application running a set of transformations and finishes
with saveAsTextFile.

Out of 80 tasks all finish pretty fast but one that just hangs and
outputs these message to STDERR:

5/12/17 17:22:19 INFO collection.ExternalAppendOnlyMap: Thread 82
spilling in-memory map of 4.0 GB to disk (6 times so far)

15/12/17 17:23:41 INFO collection.ExternalAppendOnlyMap: Thread 82
spilling in-memory map of 3.8 GB to disk (7 times so far)


Inside the WEBUI I can see that for some reason the shuffle spill
memory is exteremly high (15GB) compared to the others (around a few
mb to 1 GB) and as a result the GC time is exteremly bad



IndexIDAttemptStatus  â–´Locality LevelExecutor ID / HostLaunch
TimeDurationScheduler DelayTask Deserialization TimeGC TimeResult
Serialization TimeGetting Result TimePeak Execution MemoryOutput Size
/ RecordsShuffle Read Size / RecordsShuffle Spill (Memory)Shuffle
Spill (Disk)Errors171530RUNNINGPROCESS_LOCAL8 /
impact3.indigo.co.il2015/12/17 17:18:3232 min0 ms0 ms25 min0 ms0 ms0.0
B0.0 B / 0835.8 MB / 521783315.2 GB662.9 MB

I'm running with 8 executors with 8 cpus and 25GB ram each and it
seems that tasks are correctly spread across the nodes:



Executor IDAddressRDD BlocksStorage MemoryDisk UsedActive TasksFailed
TasksComplete TasksTotal TasksTask TimeInputShuffle ReadShuffle
WriteLogsThread Dump1impact1.indigo.co.il:3812000.0 B / 12.9 GB0.0
B0025253.54 h2.1 GB377.6 MB555.3 MB
stdout 

stderr 

Thread Dump 
2impact4.indigo.co.il:4076800.0
B / 12.9 GB0.0 B0024244.32 h2.0 GB513.1 MB495.9 MB
stdout 

stderr 

Thread Dump 
3impact2.indigo.co.il:4366600.0
B / 12.9 GB0.0 B0024243.78 h2.0 GB332.7 MB503.1 MB
stdout 

stderr 

Thread Dump 
4impact3.indigo.co.il:4902000.0
B / 12.9 GB0.0 B0026263.39 h2.2 GB532.0 MB596.1 MB
stdout 

stderr 

Thread Dump 
5impact1.indigo.co.il:4906800.0
B / 12.9 GB0.0 B0024243.30 h2.0 GB187.3 MB502.1 MB
stdout 

stderr 

Thread Dump 
6impact4.indigo.co.il:5006900.0
B / 12.9 GB0.0 B0028283.64 h2.4 GB336.4 MB498.9 MB
stdout 

stderr 

Thread Dump 
7impact2.indigo.co.il:4022500.0
B / 12.9 GB0.0 B0028283.62 h2.0 GB93.6 MB496.2 MB
stdout 

stderr 

Thread Dump 
8impact3.indigo.co.il:5076700.0
B / 12.9 GB0.0 B1024253.38 h2.1 GB336.2 MB564.4 MB
stdout 

stderr 

Thread Dump