Pasting the relevant code might help to understand better what exactly you are doing.
Thanks Best Regards On Thu, Dec 17, 2015 at 9:25 PM, Daniel Haviv < daniel.ha...@veracity-group.com> wrote: > Hi, > > I have an application running a set of transformations and finishes with > saveAsTextFile. > > Out of 80 tasks all finish pretty fast but one that just hangs and outputs > these message to STDERR: > > 5/12/17 17:22:19 INFO collection.ExternalAppendOnlyMap: Thread 82 spilling > in-memory map of 4.0 GB to disk (6 times so far) > > 15/12/17 17:23:41 INFO collection.ExternalAppendOnlyMap: Thread 82 spilling > in-memory map of 3.8 GB to disk (7 times so far) > > > Inside the WEBUI I can see that for some reason the shuffle spill memory is > exteremly high (15GB) compared to the others (around a few mb to 1 GB) and as > a result the GC time is exteremly bad > > > > IndexIDAttemptStatus â–´Locality LevelExecutor ID / HostLaunch > TimeDurationScheduler DelayTask Deserialization TimeGC TimeResult > Serialization TimeGetting Result TimePeak Execution MemoryOutput Size / > RecordsShuffle Read Size / RecordsShuffle Spill (Memory)Shuffle Spill > (Disk)Errors171530RUNNINGPROCESS_LOCAL8 / impact3.indigo.co.il2015/12/17 > 17:18:3232 min0 ms0 ms25 min0 ms0 ms0.0 B0.0 B / 0835.8 MB / 521783315.2 > GB662.9 MB > > I'm running with 8 executors with 8 cpus and 25GB ram each and it seems that > tasks are correctly spread across the nodes: > > > > Executor IDAddressRDD BlocksStorage MemoryDisk UsedActive TasksFailed > TasksComplete TasksTotal TasksTask TimeInputShuffle ReadShuffle > WriteLogsThread Dump1impact1.indigo.co.il:3812000.0 B / 12.9 GB0.0 > B0025253.54 h2.1 GB377.6 MB555.3 MB > stdout > <http://impact1.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000002/impact/stdout?start=-4096> > stderr > <http://impact1.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000002/impact/stderr?start=-4096> > Thread Dump > <http://impact1.indigo.co.il:8088/proxy/application_1449678848128_0999/executors/threadDump/?executorId=1>2impact4.indigo.co.il:4076800.0 > B / 12.9 GB0.0 B0024244.32 h2.0 GB513.1 MB495.9 MB > stdout > <http://impact4.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000003/impact/stdout?start=-4096> > stderr > <http://impact4.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000003/impact/stderr?start=-4096> > Thread Dump > <http://impact1.indigo.co.il:8088/proxy/application_1449678848128_0999/executors/threadDump/?executorId=2>3impact2.indigo.co.il:4366600.0 > B / 12.9 GB0.0 B0024243.78 h2.0 GB332.7 MB503.1 MB > stdout > <http://impact2.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000004/impact/stdout?start=-4096> > stderr > <http://impact2.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000004/impact/stderr?start=-4096> > Thread Dump > <http://impact1.indigo.co.il:8088/proxy/application_1449678848128_0999/executors/threadDump/?executorId=3>4impact3.indigo.co.il:4902000.0 > B / 12.9 GB0.0 B0026263.39 h2.2 GB532.0 MB596.1 MB > stdout > <http://impact3.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000005/impact/stdout?start=-4096> > stderr > <http://impact3.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000005/impact/stderr?start=-4096> > Thread Dump > <http://impact1.indigo.co.il:8088/proxy/application_1449678848128_0999/executors/threadDump/?executorId=4>5impact1.indigo.co.il:4906800.0 > B / 12.9 GB0.0 B0024243.30 h2.0 GB187.3 MB502.1 MB > stdout > <http://impact1.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000006/impact/stdout?start=-4096> > stderr > <http://impact1.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000006/impact/stderr?start=-4096> > Thread Dump > <http://impact1.indigo.co.il:8088/proxy/application_1449678848128_0999/executors/threadDump/?executorId=5>6impact4.indigo.co.il:5006900.0 > B / 12.9 GB0.0 B0028283.64 h2.4 GB336.4 MB498.9 MB > stdout > <http://impact4.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000007/impact/stdout?start=-4096> > stderr > <http://impact4.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000007/impact/stderr?start=-4096> > Thread Dump > <http://impact1.indigo.co.il:8088/proxy/application_1449678848128_0999/executors/threadDump/?executorId=6>7impact2.indigo.co.il:4022500.0 > B / 12.9 GB0.0 B0028283.62 h2.0 GB93.6 MB496.2 MB > stdout > <http://impact2.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000008/impact/stdout?start=-4096> > stderr > <http://impact2.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000008/impact/stderr?start=-4096> > Thread Dump > <http://impact1.indigo.co.il:8088/proxy/application_1449678848128_0999/executors/threadDump/?executorId=7>8impact3.indigo.co.il:5076700.0 > B / 12.9 GB0.0 B1024253.38 h2.1 GB336.2 MB564.4 MB > stdout > <http://impact3.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000009/impact/stdout?start=-4096> > stderr > <http://impact3.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000009/impact/stderr?start=-4096> > Thread Dump > <http://impact1.indigo.co.il:8088/proxy/application_1449678848128_0999/executors/threadDump/?executorId=8>driver15.17.198.82:5765400.0 > B / 9.6 GB0.0 B00000 ms0.0 B0.0 B0.0 B > stderr > <http://impact3.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000001/impact/stderr?start=-4096> > stdout > <http://impact3.indigo.co.il:8042/node/containerlogs/container_1449678848128_0999_01_000001/impact/stdout?start=-4096> > Thread Dump > <http://impact1.indigo.co.il:8088/proxy/application_1449678848128_0999/executors/threadDump/?executorId=driver>Any > idea what can it be ? > > > Thank you. > > Daniel > >