Hello,

I'm trying to run a spark 1.5 job with:

 ./spark-shell --driver-java-options "-Xdebug
-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=1044 -Xms16g
-Xmx48g -Xss128m"  

I get lots of error messages like :

15/08/24 20:24:33 INFO ExternalSorter: Thread 172 spilling in-memory map of
352.2 MB to disk (40 times so far)
15/08/24 20:24:33 INFO ExternalSorter: Thread 179 spilling in-memory map of
352.2 MB to disk (39 times so far)
15/08/24 20:24:34 INFO ExternalSorter: Thread 197 spilling in-memory map of
352.2 MB to disk (39 times so far)
15/08/24 20:24:34 INFO ExternalSorter: Thread 192 spilling in-memory map of
352.2 MB to disk (39 times so far)
15/08/24 20:24:36 INFO ShuffleMemoryManager: TID 798 waiting for at least
1/2N of shuffle memory pool to be free
15/08/24 20:24:36 INFO ExternalSorter: Thread 170 spilling in-memory map of
352.2 MB to disk (39 times so far)
15/08/24 20:24:36 INFO ExternalSorter: Thread 171 spilling in-memory map of
352.2 MB to disk (40 times so far)

When I force a stack trace with jstack, I get stack traces like this for
each of my 36 cores:

"Executor task launch worker-44" daemon prio=10 tid=0x00007fb9c404d000
nid=0x417f runnable [0x00007fb5e3ffe000]
   java.lang.Thread.State: RUNNABLE
        at
scala.collection.mutable.ResizableArray$class.$init$(ResizableArray.scala:32)
        at scala.collection.mutable.ArrayBuffer.<init>(ArrayBuffer.scala:49)
        at scala.collection.mutable.ArrayBuffer.<init>(ArrayBuffer.scala:63)
        at
org.apache.spark.util.SizeEstimator$SearchState.<init>(SizeEstimator.scala:154)
        at
org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:183)
        at
org.apache.spark.util.SizeEstimator$$anonfun$sampleArray$1.apply$mcVI$sp(SizeEstimator.scala:262)
        at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:166)
        at
org.apache.spark.util.SizeEstimator$.sampleArray(SizeEstimator.scala:254)
        at
org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:238)
        at
org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:194)
        at
org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:186)
        at
org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:54)
        at
org.apache.spark.util.collection.SizeTracker$class.takeSample(SizeTracker.scala:78)
        at
org.apache.spark.util.collection.SizeTracker$class.afterUpdate(SizeTracker.scala:70)
        at
org.apache.spark.util.collection.PartitionedPairBuffer.afterUpdate(PartitionedPairBuffer.scala:30)
        at
org.apache.spark.util.collection.PartitionedPairBuffer.insert(PartitionedPairBuffer.scala:53)
        at
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
        at
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
        at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
        at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:88)
        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)



Jmap Shows:

root@ip-172-31-28-227:~# jmap -heap 16421
Attaching to process ID 16421, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 24.79-b02

using thread-local object allocation.
Parallel GC with 25 thread(s)

Heap Configuration:
   MinHeapFreeRatio = 0
   MaxHeapFreeRatio = 100
   MaxHeapSize      = 51539607552 (49152.0MB)
   NewSize          = 1310720 (1.25MB)
   MaxNewSize       = 17592186044415 MB
   OldSize          = 5439488 (5.1875MB)
   NewRatio         = 2
   SurvivorRatio    = 8
   PermSize         = 21757952 (20.75MB)
   MaxPermSize      = 268435456 (256.0MB)
   G1HeapRegionSize = 0 (0.0MB)

Heap Usage:
PS Young Generation
Eden Space:
   capacity = 15290335232 (14582.0MB)
   used     = 11958528232 (11404.541236877441MB)
   free     = 3331807000 (3177.4587631225586MB)
   78.20971908433302% used
>From Space:
   capacity = 869269504 (829.0MB)
   used     = 868910560 (828.6576843261719MB)
   free     = 358944 (0.342315673828125MB)
   99.9587073976082% used
To Space:
   capacity = 965738496 (921.0MB)
   used     = 0 (0.0MB)
   free     = 965738496 (921.0MB)
   0.0% used
PS Old Generation
   capacity = 11453595648 (10923.0MB)
   used     = 1423152248 (1357.223747253418MB)
   free     = 10030443400 (9565.776252746582MB)
   12.425375329611077% used
PS Perm Generation
   capacity = 107479040 (102.5MB)
   used     = 107107360 (102.14553833007812MB)
   free     = 371680 (0.354461669921875MB)
   99.65418373666158% used

27990 interned Strings occupying 2892496 bytes.


This seems to indicate that the Threads are dumping extra data into temp
files on the disk, but none of the temp files in /tmp/ are growing as far as
I can tell.  I think this means the spilling isn't actually happening, or I
don't know where to look for it if it is.  
I can't tell what's going on here, does anyone have any suggestions on how
to work through this?

Best,
Dan





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/ExternalSorter-Thread-spilling-in-memory-map-of-352-6-MB-to-disk-38-times-so-far-tp24428.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to