Hello, I'm trying to run a spark 1.5 job with:
./spark-shell --driver-java-options "-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=1044 -Xms16g -Xmx48g -Xss128m" I get lots of error messages like : 15/08/24 20:24:33 INFO ExternalSorter: Thread 172 spilling in-memory map of 352.2 MB to disk (40 times so far) 15/08/24 20:24:33 INFO ExternalSorter: Thread 179 spilling in-memory map of 352.2 MB to disk (39 times so far) 15/08/24 20:24:34 INFO ExternalSorter: Thread 197 spilling in-memory map of 352.2 MB to disk (39 times so far) 15/08/24 20:24:34 INFO ExternalSorter: Thread 192 spilling in-memory map of 352.2 MB to disk (39 times so far) 15/08/24 20:24:36 INFO ShuffleMemoryManager: TID 798 waiting for at least 1/2N of shuffle memory pool to be free 15/08/24 20:24:36 INFO ExternalSorter: Thread 170 spilling in-memory map of 352.2 MB to disk (39 times so far) 15/08/24 20:24:36 INFO ExternalSorter: Thread 171 spilling in-memory map of 352.2 MB to disk (40 times so far) When I force a stack trace with jstack, I get stack traces like this for each of my 36 cores: "Executor task launch worker-44" daemon prio=10 tid=0x00007fb9c404d000 nid=0x417f runnable [0x00007fb5e3ffe000] java.lang.Thread.State: RUNNABLE at scala.collection.mutable.ResizableArray$class.$init$(ResizableArray.scala:32) at scala.collection.mutable.ArrayBuffer.<init>(ArrayBuffer.scala:49) at scala.collection.mutable.ArrayBuffer.<init>(ArrayBuffer.scala:63) at org.apache.spark.util.SizeEstimator$SearchState.<init>(SizeEstimator.scala:154) at org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:183) at org.apache.spark.util.SizeEstimator$$anonfun$sampleArray$1.apply$mcVI$sp(SizeEstimator.scala:262) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:166) at org.apache.spark.util.SizeEstimator$.sampleArray(SizeEstimator.scala:254) at org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:238) at org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:194) at org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:186) at org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:54) at org.apache.spark.util.collection.SizeTracker$class.takeSample(SizeTracker.scala:78) at org.apache.spark.util.collection.SizeTracker$class.afterUpdate(SizeTracker.scala:70) at org.apache.spark.util.collection.PartitionedPairBuffer.afterUpdate(PartitionedPairBuffer.scala:30) at org.apache.spark.util.collection.PartitionedPairBuffer.insert(PartitionedPairBuffer.scala:53) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Jmap Shows: root@ip-172-31-28-227:~# jmap -heap 16421 Attaching to process ID 16421, please wait... Debugger attached successfully. Server compiler detected. JVM version is 24.79-b02 using thread-local object allocation. Parallel GC with 25 thread(s) Heap Configuration: MinHeapFreeRatio = 0 MaxHeapFreeRatio = 100 MaxHeapSize = 51539607552 (49152.0MB) NewSize = 1310720 (1.25MB) MaxNewSize = 17592186044415 MB OldSize = 5439488 (5.1875MB) NewRatio = 2 SurvivorRatio = 8 PermSize = 21757952 (20.75MB) MaxPermSize = 268435456 (256.0MB) G1HeapRegionSize = 0 (0.0MB) Heap Usage: PS Young Generation Eden Space: capacity = 15290335232 (14582.0MB) used = 11958528232 (11404.541236877441MB) free = 3331807000 (3177.4587631225586MB) 78.20971908433302% used >From Space: capacity = 869269504 (829.0MB) used = 868910560 (828.6576843261719MB) free = 358944 (0.342315673828125MB) 99.9587073976082% used To Space: capacity = 965738496 (921.0MB) used = 0 (0.0MB) free = 965738496 (921.0MB) 0.0% used PS Old Generation capacity = 11453595648 (10923.0MB) used = 1423152248 (1357.223747253418MB) free = 10030443400 (9565.776252746582MB) 12.425375329611077% used PS Perm Generation capacity = 107479040 (102.5MB) used = 107107360 (102.14553833007812MB) free = 371680 (0.354461669921875MB) 99.65418373666158% used 27990 interned Strings occupying 2892496 bytes. This seems to indicate that the Threads are dumping extra data into temp files on the disk, but none of the temp files in /tmp/ are growing as far as I can tell. I think this means the spilling isn't actually happening, or I don't know where to look for it if it is. I can't tell what's going on here, does anyone have any suggestions on how to work through this? Best, Dan -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/ExternalSorter-Thread-spilling-in-memory-map-of-352-6-MB-to-disk-38-times-so-far-tp24428.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org