Hi, I'm receiving a memory allocation error with a recent build of Spark 1.5:
java.io.IOException: Unable to acquire 67108864 bytes of memory at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.acquireNewPageIfNecessary(UnsafeExternalSorter.java:348) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.insertRecord(UnsafeExternalSorter.java:398) at org.apache.spark.sql.execution.UnsafeExternalRowSorter.insertRow(UnsafeExternalRowSorter.java:92) at org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:174) at org.apache.spark.sql.execution.TungstenSort$$anonfun$doExecute$3.apply(sort.scala:146) at org.apache.spark.sql.execution.TungstenSort$$anonfun$doExecute$3.apply(sort.scala:126) The issue appears when joining 2 datasets. One with 6084 records, the other one with 200 records. I'm expecting to receive 200 records in the result. I'm using a homemade build prepared from "branch-1.5" with commit ID "eedb996". I have run "mvn -DskipTests clean install" to generate that build. Apart from that, I'm using Java 1.7.0_51 and Maven 3.3.3. I've prepared a test case that can be built and executed very easily (data files are included in the repo): https://github.com/aseigneurin/spark-testcase One thing to note is that the issue arises when the master is set to "local[*]" but not when set to "local". Both options work without problem with Spark 1.4, though. Any help will be greatly appreciated! Many thanks, Alexis