Works like a charm. Thanks Reynold for the quick and efficient response!


2015-08-05 19:19 GMT+02:00 Reynold Xin <>:

> In Spark 1.5, we have a new way to manage memory (part of Project
> Tungsten). The default unit of memory allocation is 64MB, which is way too
> high when you have 1G of memory allocated in total and have more than 4
> threads.
> We will reduce the default page size before releasing 1.5.  For now, you
> can just reduce spark.buffer.pageSize variable to a lower value (e.g. 16m).
> On Wed, Aug 5, 2015 at 9:25 AM, Alexis Seigneurin <>
> wrote:
>> Hi,
>> I'm receiving a memory allocation error with a recent build of Spark 1.5:
>> Unable to acquire 67108864 bytes of memory
>> at
>> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.acquireNewPageIfNecessary(
>> at
>> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.insertRecord(
>> at
>> org.apache.spark.sql.execution.UnsafeExternalRowSorter.insertRow(
>> at
>> org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(
>> at
>> org.apache.spark.sql.execution.TungstenSort$$anonfun$doExecute$3.apply(sort.scala:146)
>> at
>> org.apache.spark.sql.execution.TungstenSort$$anonfun$doExecute$3.apply(sort.scala:126)
>> The issue appears when joining 2 datasets. One with 6084 records, the
>> other one with 200 records. I'm expecting to receive 200 records in the
>> result.
>> I'm using a homemade build prepared from "branch-1.5" with commit ID
>> "eedb996". I have run "mvn -DskipTests clean install" to generate that
>> build.
>> Apart from that, I'm using Java 1.7.0_51 and Maven 3.3.3.
>> I've prepared a test case that can be built and executed very easily
>> (data files are included in the repo):
>> One thing to note is that the issue arises when the master is set to
>> "local[*]" but not when set to "local". Both options work without problem
>> with Spark 1.4, though.
>> Any help will be greatly appreciated!
>> Many thanks,
>> Alexis

Reply via email to