Thanks Chris

Going to try it soon by setting maybe spark.sql.shuffle.partitions to 2001.
Also, I was wondering if it would help if I repartition the data by the
fields I am using in group by and window operations?

Best Regards
Ankit Khettry

On Sat, 7 Sep, 2019, 1:05 PM Chris Teoh, <chris.t...@gmail.com> wrote:

> Hi Ankit,
>
> Without looking at the Spark UI and the stages/DAG, I'm guessing you're
> running on default number of Spark shuffle partitions.
>
> If you're seeing a lot of shuffle spill, you likely have to increase the
> number of shuffle partitions to accommodate the huge shuffle size.
>
> I hope that helps
> Chris
>
> On Sat, 7 Sep 2019, 4:18 pm Ankit Khettry, <justankit2...@gmail.com>
> wrote:
>
>> Nope, it's a batch job.
>>
>> Best Regards
>> Ankit Khettry
>>
>> On Sat, 7 Sep, 2019, 6:52 AM Upasana Sharma, <028upasana...@gmail.com>
>> wrote:
>>
>>> Is it a streaming job?
>>>
>>> On Sat, Sep 7, 2019, 5:04 AM Ankit Khettry <justankit2...@gmail.com>
>>> wrote:
>>>
>>>> I have a Spark job that consists of a large number of Window operations
>>>> and hence involves large shuffles. I have roughly 900 GiBs of data,
>>>> although I am using a large enough cluster (10 * m5.4xlarge instances). I
>>>> am using the following configurations for the job, although I have tried
>>>> various other combinations without any success.
>>>>
>>>> spark.yarn.driver.memoryOverhead 6g
>>>> spark.storage.memoryFraction 0.1
>>>> spark.executor.cores 6
>>>> spark.executor.memory 36g
>>>> spark.memory.offHeap.size 8g
>>>> spark.memory.offHeap.enabled true
>>>> spark.executor.instances 10
>>>> spark.driver.memory 14g
>>>> spark.yarn.executor.memoryOverhead 10g
>>>>
>>>> I keep running into the following OOM error:
>>>>
>>>> org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 16384
>>>> bytes of memory, got 0
>>>> at
>>>> org.apache.spark.memory.MemoryConsumer.throwOom(MemoryConsumer.java:157)
>>>> at
>>>> org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:98)
>>>> at
>>>> org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.<init>(UnsafeInMemorySorter.java:128)
>>>> at
>>>> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.<init>(UnsafeExternalSorter.java:163)
>>>>
>>>> I see there are a large number of JIRAs in place for similar issues and
>>>> a great many of them are even marked resolved.
>>>> Can someone guide me as to how to approach this problem? I am using
>>>> Databricks Spark 2.4.1.
>>>>
>>>> Best Regards
>>>> Ankit Khettry
>>>>
>>>

Reply via email to