2*2 cents
1. You can try repartition and give a large number to achieve smaller
partitions.
2. OOM errors can be avoided by increasing executor memory or using off
heap storage
3. How are you persisting? You can try using persist using DISK_ONLY_SER
storage level
4. You may take a look in the
I give the executor 14gb and would like to cut it.
I expect the critical operations to run hundreds of millions of times which
is why we run on a cluster. I will try DISK_ONLY_SER
Thanks
Steven Lewis sent from my phone
On May 7, 2015 10:59 AM, ayan guha guha.a...@gmail.com wrote:
2*2 cents
1.
I am performing a job where I perform a number of steps in succession.
One step is a map on a JavaRDD which generates objects taking up
significant memory.
The this is followed by a join and an aggregateByKey.
The problem is that the system is running getting OutOfMemoryErrors -
Most tasks work