Saving intermediate results in mapPartitions

2016-03-18 Thread Krishna
Hi, I've a situation where the number of elements output by each partition from mapPartitions don't fit into the RAM even with the lowest number of rows in the partition (there is a hard lower limit on this value). What's the best way to address this problem? During the mapPartition phase, is ther

Re: Saving intermediate results in mapPartitions

2016-03-18 Thread Enrico Rotundo
Try to set MEMORY_AND_DISK as RDD’s storage persistence level. http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence > On 19 Mar 2016, at 00:55, Krishna wrote: > > Hi, > > I've a situation where th