It seems possible that you are running out of memory unrolling a single
partition of the RDD. This is something that can cause your executor to
OOM, especially if the cache is close to being full so the executor doesn't
have much free memory left. How large are your executors? At the time of
BTW - the reason why the workaround could help is because when persisting
to DISK_ONLY, we explicitly avoid materializing the RDD partition in
memory... we just pass it through to disk
On Mon, Aug 4, 2014 at 1:10 AM, Patrick Wendell pwend...@gmail.com wrote:
It seems possible that you are
[Forking this thread.]
According to the Spark Programming Guide
http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence,
persisting RDDs with MEMORY_ONLY should not choke if the RDD cannot be held
entirely in memory:
If the RDD does not fit in memory, some partitions will not
Isn't this your worker running out of its memory for computations,
rather than for caching RDDs? so it has enough memory when you don't
actually use a lot of the heap for caching, but when the cache uses
its share, you actually run out of memory. If I'm right, and even I am
not sure I have this
On Fri, Aug 1, 2014 at 12:39 PM, Sean Owen so...@cloudera.com wrote:
Isn't this your worker running out of its memory for computations,
rather than for caching RDDs?
I’m not sure how to interpret the stack trace, but let’s say that’s true.
I’m even seeing this with a simple a =