Hi,
I am running Spark 0.9.2 on an EC2 cluster with about 16 r3.4xlarge machines
The cluster is running Spark standalone and is launched with the ec2
scripts.
In my Spark job, I am using ephemeral HDFS to checkpoint some of my RDDs.
I'm also reading and writing to S3. My jobs also involve a large
Unfortunately it doesn't look like my executors are OOM. On the slave
machines I checked both the logs in /spark/log (which I assume is from the
salve driver?) and in /spark/work/... which I assume are from each
worker/executor.
On Thu, Aug 21, 2014 at 11:19 AM, Yana Kadiyska
Hi Shay,
You can try setting spark.storage.blockManagerSlaveTimeoutMs to a higher
value.
Cheers,
Jayant
On Thu, Aug 21, 2014 at 1:33 PM, Shay Seng s...@urbanengines.com wrote:
Unfortunately it doesn't look like my executors are OOM. On the slave
machines I checked both the logs in