After a lot of grovelling through logs, I found out that the Nagios monitor
process detected that the machine was almost out of memory, and killed the
SNAP executor process.

So why is the machine running out of memory? Each node has 128GB of RAM, 4
executors, about 40GB of data. It did run out of memory if I tried to
cache() the RDD, but I would hope that persist() is implemented so that it
would stream to disk without trying to materialize too much data in RAM.

Ravi



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Lost-executors-tp11722p12032.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to