Am 24/01/2017 um 02:43 schrieb Matthew Dailey:
In general, Java processes fail with an OutOfMemoryError when your code
and data does not fit into the memory allocated to the runtime.  In
Spark, that memory is controlled through the --executor-memory flag.
If you are running Spark on YARN, then YARN configuration will dictate
the maximum memory that your Spark executors can request.  Here is a
pretty good article about setting memory in Spark on YARN:
http://www.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_ig_running_spark_on_yarn.html

If the OS were to kill your process because the system has run out of
memory, you would see an error printed to standard error that looks like
this:

||Java HotSpot(TM) 64-Bit Server VM warning: INFO:
os::commit_memory(0x00000000e2320000, 37601280, 0) failed; error='Cannot
allocate memory' (errno=12) #| There is insufficient memory for the Java
Runtime Environment to continue.|


Thanks for your answer. But that does not explain my perception, that the chance for OOM seems to raise when there are more jobs running, right?

Is the maximum memory YARN allows a statically defined value (I see yarn.scheduler.maximum-allocation-mb and other things), or can the actual value change dynamically, depending on the environment?

Anything else that could significantly change the memory requirements from run to run (with same program, data and settings)?

--
David


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to