This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 8feb80a [SPARK-27811][CORE][DOCS] Improve docs about spark.driver.memoryOverhead and spark.executor.memoryOverhead. 8feb80a is described below commit 8feb80ad86bb6a832d784a2780caccba5c428fbc Author: gengjiaan <gengji...@360.cn> AuthorDate: Sat Jun 1 08:19:50 2019 -0500 [SPARK-27811][CORE][DOCS] Improve docs about spark.driver.memoryOverhead and spark.executor.memoryOverhead. ## What changes were proposed in this pull request? I found the docs of `spark.driver.memoryOverhead` and `spark.executor.memoryOverhead` exists a little ambiguity. For example, the origin docs of `spark.driver.memoryOverhead` start with `The amount of off-heap memory to be allocated per driver in cluster mode`. But `MemoryManager` also managed a memory area named off-heap used to allocate memory in tungsten mode. So I think the description of `spark.driver.memoryOverhead` always make confused. `spark.executor.memoryOverhead` has the same confused with `spark.driver.memoryOverhead`. ## How was this patch tested? Exists UT. Closes #24671 from beliefer/improve-docs-of-overhead. Authored-by: gengjiaan <gengji...@360.cn> Signed-off-by: Sean Owen <sean.o...@databricks.com> --- .../org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 26 +++++++++++++++++----- 2 files changed, 23 insertions(+), 7 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/internal/config/package.scala b/core/src/main/scala/org/apache/spark/internal/config/package.scala index 8ea8887..32221ee 100644 --- a/core/src/main/scala/org/apache/spark/internal/config/package.scala +++ b/core/src/main/scala/org/apache/spark/internal/config/package.scala @@ -71,7 +71,7 @@ package object config { .createWithDefaultString("1g") private[spark] val DRIVER_MEMORY_OVERHEAD = ConfigBuilder("spark.driver.memoryOverhead") - .doc("The amount of off-heap memory to be allocated per driver in cluster mode, " + + .doc("The amount of non-heap memory to be allocated per driver in cluster mode, " + "in MiB unless otherwise specified.") .bytesConf(ByteUnit.MiB) .createOptional @@ -196,7 +196,7 @@ package object config { .createWithDefaultString("1g") private[spark] val EXECUTOR_MEMORY_OVERHEAD = ConfigBuilder("spark.executor.memoryOverhead") - .doc("The amount of off-heap memory to be allocated per executor in cluster mode, " + + .doc("The amount of non-heap memory to be allocated per executor in cluster mode, " + "in MiB unless otherwise specified.") .bytesConf(ByteUnit.MiB) .createOptional diff --git a/docs/configuration.md b/docs/configuration.md index 24e66e1..6632dea 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -181,10 +181,16 @@ of the most common options to set are: <td><code>spark.driver.memoryOverhead</code></td> <td>driverMemory * 0.10, with minimum of 384 </td> <td> - The amount of off-heap memory to be allocated per driver in cluster mode, in MiB unless - otherwise specified. This is memory that accounts for things like VM overheads, interned strings, + Amount of non-heap memory to be allocated per driver process in cluster mode, in MiB unless + otherwise specified. This is memory that accounts for things like VM overheads, interned strings, other native overheads, etc. This tends to grow with the container size (typically 6-10%). This option is currently supported on YARN, Mesos and Kubernetes. + <em>Note:</em> Non-heap memory includes off-heap memory + (when <code>spark.memory.offHeap.enabled=true</code>) and memory used by other driver processes + (e.g. python process that goes with a PySpark driver) and memory used by other non-driver + processes running in the same container. The maximum memory size of container to running + driver is determined by the sum of <code>spark.driver.memoryOverhead</code> + and <code>spark.driver.memory</code>. </td> </tr> <tr> @@ -244,10 +250,17 @@ of the most common options to set are: <td><code>spark.executor.memoryOverhead</code></td> <td>executorMemory * 0.10, with minimum of 384 </td> <td> - The amount of off-heap memory to be allocated per executor, in MiB unless otherwise specified. - This is memory that accounts for things like VM overheads, interned strings, other native - overheads, etc. This tends to grow with the executor size (typically 6-10%). + Amount of non-heap memory to be allocated per executor process in cluster mode, in MiB unless + otherwise specified. This is memory that accounts for things like VM overheads, interned strings, + other native overheads, etc. This tends to grow with the executor size (typically 6-10%). This option is currently supported on YARN and Kubernetes. + <br/> + <em>Note:</em> Non-heap memory includes off-heap memory + (when <code>spark.memory.offHeap.enabled=true</code>) and memory used by other executor processes + (e.g. python process that goes with a PySpark executor) and memory used by other non-executor + processes running in the same container. The maximum memory size of container to running executor + is determined by the sum of <code>spark.executor.memoryOverhead</code> and + <code>spark.executor.memory</code>. </td> </tr> <tr> @@ -1283,6 +1296,9 @@ Apart from these, the following properties are also available, and may be useful <td> If true, Spark will attempt to use off-heap memory for certain operations. If off-heap memory use is enabled, then <code>spark.memory.offHeap.size</code> must be positive. + <em>Note:</em> If off-heap memory is enabled, may need to raise the non-heap memory size + (e.g. increase <code>spark.driver.memoryOverhead</code> or + <code>spark.executor.memoryOverhead</code>). </td> </tr> <tr> --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org