This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new cd7ca92051b [SPARK-40675][DOCS] Supplement undocumented spark configurations in `configuration.md` cd7ca92051b is described below commit cd7ca92051b55c615b8db07030ea3af469dd4da4 Author: Qian.Sun <qian.sun2...@gmail.com> AuthorDate: Sun Oct 9 10:12:19 2022 -0500 [SPARK-40675][DOCS] Supplement undocumented spark configurations in `configuration.md` ### What changes were proposed in this pull request? This PR aims to supplement missing spark configurations in `org.apache.spark.internal.config` in `configuration.md`. ### Why are the changes needed? Help users to confirm configuration through documentation instead of code. ### Does this PR introduce _any_ user-facing change? Yes, more configurations in documentation. ### How was this patch tested? Pass the GitHub Actions. Closes #38131 from dcoliversun/SPARK-40675. Authored-by: Qian.Sun <qian.sun2...@gmail.com> Signed-off-by: Sean Owen <sro...@gmail.com> --- docs/configuration.md | 314 +++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 313 insertions(+), 1 deletion(-) diff --git a/docs/configuration.md b/docs/configuration.md index 16c9fdfdf9f..b528c766884 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -468,6 +468,43 @@ of the most common options to set are: </td> <td>3.0.0</td> </tr> +<tr> + <td><code>spark.decommission.enabled</code></td> + <td>false</td> + <td> + When decommission enabled, Spark will try its best to shut down the executor gracefully. + Spark will try to migrate all the RDD blocks (controlled by <code>spark.storage.decommission.rddBlocks.enabled</code>) + and shuffle blocks (controlled by <code>spark.storage.decommission.shuffleBlocks.enabled</code>) from the decommissioning + executor to a remote executor when <code>spark.storage.decommission.enabled</code> is enabled. + With decommission enabled, Spark will also decommission an executor instead of killing when <code>spark.dynamicAllocation.enabled</code> enabled. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.executor.decommission.killInterval</code></td> + <td>(none)</td> + <td> + Duration after which a decommissioned executor will be killed forcefully by an outside (e.g. non-spark) service. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.executor.decommission.forceKillTimeout</code></td> + <td>(none)</td> + <td> + Duration after which a Spark will force a decommissioning executor to exit. + This should be set to a high value in most situations as low values will prevent block migrations from having enough time to complete. + </td> + <td>3.2.0</td> +</tr> +<tr> + <td><code>spark.executor.decommission.signal</code></td> + <td>PWR</td> + <td> + The signal that used to trigger the executor to start decommission. + </td> + <td>3.2.0</td> +</tr> </table> Apart from these, the following properties are also available, and may be useful in some situations: @@ -681,7 +718,7 @@ Apart from these, the following properties are also available, and may be useful </tr> <tr> <td><code>spark.redaction.regex</code></td> - <td>(?i)secret|password|token</td> + <td>(?i)secret|password|token|access[.]key</td> <td> Regex to decide which Spark configuration properties and environment variables in driver and executor environments contain sensitive information. When this regex matches a property key or @@ -689,6 +726,16 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.1.2</td> </tr> +<tr> + <td><code>spark.redaction.string.regex</code></td> + <td>(none)</td> + <td> + Regex to decide which parts of strings produced by Spark contain sensitive information. + When this regex matches a string part, that string part is replaced by a dummy value. + This is currently used to redact the output of SQL explain commands. + </td> + <td>2.2.0</td> +</tr> <tr> <td><code>spark.python.profile</code></td> <td>false</td> @@ -906,6 +953,23 @@ Apart from these, the following properties are also available, and may be useful </td> <td>1.4.0</td> </tr> +<tr> + <td><code>spark.shuffle.unsafe.file.output.buffer</code></td> + <td>32k</td> + <td> + The file system for this buffer size after each partition is written in unsafe shuffle writer. + In KiB unless otherwise specified. + </td> + <td>2.3.0</td> +</tr> +<tr> + <td><code>spark.shuffle.spill.diskWriteBufferSize</code></td> + <td>1024 * 1024</td> + <td> + The buffer size, in bytes, to use when writing the sorted records to an on-disk file. + </td> + <td>2.3.0</td> +</tr> <tr> <td><code>spark.shuffle.io.maxRetries</code></td> <td>3</td> @@ -988,6 +1052,17 @@ Apart from these, the following properties are also available, and may be useful </td> <td>1.2.0</td> </tr> +<tr> + <td><code>spark.shuffle.service.name</code></td> + <td>spark_shuffle</td> + <td> + The configured name of the Spark shuffle service the client should communicate with. + This must match the name used to configure the Shuffle within the YARN NodeManager configuration + (<code>yarn.nodemanager.aux-services</code>). Only takes effect + when <code>spark.shuffle.service.enabled</code> is set to true. + </td> + <td>3.2.0</td> +</tr> <tr> <td><code>spark.shuffle.service.index.cache.size</code></td> <td>100m</td> @@ -1028,6 +1103,14 @@ Apart from these, the following properties are also available, and may be useful </td> <td>1.1.1</td> </tr> +<tr> + <td><code>spark.shuffle.sort.io.plugin.class</code></td> + <td>org.apache.spark.shuffle.sort.io.LocalDiskShuffleDataIO</td> + <td> + Name of the class to use for shuffle IO. + </td> + <td>3.0.0</td> +</tr> <tr> <td><code>spark.shuffle.spill.compress</code></td> <td>true</td> @@ -1063,6 +1146,58 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.3.0</td> </tr> +<tr> + <td><code>spark.shuffle.reduceLocality.enabled</code></td> + <td>true</td> + <td> + Whether to compute locality preferences for reduce tasks. + </td> + <td>1.5.0</td> +</tr> +<tr> + <td><code>spark.shuffle.mapOutput.minSizeForBroadcast</code></td> + <td>512k</td> + <td> + The size at which we use Broadcast to send the map output statuses to the executors. + </td> + <td>2.0.0</td> +</tr> +<tr> + <td><code>spark.shuffle.detectCorrupt</code></td> + <td>true</td> + <td> + Whether to detect any corruption in fetched blocks. + </td> + <td>2.2.0</td> +</tr> +<tr> + <td><code>spark.shuffle.detectCorrupt.useExtraMemory</code></td> + <td>false</td> + <td> + If enabled, part of a compressed/encrypted stream will be de-compressed/de-crypted by using extra memory + to detect early corruption. Any IOException thrown will cause the task to be retried once + and if it fails again with same exception, then FetchFailedException will be thrown to retry previous stage. + </td> + <td>3.0.0</td> +</tr> +<tr> + <td><code>spark.shuffle.useOldFetchProtocol</code></td> + <td>false</td> + <td> + Whether to use the old protocol while doing the shuffle block fetching. It is only enabled while we need the + compatibility in the scenario of new Spark version job fetching shuffle blocks from old version external shuffle service. + </td> + <td>3.0.0</td> +</tr> +<tr> + <td><code>spark.shuffle.readHostLocalDisk</code></td> + <td>true</td> + <td> + If enabled (and <code>spark.shuffle.useOldFetchProtocol</code> is disabled, shuffle blocks requested from those block managers + which are running on the same host are read from the disk directly instead of being fetched as remote blocks over the network. + </td> + <td>3.0.0</td> +</tr> <tr> <td><code>spark.files.io.connectionTimeout</code></td> <td>value of <code>spark.network.timeout</code></td> @@ -1102,6 +1237,22 @@ Apart from these, the following properties are also available, and may be useful </td> <td>3.0.0</td> </tr> +<tr> + <td><code>spark.shuffle.service.db.enabled</code></td> + <td>true</td> + <td> + Whether to use db in ExternalShuffleService. Note that this only affects standalone mode. + </td> + <td>3.0.0</td> +</tr> +<tr> + <td><code>spark.shuffle.service.db.backend</code></td> + <td>LEVELDB</td> + <td> + Specifies a disk-based store used in shuffle service local db. Setting as LEVELDB or ROCKSDB. + </td> + <td>3.4.0</td> +</tr> </table> ### Spark UI @@ -1735,6 +1886,14 @@ Apart from these, the following properties are also available, and may be useful </td> <td>1.6.0</td> </tr> +<tr> + <td><code>spark.storage.unrollMemoryThreshold</code></td> + <td>1024 * 1024</td> + <td> + Initial memory to request before unrolling any block. + </td> + <td>1.1.0</td> +</tr> <tr> <td><code>spark.storage.replication.proactive</code></td> <td>false</td> @@ -1745,6 +1904,16 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.2.0</td> </tr> +<tr> + <td><code>spark.storage.localDiskByExecutors.cacheSize</code></td> + <td>1000</td> + <td> + The max number of executors for which the local dirs are stored. This size is both applied for the driver and + both for the executors side to avoid having an unbounded store. This cache will be used to avoid the network + in case of fetching disk persisted RDD blocks or shuffle blocks (when <code>spark.shuffle.readHostLocalDisk</code> is set) from the same host. + </td> + <td>3.0.0</td> +</tr> <tr> <td><code>spark.cleaner.periodicGC.interval</code></td> <td>30min</td> @@ -1816,6 +1985,14 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.1.1</td> </tr> +<tr> + <td><code>spark.broadcast.UDFCompressionThreshold</code></td> + <td>1 * 1024 * 1024</td> + <td> + The threshold at which user-defined functions (UDFs) and Python RDD commands are compressed by broadcast in bytes unless otherwise specified. + </td> + <td>3.0.0</td> +</tr> <tr> <td><code>spark.executor.cores</code></td> <td> @@ -1891,6 +2068,24 @@ Apart from these, the following properties are also available, and may be useful </td> <td>1.0.0</td> </tr> +<tr> + <td><code>spark.files.ignoreCorruptFiles</code></td> + <td>false</td> + <td> + Whether to ignore corrupt files. If true, the Spark jobs will continue to run when encountering corrupted or + non-existing files and contents that have been read will still be returned. + </td> + <td>2.1.0</td> +</tr> +<tr> + <td><code>spark.files.ignoreMissingFiles</code></td> + <td>false</td> + <td> + Whether to ignore missing files. If true, the Spark jobs will continue to run when encountering missing files and + the contents that have been read will still be returned. + </td> + <td>2.4.0</td> +</tr> <tr> <td><code>spark.files.maxPartitionBytes</code></td> <td>134217728 (128 MiB)</td> @@ -1944,6 +2139,67 @@ Apart from these, the following properties are also available, and may be useful </td> <td>0.9.2</td> </tr> +<tr> + <td><code>spark.storage.decommission.enabled</code></td> + <td>false</td> + <td> + Whether to decommission the block manager when decommissioning executor. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.enabled</code></td> + <td>true</td> + <td> + Whether to transfer shuffle blocks during block manager decommissioning. Requires a migratable shuffle resolver + (like sort based shuffle). + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.maxThreads</code></td> + <td>8</td> + <td> + Maximum number of threads to use in migrating shuffle files. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.rddBlocks.enabled</code></td> + <td>true</td> + <td> + Whether to transfer RDD blocks during block manager decommissioning. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.fallbackStorage.path</code></td> + <td>(none)</td> + <td> + The location for fallback storage during block manager decommissioning. For example, <code>s3a://spark-storage/</code>. + In case of empty, fallback storage is disabled. The storage should be managed by TTL because Spark will not clean it up. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.fallbackStorage.cleanUp</code></td> + <td>false</td> + <td> + If true, Spark cleans up its fallback storage data during shutting down. + </td> + <td>3.2.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.maxDiskSize</code></td> + <td>(none)</td> + <td> + Maximum disk space to use to store shuffle blocks before rejecting remote shuffle blocks. + Rejecting remote shuffle blocks means that an executor will not receive any shuffle migrations, + and if there are no other executors available for migration then shuffle blocks will be lost unless + <code>spark.storage.decommission.fallbackStorage.path</code> is configured. + </td> + <td>3.2.0</td> +</tr> <tr> <td><code>spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version</code></td> <td>1</td> @@ -1971,6 +2227,7 @@ Apart from these, the following properties are also available, and may be useful </td> <td>3.0.0</td> </tr> +<tr> <td><code>spark.executor.processTreeMetrics.enabled</code></td> <td>false</td> <td> @@ -1981,6 +2238,7 @@ Apart from these, the following properties are also available, and may be useful exists. </td> <td>3.0.0</td> +</tr> <tr> <td><code>spark.executor.metrics.pollingInterval</code></td> <td>0</td> @@ -1993,6 +2251,32 @@ Apart from these, the following properties are also available, and may be useful </td> <td>3.0.0</td> </tr> +<tr> + <td><code>spark.eventLog.gcMetrics.youngGenerationGarbageCollectors</code></td> + <td>Copy,PS Scavenge,ParNew,G1 Young Generation</td> + <td> + Names of supported young generation garbage collector. A name usually is the return of GarbageCollectorMXBean.getName. + The built-in young generation garbage collectors are Copy,PS Scavenge,ParNew,G1 Young Generation. + </td> + <td>3.0.0</td> +</tr> +<tr> + <td><code>spark.eventLog.gcMetrics.oldGenerationGarbageCollectors</code></td> + <td>MarkSweepCompact,PS MarkSweep,ConcurrentMarkSweep,G1 Old Generation</td> + <td> + Names of supported old generation garbage collector. A name usually is the return of GarbageCollectorMXBean.getName. + The built-in old generation garbage collectors are MarkSweepCompact,PS MarkSweep,ConcurrentMarkSweep,G1 Old Generation. + </td> + <td>3.0.0</td> +</tr> +<tr> + <td><code>spark.executor.metrics.fileSystemSchemes</code></td> + <td>file,hdfs</td> + <td> + The file system schemes to report in executor metrics. + </td> + <td>3.1.0</td> +</tr> </table> ### Networking @@ -2321,6 +2605,16 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.4.1</td> </tr> +<tr> + <td><code>spark.standalone.submit.waitAppCompletion</code></td> + <td>false</td> + <td> + If set to true, Spark will merge ResourceProfiles when different profiles are specified in RDDs that get combined into a single stage. + When they are merged, Spark chooses the maximum of each resource and creates a new ResourceProfile. + The default of false results in Spark throwing an exception if multiple different ResourceProfiles are found in RDDs going into the same stage. + </td> + <td>3.1.0</td> +</tr> <tr> <td><code>spark.excludeOnFailure.enabled</code></td> <td> @@ -3342,6 +3636,15 @@ Push-based shuffle helps improve the reliability and performance of spark shuffl </td> <td>3.2.0</td> </tr> +<tr> + <td><code>spark.shuffle.push.numPushThreads</code></td> + <td>(none)</td> + <td> + Specify the number of threads in the block pusher pool. These threads assist in creating connections and pushing blocks to remote external shuffle services. + By default, the threadpool size is equal to the number of spark executor cores. + </td> + <td>3.2.0</td> +</tr> <tr> <td><code>spark.shuffle.push.maxBlockSizeToPush</code></td> <td><code>1m</code></td> @@ -3360,6 +3663,15 @@ Push-based shuffle helps improve the reliability and performance of spark shuffl </td> <td>3.2.0</td> </tr> +<tr> + <td><code>spark.shuffle.push.merge.finalizeThreads</code></td> + <td>8</td> + <td> + Number of threads used by driver to finalize shuffle merge. Since it could potentially take seconds for a large shuffle to finalize, + having multiple threads helps driver to handle concurrent shuffle merge finalize requests when push-based shuffle is enabled. + </td> + <td>3.3.0</td> +</tr> <tr> <td><code>spark.shuffle.push.minShuffleSizeToWait</code></td> <td><code>500m</code></td> --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org