This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 8d599972872 [SPARK-45257][CORE] Enable `spark.eventLog.compress` by default 8d599972872 is described below commit 8d599972872225e336467700715b1d4771624efe Author: Dongjoon Hyun <dh...@apple.com> AuthorDate: Thu Sep 21 20:09:16 2023 -0700 [SPARK-45257][CORE] Enable `spark.eventLog.compress` by default ### What changes were proposed in this pull request? This PR aims to enable `spark.eventLog.compress` by default for Apache Spark 4.0.0. ### Why are the changes needed? - To save the event log storage cost by compressing the logs with ZStandard codec by default ### Does this PR introduce _any_ user-facing change? Although we added a migration guide, the old Spark history servers are able to read the compressed logs. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43036 from dongjoon-hyun/SPARK-45257. Lead-authored-by: Dongjoon Hyun <dh...@apple.com> Co-authored-by: Dongjoon Hyun <dongj...@apache.org> Signed-off-by: Dongjoon Hyun <dh...@apple.com> --- core/src/main/scala/org/apache/spark/internal/config/package.scala | 2 +- docs/configuration.md | 2 +- docs/core-migration-guide.md | 4 ++++ 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/internal/config/package.scala b/core/src/main/scala/org/apache/spark/internal/config/package.scala index 05b2624b403..2dcd3af7a52 100644 --- a/core/src/main/scala/org/apache/spark/internal/config/package.scala +++ b/core/src/main/scala/org/apache/spark/internal/config/package.scala @@ -165,7 +165,7 @@ package object config { ConfigBuilder("spark.eventLog.compress") .version("1.0.0") .booleanConf - .createWithDefault(false) + .createWithDefault(true) private[spark] val EVENT_LOG_BLOCK_UPDATES = ConfigBuilder("spark.eventLog.logBlockUpdates.enabled") diff --git a/docs/configuration.md b/docs/configuration.md index 8fda9317bc7..e9ed2a8aa37 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -1311,7 +1311,7 @@ Apart from these, the following properties are also available, and may be useful </tr> <tr> <td><code>spark.eventLog.compress</code></td> - <td>false</td> + <td>true</td> <td> Whether to compress logged events, if <code>spark.eventLog.enabled</code> is true. </td> diff --git a/docs/core-migration-guide.md b/docs/core-migration-guide.md index 3f97a484e1a..765c3494f66 100644 --- a/docs/core-migration-guide.md +++ b/docs/core-migration-guide.md @@ -22,6 +22,10 @@ license: | * Table of contents {:toc} +## Upgrading from Core 3.4 to 4.0 + +- Since Spark 4.0, Spark will compress event logs. To restore the behavior before Spark 4.0, you can set `spark.eventLog.compress` to `false`. + ## Upgrading from Core 3.3 to 3.4 - Since Spark 3.4, Spark driver will own `PersistentVolumnClaim`s and try to reuse if they are not assigned to live executors. To restore the behavior before Spark 3.4, you can set `spark.kubernetes.driver.ownPersistentVolumeClaim` to `false` and `spark.kubernetes.driver.reusePersistentVolumeClaim` to `false`. --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org