This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new c6696cdcd611 [SPARK-47671][CORE] Enable structured logging in log4j2.properties.template and update docs c6696cdcd611 is described below commit c6696cdcd611a682ebf5b7a183e2970ecea3b58c Author: Gengliang Wang <gengli...@apache.org> AuthorDate: Thu May 2 19:45:48 2024 -0700 [SPARK-47671][CORE] Enable structured logging in log4j2.properties.template and update docs ### What changes were proposed in this pull request? - Rename the current log4j2.properties.template as log4j2.properties.pattern-layout-template - Enable structured logging in log4j2.properties.template - Update `configuration.md` on how to configure logging ### Why are the changes needed? Providing a structured logging template and document how to configure loggings in Spark 4.0.0 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Manual test ### Was this patch authored or co-authored using generative AI tooling? No Closes #46349 from gengliangwang/logTemplate. Authored-by: Gengliang Wang <gengli...@apache.org> Signed-off-by: Dongjoon Hyun <dh...@apple.com> --- ...template => log4j2.properties.pattern-layout-template} | 0 conf/log4j2.properties.template | 10 ++-------- docs/configuration.md | 15 +++++++++------ 3 files changed, 11 insertions(+), 14 deletions(-) diff --git a/conf/log4j2.properties.template b/conf/log4j2.properties.pattern-layout-template similarity index 100% copy from conf/log4j2.properties.template copy to conf/log4j2.properties.pattern-layout-template diff --git a/conf/log4j2.properties.template b/conf/log4j2.properties.template index ab96e03baed2..876724531444 100644 --- a/conf/log4j2.properties.template +++ b/conf/log4j2.properties.template @@ -19,17 +19,11 @@ rootLogger.level = info rootLogger.appenderRef.stdout.ref = console -# In the pattern layout configuration below, we specify an explicit `%ex` conversion -# pattern for logging Throwables. If this was omitted, then (by default) Log4J would -# implicitly add an `%xEx` conversion pattern which logs stacktraces with additional -# class packaging information. That extra information can sometimes add a substantial -# performance overhead, so we disable it in our default logging config. -# For more information, see SPARK-39361. appender.console.type = Console appender.console.name = console appender.console.target = SYSTEM_ERR -appender.console.layout.type = PatternLayout -appender.console.layout.pattern = %d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n%ex +appender.console.layout.type = JsonTemplateLayout +appender.console.layout.eventTemplateUri = classpath:org/apache/spark/SparkLayout.json # Set the default spark-shell/spark-sql log level to WARN. When running the # spark-shell/spark-sql, the log level for these classes is used to overwrite diff --git a/docs/configuration.md b/docs/configuration.md index 2e612ffd9ab9..a3b4e731f057 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -3670,14 +3670,17 @@ Note: When running Spark on YARN in `cluster` mode, environment variables need t # Configuring Logging Spark uses [log4j](http://logging.apache.org/log4j/) for logging. You can configure it by adding a -`log4j2.properties` file in the `conf` directory. One way to start is to copy the existing -`log4j2.properties.template` located there. +`log4j2.properties` file in the `conf` directory. One way to start is to copy the existing templates `log4j2.properties.template` or `log4j2.properties.pattern-layout-template` located there. -By default, Spark adds 1 record to the MDC (Mapped Diagnostic Context): `mdc.taskName`, which shows something -like `task 1.0 in stage 0.0`. You can add `%X{mdc.taskName}` to your patternLayout in -order to print it in the logs. +## Structured Logging +Starting from version 4.0.0, Spark has adopted the [JSON Template Layout](https://logging.apache.org/log4j/2.x/manual/json-template-layout.html) for logging, which outputs logs in JSON format. This format facilitates querying logs using Spark SQL with the JSON data source. Additionally, the logs include all Mapped Diagnostic Context (MDC) information for search and debugging purposes. + +To implement structured logging, start with the `log4j2.properties.template` file. + +## Plain Text Logging +If you prefer plain text logging, you can use the `log4j2.properties.pattern-layout-template` file as a starting point. This is the default configuration used by Spark before the 4.0.0 release. This configuration uses the [PatternLayout](https://logging.apache.org/log4j/2.x/manual/layouts.html#PatternLayout) to log all the logs in plain text. MDC information is not included by default. In order to print it in the logs, you can update the patternLayout in the file. For example, you can ad [...] Moreover, you can use `spark.sparkContext.setLocalProperty(s"mdc.$name", "value")` to add user specific data into MDC. -The key in MDC will be the string of "mdc.$name". +The key in MDC will be the string of `mdc.$name`. # Overriding configuration directory --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org