This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new fbe6b1dba2ab [SPARK-47721][DOC] Guidelines for the Structured Logging 
Framework
fbe6b1dba2ab is described below

commit fbe6b1dba2abab03fef2fbbac4640c4c41153e71
Author: Gengliang Wang <gengli...@apache.org>
AuthorDate: Thu Apr 4 08:46:21 2024 +0900

    [SPARK-47721][DOC] Guidelines for the Structured Logging Framework
    
    ### What changes were proposed in this pull request?
    
    As suggested in 
https://github.com/apache/spark/pull/45834/files#r1549565157, I am creating 
initial guidelines for the structured logging framework.
    
    ### Why are the changes needed?
    
    We need guidelines to align the logging migration works in the community.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No
    ### How was this patch tested?
    
    It's just doc change.
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Yes. Generated-by: GitHub Copilot 1.2.17.2887
    
    Closes #45862 from gengliangwang/logREADME.
    
    Authored-by: Gengliang Wang <gengli...@apache.org>
    Signed-off-by: Hyukjin Kwon <gurwls...@apache.org>
---
 .../src/main/scala/org/apache/spark/internal/README.md      | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/common/utils/src/main/scala/org/apache/spark/internal/README.md 
b/common/utils/src/main/scala/org/apache/spark/internal/README.md
new file mode 100644
index 000000000000..ed3d77333806
--- /dev/null
+++ b/common/utils/src/main/scala/org/apache/spark/internal/README.md
@@ -0,0 +1,13 @@
+# Guidelines for the Structured Logging Framework
+
+## LogKey
+
+LogKeys serve as identifiers for mapped diagnostic contexts (MDC) within logs. 
Follow these guidelines when adding new LogKeys:
+* Define all structured logging keys in `LogKey.scala`, and sort them 
alphabetically for ease of search.
+* Use `UPPER_SNAKE_CASE` for key names.
+* Key names should be both simple and broad, yet include specific identifiers 
like `STAGE_ID`, `TASK_ID`, and `JOB_ID` when needed for clarity. For instance, 
use `MAX_ATTEMPTS` as a general key instead of creating separate keys for each 
scenario such as `EXECUTOR_STATE_SYNC_MAX_ATTEMPTS` and `MAX_TASK_FAILURES`. 
This balances simplicity with the detail needed for effective logging.
+* Use abbreviations in names if they are widely understood, such as `APP_ID` 
for APPLICATION_ID, and `K8S` for KUBERNETES.
+
+## Exceptions
+
+To ensure logs are compatible with Spark SQL and log analysis tools, avoid 
`Exception.printStackTrace()`. Use `logError`, `logWarning`, and `logInfo` 
methods from the `Logging` trait to log exceptions, maintaining structured and 
parsable logs.


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to