panbingkun commented on code in PR #46634:
URL: https://github.com/apache/spark/pull/46634#discussion_r1606373207


##########
common/utils/src/main/scala/org/apache/spark/internal/README.md:
##########
@@ -45,3 +45,29 @@ logger.error("Failed to abort the writer after failing to 
write map output.", e)
 ## Exceptions
 
 To ensure logs are compatible with Spark SQL and log analysis tools, avoid 
`Exception.printStackTrace()`. Use `logError`, `logWarning`, and `logInfo` 
methods from the `Logging` trait to log exceptions, maintaining structured and 
parsable logs.
+
+## External third-party ecosystem access
+
+* If you want to output logs in `scala code` through the structured log 
framework, you can define `custom LogKey` and use it in `scala` code as follows:
+
+```scala
+// External third-party ecosystem `custom LogKey` must be `extends LogKey`
+case object CUSTOM_LOG_KEY extends LogKey
+```
+```scala
+import org.apache.spark.internal.MDC;
+
+logInfo(log"${MDC(CUSTOM_LOG_KEY, "key")}")
+```
+
+* If you want to output logs in `java code` through the structured log 
framework, you can define `custom LogKey` and use it in `java` code as follows:
+
+```java
+// External third-party ecosystem `custom LogKey` must be `implements LogKey`
+public static class CUSTOM_LOG_KEY implements LogKey { }
+```
+```java
+import org.apache.spark.internal.MDC;
+
+logger.error("Unable to delete key {} for cache", MDC.of(CUSTOM_LOG_KEY, 
"key"));
+```

Review Comment:
   1.Let me take an example. In the production system, for example, an 
application named `SparkPi`, if you want to use `log4j` to print logs, there 
are two ways,
   One is as follows:
   <img width="519" alt="image" 
src="https://github.com/apache/spark/assets/15246973/2f0c1cca-122c-4d3d-b67d-f64eae3f0ca1";>
   
   The other is(It does not need to be placed under the `org.apache.spark 
namespace`. `Logging` can also be used):
   <img width="497" alt="image" 
src="https://github.com/apache/spark/assets/15246973/04ba4b07-2442-4630-8aea-4e6e632bcaf4";>
   <img width="503" alt="image" 
src="https://github.com/apache/spark/assets/15246973/2e92c77c-5f82-45b5-b3cd-e1aee5a9bed3";>
   
   
   The last one is often used by `me`, and this is just `an application`.
   
   If a third-party system, such as `iceberg`, wants to summarize logs through 
`a structured log framework`, it should be easy to do so by using the current 
structured framework.
   
   2.In fact, our logging does not indicate that it is `any internal API` (We 
just didn't mark it as `Developer Api`), as shown below:
   
https://github.com/apache/spark/blob/fa8aa571ad18441622bb7e3ac66032ab9e7cbc0a/common/utils/src/main/scala/org/apache/spark/internal/Logging.scala#L84
   
   I am worried that many users have used it like me, because it is direct and 
convenient (`conf/log4j.properties` can be used together)
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to