panbingkun commented on code in PR #46634:
URL: https://github.com/apache/spark/pull/46634#discussion_r1606373207
##########
common/utils/src/main/scala/org/apache/spark/internal/README.md:
##########
@@ -45,3 +45,29 @@ logger.error("Failed to abort the writer after failing to
write map output.", e)
## Exceptions
To ensure logs are compatible with Spark SQL and log analysis tools, avoid
`Exception.printStackTrace()`. Use `logError`, `logWarning`, and `logInfo`
methods from the `Logging` trait to log exceptions, maintaining structured and
parsable logs.
+
+## External third-party ecosystem access
+
+* If you want to output logs in `scala code` through the structured log
framework, you can define `custom LogKey` and use it in `scala` code as follows:
+
+```scala
+// External third-party ecosystem `custom LogKey` must be `extends LogKey`
+case object CUSTOM_LOG_KEY extends LogKey
+```
+```scala
+import org.apache.spark.internal.MDC;
+
+logInfo(log"${MDC(CUSTOM_LOG_KEY, "key")}")
+```
+
+* If you want to output logs in `java code` through the structured log
framework, you can define `custom LogKey` and use it in `java` code as follows:
+
+```java
+// External third-party ecosystem `custom LogKey` must be `implements LogKey`
+public static class CUSTOM_LOG_KEY implements LogKey { }
+```
+```java
+import org.apache.spark.internal.MDC;
+
+logger.error("Unable to delete key {} for cache", MDC.of(CUSTOM_LOG_KEY,
"key"));
+```
Review Comment:
1.Let me take an example. In the production system, for example, an
application named `SparkPi`, if you want to use `log4j` to print logs, there
are two ways,
One is as follows(It does not need to be placed under the `org.apache.spark
namespace`. `Logging` can also be used):
<img width="519" alt="image"
src="https://github.com/apache/spark/assets/15246973/2f0c1cca-122c-4d3d-b67d-f64eae3f0ca1">
The other is:
<img width="497" alt="image"
src="https://github.com/apache/spark/assets/15246973/04ba4b07-2442-4630-8aea-4e6e632bcaf4">
The last one is often used by `me`, and this is just `an application`.
If a third-party system, such as `iceberg`, wants to summarize logs through
`a structured log framework`, it should be easy to do so by using the current
structured framework.
2.In fact, our logging does not indicate that it is `any internal API` (We
just didn't mark it as Developer Api), as shown below:
https://github.com/apache/spark/blob/fa8aa571ad18441622bb7e3ac66032ab9e7cbc0a/common/utils/src/main/scala/org/apache/spark/internal/Logging.scala#L84
I am worried that many users have used it like me, because it is direct and
convenient (`conf/log4j.properties` can be used together)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]