Re: [PR] [SPARK-48320][CORE][DOCS] Add external third-party ecosystem access guide to the `scala/java` doc [spark]

2024-05-24 Thread via GitHub


panbingkun commented on code in PR #46634:
URL: https://github.com/apache/spark/pull/46634#discussion_r1614383361


##
common/utils/src/main/scala/org/apache/spark/internal/README.md:
##


Review Comment:
   Okay, I have adopted the plan of deleting `README.md`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48320][CORE][DOCS] Add external third-party ecosystem access guide to the `scala/java` doc [spark]

2024-05-24 Thread via GitHub


mridulm commented on code in PR #46634:
URL: https://github.com/apache/spark/pull/46634#discussion_r1614353283


##
common/utils/src/main/scala/org/apache/spark/internal/README.md:
##


Review Comment:
   IIRC @panbingkun included the contents into the existing code; which is also 
a nice addition to the code documentation.
   IMO that is a better overall approach given this is internal to spark.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48320][CORE][DOCS] Add external third-party ecosystem access guide to the `scala/java` doc [spark]

2024-05-24 Thread via GitHub


gengliangwang commented on code in PR #46634:
URL: https://github.com/apache/spark/pull/46634#discussion_r1613860819


##
common/utils/src/main/scala/org/apache/spark/internal/README.md:
##


Review Comment:
   I will wait for @mridulm's response until this weekend. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48320][CORE][DOCS] Add external third-party ecosystem access guide to the `scala/java` doc [spark]

2024-05-23 Thread via GitHub


panbingkun commented on code in PR #46634:
URL: https://github.com/apache/spark/pull/46634#discussion_r1612809775


##
common/utils/src/main/scala/org/apache/spark/internal/README.md:
##


Review Comment:
   I'm fine to it, WDYT @mridulm ?
   we can just put the new description in the `scala/java` doc.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48320][CORE][DOCS] Add external third-party ecosystem access guide to the `scala/java` doc [spark]

2024-05-23 Thread via GitHub


gengliangwang commented on PR #46634:
URL: https://github.com/apache/spark/pull/46634#issuecomment-2127700362

   LGTM except for one comment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48320][CORE][DOCS] Add external third-party ecosystem access guide to the `scala/java` doc [spark]

2024-05-23 Thread via GitHub


gengliangwang commented on code in PR #46634:
URL: https://github.com/apache/spark/pull/46634#discussion_r1612074023


##
common/utils/src/main/scala/org/apache/spark/internal/README.md:
##


Review Comment:
   @mridulm @panbingkun This README is for Spark developers. As long as it 
doesn't mention 3rd party supports, can we just keep it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48320][CORE][DOCS] Add external third-party ecosystem access guide to the `scala/java` doc [spark]

2024-05-23 Thread via GitHub


panbingkun commented on PR #46634:
URL: https://github.com/apache/spark/pull/46634#issuecomment-2126374603

   > Thanks for the changes @panbingkun. Looks good to me - I will let 
@gengliangwang review and merge; he is much more knowledgeable about this in 
general !
   
   Yeah, Thank you very much! ❤️


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48320][CORE][DOCS] Add external third-party ecosystem access guide to the `scala/java` doc [spark]

2024-05-22 Thread via GitHub


panbingkun commented on PR #46634:
URL: https://github.com/apache/spark/pull/46634#issuecomment-2126146078

   > Please replace all references to `external` and `external system` - we 
dont want to set expectations that external system integrations are encouraged. 
Keep it as custom integrations instead - if power users and frameworks want to 
leverage the integration, this would be the path for them as it is not 
explicitly `private [spark]`.
   > 
   > I added a few cases below, but I would expect @gengliangwang to be better 
aware and give much better guidance than what I can of what else might need to 
be changed - as he is more actively reviewing the PR's :-)
   
   Thank you very much for your patient and careful review. ❤️
   I have updated it again. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48320][CORE][DOCS] Add external third-party ecosystem access guide to the `scala/java` doc [spark]

2024-05-22 Thread via GitHub


mridulm commented on code in PR #46634:
URL: https://github.com/apache/spark/pull/46634#discussion_r1610164692


##
common/utils/src/main/java/org/apache/spark/internal/SparkLogger.java:
##
@@ -28,6 +28,50 @@
 import org.slf4j.Logger;
 // checkstyle.on: RegexpSinglelineJava
 
+// checkstyle.off: RegexpSinglelineJava
+/**
+ * Guidelines for the Structured Logging Framework - Java Logging
+ * 
+ *
+ * Use the `org.apache.spark.internal.SparkLoggerFactory` to get the logger 
instance in Java code:
+ * Getting Logger Instance:
+ *   Instead of using `org.slf4j.LoggerFactory`, use 
`org.apache.spark.internal.SparkLoggerFactory`
+ *   to ensure structured logging.
+ * 
+ *
+ * import org.apache.spark.internal.SparkLogger;
+ * import org.apache.spark.internal.SparkLoggerFactory;
+ * private static final SparkLogger logger = 
SparkLoggerFactory.getLogger(JavaUtils.class);
+ * 
+ *
+ * Logging Messages with Variables:
+ *   When logging messages with variables, wrap all the variables with `MDC`s 
and they will be
+ *   automatically added to the Mapped Diagnostic Context (MDC).
+ * 
+ *
+ * import org.apache.spark.internal.LogKeys;
+ * import org.apache.spark.internal.MDC;
+ * logger.error("Unable to delete file for partition {}", 
MDC.of(LogKeys.PARTITION_ID$.MODULE$, i));
+ * 
+ *
+ * Constant String Messages:
+ *   For logging constant string messages, use the standard logging methods.
+ * 
+ *
+ * logger.error("Failed to abort the writer after failing to write map 
output.", e);
+ * 
+ *
+ * External third-party ecosystem access:

Review Comment:
   ```suggestion
   ```



##
common/utils/src/main/java/org/apache/spark/internal/SparkLogger.java:
##
@@ -28,6 +28,50 @@
 import org.slf4j.Logger;
 // checkstyle.on: RegexpSinglelineJava
 
+// checkstyle.off: RegexpSinglelineJava
+/**
+ * Guidelines for the Structured Logging Framework - Java Logging
+ * 
+ *
+ * Use the `org.apache.spark.internal.SparkLoggerFactory` to get the logger 
instance in Java code:
+ * Getting Logger Instance:
+ *   Instead of using `org.slf4j.LoggerFactory`, use 
`org.apache.spark.internal.SparkLoggerFactory`
+ *   to ensure structured logging.
+ * 
+ *
+ * import org.apache.spark.internal.SparkLogger;
+ * import org.apache.spark.internal.SparkLoggerFactory;
+ * private static final SparkLogger logger = 
SparkLoggerFactory.getLogger(JavaUtils.class);
+ * 
+ *
+ * Logging Messages with Variables:
+ *   When logging messages with variables, wrap all the variables with `MDC`s 
and they will be
+ *   automatically added to the Mapped Diagnostic Context (MDC).
+ * 
+ *
+ * import org.apache.spark.internal.LogKeys;
+ * import org.apache.spark.internal.MDC;
+ * logger.error("Unable to delete file for partition {}", 
MDC.of(LogKeys.PARTITION_ID$.MODULE$, i));
+ * 
+ *
+ * Constant String Messages:
+ *   For logging constant string messages, use the standard logging methods.
+ * 
+ *
+ * logger.error("Failed to abort the writer after failing to write map 
output.", e);
+ * 
+ *
+ * External third-party ecosystem access:
+ *   If you want to output logs in `java code` through the structured log 
framework,
+ *   you can define `custom LogKey` and use it in `java` code as follows:
+ * 
+ *
+ * // External third-party ecosystem `custom LogKey` must be `implements 
LogKey`

Review Comment:
   ```suggestion
* // To add a `custom LogKey`, implement `LogKey`
   ```



##
common/utils/src/test/java/org/apache/spark/util/SparkLoggerSuiteBase.java:
##
@@ -90,6 +94,9 @@ private String basicMsg() {
   // test for external system custom LogKey
   abstract String expectedPatternForExternalSystemCustomLogKey(Level level);
 
+  // test for external system java custom LogKey
+  abstract String expectedPatternForExternalSystemJavaCustomLogKey(Level 
level);

Review Comment:
   ```suggestion
 abstract String expectedPatternForJavaCustomLogKey(Level level);
   ```



##
common/utils/src/test/java/org/apache/spark/util/StructuredSparkLoggerSuite.java:
##
@@ -161,5 +161,19 @@ String expectedPatternForExternalSystemCustomLogKey(Level 
level) {
 "logger": ""
   }""");
   }
+
+  @Override
+  String expectedPatternForExternalSystemJavaCustomLogKey(Level level) {

Review Comment:
   ```suggestion
 String expectedPatternForJavaCustomLogKey(Level level) {
   ```



##
common/utils/src/test/java/org/apache/spark/util/PatternSparkLoggerSuite.java:
##
@@ -87,4 +87,10 @@ String expectedPatternForMsgWithMDCValueIsNull(Level level) {
   String expectedPatternForExternalSystemCustomLogKey(Level level) {
 return toRegexPattern(level, ".* : External system 
custom log message.\n");
   }
+
+  @Override
+  String expectedPatternForExternalSystemJavaCustomLogKey(Level level) {
+return toRegexPattern(level,
+ ".* : External system custom log message.\n");

Review Comment:
   ```suggestion
".* : custom log message.\n");
   ```



##
common/utils/src/test/java/org/apa

Re: [PR] [SPARK-48320][CORE][DOCS] Add external third-party ecosystem access guide to the `scala/java` doc [spark]

2024-05-21 Thread via GitHub


panbingkun commented on code in PR #46634:
URL: https://github.com/apache/spark/pull/46634#discussion_r1609203297


##
common/utils/src/main/scala/org/apache/spark/internal/README.md:
##
@@ -45,3 +45,29 @@ logger.error("Failed to abort the writer after failing to 
write map output.", e)
 ## Exceptions
 
 To ensure logs are compatible with Spark SQL and log analysis tools, avoid 
`Exception.printStackTrace()`. Use `logError`, `logWarning`, and `logInfo` 
methods from the `Logging` trait to log exceptions, maintaining structured and 
parsable logs.
+
+## External third-party ecosystem access
+
+* If you want to output logs in `scala code` through the structured log 
framework, you can define `custom LogKey` and use it in `scala` code as follows:
+
+```scala
+// External third-party ecosystem `custom LogKey` must be `extends LogKey`
+case object CUSTOM_LOG_KEY extends LogKey
+```
+```scala
+import org.apache.spark.internal.MDC;
+
+logInfo(log"${MDC(CUSTOM_LOG_KEY, "key")}")
+```
+
+* If you want to output logs in `java code` through the structured log 
framework, you can define `custom LogKey` and use it in `java` code as follows:
+
+```java
+// External third-party ecosystem `custom LogKey` must be `implements LogKey`
+public static class CUSTOM_LOG_KEY implements LogKey { }
+```
+```java
+import org.apache.spark.internal.MDC;
+
+logger.error("Unable to delete key {} for cache", MDC.of(CUSTOM_LOG_KEY, 
"key"));
+```

Review Comment:
   @mridulm @gengliangwang 
   The `common/utils/src/main/scala/org/apache/spark/internal/README.md` file 
has been dropped, and the content of the original document has been distributed 
to `java/scala` docs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org