[GitHub] [hudi] leesf commented on a change in pull request #4222: [HUDI-2849] improve SparkUI job description for write path

2021-12-08 Thread GitBox


leesf commented on a change in pull request #4222:
URL: https://github.com/apache/hudi/pull/4222#discussion_r765382289



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##
@@ -441,9 +441,10 @@ public void refreshTimeline() throws IOException {
   return null;
 }
 
+jssc.setJobGroup(this.getClass().getSimpleName(), "Checking if input is 
empty");
 if ((!avroRDDOptional.isPresent()) || (avroRDDOptional.get().isEmpty())) {
   LOG.info("No new data, perform empty commit.");
-  return Pair.of(schemaProvider, Pair.of(checkpointStr, jssc.emptyRDD()));
+  return Pair.of(schemaProvider, Pair.of(checkpointStr, null));

Review comment:
   > It saves one call to isEmpty() in `DeltaSync::writeToSink`, which 
could further eliminate one spark job.
   
   You would separate it in another PR.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] leesf commented on a change in pull request #4222: [HUDI-2849] improve SparkUI job description for write path

2021-12-08 Thread GitBox


leesf commented on a change in pull request #4222:
URL: https://github.com/apache/hudi/pull/4222#discussion_r764625760



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##
@@ -441,9 +441,10 @@ public void refreshTimeline() throws IOException {
   return null;
 }
 
+jssc.setJobGroup(this.getClass().getSimpleName(), "Checking if input is 
empty");
 if ((!avroRDDOptional.isPresent()) || (avroRDDOptional.get().isEmpty())) {
   LOG.info("No new data, perform empty commit.");
-  return Pair.of(schemaProvider, Pair.of(checkpointStr, jssc.emptyRDD()));
+  return Pair.of(schemaProvider, Pair.of(checkpointStr, null));

Review comment:
   why do we need modify this?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org