bvaradar commented on a change in pull request #1964:
URL: https://github.com/apache/hudi/pull/1964#discussion_r475895369



##########
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieDefaultTimeline.java
##########
@@ -296,6 +300,42 @@ public boolean isBeforeTimelineStarts(String instant) {
     return details.apply(instant);
   }
 
+  /**

Review comment:
       Timeline APIs are only about instants  in general. I think adding 
partitions here is breaking that abstraction. Can you move this to some helper 
class

##########
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieTimeline.java
##########
@@ -232,6 +233,12 @@
    */
   Option<byte[]> getInstantDetails(HoodieInstant instant);
 
+  /**
+   * Returns partitions that have been modified in the timeline. This includes 
internal operations such as clean.
+   * Note that this only returns data for completed instants.
+   */
+  List<String> getPartitionsMutated();

Review comment:
       +1 on abstraction point. I think having a separate helper class would be 
better.

##########
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieDefaultTimeline.java
##########
@@ -296,6 +300,42 @@ public boolean isBeforeTimelineStarts(String instant) {
     return details.apply(instant);
   }
 
+  /**
+   * Returns partitions that have been modified in the timeline. This includes 
internal operations such as clean.
+   * Note that this only returns data for completed instants.
+   */
+  public List<String> getPartitionsMutated() {
+    return filterCompletedInstants().getInstants().flatMap(s -> {
+      switch (s.getAction()) {
+        case HoodieTimeline.COMMIT_ACTION:
+        case HoodieTimeline.DELTA_COMMIT_ACTION:
+          try {
+            HoodieCommitMetadata commitMetadata = 
HoodieCommitMetadata.fromBytes(getInstantDetails(s).get(), 
HoodieCommitMetadata.class);
+            return commitMetadata.getPartitionToWriteStats().keySet().stream();
+          } catch (IOException e) {
+            throw new HoodieIOException("Failed to get partitions written 
between " + firstInstant() + " " + lastInstant(), e);
+          }
+        case HoodieTimeline.CLEAN_ACTION:

Review comment:
       We dont need to look at clean and (event compaction) for figuring out 
changed partitions for the case of hive syncing. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to