97harsh commented on code in PR #14964:
URL: https://github.com/apache/iceberg/pull/14964#discussion_r2662381012


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java:
##########
@@ -157,13 +158,19 @@ public RewriteDataFilesSparkAction filter(Expression 
expression) {
     return this;
   }
 
+  public RewriteDataFilesSparkAction toBranch(String targetBranch) {
+    this.branch = targetBranch;
+    return this;
+  }
+
   @Override
   public RewriteDataFiles.Result execute() {
     if (table.currentSnapshot() == null) {
       return EMPTY_RESULT;
     }
 
-    long startingSnapshotId = table.currentSnapshot().snapshotId();
+    long startingSnapshotId =
+        branch != null ? table.snapshot(branch).snapshotId() : 
table.currentSnapshot().snapshotId();

Review Comment:
   Thank you, that's a valid suggestion.
   
   I considered adding the check in the procedure, but think it makes more 
sense to add in the action class because:
   - The action can be used directly via 
SparkActions.get().rewriteDataFiles(table).toBranch("branch").execute() without 
going through the procedure. Validating in the action ensures all callers are 
protected.
   - Keeping it consistent with other validations in this class (e.g., 
maxConcurrentFileGroupRewrites >= 1, partialProgressEnabled checks) are done in 
the action, not the procedure.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to