coderfender commented on code in PR #12824:
URL: https://github.com/apache/iceberg/pull/12824#discussion_r2049724630
##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java:
##########
@@ -407,15 +409,49 @@ private Builder doExecuteWithPartialProgress(
Stream<RewriteFileGroup> toGroupStream(
RewriteExecutionContext ctx, Map<StructLike, List<List<FileScanTask>>>
groupsByPartition) {
- return groupsByPartition.entrySet().stream()
+ if (maxFilesToRewrite == null) {
Review Comment:
Good point ! My motivation here is to keep the option as simple and
straightforward as possible. Having an upper bound ( I am assuming you meant
setting default value of param MAX_FILES_TO_REWRITE as LONG.MAX_VALUE) which
would add a side effect to this functionality by limiting the number of file
to 2^^63-1 . However unlikely that is, I dont believe optional parameters
should interfere in the default behavior and should be isolated for the sake
of consistency and clarity.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]