RussellSpitzer commented on code in PR #12692:
URL: https://github.com/apache/iceberg/pull/12692#discussion_r2039899219
##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java:
##########
@@ -196,69 +183,29 @@ public RewriteDataFiles.Result execute() {
return resultBuilder.build();
}
- StructLikeMap<List<List<FileScanTask>>> planFileGroups(long
startingSnapshotId) {
- CloseableIterable<FileScanTask> fileScanTasks =
- table
- .newScan()
- .useSnapshot(startingSnapshotId)
- .caseSensitive(caseSensitive)
- .filter(filter)
- .ignoreResiduals()
- .planFiles();
+ private void init(long startingSnapshotId) {
- try {
- StructType partitionType = table.spec().partitionType();
- StructLikeMap<List<FileScanTask>> filesByPartition =
- groupByPartition(partitionType, fileScanTasks);
- return fileGroupsByPartition(filesByPartition);
- } finally {
- try {
- fileScanTasks.close();
- } catch (IOException io) {
- LOG.error("Cannot properly close file iterable while planning for
rewrite", io);
- }
- }
- }
+ this.planner =
+ shufflingPlanner
Review Comment:
I noted above but I think we can just check if our rewrite extends a class
that requires shuffles?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]