wypoon commented on code in PR #4817:
URL: https://github.com/apache/iceberg/pull/4817#discussion_r1107867650
##########
core/src/main/java/org/apache/iceberg/actions/BaseDeleteOrphanFilesActionResult.java:
##########
@@ -19,16 +19,26 @@
package org.apache.iceberg.actions;
+import java.util.stream.Collectors;
+import java.util.stream.StreamSupport;
+
public class BaseDeleteOrphanFilesActionResult implements
DeleteOrphanFiles.Result {
- private final Iterable<String> orphanFileLocations;
+ private final Iterable<DeleteOrphanFiles.OrphanFileStatus>
orphanFileLocations;
Review Comment:
I agree that `orphanFileStatuses` would be better.
##########
api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java:
##########
@@ -85,8 +85,33 @@ public interface DeleteOrphanFiles extends
Action<DeleteOrphanFiles, DeleteOrpha
*/
interface Result {
/**
+ * @deprecated since 0.14.0, will be removed in 0.15.0; use {@link
#orphanFiles()} instead.
* Returns locations of orphan files.
*/
+ @Deprecated
Iterable<String> orphanFileLocations();
+
+ /**
+ * Returns orphan files.
+ */
+ Iterable<OrphanFileStatus> orphanFiles();
Review Comment:
It could be helpful to have a `failures()` method that returns
`statuses().filter(status -> status.deleted() == false)`, but where/how do you
envision such a method being used, @szehon-ho?
It would be helpful if the SparkAction or SQL procedure could take the
failed deletes and rerun without computing the orphan files again. Of course,
if only a few files failed to be deleted, one could delete them directly from
the storage layer one by one.
##########
api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java:
##########
@@ -85,8 +85,33 @@ public interface DeleteOrphanFiles extends
Action<DeleteOrphanFiles, DeleteOrpha
*/
interface Result {
/**
+ * @deprecated since 0.14.0, will be removed in 0.15.0; use {@link
#orphanFiles()} instead.
* Returns locations of orphan files.
*/
+ @Deprecated
Iterable<String> orphanFileLocations();
+
+ /**
+ * Returns orphan files.
+ */
+ Iterable<OrphanFileStatus> orphanFiles();
Review Comment:
I agree that `orphanFileStatuses` is a better name for the method, but since
it is a method in `DeleteOrphanFiles`, I think `statuses` is a simpler name and
is clear enough from the context of the class.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]