dramaticlly commented on code in PR #12278:
URL: https://github.com/apache/iceberg/pull/12278#discussion_r1958833411
##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java:
##########
@@ -84,11 +86,11 @@
* comparing the actual files in that location with content and metadata files
referenced by all
* valid snapshots. The location must be accessible for listing via the Hadoop
{@link FileSystem}.
*
- * <p>By default, this action cleans up the table location returned by {@link
Table#location()} and
- * removes unreachable files that are older than 3 days using {@link
Table#io()}. The behavior can
- * be modified by passing a custom location to {@link #location} and a custom
timestamp to {@link
- * #olderThan(long)}. For example, someone might point this action to the data
folder to clean up
- * only orphan data files.
+ * <p>By default, this action cleans up data and metadata directory under the
table location
+ * returned by {@link Table#location()} and removes unreachable files that are
older than 3 days
+ * using {@link Table#io()}. The behavior can be modified by passing a custom
location to {@link
+ * #location} and a custom timestamp to {@link #olderThan(long)}. For example,
someone might point
+ * this action to the data folder to clean up only orphan data files.
Review Comment:
my $0.02, I think this introduce behaviour change for removing orphan files,
would be great to have a email on dev@ to highlight the proposal and changes.
Also I think we can just mention orphan removal is honoring
`write.data.path` and `write.metadata.path` but allow for action/procedure
level override if `location` is provided. (The default value of
`write.data.path` and `write.metadata.path` can change independently and we
dont need to mention Table#location)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]