steveloughran commented on PR #14501: URL: https://github.com/apache/iceberg/pull/14501#issuecomment-3711735385
@jordepic @danielcweeks joining in very late here. that trash api really exists to stop users doing things on the command line, so `hadoop fs -rm -rf /` doesn't (on s3a:// we don't let you delete root as an alternative). Database operations tend not not go through trash on the basis that databases can do their own thing and/or you need disaster recovery mechanisms at this point. I do see hive has it as a safety check, presumably someone did a DROP TABLE and changed their mind. I suspect it is not used on every file deletion though, more the whole-table operations, because one aspect of trash it likes to be atomic: moving a whole table in there gives you that. S3aFs doesn't like trash as the PoV there is "S3 versioning may not be atomic but it's a lot faster than renaming". We've discussed having a plugin policy here where each fs could have its own .[HADOOP-18013. ABFS: add cloud trash policy with per-schema policy selection](https://issues.apache.org/jira/browse/HADOOP-18013); superceded by something with active development https://github.com/apache/hadoop/pull/8063 . I'll see about getting that in. Regarding this patch * it's going to cause problems in HD/Insight as micrsoft don't put the HDFS Jars on the classpath, and this has explicit reference to the classes. * it doesn't let people turn on trash on azure storage or elsewhere if they want it. What about just a configuration option "iceberg.hadoop.trash.schemas" to take a list of filesystem schemas "hdfs, viewfs, file, abfs" for which trash is enabled?. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
