dongjoon-hyun commented on PR #40128:
URL: https://github.com/apache/spark/pull/40128#issuecomment-1480402673

   @shrprasa . 
   
   1. It seems that you have an assumption that Shutdown hook is magically 
reliable. However, shutdown hook has a well-known limitation where JVM can be 
destroyed abruptly and K8s Pod can be deleted also without giving the inside 
processes enough time to handle business logic.
   
   2. As I mentioned in the above, public cloud storage systems have a better 
and complete TTL-based solution for that issue. In that context, this PR is 
only trying to mitigate HDFS issue, 
https://issues.apache.org/jira/browse/HDFS-6382, partially.
   
   > The change to clean up the upload directory is not specific to HDFS. The 
reason we should do cleanup is because if the spark job is creating new 
directories/files, it should clean them up too just like it's being done in 
YARN and also for other files like shuffle spill.
   Also, can you please explain why the approach seems to be incomplete? How is 
unable to prevent leftover from upload directory?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to