[ 
https://issues.apache.org/jira/browse/HIVE-20259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560444#comment-16560444
 ] 

Jason Dere commented on HIVE-20259:
-----------------------------------

Yeah, I had originally proposed using the ClearDanglingScratchDir functionality 
for cleanup, though [~hagleitn] had some reservations with this approach as 
this depends on very specific behavior of HDFS (file leases).

Another possible option rather than using the file lease, is to simply 
periodically write a file with a known naming convention 
(.cleanup.timestamp_val?) to the cache directory. A cleanup thread would look 
for the .cleanup file in the base cache directory, and delete any cache 
directories where the age of the file is too old. As long as the Hive process 
is still alive and creating new versions of the cleanup file, its cache 
directory would not be cleaned up by the cleanup thread.


> Cleanup of results cache directory
> ----------------------------------
>
>                 Key: HIVE-20259
>                 URL: https://issues.apache.org/jira/browse/HIVE-20259
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Jason Dere
>            Assignee: Jason Dere
>            Priority: Major
>
> The query results cache directory is currently deleted at process exit. This 
> does not work in the case of a kill -9 or a sudden process exit of Hive. 
> There should be some cleanup mechanism in place to take care of any old cache 
> directories that were not deleted at process exit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to