[ 
https://issues.apache.org/jira/browse/HIVE-23196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085737#comment-17085737
 ] 

Attila Magyar commented on HIVE-23196:
--------------------------------------

[~ashutoshc], [~rajesh.balamohan],

The API allows us to set paths which are not within the scratch directory. 
However the general usage is that resFile and resDir is always under the 
scratch dir, but these are set externally at 30-40 different places. Similarly 
the stagingDir is within the scratchDir but these are not enforced rules. So I 
think it's not safe to completely remove these deletions. I simplified the 
patch by only adding a few guard clauses before removing the directories.

Moving the deletion to a different thread seems to be a bit overkill to me.

Please see the updated version: https://reviews.apache.org/r/72371/

> Reduce number of delete calls to NN during Context::clear
> ---------------------------------------------------------
>
>                 Key: HIVE-23196
>                 URL: https://issues.apache.org/jira/browse/HIVE-23196
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Assignee: Attila Magyar
>            Priority: Major
>         Attachments: HIVE-23196.1.patch, HIVE-23196.2.patch
>
>
> {{Context::clear()}} ends up deleting same directories (or its subdirs) 
> multiple times. It would be good to reduce the number of delete calls to NN 
> for latency sensitive queries. This also has an impact on concurrent queries.
> {noformat}
> 2020-04-14T04:22:28,703 DEBUG [7c6a6b09-ab37-4bc8-93a5-5da6fb154899 
> HiveServer2-Handler-Pool: Thread-378] ql.Context: Deleting result dir: 
> hdfs://nn1:8020/tmp/hive/xyz/7c6a6b09-ab37-4bc8-93a5-5da6fb154899/hive_2020-04-14_04-22-24_335_8573832618972595103-13/-mr-10000
> 2020-04-14T04:22:28,721 DEBUG [7c6a6b09-ab37-4bc8-93a5-5da6fb154899 
> HiveServer2-Handler-Pool: Thread-378] ql.Context: Deleting scratch dir: 
> hdfs://nn1:8020/tmp/hive/xyz/7c6a6b09-ab37-4bc8-93a5-5da6fb154899/hive_2020-04-14_04-22-24_335_8573832618972595103-13
> 2020-04-14T04:22:28,737 DEBUG [7c6a6b09-ab37-4bc8-93a5-5da6fb154899 
> HiveServer2-Handler-Pool: Thread-378] ql.Context: Deleting scratch dir: 
> hdfs://nn1:8020/tmp/hive/xyz/7c6a6b09-ab37-4bc8-93a5-5da6fb154899/hive_2020-04-14_04-22-24_335_8573832618972595103-13/-mr-10000/.hive-staging_hive_2020-04-14_04-22-24_335_8573832618972595103-13{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to