[ 
https://issues.apache.org/jira/browse/HIVE-15068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534321#comment-17534321
 ] 

royal commented on HIVE-15068:
------------------------------

[~daijy]  [~leftyl]  [~hiveqa]  [~thejas]  [~h_o] can you help me solve this 
problem, Thank You

https://stackoverflow.com/questions/72186228/hive-fetch-result-from-hdfs-is-too-slow-because-of-too-many-the-map-only-task

> Run ClearDanglingScratchDir periodically inside HS2
> ---------------------------------------------------
>
>                 Key: HIVE-15068
>                 URL: https://issues.apache.org/jira/browse/HIVE-15068
>             Project: Hive
>          Issue Type: Improvement
>          Components: HiveServer2
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Major
>              Labels: TODOC1.3, TODOC2.2
>             Fix For: 1.3.0, 2.2.0
>
>         Attachments: HIVE-15068.1.patch, HIVE-15068.2.patch, 
> HIVE-15068.3.patch, HIVE-15068.4.branch-1.patch, HIVE-15068.4.patch
>
>
> In HIVE-13429, we introduce a tool which clear dangling scratch directory. In 
> this ticket, we want to invoke the tool automatically on a Hive cluster. 
> Options are:
> 1. cron job, which would involve manual cron job setup
> 2. As a metastore thread. However, it is possible we run metastore without 
> hdfs in the future (eg, managing s3 files). ClearDanglingScratchDir needs 
> support which only exists in hdfs, it won't work if the above scenario happens
> 3. As a HS2 thread. The downside is if no HS2 is running, the tool will not 
> run automatically. But we expect HS2 will be a required component down the 
> road
> Here I choose approach 3 in the implementation.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to