[ 
https://issues.apache.org/jira/browse/HBASE-26320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17424087#comment-17424087
 ] 

Pankaj Kumar commented on HBASE-26320:
--------------------------------------

Thanks [~zyork] for bringing this, recently we faced this problem in one of our 
production environment where archived and old WAL files size reaches to several 
TBs due to slow directory scan, we had to modify multiple configs to speed up 
the cleaning.

> Separate Log Cleaner DirScanPool to prevent the OLDWALs from filling up the 
> disk when archive is large
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-26320
>                 URL: https://issues.apache.org/jira/browse/HBASE-26320
>             Project: HBase
>          Issue Type: Improvement
>          Components: Operability
>    Affects Versions: 1.7.1, 2.4.6
>            Reporter: Zach York
>            Assignee: Zach York
>            Priority: Major
>
> We currently share the DirScanPool (threadpool for scanning for files to 
> delete in the OldLogs and archive directories) between the LogCleaner and 
> HFileCleaner. This means that if the archive directory is large/has lots of 
> files/directories, the threads can get stuck scanning through the archive 
> directory, starving the LogCleaner. This is especially apparent on S3 where 
> list can be slower than on HDFS.
> This JIRA creates separate DirScanPools for the LogCleaner and HFileCleaner



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to