[ 
https://issues.apache.org/jira/browse/HBASE-18309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243332#comment-16243332
 ] 

Reid Chan commented on HBASE-18309:
-----------------------------------

bq. for a reasonable test please use a larger scale and include your reasoning, 
10 doesn't seem like enough to simulate what will happen in a deployment. e.g. 
X regions per server, Y servers means Z directories to clean up.
No need to use a real large scale, it can be simulated by creating 1000 sub 
dirs under root dir, and each sub dirs contains up to 1000 files and sub dirs. 
WDYT? I will provide statistics later.
bq. At what point will tuning this parameter cause a NameNode to fall over? How 
do we stop folks from doing that accidentally?
I'm not sure, and that's why parameter upper limit is machine's available 
cores. But observation from my production cluster(1000+ nodes) NameNode(24 
cores) running for months and dealing with hundreds of jobs with deletion and 
creation every day shows that it is not easy for cleaner chore to get that 
achievement, XD. And i would suggest to set it less than or equals to 
NameNodes's core number for safety concern.
bq. These details should probably be in the documentation about the config.
Get it, i will write it in hbase-default.xml with description.

> Support multi threads in CleanerChore
> -------------------------------------
>
>                 Key: HBASE-18309
>                 URL: https://issues.apache.org/jira/browse/HBASE-18309
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>            Reporter: binlijin
>            Assignee: Reid Chan
>         Attachments: HBASE-18309.master.001.patch, 
> HBASE-18309.master.002.patch
>
>
> There is only one thread in LogCleaner to clean oldWALs and in our big 
> cluster we find this is not enough. The number of files under oldWALs reach 
> the max-directory-items limit of HDFS and cause region server crash, so we 
> use multi threads for LogCleaner and the crash not happened any more.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to