[
https://issues.apache.org/jira/browse/HBASE-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Kyle Purtell resolved HBASE-12626.
-----------------------------------------
Assignee: (was: Michael Stack)
Resolution: Fixed
This has been resolved by subsequent work on parallelizing the cleaner task
workers.
> Archive cleaner cannot keep up; it maxes out at about 400k deletes/hour
> -----------------------------------------------------------------------
>
> Key: HBASE-12626
> URL: https://issues.apache.org/jira/browse/HBASE-12626
> Project: HBase
> Issue Type: Improvement
> Components: master, scaling
> Affects Versions: 0.94.25
> Reporter: Michael Stack
> Priority: Critical
>
> On big clusters, it is possible to overrun the archive cleaning thread. Make
> it able to do more work per cycle when needed.
> We saw this on a user's cluster. The rate at which files were being moved to
> the archive exceeded our delete rate such that the archive had tens of
> millions of files putting a friction on all cluster ops.
> The cluster had ~500 nodes. It that was RAM constrained (other processes on
> box also need RAM). Over a period of days, the loading was thrown off kilter
> because it started taking double writes going from one schema to another
> (Cluster was running hot before the double loading). The master was deleting
> an archived file every 9ms on average, about 400k deletes an hour. The
> constrained RAM and their having 4-5 column famiilies had them creating files
> in excess of this rate so we backed up.
> For some helpful background/input, see the dev thread
> http://search-hadoop.com/m/DHED4UYSF9
--
This message was sent by Atlassian Jira
(v8.20.7#820007)