[ https://issues.apache.org/jira/browse/HBASE-27590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on HBASE-27590 started by Peter Somogyi. --------------------------------------------- > Change Iterable to List in CleanerChore > --------------------------------------- > > Key: HBASE-27590 > URL: https://issues.apache.org/jira/browse/HBASE-27590 > Project: HBase > Issue Type: Improvement > Reporter: Peter Somogyi > Assignee: Peter Somogyi > Priority: Minor > > The HFileCleaners can have low performance on large /archive area when used > with slow storage like S3. The snapshot write lock in SnapshotFileCache is > held while the file metadata is fetched from S3. Due to this even with > multiple cleaner threads only a single cleaner can effectively delete files > from the archive. > File metadata collection is performed before SnapshotHFileCleaner just by > changing the passed parameter type in FileCleanerDelegate from Iterable to > List. > Running with the below cleaner configurations I observed that the lock held > in SnapshotFileCache went down from 45000ms to 100msĀ when it was running for > 1000 files in a directory. The complete evaluation and deletion for this > folder took the same time but since the file metadata fetch from S3 was done > outside of the lock the multiple cleaner threads were able to run > concurrently. > {noformat} > hbase.cleaner.directory.sorting=false > hbase.cleaner.scan.dir.concurrent.size=0.75 > hbase.regionserver.hfilecleaner.small.thread.count=16 > hbase.regionserver.hfilecleaner.large.thread.count=8 > {noformat} > The files to evaluate are already passed in a List to > CleanerChore.checkAndDeleteFiles but it is converted to an Iterable to run > the checks on the configured cleaners. -- This message was sent by Atlassian Jira (v8.20.10#820010)