ruanhui created HBASE-28116:
-------------------------------

             Summary: Move snapshot storage from filesystem to a separated 
HBase table
                 Key: HBASE-28116
                 URL: https://issues.apache.org/jira/browse/HBASE-28116
             Project: HBase
          Issue Type: New Feature
          Components: snapshots
            Reporter: ruanhui


As we know, rename and list are very expensive operations on object storage. 
Currently, the snapshot in hbase relies on these two operations. For example, 
when taking snapshot, we first write snapshot description and data manifest 
file to a temporary directory ,then commit it by a rename operation. When list 
all snapshots, we will scan the snapshot directory to find all completed 
snapshots.

So maybe we can try to introduce a new snapshot storage, using hbase table to 
store it.
Here are a few points from which maybe we can gain benefits:
1. make hbase easier to deploy on object storage, like s3
2. will make snapshots faster and more lightweight. In the current 
filesystem-based snapshot implementation, when consolidating snapshot manifest, 
we will first list all region manifests with a thread pool, read content and 
then delete them. When the number of regions is large, this process may take a 
lot of time. In comparison, the read and write operations of hbase tables are 
more lightweight than the read and write operations of hdfs files.
3. more likely to reduce hdfs small files



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to