[
https://issues.apache.org/jira/browse/HBASE-27826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18076859#comment-18076859
]
Hudson commented on HBASE-27826:
--------------------------------
Results for branch master
[build #53 on
builds.a.o|https://ci-hbase.apache.org/job/HBase-Integration-Test/job/master/53/]:
(/) *{color:green}+1 overall{color}*
----
details (if available):
(/) {color:green}+1 client integration test for 3.3.5 {color}
(/) {color:green}+1 client integration test for 3.3.5 with shaded hadoop
client{color}
(/) {color:green}+1 client integration test for 3.3.6 {color}
(/) {color:green}+1 client integration test for 3.3.6 with shaded hadoop
client{color}
(/) {color:green}+1 client integration test for 3.4.0 {color}
(/) {color:green}+1 client integration test for 3.4.0 with shaded hadoop
client{color}
(/) {color:green}+1 client integration test for 3.4.1 {color}
(/) {color:green}+1 client integration test for 3.4.1 with shaded hadoop
client{color}
(/) {color:green}+1 client integration test for 3.4.2 {color}
(/) {color:green}+1 client integration test for 3.4.2 with shaded hadoop
client{color}
(/) {color:green}+1 client integration test for 3.4.3 {color}
(/) {color:green}+1 client integration test for 3.4.3 with shaded hadoop
client{color}
> Region split and merge time while offline is O(n) with respect to number of
> store files
> ---------------------------------------------------------------------------------------
>
> Key: HBASE-27826
> URL: https://issues.apache.org/jira/browse/HBASE-27826
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.5.4
> Reporter: Andrew Kyle Purtell
> Assignee: Prathyusha
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0
>
>
> This is a significant availability issue when HFiles are on S3. =
> HBASE-26079 ({_}Use StoreFileTracker when splitting and merging{_}) changed
> the split and merge table procedure implementations to indirect through the
> StoreFileTracker implementation when selecting HFiles to be merged or split,
> rather than directly listing those using file system APIs. It also changed
> the commit logic in HRegionFileSystem to add the link/ref files on resulting
> split or merged regions to the StoreFileTracker. However, the creation of a
> link file is still a filesystem operation and creating a “file” on S3 can
> take well over a second. If, for example there are 20 store files in a
> region, which is not uncommon, after the region is taken offline for a split
> (or merge) it may require more than 20 seconds to create the link files
> before the results can be brought back online, creating a severe availability
> problem. Splits and merges are supposed to be fast, completing in less than a
> second, certainly less than a few seconds. This has been true when HFiles are
> stored on HDFS only because file creation operations there are nearly
> instantaneous.
> There are two issues but both can be handled with modifications to the store
> file tracker interface and the file based store file tracker implementation.
> When the file based store file file tracker is enabled the HFile links should
> be virtual entities that only exist in the file manifest. We do not require
> physical files in the filesystem to serve as links now. That is the magic of
> the this file tracker, the manifest file replaces requirements to list the
> filesystem.
> Then, when splitting or merging, the HFile links should be collected into a
> list and committed in one batch using a new FILE file tracker interface,
> requiring only one update of the manifest file in S3, bringing the time
> requirement for this operation to O(1) down from O[n].
--
This message was sent by Atlassian Jira
(v8.20.10#820010)