[ https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537060#comment-13537060 ]
Ted Yu commented on HBASE-7403: ------------------------------- Nice work, Chunhui. This is related to HBASE-5487: Generic framework for Master-coordinated tasks Slides 12 to 14 give flow chart. It would be nice if true / false conditions are labeled for branch node. 'hbase.master.thread.merge': would 'hbase.master.merge.threads' be better name for the config ? For trunk patch, the new classes should be annotated for audience and stability. Will take a closer look at the patch. > Online Merge > ------------ > > Key: HBASE-7403 > URL: https://issues.apache.org/jira/browse/HBASE-7403 > Project: HBase > Issue Type: New Feature > Affects Versions: 0.94.3 > Reporter: chunhui shen > Assignee: chunhui shen > Fix For: 0.96.0, 0.94.5 > > Attachments: hbase-7403-94v1.patch, hbase-7403-trunkv1.patch, merge > region.pdf > > > We need merge in the following cases: > 1.Region hole or region overlap, can’t be fix by hbck > 2.Region become empty because of TTL and not reasonable Rowkey design > 3.Region is always empty or very small because of presplit when create table > 4.Too many empty or small regions would reduce the system performance(e.g. > mslab) > Current merge tools only support offline and are not able to redo if > exception is thrown in the process of merging, causing a dirty data > For online system, we need a online merge. > This implement logic of this patch for Online Merge is : > For example, merge regionA and regionB into regionC > 1.Offline the two regions A and B > 2.Merge the two regions in the HDFS(Create regionC’s directory, move > regionA’s and regionB’s file to regionC’s directory, delete regionA’s and > regionB’s directory) > 3.Add the merged regionC to .META. > 4.Assign the merged regionC > As design of this patch , once we do the merge work in the HDFS,we could redo > it until successful if it throws exception or abort or server restart, but > couldn’t be rolled back. > It depends on > Use zookeeper to record the transaction journal state, make redo easier > Use zookeeper to send/receive merge request > Merge transaction is executed on the master > Support calling merge request through API or shell tool > About the merge process, please see the attachment and patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira