[ 
https://issues.apache.org/jira/browse/HBASE-12425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317737#comment-14317737
 ] 

Hadoop QA commented on HBASE-12425:
-----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12698314/HBASE-12425.patch
  against master branch at commit b7f6a45803d6b56a2ff56ebcac6a78aee100b409.
  ATTACHMENT ID: 12698314

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.
    {color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

    {color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

    {color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

    {color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
    +As write requests are handled by the region server, they accumulate in an 
in-memory storage system called the _memstore_. Once the memstore fills, its 
content are written to disk as additional store files. This event is called a 
_memstore flush_. As store files accumulate, the RegionServer will 
<<compaction,compact>> them into fewer, larger files. After each flush or 
compaction finishes, a region split request is enqueued if the _region split 
policy_ determines that the region should be split into two.
+Since all data files in HBase are immutable, when a split happens, the newly 
created _daughter regions_ do not rewrite all the data into new files 
immediately. Instead, they create small files similar to symbolic link files, 
named 
link:http://www.google.com/url?q=http%3A%2F%2Fhbase.apache.org%2Fapidocs%2Forg%2Fapache%2Fhadoop%2Fhbase%2Fio%2FReference.html&sa=D&sntz=1&usg=AFQjCNEkCbADZ3CgKHTtGYI8bJVwp663CA[Reference
 files], which point to either the top or bottom part of the parent store file 
according to the split point. The reference file is used just like a regular 
data file, but only half of the records are considered. The region can only be 
split if there are no more references to the immutable data files of the parent 
region. Those reference files are cleaned gradually by compactions, so that the 
region will stop referring to its parents files, and can be split further.
+Although splitting the region is a local decision made by the RegionServer, 
the split process itself must coordinate with many actors. The RegionServer 
notifies the Master before and after the split, updates the `.META.` table so 
that clients can discover the new daughter regions, and rearranges the 
directory structure and data files in HDFS. Splitting is a multi-task process. 
To enable rollback in case of an error, the RegionServer keeps an in-memory 
journal about the execution state. The steps taken by the RegionServer to 
execute the split are illustrated in <<regionserver_split_process_image>>. Each 
step is labeled with its step number. Actions from RegionServers or Master are 
shown in red, while actions from the clients are show in green.
+. The RegionServer decides locally to split the region, and prepares the 
split. *THE SPLIT TRANSACTION IS STARTED.* As a first step, the RegionServer 
acquires a shared read lock on the table to prevent schema modifications during 
the splitting process. Then it creates a znode in zookeeper under 
`/hbase/region-in-transition/region-name`, and sets the znode's state to 
`SPLITTING`.
+. The Master learns about this znode, since it has a watcher for the parent 
`region-in-transition` znode.
+. The RegionServer creates a sub-directory named `.splits` under the 
parent’s `region` directory in HDFS.
+. The RegionServer closes the parent region and marks the region as offline in 
its local data structures. *THE SPLITTING REGION IS NOW OFFLINE.* At this 
point, client requests coming to the parent region will throw 
`NotServingRegionException`. The client will retry with some backoff. The 
closing region is flushed.
+. The  RegionServer creates region directories under the `.splits` directory, 
for daughter regions A and B, and creates necessary data structures. Then it 
splits the store files, in the sense that it creates two 
link:http://www.google.com/url?q=http%3A%2F%2Fhbase.apache.org%2Fapidocs%2Forg%2Fapache%2Fhadoop%2Fhbase%2Fio%2FReference.html&sa=D&sntz=1&usg=AFQjCNEkCbADZ3CgKHTtGYI8bJVwp663CA[Reference]
 files per store file in the parent region. Those reference files will point to 
the parent regions'files.
+. The RegionServer creates the actual region directory in HDFS, and moves the 
reference files for each daughter.
+. The RegionServer sends a `Put` request to the `.META.` table, to set the 
parent as offline in the `.META.` table and add information about daughter 
regions. At this point, there won’t be individual entries in `.META.` for the 
daughters. Clients will see that the parent region is split if they scan 
`.META.`, but won’t know about the daughters until they appear in `.META.`. 
Also, if this `Put` to `.META`. succeeds, the parent will be effectively split. 
If the RegionServer fails before this RPC succeeds, Master and the next Region 
Server opening the region will clean dirty state about the region split. After 
the `.META.` update, though, the region split will be rolled-forward by Master.

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

     {color:red}-1 core tests{color}.  The patch failed these unit tests:
                       
org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//artifact/patchprocess/checkstyle-aggregate.html

  Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12792//console

This message is automatically generated.

> Document the phases of the split transaction
> --------------------------------------------
>
>                 Key: HBASE-12425
>                 URL: https://issues.apache.org/jira/browse/HBASE-12425
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation
>            Reporter: Andrew Purtell
>            Assignee: Misty Stanley-Jones
>             Fix For: 2.0.0
>
>         Attachments: HBASE-12425.patch, region_split_process.png
>
>
> See PDF document attached to parent issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to