[jira] [Commented] (HBASE-5196) Failure in region split after PONR could cause region hole

jirapos...@reviews.apache.org (Commented) (JIRA) Fri, 13 Jan 2012 11:33:16 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185787#comment-13185787
 ]

jirapos...@reviews.apache.org commented on HBASE-5196:
------------------------------------------------------

bq.  On 2012-01-13 19:18:26, Michael Stack wrote:
bq.  > +1 on patch so far.  In issue when you say 'if master does not get a 
chance to fix it', when is that?  Doesn't master do it when it comes on line?  
Good stuff Jimmy.
bq.  
bq.  Jimmy Xiang wrote:
bq.      There are only 3 threads to do the clean up.  If there are lots of 
(most in the cluster) region servers died, the shutdown handler may stuck in 
log splitting for quite sometime. During this period,
bq.      if the master died somehow, it won't be able to finish the clean up.  
In my case, I ran testLoadAndVerify and it brings the HDFS down to knee. So I 
restart the cluster and
bq.      end up with lots of holes in the region chain.

Makes sense.

- Michael

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3488/#review4363
-----------------------------------------------------------

On 2012-01-13 19:11:36, Jimmy Xiang wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3488/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-01-13 19:11:36)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  When the master starts up, this patch tries to scan all offline split 
parents and fix up missing daughters as the ServerShutdownHandler does.
bq.  
bq.  
bq.  This addresses bug HBASE-5196.
bq.      https://issues.apache.org/jira/browse/HBASE-5196
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java cb2f084 
bq.    
src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 
8f4f4b8 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java 
41f5dff 
bq.  
bq.  Diff: https://reviews.apache.org/r/3488/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  I test the fix in my real cluster and it does fix the problem.
bq.  
bq.  I am working on a unit test now.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jimmy
bq.  
bq.

> Failure in region split after PONR could cause region hole
> ----------------------------------------------------------
>
>                 Key: HBASE-5196
>                 URL: https://issues.apache.org/jira/browse/HBASE-5196
>             Project: HBase
>          Issue Type: Bug
>          Components: master, regionserver
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: Jimmy Xiang
>            Assignee: Jimmy Xiang
>
> If region split fails after PONR, it relies on the master ServerShutdown 
> handler to fix it.  However, if the master doesn't get a chance to fix it.  
> There will be a hole in the region chain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5196) Failure in region split after PONR could cause region hole

Reply via email to