[ 
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203921#comment-13203921
 ] 

stack commented on HBASE-5270:
------------------------------

I was taking a look through HBASE-5179 and HBASE-4748 again, the two issues 
that spawned this one (Both are in synopsis about master failover with 
concurrent servershutdown handler running).  I have also been looking at 
"HBASE-5344
[89-fb] Scan unassigned region directory on master failover".

HBASE-5179 starts out as we can miss edits if a server is discovered to be dead 
AFTER master failover has started up splitting logs because we'll notice it 
dead so will assign out its regions but before we've had a chance to split its 
logs.  The way fb deal with this in hbase-5344 is not to process zookeeper 
events that come in during master failover.  They queue them instead and only 
start in on the processing after master is up.

Chunhui does something like this in his original patch by adding any server 
currently being processed by server shutdown to the list of regionservers whose 
logs we should not split.  The fb way of halting temporarily the callback 
processing seems more airtight.

HBASE-5179 is then extended to include as in scope, the processing of servers 
carrying root and meta (hbase-4748) that crash during master failover.  We need 
to consider the cases where a server crashes AFTER master failover distributed 
log splitting has started but before we run the verifications of meta and root 
locations.

Currently we'll expire the server that is unresponsive when we go to verify 
root and meta locations.  The notion is that the meta regions will be assigned 
by the server shutdown handler.  The fb technique of turning off processing zk 
events would mess with our existing handling code here -- but I'm not too 
confident the code is going to do the right thing since it has no tests of this 
predicament and the scenarios look like they could be pretty varied (root is 
offline only, meta server has crashed only, a server with both root and meta 
has crashed, etc).  In hbase-5344, fb will go query each regionserver for the 
regions its currently hosting (and look in zk to see what rs are up).   Maybe 
we need some of this from 89-fb in trunk but I'm not clear on it just yet; 
would need more study of the current state of trunk and then of what is 
happening over in 89-fb.

One thing I think we should do to lessen the number of code paths we can take 
on failover is to do the long-talked of purge  of the root region.  This should 
cut down on the number of states we need to deal with and make reasoning about 
failure states on failover easier to reason about.
                
> Handle potential data loss due to concurrent processing of processFaileOver 
> and ServerShutdownHandler
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-5270
>                 URL: https://issues.apache.org/jira/browse/HBASE-5270
>             Project: HBase
>          Issue Type: Sub-task
>          Components: master
>            Reporter: Zhihong Yu
>             Fix For: 0.94.0, 0.92.1
>
>
> This JIRA continues the effort from HBASE-5179. Starting with Stack's 
> comments about patches for 0.92 and TRUNK:
> Reviewing 0.92v17
> isDeadServerInProgress is a new public method in ServerManager but it does 
> not seem to be used anywhere.
> Does isDeadRootServerInProgress need to be public? Ditto for meta version.
> This method param names are not right 'definitiveRootServer'; what is meant 
> by definitive? Do they need this qualifier?
> Is there anything in place to stop us expiring a server twice if its carrying 
> root and meta?
> What is difference between asking assignment manager isCarryingRoot and this 
> variable that is passed in? Should be doc'd at least. Ditto for meta.
> I think I've asked for this a few times - onlineServers needs to be 
> explained... either in javadoc or in comment. This is the param passed into 
> joinCluster. How does it arise? I think I know but am unsure. God love the 
> poor noob that comes awandering this code trying to make sense of it all.
> It looks like we get the list by trawling zk for regionserver znodes that 
> have not checked in. Don't we do this operation earlier in master setup? Are 
> we doing it again here?
> Though distributed split log is configured, we will do in master single 
> process splitting under some conditions with this patch. Its not explained in 
> code why we would do this. Why do we think master log splitting 'high 
> priority' when it could very well be slower. Should we only go this route if 
> distributed splitting is not going on. Do we know if concurrent distributed 
> log splitting and master splitting works?
> Why would we have dead servers in progress here in master startup? Because a 
> servershutdownhandler fired?
> This patch is different to the patch for 0.90. Should go into trunk first 
> with tests, then 0.92. Should it be in this issue? This issue is really hard 
> to follow now. Maybe this issue is for 0.90.x and new issue for more work on 
> this trunk patch?
> This patch needs to have the v18 differences applied.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to