[ 
https://issues.apache.org/jira/browse/HBASE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080402#comment-13080402
 ] 

ramkrishna.s.vasudevan commented on HBASE-4015:
-----------------------------------------------

@Ted,
I got your point.  Actually we are planning to lookup once again after setting 
any state so that whatever was the intended state that has really been set in 
the ZK.
If master tries to change to RE_ALLOCATE it will issue the command to ZK.  By 
the time the RS would have changed it to OPENING.

Now the master will once again check if the state is RE_ALLOCATE .  If yes the 
operation is successful if not(RS has changed to OPENING) the master will 
update his inmemory state to OPENING and will wait on this state to change.
Similar is the case with RS.

(Even checking for the version also can be done by comparing with the version 
that was got from ZK and the version that the Master or RS has really intended 
to set).

In one of the ZK LeaderElection algo we did something similar to this.  One guy 
will create a sequential node.  Will know what is the node he created.
If someother guy at the same time has created another sequential node in the 
same path just before the first guy had created, the first guy's node id will 
be greater than the second guy apparently the second guy wins the race.

Is it fine Ted? Do correct me if this is wrong.  Also if i could figure 
something better i would post it.  Thanks Ted.


> Refactor the TimeoutMonitor to make it less racy
> ------------------------------------------------
>
>                 Key: HBASE-4015
>                 URL: https://issues.apache.org/jira/browse/HBASE-4015
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 0.90.3
>            Reporter: Jean-Daniel Cryans
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Blocker
>             Fix For: 0.92.0
>
>         Attachments: Timeoutmonitor with state diagrams.pdf
>
>
> The current implementation of the TimeoutMonitor acts like a race condition 
> generator, mostly making things worse rather than better. It does it's own 
> thing for a while without caring for what's happening in the rest of the 
> master.
> The first thing that needs to happen is that the regions should not be 
> processed in one big batch, because that sometimes can take minutes to 
> process (meanwhile a region that timed out opening might have opened, then 
> what happens is it will be reassigned by the TimeoutMonitor generating the 
> never ending PENDING_OPEN situation).
> Those operations should also be done more atomically, although I'm not sure 
> how to do it in a scalable way in this case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to