[ 
https://issues.apache.org/jira/browse/HBASE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082559#comment-13082559
 ] 

stack commented on HBASE-4015:
------------------------------

bq. But we thought of introducing a new state so that there is a clear 
distinction whether reallocation has happened or not and also handling of the 
new state may be cleaner than changing the behaviour in the existing state.

I would not be worried about changing current states.  All of this stuff is 
transient in zk and it'll be in 0.92 requiring restart so change the states.

Adding new REALLOC state seems gratuitous (I don't see the diff from OFFLINE; 
OFFLINE+servername might help in some cases).  More states make it harder to 
chase down all transition scenarios.

Looking at the diagram again, I'm not sure it addresses the issue.

Do we even need the new state to address the core timeout monitor race issue?  
The regionserver already is careful about checking states AND version number; 
i.e. if not expected state it will give up on opening or if not expected 
version it will close a region it has already opened.

The core prob. as per J-D above is that state transitions happen fine out on 
the regionserver but the master lags processing them; meantime the timeout 
monitor runs and presumes since its not seen the transition (that is likely in 
queue to process), it preempts znode setting it OFFLINE.





> Refactor the TimeoutMonitor to make it less racy
> ------------------------------------------------
>
>                 Key: HBASE-4015
>                 URL: https://issues.apache.org/jira/browse/HBASE-4015
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 0.90.3
>            Reporter: Jean-Daniel Cryans
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Blocker
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4015_1_trunk.patch, Timeoutmonitor with state 
> diagrams.pdf
>
>
> The current implementation of the TimeoutMonitor acts like a race condition 
> generator, mostly making things worse rather than better. It does it's own 
> thing for a while without caring for what's happening in the rest of the 
> master.
> The first thing that needs to happen is that the regions should not be 
> processed in one big batch, because that sometimes can take minutes to 
> process (meanwhile a region that timed out opening might have opened, then 
> what happens is it will be reassigned by the TimeoutMonitor generating the 
> never ending PENDING_OPEN situation).
> Those operations should also be done more atomically, although I'm not sure 
> how to do it in a scalable way in this case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to