[ 
https://issues.apache.org/jira/browse/HBASE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13554617#comment-13554617
 ] 

Chris Trezzo commented on HBASE-2611:
-------------------------------------

[~hvash...@cs.ualberta.ca] Hmm I may have miss spoke... atomic was not the 
right word choice.

bq. But the retries in RecoverableZookeeper are not atomic... if the region 
server fails in the middle of RecoverableZooKeeper.multi, the queues will not 
get transferred.

I see that as long as a multi hasn't succeeded, all region servers will 
continue to try and failover the queues. So the problem seems to be more along 
the lines of if all region servers exhaust their multi retries, then the queues 
would get lost.

Is there ever a case in practice where we would run into this and zookeeper is 
not down?
                
> Handle RS that fails while processing the failure of another one
> ----------------------------------------------------------------
>
>                 Key: HBASE-2611
>                 URL: https://issues.apache.org/jira/browse/HBASE-2611
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Replication
>            Reporter: Jean-Daniel Cryans
>            Assignee: Himanshu Vashishtha
>             Fix For: 0.94.5
>
>         Attachments: HBase-2611-upstream-v1.patch, HBASE-2611-v2.patch
>
>
> HBASE-2223 doesn't manage region servers that fail while doing the transfer 
> of HLogs queues from other region servers that failed. Devise a reliable way 
> to do it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to