[jira] [Updated] (HADOOP-9183) Potential deadlock in ActiveStandbyElector

Tom White (JIRA) Mon, 07 Jan 2013 09:20:12 -0800

     [ 
https://issues.apache.org/jira/browse/HADOOP-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Tom White updated HADOOP-9183:
------------------------------

    Attachment: HADOOP-9183.patch

This patch fixes the problem by making two changes. First, the queue of events 
in WatcherWithClientRef is dispensed with, and instead the process method 
blocks until the ZK object is set back on the watcher. This should be 
acceptable since the set operation is a simple method call, so there is minimal 
overhead. Second, the locking order ActiveStandbyElector -> 
WatcherWithClientRef is enforced, to prevent cycles.

Note also that the CountDownLatch can safely have its countDown() method called 
outside the synchronized section (which is to protect the ZK field). Indeed it 
must, since getNewZooKeeper is holding the ActiveStandbyElector object lock 
while it waits for the ZK connection event. This means that the event cannot be 
processed until the lock is released (this is the current behaviour today), but 
we need to signal that the connect event was received.
                
> Potential deadlock in ActiveStandbyElector
> ------------------------------------------
>
>                 Key: HADOOP-9183
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9183
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.0.2-alpha
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: 2_jcarder_result_1.png, 3_jcarder_result_0.png, 
> HADOOP-9183.patch
>
>
> A jcarder run found a potential deadlock in the locking of 
> ActiveStandbyElector and ActiveStandbyElector.WatcherWithClientRef. No 
> deadlock has been seen in practice, this is just a theoretical possibility at 
> the moment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9183) Potential deadlock in ActiveStandbyElector

Reply via email to