[jira] [Commented] (HADOOP-8163) Improve ActiveStandbyElector to provide hooks for fencing old active

Bikas Saha (Commented) (JIRA) Thu, 22 Mar 2012 16:02:45 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236151#comment-13236151
 ]


Bikas Saha commented on HADOOP-8163:
------------------------------------

Thanks for clarifying! I was afraid I would have to look out for code changes 
that might have been covered by the "cleaned up" tests. The new test code looks 
cleaner. Thanks! :) I still have to get my head around the changes to the real 
ZK test :P

- consider renaming zkMostRecentFilePath to something like 
zkMostRecentActiveFilePath. zkMostRecentFilePath is open to being 
misunderstood. Same for MOST_RECENT_FILENAME.

- actually MostRecent seems to be a misnomer to me. I think it actually is 
LockOwnerInfo/LeaderInfo. zkLockOwnerInfoPath/tryDeleteLeaderInfo etc.

- why is ensureBaseNode() needed? In it we are creating a new set of znodes 
with the given zkAcls which may or may not be the correct thing. eg. if the 
admin simply forgot to create the appropriate znode path before starting the 
service it might be ok to fail. Instead of trying to create the path ourselves 
with permissions that may or may not be appropriate for the entire path. I 
would be wary of doing this. What is the use case?

- consider renaming baseNodeExists() to parentNodeExists() or renaming the 
parentZnodeName parameter in the constructor to baseNode for consistency. 
Perhaps this could be called in the constructor to check that the znode exists 
and be done with config issues. No need for ensureBaseNode() above.

- this must be my newbie java skills but I find something like - 
prefixPath.append("/").append(pathParts[index]) or znodeWorkingDir.subString(0, 
znodeWorkingDir.nextIndexOf('/')) - more readable than prefixPath = 
Joiner.on("/").join(Arrays.asList(pathParts).subList(0, i)). It might also be 
more efficient but thats not relevant for this situation.

- public synchronized void quitElection(boolean needFence) - Dont we want to 
delete the permanent znode for standby's too? Why check if state is active. It 
anyways calls a tryDelete* method that should be harmless.

- tryDeleteMostRecentNode() - From my understanding of "tryFunction" - this 
function should be not really be asserting that some state holds. If it should 
assert then we should remove "try" from the name.

- in zkDoWithRetries there is a NUM_RETRIES field that could be used instead of 
3.

- why are we exposing "public synchronized ZooKeeper getZKClient()"? What 
operations does the HA service need to perform that needs direct access to the 
zkClient? I am stressing on this because the Elector is trying to provide an 
interface for leader election and management protocol. Hence, any necessary 
activity should be available through the interface. The client of the Elector 
should not manipulate/read/interact directly with ZK. If it needs ZK for some 
other activity then it could create another ZK client. Making this a public 
method adds it to the public API of the elector and I am wary of doing it.

- the following code seems to have issues
    LOG.info("Old node exists: " + StringUtils.byteToHexString(data));

    appClient.fenceOldActive(data);

    try {
      deleteWithRetries(zkMostRecentFilePath, stat.getVersion());
    } catch (KeeperException e) {
      LOG.error("Hit error " + e + " trying to delete info znode " +
          "after fencing!", e);
      throw e;
    }
say appClient.fenceOldActive(data) is a long running operation (and likely to 
be so). While that is happening, the state of the world changes and this 
elector is not longer the lock owner. When appClient.fenceOldActive(data) will 
complete then the code will go ahead and delete the lockOwnerZnode at 
zkMostRecentFilePath. This node could be from the new leader who had 
successfully fenced and become active. The version number parameter might 
accidentally save us but would likely be 0 all the time.

- what happens if the leader lost the lock, tried to delete its znode, failed 
to do so, exited anyways, then became the next owner and found the existing 
mostrecent znode. I think it will try to fence itself. We could add service 
side logic that checks against this but IMO since the fencing strategy via 
permanent znodes is an artifact of ActiveStandbyElector, the elector should 
itself be guarding against this.

- It would be great to see the functional test with RealZK enhanced to cover 
some of the permanent znode use cases. I remember fixing some issues in my own 
code that were exposed to the reality of the real ZK :P

                
> Improve ActiveStandbyElector to provide hooks for fencing old active
> --------------------------------------------------------------------
>
>                 Key: HADOOP-8163
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8163
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ha
>    Affects Versions: 0.23.3, 0.24.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8163.txt, hadoop-8163.txt, hadoop-8163.txt
>
>
> When a new node becomes active in an HA setup, it may sometimes have to take 
> fencing actions against the node that was formerly active. This JIRA extends 
> the ActiveStandbyElector which adds an extra non-ephemeral node into the ZK 
> directory, which acts as a second copy of the active node's information. 
> Then, if the active loses its ZK session, the next active to be elected may 
> easily locate the unfenced node to take the appropriate actions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8163) Improve ActiveStandbyElector to provide hooks for fencing old active

Reply via email to