[jira] [Commented] (HADOOP-8163) Improve ActiveStandbyElector to provide hooks for fencing old active

Todd Lipcon (Commented) (JIRA) Thu, 22 Mar 2012 17:20:45 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236213#comment-13236213
 ]


Todd Lipcon commented on HADOOP-8163:
-------------------------------------

ilePath. zkMostRecentFilePath is open to being misunderstood. Same for 
MOST_RECENT_FILENAME.
Done

bq. actually MostRecent seems to be a misnomer to me. I think it actually is 
LockOwnerInfo/LeaderInfo. zkLockOwnerInfoPath/tryDeleteLeaderInfo etc.

It's not always the lock owner, though. Basically, we go through the following 
states:

||Time step||Lock node||MostRecentActive||Description||
|1|-|-|Startup|
|2|Node A|-|Node A acquires active lock
|3|Node A|Node A|..and writes its own info|
|4|-|Node A|A loses its ZK lease|
|5|Node B|Node A|Node B acquires active lock
|6|Node B|-|Node B fences node A|
|7|Node B|Node B|Node B writes its info|

So, in steps 3 and 7, calling it "LeaderInfo" or "LockOwnerInfo" makes sense. 
But in steps 4 and 5, it's the "PreviousLeaderInfo".

Perhaps just renaming to "LeaderBreadcrumb" or something makes more sense, 
since it's basically a bread crumb left around by the previous leader so that 
future leaders know its info.

bq. why is ensureBaseNode() needed? In it we are creating a new set of znodes 
with the given zkAcls which may or may not be the correct thing. eg. if the 
admin simply forgot to create the appropriate znode path before starting the 
service it might be ok to fail. Instead of trying to create the path ourselves 
with permissions that may or may not be appropriate for the entire path. I 
would be wary of doing this. What is the use case?

The use case is a "ZKFailoverController -formatZK" command line tool that I'm 
building into the ZKFC code. The thinking is that most administrators won't 
want to go into the ZK CLI to manually create the parent znode while installing 
HDFS. Instead, they'd rather just issue this simple command. In the case that 
they want to have varying permissions across the path, or some more complicated 
ACL, then they'll have to use the ZK CLI, but for the common case I think this 
will make deployment much simpler.

bq. consider renaming baseNodeExists() to parentNodeExists() or renaming the 
parentZnodeName parameter in the constructor to baseNode for consistency. 
Perhaps this could be called in the constructor to check that the znode exists 
and be done with config issues. No need for ensureBaseNode() above.

Renamed to parentZNodeExists and ensureParentZNode

bq. this must be my newbie java skills but I find something like - 
prefixPath.append("/").append(pathParts[index]) or znodeWorkingDir.subString(0, 
znodeWorkingDir.nextIndexOf('/')) - more readable than prefixPath = 
Joiner.on("/").join(Arrays.asList(pathParts).subList(0, i)). It might also be 
more efficient but thats not relevant for this situation.
Agreed, fixed.

bq. public synchronized void quitElection(boolean needFence) - Dont we want to 
delete the permanent znode for standby's too? Why check if state is active. It 
anyways calls a tryDelete* method that should be harmless.

If the node is standby, then the permanent znode refers to the current 
lockholder. So deleting it would incorrectly signify that whoever is active 
doesn't need to be fenced if it crashes.

bq. tryDeleteMostRecentNode() - From my understanding of "tryFunction" - this 
function should be not really be asserting that some state holds. If it should 
assert then we should remove "try" from the name.

The difference here is this: the assert() guards against programmer error. It 
is a mistake to call this function when you aren't active (see above comment). 
But if there is a ZK error (like the session got lost) it's OK to fail to 
delete it, since it just means that the node will get fenced.

bq. in zkDoWithRetries there is a NUM_RETRIES field that could be used instead 
of 3.
Fixed

bq. why are we exposing "public synchronized ZooKeeper getZKClient()"? 
Removed

bq. the following code seems to have issues... <snip>... While that is 
happening, the state of the world changes and this elector is not longer the 
lock owner. When appClient.fenceOldActive(data) will complete then the code 
will go ahead and delete the lockOwnerZnode at zkMostRecentFilePath. This node 
could be from the new leader who had successfully fenced and become active. The 
version number parameter might accidentally save us but would likely be 0 all 
the time.

This scenario is impossible for the following reason: If the state of the world 
changed and this node was no longer active, the only possible reason for that 
is that the node lost its ZK session lease. If that's the case, then it won't 
be able to issue any further commands from that client (see my conversation 
with Hari above)

bq. what happens if the leader lost the lock, tried to delete its znode, failed 
to do so, exited anyways, then became the next owner and found the existing 
mostrecent znode. I think it will try to fence itself

I'm currently handling this situation at the ZKFailoverController level. Are 
you suggesting we determine "self" by comparing the actual bytewise data of the 
znode, and skip the fence call if it's the same as this elector instance's 
appData? Seems reasonable, just want to clarify.


bq. It would be great to see the functional test with RealZK enhanced to cover 
some of the permanent znode use cases. I remember fixing some issues in my own 
code that were exposed to the reality of the real ZK 

Added a small change to the test to make sure the fencing code got exercised. I 
also plan to add some more integration-level tests using real ZK and the 
ZKFailoverController with that patch.

                
> Improve ActiveStandbyElector to provide hooks for fencing old active
> --------------------------------------------------------------------
>
>                 Key: HADOOP-8163
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8163
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ha
>    Affects Versions: 0.23.3, 0.24.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8163.txt, hadoop-8163.txt, hadoop-8163.txt
>
>
> When a new node becomes active in an HA setup, it may sometimes have to take 
> fencing actions against the node that was formerly active. This JIRA extends 
> the ActiveStandbyElector which adds an extra non-ephemeral node into the ZK 
> directory, which acts as a second copy of the active node's information. 
> Then, if the active loses its ZK session, the next active to be elected may 
> easily locate the unfenced node to take the appropriate actions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8163) Improve ActiveStandbyElector to provide hooks for fencing old active

Reply via email to