[ 
https://issues.apache.org/jira/browse/HDFS-14961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978250#comment-16978250
 ] 

Ayush Saxena commented on HDFS-14961:
-------------------------------------

In this case, when the Namenode Joined election it was a Standby Namenode only. 
It is very much allowed to participate in election. We can't predict when it 
started participating in the election that by the time conclusion will come, 
the Namenode will turn to OBSERVER.
I think The case isn't that OBSERVER participated in election, It just received 
the result of the previous participation.
I think that is very much doable at the namenode side itself, by checking 
whether the state is OBSERVER and the instruction is from ZKFC, as it was 
OBSERVER, who doesn't need to participate. The ZKFC worked as normal and 
intended. Maybe staying at the culprit side(Namenode side) feels little safe to 
me. As it clarifies the idea directly, that ONN ignores any ZKFC calls as it is 
prohibited to participate in ZK election.

> Prevent ZKFC changing Observer Namenode state
> ---------------------------------------------
>
>                 Key: HDFS-14961
>                 URL: https://issues.apache.org/jira/browse/HDFS-14961
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Íñigo Goiri
>            Assignee: Ayush Saxena
>            Priority: Major
>         Attachments: HDFS-14961-01.patch, HDFS-14961-02.patch
>
>
> HDFS-14130 made ZKFC aware of the Observer Namenode and hence allows ZKFC 
> running along with the observer NOde.
> The Observer namenode isn't suppose to be part of ZKFC election process.
> But if the  Namenode was part of election, before turning into Observer by 
> transitionToObserver Command. The ZKFC still sends instruction to the 
> Namenode as a result of previous participation and sometimes tend to change 
> the state of Observer to Standby.
> This is also the reason for  failure in TestDFSZKFailoverController.
> TestDFSZKFailoverController has been consistently failing with a time out 
> waiting in testManualFailoverWithDFSHAAdmin(). In particular 
> {{waitForHAState(1, HAServiceState.OBSERVER);}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to