[ 
https://issues.apache.org/jira/browse/HDFS-14059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687093#comment-16687093
 ] 

Plamen Jeliazkov commented on HDFS-14059:
-----------------------------------------

New finding. Let's call it (4):

With `dfs.ha.automatic-failover.enabled=true` still set, I am noticing that 
when I manually transition a Standby->Observer (that has ZKFC co-located), the 
ZKFC will automatically try to convert the Observer back to Standby mode. Logs 
end up looking like this:
{code}
2018-11-14 12:29:00,466 ERROR org.apache.hadoop.ha.ZKFailoverController: Local 
service NameNode at 
instance-3.pp-devcos-myhadoop.us-central1.gcp.dev.paypalinc.com/10.176.1.207:8030
 has changed the serviceState to observer. Expected was standby. Quitting 
election marking fencing necessary.
2018-11-14 12:29:00,466 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
Yielding from election
2018-11-14 12:29:00,468 INFO org.apache.zookeeper.ZooKeeper: Session: 
0x1000acb2b350012 closed
2018-11-14 12:29:00,468 INFO org.apache.zookeeper.ClientCnxn: EventThread shut 
down for session: 0x1000acb2b350012
2018-11-14 12:29:01,469 INFO org.apache.zookeeper.ZooKeeper: Initiating client 
connection, 
connectString=instance-3.pp-devcos-myhadoop.us-central1.gcp.dev.paypalinc.com:2181
 sessionTimeout=10000 
watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@2992f4e4
2018-11-14 12:29:01,471 INFO org.apache.zookeeper.ClientCnxn: Opening socket 
connection to server 
instance-3.pp-devcos-myhadoop.us-central1.gcp.dev.paypalinc.com/10.176.1.207:2181.
 Will not attempt to authenticate using SASL (unknown error)
2018-11-14 12:29:01,471 INFO org.apache.zookeeper.ClientCnxn: Socket connection 
established to 
instance-3.pp-devcos-myhadoop.us-central1.gcp.dev.paypalinc.com/10.176.1.207:2181,
 initiating session
2018-11-14 12:29:01,474 INFO org.apache.zookeeper.ClientCnxn: Session 
establishment complete on server 
instance-3.pp-devcos-myhadoop.us-central1.gcp.dev.paypalinc.com/10.176.1.207:2181,
 sessionid = 0x1000acb2b350013, negotiated timeout = 10000
2018-11-14 12:29:01,475 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session 
connected.
2018-11-14 12:29:01,479 INFO org.apache.hadoop.ha.ZKFailoverController: ZK 
Election indicated that NameNode at 
instance-3.pp-devcos-myhadoop.us-central1.gcp.dev.paypalinc.com/10.176.1.207:8030
 should become standby
2018-11-14 12:29:01,503 INFO org.apache.hadoop.ha.ZKFailoverController: 
Successfully transitioned NameNode at 
instance-3.pp-devcos-myhadoop.us-central1.gcp.dev.paypalinc.com/10.176.1.207:8030
 to standby state
{code}

With the ZKFC on the Standby killed, I am able to transition it to Observer and 
able to create directories, files, and then list status, cat, etc., as usual.

It seems we need to make a decision on whether we want to support automatic 
failover and go into ZKFC and, possibly, ZK states, or not support automatic 
failover but still support ConfiguredFailoverProxyProvider.

> Test reads from standby on a secure cluster with Configured failover
> --------------------------------------------------------------------
>
>                 Key: HDFS-14059
>                 URL: https://issues.apache.org/jira/browse/HDFS-14059
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: test
>            Reporter: Konstantin Shvachko
>            Assignee: Plamen Jeliazkov
>            Priority: Major
>
> Run standard HDFS tests to verify reading from ObserverNode on a secure HA 
> cluster with {{ConfiguredFailoverProxyProvider}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to