[
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18045677#comment-18045677
]
ASF GitHub Bot commented on HDFS-14963:
---------------------------------------
github-actions[bot] commented on PR #1700:
URL: https://github.com/apache/hadoop/pull/1700#issuecomment-3663022050
We're closing this stale PR because it has been open for 100 days with no
activity. This isn't a judgement on the merit of the PR in any way. It's just a
way of keeping the PR queue manageable.
If you feel like this was a mistake, or you would like to continue working
on it, please feel free to re-open it and ask for a committer to remove the
stale tag and review again.
Thanks all for your contribution.
> Add HDFS Client machine caching active namenode index mechanism.
> ----------------------------------------------------------------
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client
> Affects Versions: 3.1.3
> Reporter: Xudong Cao
> Assignee: Xudong Cao
> Priority: Minor
> Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from
> the 1st namenode, simply polls, and finally determines the current Active
> namenode.
> This brings at least two problems:
> # Extra failover consumption, especially in the case of frequent creation of
> clients.
> # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and
> then a client starts rpc with the 1st NN, it will be silent when failover
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd
> NN, it prints some unnecessary logs, in some scenarios, these logs will be
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
> Operation category READ is not supported in state standby. Visit
> https://s.apache.org/sbnn-error
> at
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
> ...{code}
> We can introduce a solution for this problem: in client machine, for every
> hdfs cluster, caching its current Active NameNode index in a separate cache
> file named by its uri. *Note these cache files are shared by all hdfs client
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client
> machine cache file directory is /tmp, then:
> # the ns1 cluster related cache file is /tmp/ns1
> # the ns2 cluster related cache file is /tmp/ns2
> And then:
> # When a client starts, it reads the current Active NameNode index from the
> corresponding cache file based on the target hdfs uri, and then directly make
> an rpc call toward the right ANN.
> # After each time client failovers, it need to write the latest Active
> NameNode index to the corresponding cache file based on the target hdfs uri.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]