Krzysztof Adamski created HDFS-14013: ----------------------------------------
Summary: HDFS ZKFC on standby NN not starting with credentials stored in hdfs Key: HDFS-14013 URL: https://issues.apache.org/jira/browse/HDFS-14013 Project: Hadoop HDFS Issue Type: Bug Components: hdfs Affects Versions: 3.1.1 Reporter: Krzysztof Adamski Attachments: hadoop-hdfs-zkfc-server1.log HDFS ZKFailoverController is not starting correctly on strandby NameNode when credential provideris stored in hdfs. Removing credential provider entry from core-site helps. See full exception stack attached. It looks like if it only checks the credentials on the same host namenode, not redirects to the active one. It may make sense to delay credential check after active namenode is elected and redirect to the active namenode as well. {code:java} 2018-10-22 08:17:09,251 FATAL tools.DFSZKFailoverController (DFSZKFailoverController.java:main(197)) - DFSZKFailOverController exiting due to earlier exception java.io.IOException: Configuration problem with provider path. 2018-10-22 08:17:09,252 DEBUG util.ExitUtil (ExitUtil.java:terminate(209)) - Exiting with status 1: java.io.IOException: Configuration problem with provider path. 1: java.io.IOException: Configuration problem with provider path. at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:265) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:199) Caused by: java.io.IOException: Configuration problem with provider path. at org.apache.hadoop.conf.Configuration.getPasswordFromCredentialProviders(Configuration.java:2363) at org.apache.hadoop.conf.Configuration.getPassword(Configuration.java:2282) at org.apache.hadoop.security.SecurityUtil.getZKAuthInfos(SecurityUtil.java:732) at org.apache.hadoop.ha.ZKFailoverController.initZK(ZKFailoverController.java:343) at org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:194) at org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:60) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:175) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:480) at org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:171) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:195) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1951) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1427) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3100) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1154) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:966) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1497) at org.apache.hadoop.ipc.Client.call(Client.java:1443) at org.apache.hadoop.ipc.Client.call(Client.java:1353) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:900) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1654) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1580) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1595) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1734) at org.apache.hadoop.security.alias.JavaKeyStoreProvider.keystoreExists(JavaKeyStoreProvider.java:65) at org.apache.hadoop.security.alias.AbstractJavaKeyStoreProvider.locateKeystore(AbstractJavaKeyStoreProvider.java:319) at org.apache.hadoop.security.alias.AbstractJavaKeyStoreProvider.<init>(AbstractJavaKeyStoreProvider.java:86) at org.apache.hadoop.security.alias.JavaKeyStoreProvider.<init>(JavaKeyStoreProvider.java:49) at org.apache.hadoop.security.alias.JavaKeyStoreProvider.<init>(JavaKeyStoreProvider.java:41) at org.apache.hadoop.security.alias.JavaKeyStoreProvider$Factory.createProvider(JavaKeyStoreProvider.java:100) at org.apache.hadoop.security.alias.CredentialProviderFactory.getProviders(CredentialProviderFactory.java:73) at org.apache.hadoop.conf.Configuration.getPasswordFromCredentialProviders(Configuration.java:2344) ... 13 more{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org