[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16769806#comment-16769806 ]
Íñigo Goiri edited comment on HDFS-14284 at 2/15/19 10:33 PM: -------------------------------------------------------------- We are getting the remote exception as this: {code} Exception in thread "main" org.apache.hadoop.ipc.RemoteException(java.io.IOException): No namenode available under nameservice ns0 at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.shouldRetry(RouterRpcClient.java:309) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:464) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:471) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeMethod(RouterRpcClient.java:367) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeSequential(RouterRpcClient.java:734) at org.apache.hadoop.hdfs.server.federation.router.RouterClientProtocol.getFileInfo(RouterClientProtocol.java:699) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getFileInfo(RouterRpcServer.java:731) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:881) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2621) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2040) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1449) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3076) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1127) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:881) {code} Maybe we can extend RemoteException to include the source (e.g., IP) of the exception. was (Author: elgoiri): We are getting the remote exception as this: {code} Exception in thread "main" org.apache.hadoop.ipc.RemoteException(java.io.IOException): No namenode available under nameservice BN2 at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.shouldRetry(RouterRpcClient.java:309) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:464) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:471) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeMethod(RouterRpcClient.java:367) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeSequential(RouterRpcClient.java:734) at org.apache.hadoop.hdfs.server.federation.router.RouterClientProtocol.getFileInfo(RouterClientProtocol.java:699) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getFileInfo(RouterRpcServer.java:731) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:881) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2621) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2040) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1449) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3076) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1127) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:881) {code} Maybe we can extend RemoteException to include the source (e.g., IP) of the exception. > RBF: Log Router identifier when reporting exceptions > ---------------------------------------------------- > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Íñigo Goiri > Priority: Major > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org