[ https://issues.apache.org/jira/browse/HDFS-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Chen updated HDFS-13040: ----------------------------- Attachment: HDFS-13040.02.patch > Kerberized inotify client fails despite kinit properly > ------------------------------------------------------ > > Key: HDFS-13040 > URL: https://issues.apache.org/jira/browse/HDFS-13040 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 2.6.0 > Environment: Kerberized, HA cluster, iNotify client, CDH5.10.2 > Reporter: Wei-Chiu Chuang > Assignee: Wei-Chiu Chuang > Priority: Major > Attachments: HDFS-13040.001.patch, HDFS-13040.02.patch, > TestDFSInotifyEventInputStreamKerberized.java, TransactionReader.java > > > This issue is similar to HDFS-10799. > HDFS-10799 turned out to be a client side issue where client is responsible > for renewing kerberos ticket actively. > However we found in a slightly setup even if client has valid Kerberos > credentials, inotify still fails. > Suppose client uses principal h...@example.com, > namenode 1 uses server principal hdfs/nn1.example....@example.com > namenode 2 uses server principal hdfs/nn2.example....@example.com > *After Namenodes starts for longer than kerberos ticket lifetime*, the client > fails with the following error: > {noformat} > 18/01/19 11:23:02 WARN security.UserGroupInformation: > PriviledgedActionException as:h...@gce.cloudera.com (auth:KERBEROS) > cause:org.apache.hadoop.ipc.RemoteException(java.io.IOException): We > encountered an error reading > https://nn2.example.com:8481/getJournal?jid=ns1&segmentTxId=8662&storageInfo=-60%3A353531113%3A0%3Acluster3, > > https://nn1.example.com:8481/getJournal?jid=ns1&segmentTxId=8662&storageInfo=-60%3A353531113%3A0%3Acluster3. > During automatic edit log failover, we noticed that all of the remaining > edit log streams are shorter than the current one! The best remaining edit > log ends at transaction 8683, but we thought we could read up to transaction > 8684. If you continue, metadata will be lost forever! > at > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:213) > at > org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.readOp(NameNodeRpcServer.java:1701) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEditsFromTxid(NameNodeRpcServer.java:1763) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getEditsFromTxid(AuthorizationProviderProxyClientProtocol.java:1011) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEditsFromTxid(ClientNamenodeProtocolServerSideTranslatorPB.java:1490) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210) > {noformat} > Typically if NameNode has an expired Kerberos ticket, the error handling for > the typical edit log tailing would let NameNode to relogin with its own > Kerberos principal. However, when inotify uses the same code path to retrieve > edits, since the current user is the inotify client's principal, unless > client uses the same principal as the NameNode, NameNode can't do it on > behalf of the client. > Therefore, a more appropriate approach is to use proxy user so that NameNode > can retrieving edits on behalf of the client. > I will attach a patch to fix it. This patch has been verified to work for a > CDH5.10.2 cluster, however it seems impossible to craft a unit test for this > fix because the way Hadoop UGI handles Kerberos credentials (I can't have a > single process that logins as two Kerberos principals simultaneously and let > them establish connection) > A possible workaround is for the inotify client to use the active NameNode's > server principal. However, that's not going to work when there's a namenode > failover, because then the client's principal will not be consistent with the > active NN's one, and then fails to authenticate. > Credit: this bug was confirmed and reproduced by [~pifta] and [~r1pp3rj4ck] -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org