[ 
https://issues.apache.org/jira/browse/HADOOP-15487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489733#comment-16489733
 ] 

Wei-Chiu Chuang commented on HADOOP-15487:
------------------------------------------

FYI here comes another that looks eerily similar:

This one is from a NameNode on a different cluster, CDH5.13.2, jdk1.8.0_74.
{noformat}
2018-05-20 14:01:35,314 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for 
port 8020: readAndProcess from client 192.168.30.37 threw exception 
[java.lang.IllegalStateException: This ticket is no longer valid]
java.lang.IllegalStateException: This ticket is no longer valid
        at 
javax.security.auth.kerberos.KerberosTicket.toString(KerberosTicket.java:638)
        at java.lang.String.valueOf(String.java:2994)
        at java.lang.StringBuilder.append(StringBuilder.java:131)
        at sun.security.jgss.krb5.SubjectComber.findAux(SubjectComber.java:171)
        at sun.security.jgss.krb5.SubjectComber.find(SubjectComber.java:61)
        at 
sun.security.jgss.krb5.ServiceCreds.getInstance(ServiceCreds.java:127)
        at sun.security.jgss.krb5.Krb5Util.getServiceCreds(Krb5Util.java:203)
        at 
sun.security.jgss.krb5.Krb5AcceptCredential$1.run(Krb5AcceptCredential.java:74)
        at 
sun.security.jgss.krb5.Krb5AcceptCredential$1.run(Krb5AcceptCredential.java:72)
        at java.security.AccessController.doPrivileged(Native Method)
        at 
sun.security.jgss.krb5.Krb5AcceptCredential.getInstance(Krb5AcceptCredential.java:71)
        at 
sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:127)
        at 
sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:193)
        at sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:427)
        at sun.security.jgss.GSSCredentialImpl.<init>(GSSCredentialImpl.java:62)
        at 
sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:154)
        at 
com.sun.security.sasl.gsskerb.GssKrb5Server.<init>(GssKrb5Server.java:108)
        at 
com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
        at 
org.apache.hadoop.security.SaslRpcServer$FastSaslServerFactory.createSaslServer(SaslRpcServer.java:398)
        at 
org.apache.hadoop.security.SaslRpcServer$1.run(SaslRpcServer.java:164)
        at 
org.apache.hadoop.security.SaslRpcServer$1.run(SaslRpcServer.java:161)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
        at 
org.apache.hadoop.security.SaslRpcServer.create(SaslRpcServer.java:160)
        at 
org.apache.hadoop.ipc.Server$Connection.createSaslServer(Server.java:1742)
        at 
org.apache.hadoop.ipc.Server$Connection.processSaslMessage(Server.java:1522)
        at org.apache.hadoop.ipc.Server$Connection.saslProcess(Server.java:1433)
        at 
org.apache.hadoop.ipc.Server$Connection.saslReadAndProcess(Server.java:1396)
        at 
org.apache.hadoop.ipc.Server$Connection.processRpcOutOfBandRequest(Server.java:2080)
        at 
org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1920)
        at 
org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1682)
        at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:896)
        at 
org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:752)
        at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:723)
2018-05-20 14:01:35,385 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth 
successful for u...@example.com (auth:KERBEROS)
2018-05-20 14:01:35,411 INFO 
SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
 Authorization successful for u...@example.com (auth:KERBEROS) for 
protocol=interface org
.apache.hadoop.hdfs.protocol.ClientProtocol
2018-05-20 14:01:35,545 WARN org.apache.hadoop.security.UserGroupInformation: 
PriviledgedActionException as:hdfs/nn1.example....@example.com (auth:KERBEROS) 
cause:javax.security.sasl.SaslExcept
ion: GSS initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Failed to find any Kerberos tgt)]
2018-05-20 14:01:35,545 WARN org.apache.hadoop.security.UserGroupInformation: 
PriviledgedActionException as:hdfs/nn1.example....@example.com (auth:KERBEROS) 
cause:javax.security.sasl.SaslExcept
ion: GSS initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Failed to find any Kerberos tgt)]
2018-05-20 14:01:35,545 WARN org.apache.hadoop.security.UserGroupInformation: 
PriviledgedActionException as:hdfs/nn1.example....@example.com (auth:KERBEROS) 
cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
2018-05-20 14:01:35,561 WARN org.apache.hadoop.security.UserGroupInformation: 
Not attempting to re-login since the last re-login was attempted less than 60 
seconds before. Last Login=1526850095545
2018-05-20 14:01:35,562 WARN org.apache.hadoop.security.UserGroupInformation: 
Not attempting to re-login since the last re-login was attempted less than 60 
seconds before. Last Login=1526850095545
{noformat}

Maybe UGI.reloginFromKeytab() and 
SaslRpcServer$FastSaslServerFactory.createSaslServer() have race conditions?

> ConcurrentModificationException resulting in Kerberos authentication error.
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-15487
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15487
>             Project: Hadoop Common
>          Issue Type: Bug
>         Environment: CDH 5.13.3. Kerberized, Hadoop-HA, jdk1.8.0_152
>            Reporter: Wei-Chiu Chuang
>            Priority: Major
>
> We found the following exception message in a NameNode log. It seems the 
> ConcurrentModificationException caused Kerberos authentication error.
> It appears to be a JDK bug, similar to HADOOP-13433 (Race in 
> UGI.reloginFromKeytab) but the version of Hadoop (CDH5.13.3) already patched 
> HADOOP-13433. (The stacktrace also differs) This cluster runs on JDK 
> 1.8.0_152.
> {noformat}
> 2018-05-19 04:00:00,182 WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:hdfs/no...@example.com (auth:KERBEROS) 
> cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 2018-05-19 04:00:00,183 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 
> for port 8020: readAndProcess from client 10.16.20.122 threw exception 
> [java.util.ConcurrentModificationException]
> java.util.ConcurrentModificationException
>         at 
> java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966)
>         at java.util.LinkedList$ListItr.next(LinkedList.java:888)
>         at javax.security.auth.Subject$SecureSet$1.next(Subject.java:1070)
>         at javax.security.auth.Subject$ClassSet$1.run(Subject.java:1401)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1399)
>         at javax.security.auth.Subject$ClassSet.<init>(Subject.java:1372)
>         at javax.security.auth.Subject.getPrivateCredentials(Subject.java:767)
>         at 
> sun.security.jgss.krb5.SubjectComber.findAux(SubjectComber.java:127)
>         at 
> sun.security.jgss.krb5.SubjectComber.findMany(SubjectComber.java:69)
>         at 
> sun.security.jgss.krb5.ServiceCreds.getInstance(ServiceCreds.java:96)
>         at sun.security.jgss.krb5.Krb5Util.getServiceCreds(Krb5Util.java:203)
>         at 
> sun.security.jgss.krb5.Krb5AcceptCredential$1.run(Krb5AcceptCredential.java:74)
>         at 
> sun.security.jgss.krb5.Krb5AcceptCredential$1.run(Krb5AcceptCredential.java:72)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at 
> sun.security.jgss.krb5.Krb5AcceptCredential.getInstance(Krb5AcceptCredential.java:71)
>         at 
> sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:127)
>         at 
> sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:193)
>         at sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:427)
>         at 
> sun.security.jgss.GSSCredentialImpl.<init>(GSSCredentialImpl.java:62)
>         at 
> sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:154)
>         at 
> com.sun.security.sasl.gsskerb.GssKrb5Server.<init>(GssKrb5Server.java:108)
>         at 
> com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
>         at 
> org.apache.hadoop.security.SaslRpcServer$FastSaslServerFactory.createSaslServer(SaslRpcServer.java:398)
>         at 
> org.apache.hadoop.security.SaslRpcServer$1.run(SaslRpcServer.java:164)
>         at 
> org.apache.hadoop.security.SaslRpcServer$1.run(SaslRpcServer.java:161)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
>         at 
> org.apache.hadoop.security.SaslRpcServer.create(SaslRpcServer.java:160)
>         at 
> org.apache.hadoop.ipc.Server$Connection.createSaslServer(Server.java:1742)
>         at 
> org.apache.hadoop.ipc.Server$Connection.processSaslMessage(Server.java:1522)
>         at 
> org.apache.hadoop.ipc.Server$Connection.saslProcess(Server.java:1433)
>         at 
> org.apache.hadoop.ipc.Server$Connection.saslReadAndProcess(Server.java:1396)
>         at 
> org.apache.hadoop.ipc.Server$Connection.processRpcOutOfBandRequest(Server.java:2080)
>         at 
> org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1920)
>         at 
> org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1682)
>         at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:896)
>         at 
> org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:752)
>         at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:723)
> {noformat}
> We saw a few GSSException in the NN log, but only one threw the 
> ConcurrentModificationException. This NN had a failover, which is caused by 
> ZKFC having GSSException too. Suspect it's related issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to