[ https://issues.apache.org/jira/browse/KAFKA-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rajini Sivaram resolved KAFKA-7902. ----------------------------------- Resolution: Fixed Fix Version/s: 2.1.1 2.2.0 > SASL/OAUTHBEARER can become unable to connect: > javax.security.sasl.SaslException: Unable to find OAuth Bearer token in > Subject's private credentials (size=2) > -------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-7902 > URL: https://issues.apache.org/jira/browse/KAFKA-7902 > Project: Kafka > Issue Type: Bug > Components: clients > Affects Versions: 2.0.0, 2.0.1, 2.1.0 > Reporter: Ron Dagostino > Assignee: Ron Dagostino > Priority: Major > Fix For: 2.2.0, 2.1.1 > > > It is possible for a Java SASL/OAUTHBEARER client (either a non-broker > producer/consumer client or a broker when acting as an inter-broker client) > to end up in a state where it cannot connect to a new broker (or, if > re-authentication as implemented by KIP-368 and merged for v2.2.0 were to be > deployed and enabled, to be unable to re-authenticate). The error message > looks like this: > {{Connection to node 1 failed authentication due to: An error: > (java.security.PrivilegedActionException: javax.security.sasl.SaslException: > Unable to find OAuth Bearer token in Subject's private credentials (size=2) > [Caused by java.io.IOException: Unable to find OAuth Bearer token in > Subject's private credentials (size=2)]) occurred when evaluating SASL token > received from the Kafka Broker. Kafka Client will go to AUTHENTICATION_FAILED > state.}} > The root cause of the problem begins at this point in the code: > [https://github.com/apache/kafka/blob/2.0/clients/src/main/java/org/apache/kafka/common/security/oauthbearer/internals/expiring/ExpiringCredentialRefreshingLogin.java#L378]: > The {{loginContext}} field doesn't get replaced with the old version stored > away in the {{optionalLoginContextToLogout}} variable if/when the > {{loginContext.login()}} call on line 381 throws an exception. *This is an > unusual event* – the OAuth authorization server must be unavailable at the > moment when the token refresh occurs – but when it does happen it puts the > refresher thread instance in an invalid state because now its > {{loginContext}} field represents the one that failed instead of the original > one, which is now lost. The current {{loginContext}} can't be logged out – > it will throw an {{InvalidStateException}} if that is attempted because there > is no token associated with it -- and the token associated with the login > context that was lost can never be logged out and removed from the Subject's > private credentials (because we don't retain a reference to it). The net > effect is that we end up with an extra token on the Subject's private > credentials, which eventually results in the exception mentioned above when > the client tries to authenticate to a broker. > So the chain of events is: > 1) login failure upon token refresh causes the refresher thread's login > context field to be incorrect, and the existing token on the Subject's > private credentials will never be logged out/removed > 2) retry occurs in 10 seconds, potentially repeatedly until the > authorization server is back online > 3) login succeeds, adding a second token to the Subject's private > credentials (logout is then called on the login context set incorrectly in > the most recent failure -- e.g. in step 1 -- which results in an exception, > but this is not the real issue -- it is the 2 tokens on the Subject's private > credentials that is the issue) > 4) At this point we now have 2 tokens on the Subject, and then at some point > in the future the client tries to make a new connection, it sees the 2 tokens > and throws an exception – BOOM! The client is now unable to connect (or > re-authenticate if applicable) going forward. -- This message was sent by Atlassian JIRA (v7.6.3#76005)