Peter Bacsko created MAPREDUCE-7273:
---------------------------------------

             Summary: JHS: make sure that Kerberos relogin is performed when 
KDC becomes offline then online again
                 Key: MAPREDUCE-7273
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7273
             Project: Hadoop Map/Reduce
          Issue Type: Bug
            Reporter: Peter Bacsko
            Assignee: Peter Bacsko


In JHS, if the KDC goes offline, the IPC layer does try to relogin, but it's 
not always enough. You have to wait for 60 seconds for the next retry. In the 
meantime, if the KDC comes back, the following error might occur:

{noformat}
2020-04-09 03:27:52,075 DEBUG ipc.Server (Server.java:processSaslToken(1952)) - 
Have read input token of size 708 for processing by 
saslServer.evaluateResponse()
2020-04-09 03:27:52,077 DEBUG ipc.Server (Server.java:saslProcess(1829)) - 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
Failure unspecified at GSS-API level (Mechanism level: Invalid argument (400) - 
Cannot find key of appropriate type to decrypt AP REP - AES128 CTS mode with 
HMAC SHA1-96)]
        at 
com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:199)
...
{noformat}

When this happens, JHS has to be restarted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to