Josh Elser created ACCUMULO-3497:
------------------------------------
Summary: Poor error when bind-address of server doesn't match with
kerberos principal
Key: ACCUMULO-3497
URL: https://issues.apache.org/jira/browse/ACCUMULO-3497
Project: Accumulo
Issue Type: Improvement
Components: rpc
Reporter: Josh Elser
Assignee: Josh Elser
Fix For: 1.7.0
I used the generated configuration (in
{{assemble/accumulo-$VERSION-dev/accumulo-$VERSION}}) and got errors in the
master and tserver:
{panel:title=TServer}
{code}
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException:
Peer indicated failure: GSS initiate failed
at
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
at
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
at
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1608)
at
org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException: Peer indicated
failure: GSS initiate failed
at
org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:190)
at
org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
at
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
at
org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
... 10 more
{code}
{panel}
{panel:title=Master}
{code}
2015-01-19 17:07:55,505 [transport.TSaslTransport] ERROR: SASL negotiation
failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException:
No valid credentials provided (Mechanism level: Server not found in Kerberos
database (7) - LOOKING_UP_SERVER)]
at
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
at
org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
at
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
at
org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at
org.apache.accumulo.core.rpc.UGIAssumingTransport$1.run(UGIAssumingTransport.java:53)
at
org.apache.accumulo.core.rpc.UGIAssumingTransport$1.run(UGIAssumingTransport.java:49)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at
org.apache.accumulo.core.rpc.UGIAssumingTransport.open(UGIAssumingTransport.java:49)
at
org.apache.accumulo.core.rpc.ThriftUtil.createClientTransport(ThriftUtil.java:358)
at
org.apache.accumulo.core.client.impl.ThriftTransportPool.createNewTransport(ThriftTransportPool.java:478)
at
org.apache.accumulo.core.client.impl.ThriftTransportPool.getTransport(ThriftTransportPool.java:411)
at
org.apache.accumulo.core.client.impl.ThriftTransportPool.getTransport(ThriftTransportPool.java:389)
at
org.apache.accumulo.core.rpc.ThriftUtil.getClient(ThriftUtil.java:122)
at
org.apache.accumulo.server.master.LiveTServerSet$TServerConnection.halt(LiveTServerSet.java:118)
at
org.apache.accumulo.master.Master.gatherTableInformation(Master.java:1009)
at org.apache.accumulo.master.Master.access$600(Master.java:160)
at
org.apache.accumulo.master.Master$StatusThread.updateStatus(Master.java:911)
at org.apache.accumulo.master.Master$StatusThread.run(Master.java:901)
Caused by: GSSException: No valid credentials provided (Mechanism level: Server
not found in Kerberos database (7) - LOOKING_UP_SERVER)
at
sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:710)
at
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)
at
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
... 19 more
Caused by: KrbException: Server not found in Kerberos database (7) -
LOOKING_UP_SERVER
at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:73)
at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:192)
at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:203)
at
sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:309)
at
sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:115)
at
sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:454)
at
sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:641)
... 22 more
{code}
{panel}
This error occurs due to fact that DNS is so closely tied to the
authentication. The default configuration used {{localhost}} instead of the
FQDN in hosts files (masters, monitors, slaves, tracers, gc). This ultimately
created a mismatch between the instance component of the kerberos principal (I
used the FQDN) while the thrift server using the FQDN.
We should detect when this happens and throw an intuitive error.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)