[ https://issues.apache.org/jira/browse/ACCUMULO-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Elser resolved ACCUMULO-3497. ---------------------------------- Resolution: Fixed Verified this by using a clean build, generating configs with {{bootstrap_config.sh -k}} and seeing expected exception in processes. Placing the FQDN into the hosts files prevents the exception from being thrown. > Poor error when bind-address of server doesn't match with kerberos principal > ---------------------------------------------------------------------------- > > Key: ACCUMULO-3497 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3497 > Project: Accumulo > Issue Type: Improvement > Components: rpc > Reporter: Josh Elser > Assignee: Josh Elser > Fix For: 1.7.0 > > Time Spent: 10m > Remaining Estimate: 0h > > I used the generated configuration (in > {{assemble/accumulo-$VERSION-dev/accumulo-$VERSION}}) and got errors in the > master and tserver: > {panel:title=TServer} > {code} > java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: > Peer indicated failure: GSS initiate failed > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) > at > org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51) > at > org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:356) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1608) > at > org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.thrift.transport.TTransportException: Peer indicated > failure: GSS initiate failed > at > org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:190) > at > org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > ... 10 more > {code} > {panel} > {panel:title=Master} > {code} > 2015-01-19 17:07:55,505 [transport.TSaslTransport] ERROR: SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Server not > found in Kerberos database (7) - LOOKING_UP_SERVER)] > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) > at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) > at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > at > org.apache.accumulo.core.rpc.UGIAssumingTransport$1.run(UGIAssumingTransport.java:53) > at > org.apache.accumulo.core.rpc.UGIAssumingTransport$1.run(UGIAssumingTransport.java:49) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.accumulo.core.rpc.UGIAssumingTransport.open(UGIAssumingTransport.java:49) > at > org.apache.accumulo.core.rpc.ThriftUtil.createClientTransport(ThriftUtil.java:358) > at > org.apache.accumulo.core.client.impl.ThriftTransportPool.createNewTransport(ThriftTransportPool.java:478) > at > org.apache.accumulo.core.client.impl.ThriftTransportPool.getTransport(ThriftTransportPool.java:411) > at > org.apache.accumulo.core.client.impl.ThriftTransportPool.getTransport(ThriftTransportPool.java:389) > at > org.apache.accumulo.core.rpc.ThriftUtil.getClient(ThriftUtil.java:122) > at > org.apache.accumulo.server.master.LiveTServerSet$TServerConnection.halt(LiveTServerSet.java:118) > at > org.apache.accumulo.master.Master.gatherTableInformation(Master.java:1009) > at org.apache.accumulo.master.Master.access$600(Master.java:160) > at > org.apache.accumulo.master.Master$StatusThread.updateStatus(Master.java:911) > at org.apache.accumulo.master.Master$StatusThread.run(Master.java:901) > Caused by: GSSException: No valid credentials provided (Mechanism level: > Server not found in Kerberos database (7) - LOOKING_UP_SERVER) > at > sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:710) > at > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248) > at > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193) > ... 19 more > Caused by: KrbException: Server not found in Kerberos database (7) - > LOOKING_UP_SERVER > at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:73) > at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:192) > at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:203) > at > sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:309) > at > sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:115) > at > sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:454) > at > sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:641) > ... 22 more > {code} > {panel} > This error occurs due to fact that DNS is so closely tied to the > authentication. The default configuration used {{localhost}} instead of the > FQDN in hosts files (masters, monitors, slaves, tracers, gc). This ultimately > created a mismatch between the instance component of the kerberos principal > (I used the FQDN) while the thrift server using the FQDN. > We should detect when this happens and throw an intuitive error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)