jerqi commented on issue #196:
URL: 
https://github.com/apache/incubator-uniffle/issues/196#issuecomment-1256062111

   > The stacktrace is as follow
   > 
   > ```
   > 2022-09-01 07:31:40,131 ERROR [FlushEventThreadPool] 
server.ShuffleFlushManager (ShuffleFlushManager.java:flushToFile(209)) - 
Exception happened when process flush shuffle data for ShuffleDataFlushEvent: 
eventId=0, appId=complexWriteTest_appId1, shuffleId=1, startPartition=0, 
endPartition=1
   > java.lang.RuntimeException: java.io.IOException: Failed on local 
exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate 
failed [Caused by GSSException: No valid credentials provided (Mechanism level: 
Server not found in Kerberos database (7) - Server not found in Kerberos 
database)]; Host Details : local host is: "fv-az489-314/10.1.1.91"; destination 
host is: "localhost":37279; 
   >    at 
org.apache.uniffle.storage.common.HdfsStorage.newWriteHandler(HdfsStorage.java:113)
   >    at 
org.apache.uniffle.storage.common.AbstractStorage.lambda$getOrCreateWriteHandler$2(AbstractStorage.java:50)
   >    at 
java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
   >    at 
org.apache.uniffle.storage.common.AbstractStorage.getOrCreateWriteHandler(AbstractStorage.java:50)
   >    at 
org.apache.uniffle.server.ShuffleFlushManager.flushToFile(ShuffleFlushManager.java:168)
   >    at 
org.apache.uniffle.server.ShuffleFlushManager.lambda$null$0(ShuffleFlushManager.java:100)
   >    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   >    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   >    at java.lang.Thread.run(Thread.java:750)
   > Caused by: java.io.IOException: Failed on local exception: 
java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed 
[Caused by GSSException: No valid credentials provided (Mechanism level: Server 
not found in Kerberos database (7) - Server not found in Kerberos database)]; 
Host Details : local host is: "fv-az489-314/10.1.1.91"; destination host is: 
"localhost":37279; 
   >    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782)
   >    at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1493)
   >    at org.apache.hadoop.ipc.Client.call(Client.java:1435)
   >    at org.apache.hadoop.ipc.Client.call(Client.java:1345)
   >    at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
   >    at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
   >    at com.sun.proxy.$Proxy68.getFileInfo(Unknown Source)
   >    at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:796)
   >    at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
   >    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   >    at java.lang.reflect.Method.invoke(Method.java:498)
   >    at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)
   >    at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
   >    at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
   >    at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
   >    at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
   >    at com.sun.proxy.$Proxy69.getFileInfo(Unknown Source)
   >    at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1649)
   >    at 
org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1440)
   >    at 
org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1437)
   >    at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
   >    at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1437)
   >    at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1437)
   >    at 
org.apache.uniffle.storage.handler.impl.HdfsShuffleWriteHandler.initialize(HdfsShuffleWriteHandler.java:89)
   >    at 
org.apache.uniffle.storage.handler.impl.HdfsShuffleWriteHandler.<init>(HdfsShuffleWriteHandler.java:81)
   >    at 
org.apache.uniffle.storage.common.HdfsStorage.newWriteHandler(HdfsStorage.java:108)
   >    ... 8 more
   > Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Server not found in Kerberos database (7) - Server not found 
in Kerberos database)]
   >    at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:755)
   >    at java.security.AccessController.doPrivileged(Native Method)
   >    at javax.security.auth.Subject.doAs(Subject.java:422)
   >    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
   >    at 
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:718)
   >    at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:811)
   >    at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:410)
   >    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1550)
   >    at org.apache.hadoop.ipc.Client.call(Client.java:1381)
   >    ... 31 more
   > Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused 
by GSSException: No valid credentials provided (Mechanism level: Server not 
found in Kerberos database (7) - Server not found in Kerberos database)]
   >    at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
   >    at 
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:406)
   >    at 
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:614)
   >    at org.apache.hadoop.ipc.Client$Connection.access$2200(Client.java:410)
   >    at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:798)
   >    at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:794)
   >    at java.security.AccessController.doPrivileged(Native Method)
   >    at javax.security.auth.Subject.doAs(Subject.java:422)
   >    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
   >    at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:793)
   >    ... 34 more
   > Caused by: GSSException: No valid credentials provided (Mechanism level: 
Server not found in Kerberos database (7) - Server not found in Kerberos 
database)
   >    at 
sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:772)
   >    at 
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)
   >    at 
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
   >    at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
   >    ... 43 more
   > Caused by: KrbException: Server not found in Kerberos database (7) - 
Server not found in Kerberos database
   >    at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:73)
   >    at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:226)
   >    at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:237)
   >    at 
sun.security.krb5.internal.CredentialsUtil.serviceCredsSingle(CredentialsUtil.java:477)
   >    at 
sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:340)
   >    at 
sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:314)
   >    at 
sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:169)
   >    at 
sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:490)
   >    at 
sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:695)
   >    ... 46 more
   > Caused by: KrbException: Identifier doesn't match expected value (906)
   >    at sun.security.krb5.internal.KDCRep.init(KDCRep.java:140)
   >    at sun.security.krb5.internal.TGSRep.init(TGSRep.java:65)
   >    at sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:60)
   >    at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:55)
   >    ... 54 more
   > ```
   
   It occurs again
   
https://github.com/apache/incubator-uniffle/actions/runs/3111709653/jobs/5044304701
   This test is still flaky.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to