[ https://issues.apache.org/jira/browse/NIFI-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16607428#comment-16607428 ]
ASF GitHub Bot commented on NIFI-5557: -------------------------------------- Github user jtstorck commented on a diff in the pull request: https://github.com/apache/nifi/pull/2971#discussion_r216031690 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java --- @@ -269,13 +272,15 @@ public Object run() { } changeOwner(context, hdfs, configuredRootDirPath, flowFile); } catch (IOException e) { - if (!Strings.isNullOrEmpty(e.getMessage()) && e.getMessage().contains(String.format("Couldn't setup connection for %s", ugi.getUserName()))) { - getLogger().error(String.format("An error occured while connecting to HDFS. Rolling back session, and penalizing flowfile %s", - flowFile.getAttribute(CoreAttributes.UUID.key()))); - session.rollback(true); - } else { - throw e; - } + boolean tgtExpired = hasCause(e, GSSException.class, gsse -> "Failed to find any Kerberos tgt".equals(gsse.getMinorString())); + if (tgtExpired) { + getLogger().error(String.format("An error occured while connecting to HDFS. Rolling back session, and penalizing flow file %s", + putFlowFile.getAttribute(CoreAttributes.UUID.key()))); + session.rollback(true); + } else { + getLogger().error("Failed to access HDFS due to {}", new Object[]{e}); + session.transfer(session.penalize(putFlowFile), REL_FAILURE); --- End diff -- @ekovacs I don't think we need to penalize on the transfer to failure here. > PutHDFS "GSSException: No valid credentials provided" when krb ticket expires > ----------------------------------------------------------------------------- > > Key: NIFI-5557 > URL: https://issues.apache.org/jira/browse/NIFI-5557 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework > Affects Versions: 1.5.0 > Reporter: Endre Kovacs > Assignee: Endre Kovacs > Priority: Major > > when using *PutHDFS* processor in a kerberized environment, with a flow > "traffic" which approximately matches or less frequent then the lifetime of > the ticket of the principal, we see this in the log: > {code:java} > INFO [Timer-Driven Process Thread-4] o.a.h.io.retry.RetryInvocationHandler > Exception while invoking getFileInfo of class > ClientNamenodeProtocolTranslatorPB over host2/ip2:8020 after 13 fail over > attempts. Trying to fail over immediately. > java.io.IOException: Failed on local exception: java.io.IOException: Couldn't > setup connection for princi...@example.com to host2.example.com/ip2:8020; > Host Details : local host is: "host1.example.com/ip1"; destination host is: > "host2.example.com":8020; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776) > at org.apache.hadoop.ipc.Client.call(Client.java:1479) > at org.apache.hadoop.ipc.Client.call(Client.java:1412) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) > at com.sun.proxy.$Proxy134.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771) > at sun.reflect.GeneratedMethodAccessor344.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy135.getFileInfo(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108) > at > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) > at > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317) > at org.apache.nifi.processors.hadoop.PutHDFS$1.run(PutHDFS.java:254) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1678) > at org.apache.nifi.processors.hadoop.PutHDFS.onTrigger(PutHDFS.java:222) > {code} > and the flowfile is routed to failure relationship. > *To reproduce:* > Create a principal in your KDC with two minutes ticket lifetime, > and set up a similar flow: > {code:java} > GetFile => putHDFS ----- success----- -> logAttributes > \ > fail > \ > -> logAttributes > {code} > copy a file to the input directory of the getFile processor. If the influx > of the flowfile is much more frequent, then the expiration time of the ticket: > {code:java} > watch -n 5 "cp book.txt /path/to/input" > {code} > then the flow will successfully run without issue. > If we adjust this, to: > {code:java} > watch -n 121 "cp book.txt /path/to/input" > {code} > then we will observe this issue. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)