[ 
https://issues.apache.org/jira/browse/NIFI-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597766#comment-16597766
 ] 

ASF GitHub Bot commented on NIFI-5557:
--------------------------------------

Github user jtstorck commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2971#discussion_r214136451
  
    --- Diff: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java
 ---
    @@ -266,6 +268,13 @@ public Object run() {
                                 throw new 
IOException(configuredRootDirPath.toString() + " could not be created");
                             }
                             changeOwner(context, hdfs, configuredRootDirPath, 
flowFile);
    +                    } catch (IOException e) {
    +                        if (!Strings.isNullOrEmpty(e.getMessage()) && 
e.getMessage().contains(String.format("Couldn't setup connection for %s", 
ugi.getUserName()))) {
    --- End diff --
    
    @ekovacs I think we should be more selective in this check.  I don't think 
there's a better way to detect this error scenario than string matching at this 
point, but the exception stack should be inspected to see if you can find the 
GSSException as the root cause:
    `Caused by: org.ietf.jgss.GSSException: No valid credentials provided 
(Mechanism level: Failed to find any Kerberos tgt) `
    If you iterate through the causes when PutHDFS encounters an IOException, 
and see that GSSException, we can do a penalize with a session rollback.
    Otherwise, we'd want to pass the flowfile to the failure relationship.


> PutHDFS "GSSException: No valid credentials provided" when krb ticket expires
> -----------------------------------------------------------------------------
>
>                 Key: NIFI-5557
>                 URL: https://issues.apache.org/jira/browse/NIFI-5557
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.5.0
>            Reporter: Endre Kovacs
>            Assignee: Endre Kovacs
>            Priority: Major
>
> when using *PutHDFS* processor in a kerberized environment, with a flow 
> "traffic" which approximately matches or less frequent then the lifetime of 
> the ticket of the principal, we see this in the log:
> {code:java}
> INFO [Timer-Driven Process Thread-4] o.a.h.io.retry.RetryInvocationHandler 
> Exception while invoking getFileInfo of class 
> ClientNamenodeProtocolTranslatorPB over host2/ip2:8020 after 13 fail over 
> attempts. Trying to fail over immediately.
> java.io.IOException: Failed on local exception: java.io.IOException: Couldn't 
> setup connection for princi...@example.com to host2.example.com/ip2:8020; 
> Host Details : local host is: "host1.example.com/ip1"; destination host is: 
> "host2.example.com":8020; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776)
> at org.apache.hadoop.ipc.Client.call(Client.java:1479)
> at org.apache.hadoop.ipc.Client.call(Client.java:1412)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> at com.sun.proxy.$Proxy134.getFileInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
> at sun.reflect.GeneratedMethodAccessor344.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy135.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
> at org.apache.nifi.processors.hadoop.PutHDFS$1.run(PutHDFS.java:254)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:360)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1678)
> at org.apache.nifi.processors.hadoop.PutHDFS.onTrigger(PutHDFS.java:222)
> {code}
> and the flowfile is routed to failure relationship.
> *To reproduce:*
> Create a principal in your KDC with two minutes ticket lifetime,
> and set up a similar flow:
> {code:java}
> GetFile => putHDFS ----- success----- -> logAttributes
>                     \
>                      fail
>                        \
>                      -> logAttributes
> {code}
>  copy a file to the input directory of the getFile processor. If the influx 
> of the flowfile is much more frequent, then the expiration time of the ticket:
> {code:java}
> watch -n 5 "cp book.txt /path/to/input"
> {code}
> then the flow will successfully run without issue.
> If we adjust this, to:
> {code:java}
> watch -n 121 "cp book.txt /path/to/input"
> {code}
> then we will observe this issue.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to