[ 
https://issues.apache.org/jira/browse/NIFI-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15155657#comment-15155657
 ] 

Oleg Zhurakousky commented on NIFI-1536:
----------------------------------------

Here is the issue:
Interaction with authentication support in Hadoop is accomplished thru 
_UserGroupInformation_ (UGI) and we are using it in 
_AbstractHadoopProcessor.resetHDFSResources(..)_
The problem is with UGI  which maintains its state in static variables which 
are essentially shared by all within the single ClassLoader. Given that NiFi 
creates ClassLoader per bundle (not per instance of such bundle), no matter how 
many instances of Put/GetHDFS one will have the UGI is shared across all of 
them. This creates a rather nasty issue since one can still provide different 
configurations pointing to different HDFS clusters and those configuration will 
be honored (as described in the first comment), yet only one configuration will 
be set in UGI sharing that UGI across all instances of Put/GetHDFS Processors.

The issue would be easily solvable by simply introducing a ClassLoader per 
bundle instance (e.g., for each instance of processor from such bundle), but 
ClassLoaders are created way to early in the process of NiFi startup - 
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-runtime/src/main/java/org/apache/nifi/NiFi.java#L111.
 Further more, ServiceLoader gets in the way later on as well, since it 
pre-instantiates all components regardless of their actual use, essentially 
loading a component and everything that is required by it. 

Need to think about it a bit.

> multiple putHDFS processors can result in put failures.
> -------------------------------------------------------
>
>                 Key: NIFI-1536
>                 URL: https://issues.apache.org/jira/browse/NIFI-1536
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 0.5.0
>         Environment: Java 1.8.0_60
> OSX Yosemite 10.10.5
>            Reporter: Matthew Clarke
>            Assignee: Oleg Zhurakousky
>             Fix For: 0.5.1
>
>
> When multiple putHDFS processors exist, the first to run loads some config 
> that is then used by other putHDFS processors that are started. 
> I have two dataflows setup.  One pushes data to a kerberized HDFS cluster 
> while the other pushes data to a totally different non kerberized HDFS 
> cluster.
> Each putHDFS is configured to use its own core-sites.xml.  If I start the 
> putHDFS that sends to the Kerberized HDFS cluster first, the other putHDFS 
> will throw an error when it tries to send data to the non-kerberized HDFS 
> cluster.
> ERROR [Timer-Driven Process Thread-7] o.apache.nifi.processors.hadoop.PutHDFS
> java.io.IOException: Failed on local exception: java.io.IOException: Server 
> asks us to fall back to SIMPLE auth, but this client is configured to only 
> allow secure connections.; Host Details : local host is: "<client 
> hostname>/<client IP>"; destination host is: "<hdfs hostname>":<hdfs port>
> ....
> Caused by: java.io.IOException: Server asks us to fall back to SIMPLE auth, 
> but this client is configured to only allow secure connections.
> Even if i stop the putHDFS that is sending to the kerberized HDFS and/or 
> restart the putHDFS sending to the non kerberized HDFS, the above error 
> persists.
> I need to restart NiFi to clear condition.  After NiFi restart, if i run the 
> putHDFS  that sends to the non-kerberized HDFS first, the putHDFS to the 
> kerberized HDFS will still work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to