Hi Bryan,

That indeed did the trick (2.8.0 has been out for a while).

All the tests passed. For future reference, and for anybody trying this on a 
HDInsight managed node, this is what I had to do

Create a wasb-site.xml with inside the following

<configuration>
<property>
    <name>fs.wasb.impl</name>
    <value>org.apache.hadoop.fs.azure.NativeAzureFileSystem</value>
</property>
</configuration>

In the Hadoop configuration resources, I’ve put

/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml,/path/to/wasb-site.xml

And in the Additional Classpath resource

/usr/hdp/current/hadoop-client,/usr/hdp/current/hadoop-client/lib

That way, if the edge node is managed by HDInsight/Ambari, there’s *no* need to 
pass an unencrypted storage key, as it will automatically use the (encrypted) 
one provided by HDInsight.

Thanks for the pointers everyone!

Cheers,

Giovanni Lanzani
Chief Science Officer GoDataDriven
T: @gglanzani
M: +31 6 5120 6163

From: Bryan Bende<mailto:bbe...@gmail.com>
Sent: Tuesday, April 4, 2017 5:41 PM
To: users@nifi.apache.org<mailto:users@nifi.apache.org>
Subject: Re: GetHDFS from Azure Blob

Giovanni,

In the pom.xml at the root of the NiFi source tree:

<hadoop.version>2.7.3</hadoop.version>

You can change that to 2.8.0 (if 2.8.0 is released) and then run a
full build, assuming 2.8.0 doesn't break any code that NiFi is using.

I don't really view this as the same issue as NIFI-1922... even if the
solution to NIFI-1922 was to directly bundle the Azure/Wasb JARs in
NiFi, we would still have to bundle the JARs that are compatible with
the Hadoop client we are using, which is currently 2.7.3.

In the future when we have an extension registry, we could presumably
publish variations of the nifi-hadoop-nar + nifi-hadoop-libraries-nar
built against different versions of the Hadoop client (2.7.x, 2.8.x,
HDP, CDH, MapR, etc), and with the component versioning work currently
going on in master, it would be easy for people to run as many of
these in parallel as they want.

For now I think the easiest thing to do is maintain your own build of
nifi-hadoop-libraries-nar by changing the the version mentioned above.

At some point the NiFi community will likely move to a newer Hadoop
client as they come out (we fairly recently moved from 2.6.x to
2.7.x), but this a bigger decision that depends on how stable the
client is and what (if any) ramifications it has for compatibility.

Thanks,

Bryan



On Tue, Apr 4, 2017 at 11:20 AM, Giovanni Lanzani
<giovannilanz...@godatadriven.com> wrote:
> Hi Brian,
>
> Thanks for the reply.
>
> Is there a way to compile NiFi using the Hadoop 2.8.0 libraries?
>
> It's of course unfortunate, but the libraries you mentioned before works in 
> their very specific version. Once you use a newer version (like 
> azure-storage-2.2.0) then things seem to break.
>
> Maybe this jira [^1] could be reopened then? 😊
>
> Cheers,
>
> Giovanni
>
> [^1]: https://issues.apache.org/jira/browse/NIFI-1922
>
>> -----Original Message-----
>> From: Bryan Bende [mailto:bbe...@gmail.com]
>> Sent: Tuesday, April 4, 2017 3:59 PM
>> To: users@nifi.apache.org
>> Subject: Re: GetHDFS from Azure Blob
>>
>> Giovanni,
>>
>> I'm not that familiar with using a key provider, but NiFi currently bundles 
>> the
>> Hadoop 2.7.3 client, and looking at ProviderUtils from 2.7.3, there doesn't
>> appear to be a method
>> "excludeIncompatibleCredentialProviders":
>>
>> https://github.com/apache/hadoop/blob/release-2.7.3-RC2/hadoop-common-
>> project/hadoop-
>> common/src/main/java/org/apache/hadoop/security/ProviderUtils.java
>>
>> It looks like it is introduced in 2.8.0:
>>
>> https://github.com/apache/hadoop/blob/release-2.8.0-RC3/hadoop-common-
>> project/hadoop-
>> common/src/main/java/org/apache/hadoop/security/ProviderUtils.java
>>
>> Most likely some code that is present in one of the JARs specified through
>> Additional Resources is dependent on Hadoop 2.8.0, and since NiFi is bundling
>> 2.7.3, there are some things not lining up.
>>
>> -Bryan
>>
>>
>> On Tue, Apr 4, 2017 at 9:50 AM, Giovanni Lanzani
>> <giovannilanz...@godatadriven.com> wrote:
>> > Bryan,
>> >
>> > Allow me to chime in (to ask for help).
>> >
>> > What about when I'm using an encrypted key?
>> >
>> > In my case I have (in core-site.xml)
>> >
>> >    <property>
>> >
>> <name>fs.azure.account.keyprovider.nsanalyticsstorage.blob.core.windows.ne
>> t</name>
>> >       <value>org.apache.hadoop.fs.azure.ShellDecryptionKeyProvider</value>
>> >     </property>
>> >
>> > Everything works from the command line (hdfs dfs).
>> >
>> > But NiFi complains with:
>> >
>> > java.lang.NoSuchMethodError:
>> > org.apache.hadoop.security.ProviderUtils.excludeIncompatibleCredential
>> > Providers
>> >
>> > Any ideas? I've already linked hadoop-commons.jar as well (besides what you
>> suggested below).
>> >
>> > Cheers,
>> >
>> > Giovanni
>> >
>> >
>> >> -----Original Message-----
>> >> From: Bryan Bende [mailto:bbe...@gmail.com]
>> >> Sent: Tuesday, March 28, 2017 7:41 PM
>> >> To: users@nifi.apache.org
>> >> Subject: Re: GetHDFS from Azure Blob
>> >>
>> >> Austin,
>> >>
>> >> Can you provide the full error message and stacktrace for  the
>> >> IllegalArgumentException from nifi-app.log?
>> >>
>> >> When you start the processor it creates a FileSystem instance based
>> >> on the config files provided to the processor, which in turn causes
>> >> all of the corresponding classes to load.
>> >>
>> >> I'm not that familiar with Azure, but if "Azure blob store" is WASB,
>> >> then I have successfully done the following...
>> >>
>> >> In core-site.xml:
>> >>
>> >> <configuration>
>> >>
>> >>     <property>
>> >>       <name>fs.defaultFS</name>
>> >>       <value>wasb://YOUR_USER@YOUR_HOST/</value>
>> >>     </property>
>> >>
>> >>     <property>
>> >>       <name>fs.azure.account.key.nifi.blob.core.windows.net</name>
>> >>       <value>YOUR_KEY</value>
>> >>     </property>
>> >>
>> >>     <property>
>> >>       <name>fs.AbstractFileSystem.wasb.impl</name>
>> >>       <value>org.apache.hadoop.fs.azure.Wasb</value>
>> >>     </property>
>> >>
>> >>     <property>
>> >>       <name>fs.wasb.impl</name>
>> >>       <value>org.apache.hadoop.fs.azure.NativeAzureFileSystem</value>
>> >>     </property>
>> >>
>> >>     <property>
>> >>       <name>fs.azure.skip.metrics</name>
>> >>       <value>true</value>
>> >>     </property>
>> >>
>> >> </configuration>
>> >>
>> >> In Additional Resources property of an HDFS processor, point to a
>> >> directory
>> >> with:
>> >>
>> >> azure-storage-2.0.0.jar
>> >> commons-codec-1.6.jar
>> >> commons-lang3-3.3.2.jar
>> >> commons-logging-1.1.1.jar
>> >> guava-11.0.2.jar
>> >> hadoop-azure-2.7.3.jar
>> >> httpclient-4.2.5.jar
>> >> httpcore-4.2.4.jar
>> >> jackson-core-2.2.3.jar
>> >> jsr305-1.3.9.jar
>> >> slf4j-api-1.7.5.jar
>> >>
>> >>
>> >> Thanks,
>> >>
>> >> Bryan
>> >>
>> >>
>> >> On Tue, Mar 28, 2017 at 1:15 PM, Austin Heyne <ahe...@ccri.com> wrote:
>> >> > Hi all,
>> >> >
>> >> > Thanks for all the help you've given me so far. Today I'm trying to
>> >> > pull files from an Azure blob store. I've done some reading on this
>> >> > and from previous tickets [1] and guides [2] it seems the
>> >> > recommended approach is to place the required jars, to use the HDFS
>> >> > Azure protocol, in 'Additional Classpath Resoures' and the hadoop
>> >> > core-site and hdfs-site configs into the 'Hadoop Configuration
>> >> > Resources'. I have my local HDFS properly configured to access wasb
>> >> > urls. I'm able to ls,
>> >> copy to and from, etc with out problem.
>> >> > Using the same HDFS config files and trying both all the jars in my
>> >> > hadoop-client/lib directory (hdp) and using the jars recommend in
>> >> > [1] I'm still seeing the "java.lang.IllegalArgumentException: Wrong FS: 
>> >> > "
>> >> > error in my NiFi logs and am unable to pull files from Azure blob 
>> >> > storage.
>> >> >
>> >> > Interestingly, it seems the processor is spinning up way to fast,
>> >> > the errors appear in the log as soon as I start the processor. I'm
>> >> > not sure how it could be loading all of those jars that quickly.
>> >> >
>> >> > Does anyone have any experience with this or recommendations to try?
>> >> >
>> >> > Thanks,
>> >> > Austin
>> >> >
>> >> > [1] https://issues.apache.org/jira/browse/NIFI-1922
>> >> > [2]
>> >> > https://community.hortonworks.com/articles/71916/connecting-to-azur
>> >> > e-d
>> >> > ata-lake-from-a-nifi-dataflow.html
>> >> >
>> >> >

Reply via email to