Yes, I can do Hadoop fs -ls to mapr distribution from nifi node. Nifi node has all necessary mapr client node setup.
All the jar files those are part of re-compiled Hadoop-bundles-nar are part of class path provided. Nifi processor doesn’t seems to be resolving classes from overwritten class path. Thanks, Ravi Papisetti From: Andre <andre-li...@fucs.org> Reply-To: "andre-li...@fucs.org" <andre-li...@fucs.org> Date: Tuesday, 27 March 2018 at 8:27 PM To: Cisco Employee <rpapi...@cisco.com> Cc: "users@nifi.apache.org" <users@nifi.apache.org>, "andre-li...@fucs.org" <andre-li...@fucs.org> Subject: Re: PutHDFS with mapr Ravi, I assume the MapR client package is working and operational and you can login to the uid running NiFi and issues the following successfuly: $ maprlogin authtest $ maprlogin print $ hdfs dfs -ls / So if those fail, fix them before you proceed. If those work, I would point the issue is likely to be caused by the additional class path not being complete. From the documentation: A comma-separated list of paths to files and/or directories that will be added to the classpath. When specifying a directory, all files with in the directory will be added to the classpath, but further sub-directories will not be included. I don't have a mapr-client instance handy but my next steps would be ensuring the list of directory and subdirectories is complete and if not, add individual JAR files. It should work. On Tue, Mar 27, 2018 at 1:56 AM, Ravi Papisetti (rpapiset) <rpapi...@cisco.com<mailto:rpapi...@cisco.com>> wrote: Hi Andre, I have tried with pointing puthdfs to lib class path with: /opt/mapr/lib,/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/common,/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/common/lib I have given this value for “Additional Classpath Resources” parameter of PutHDFS processor. Getting below exception. Please note that, I have tried this in NiFi 1.5 version. 2018-03-26 14:47:51,305 ERROR [StandardProcessScheduler Thread-6] o.a.n.controller.StandardProcessorNode Failed to invoke @OnScheduled method due to java.lang.RuntimeException: Failed while executing one of processor's OnScheduled task. java.lang.RuntimeException: Failed while executing one of processor's OnScheduled task. at org.apache.nifi.controller.StandardProcessorNode.invokeTaskAsCancelableFuture(StandardProcessorNode.java:1504) at org.apache.nifi.controller.StandardProcessorNode.initiateStart(StandardProcessorNode.java:1330) at org.apache.nifi.controller.StandardProcessorNode.lambda$initiateStart$1(StandardProcessorNode.java:1358) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:206) at org.apache.nifi.controller.StandardProcessorNode.invokeTaskAsCancelableFuture(StandardProcessorNode.java:1487) ... 9 common frames omitted Caused by: java.lang.reflect.InvocationTargetException: null at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:137) at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:125) at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:70) at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotation(ReflectionUtils.java:47) at org.apache.nifi.controller.StandardProcessorNode$1.call(StandardProcessorNode.java:1334) at org.apache.nifi.controller.StandardProcessorNode$1.call(StandardProcessorNode.java:1330) ... 6 common frames omitted Caused by: java.io.IOException: No FileSystem for scheme: maprfs at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2660) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:172) at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor$1.run(AbstractHadoopProcessor.java:322) at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor$1.run(AbstractHadoopProcessor.java:319) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.getFileSystemAsUser(AbstractHadoopProcessor.java:319) at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.resetHDFSResources(AbstractHadoopProcessor.java:281) at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.abstractOnScheduled(AbstractHadoopProcessor.java:205) Thanks, Ravi Papisetti From: Andre <andre-li...@fucs.org<mailto:andre-li...@fucs.org>> Reply-To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" <users@nifi.apache.org<mailto:users@nifi.apache.org>>, "andre-li...@fucs.org<mailto:andre-li...@fucs.org>" <andre-li...@fucs.org<mailto:andre-li...@fucs.org>> Date: Sunday, 25 March 2018 at 3:56 PM To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" <users@nifi.apache.org<mailto:users@nifi.apache.org>> Subject: Re: PutHDFS with mapr Joey, Yes. The client must be installed and setup (this is a requirement for the compiled NiFi as well). Without the client installed and configured the MapR libraries (java and native) would be lost in to what ZK connect in order to get information about the CLDB (their alternative to namenode). Cheers On Mon, Mar 26, 2018 at 1:20 AM, Joey Frazee <joey.fra...@icloud.com<mailto:joey.fra...@icloud.com>> wrote: I'm kinda going on memory here because I lost some notes I had about doing this, but I think the compile against the mapr libs presumes you have also have the C-based mapr client libs on your machine at compile time and run time. I skimmed that blog post, albeit very quickly, and didn't see that explicitly mentioned in there. Using the additional jars in PutHDFS would presumably require them too. Andre, that's correct isn't it? On Mar 24, 2018, 8:26 AM -0500, Mark Payne <marka...@hotmail.com<mailto:marka...@hotmail.com>>, wrote: Andre, I knew this was possible but had no idea how. Thanks for the great explanation and associates caveats! -Mark On Mar 24, 2018, at 1:04 AM, Andre <andre-li...@fucs.org<mailto:andre-li...@fucs.org>> wrote: Ravi, There are two ways of solving this. One of them (suggested to me MapR representatives) is to deploy MapR's FUSE client to your NiFi nodes, use the PutFile processor instead of PutHDFS and let the MapR client pump coordinate the API engagement with MapR-FS. This is a very clean and robust approach, however it may have licensing implications as the FUSE client is licensed. (per node if I recall correctly). The other one is to use the out of box PutHDFS processor with a bit of configurations (it works on both Secure and Insecure clusters). Try this out Instead of recompiling PutHDFS simply point it to the mapr-client jars and use a core-site.xml with the following content: <configuration> <property> <name>fs.defaultFS</name> <value>maprfs:///</value> </property> </configuration> Please note maprclients don't play ball with kerberos nicely and you will be required to use a MapR ticket to access the system. This can be easily done by: sudo -u <whatever_user_nifi_uses> "kinit -kt /path/to/your/keytab && maprlogin kerberos" Cheers [1] https://lists.apache.org/thread.html/af9244266e89990618152bb59b5bf95c9a49dc2428ea3fa0e6aaa682@%3Cusers.nifi.apache.org%3E [2] https://cwiki.apache.org/confluence/x/zI5zAw On Fri, Mar 23, 2018 at 5:05 AM, Ravi Papisetti (rpapiset) <rpapi...@cisco.com<mailto:rpapi...@cisco.com>> wrote: Hi, I have re-compiled nifi with mapr dependencies as per instructions at http://hariology.com/integrating-mapr-fs-and-apache-nifi/ Created process flow with ListFile > FetchFile > PutHDFS. As soon as I start this process group nifi-bootstrap.log is getting filled with 2018-03-21 22:56:26,806 ERROR [NiFi logging handler] org.apache.nifi.StdErr 2018-03-21 22:56:26,8003 select failed(-1) error Invalid argument 2018-03-21 22:56:26,806 ERROR [NiFi logging handler] org.apache.nifi.StdErr 2018-03-21 22:56:26,8003 select failed(-1) error Invalid argument This log grows into GBs in minutes. I had to stop nifi to stop the flooding. I found similar issue in petaho forum: https://jira.pentaho.com/browse/PDI-16270 Any one has any thoughts why this error might be coming? Appreciate any help. Thanks, Ravi Papisetti