from:"\"Sumanth Chinthagunta\""

need help with Async code

2015-09-27 Thread Sumanth Chinthagunta


Hi All,
I am new to NiFi and   I'm stuck with couple issues:  

1. Unable to get hold of ControllerService  from Processor’s init method. 
I wanted to pre-set some dependencies during init phase instead of  
querying them repeatedly in onTrigger method. 
I am getting null for service and not sure what I have to pass for 
'serviceIdentifier’ . I couldn't find documentation or examples on how to give 
Identifier to a service.  


final VertxServiceInterface vertxService = (VertxServiceInterface) 
context.getControllerServiceLookup().getControllerService("VertxService”)


https://github.com/xmlking/nifi-websocket/blob/master/src/main/java/com/crossbusiness/nifi/processors/PutEventBus.java#L55
 


2. for my usecase I get data published to a topic from EventBus with following 
code. 

EventBus eb = vertx.eventBus();

eb.consumer("news.uk.sport", message -> {
System.out.println("I have received a message: " + 
message.body());
});

I am working on a date ingest processor (push based) that needs to 
listen for new messages on a topic and send to flow as FlowFile. 
In my case data source is EvenBus that expose emit messages via 
callback API. 
I am looking for ideas on how to call Processor’s onTrigger method when 
the above callback is evoked. 
Should I have to use my own intermediate queue and poll it in onTrigger 
method? 
is there a better way to trigger the  onTrigger method programmatically 
? 

Thanks 
Sumo

nifi question

2015-09-27 Thread Sumanth Chinthagunta


Hi All,
I am new to NiFi and   I'm stuck with couple issues:  

1. Unable to get hold of ControllerService  from Processor’s init method. 
I wanted to pre-set some dependencies in my Processor during init phase 
instead of  querying them repeatedly in onTrigger method. 
I am getting null for service and not sure what I have to pass for 
'serviceIdentifier’ . I couldn't find documentation or examples on how to give 
Identifier to a service.  


final VertxServiceInterface vertxService = (VertxServiceInterface) 
context.getControllerServiceLookup().getControllerService("VertxService”)


https://github.com/xmlking/nifi-websocket/blob/master/src/main/java/com/crossbusiness/nifi/processors/PutEventBus.java#L55
 


2. for my usecase I get data published to a topic from EventBus with following 
code. 

EventBus eb = vertx.eventBus();

eb.consumer("news.uk.sport", message -> {
System.out.println("I have received a message: " + 
message.body());
});

I am working on a date ingest processor (push based) that needs to 
listen for new messages on a topic and send them to flow as FlowFile. 
In my case data source is EvenBus that expose emit messages via 
callback API. 
I am looking for ideas on how to call Processor’s onTrigger method when 
the above callback is evoked. 
Should I have to use my own intermediate queue and poll it in onTrigger 
method? 
is there a better way to trigger the  onTrigger method programmatically 
? 

Thanks 
Sumo

Re: nifi question

2015-09-28 Thread Sumanth Chinthagunta

Thanks Mark for pointing me to GetTwitter example. I will try to follow this 
pattern. 
PropertyDescriptor is only available in onTrigger method via ProcessContext. I 
needed to get hold of Service from  initialize method using 
ProcessorInitializationContexts as documented in the Developer guide. this 
helps me to do expensive Service calls once not doing every time onTrigger 
method is invoked. 

Thanks 
Sumo
 

> On Sep 28, 2015, at 5:32 AM, Mark Payne  wrote:
> 
> Sumo,
> 
> The preferred mechanism for obtaining a Controller Service is to do so via a 
> PropertyDescriptor. You would
> specify that the property represents a Controller Service by using the 
> identifiesControllerService(Class) method.
> This is discussed in the "Interacting with a Controller Service" section of 
> the Developer's Guide [1].
> 
> In terms of communicating with some asynchronous process, generally the best 
> solution is to use a queue and
> then read from that queue in the onTrigger method. You will not be able to 
> call the onTrigger method properly
> yourself, as it is designed to be called by the framework at the appropriate 
> time. I would recommend using
> a BlockingQueue with a small capacity, as you do not want your Java Heap to 
> fill up with objects from this
> Processor if the Processor is stopped for a while. You can look at how this 
> is done in the GetTwitter processor,
> if you would like to have an example to look at.
> 
> Let us know if you have any more questions, or if anything is still not clear!
> 
> Thanks
> -Mark
> 
> 
> [1] 
> http://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#interacting-with-controller-service
> 
> 
> 
>> On Sep 27, 2015, at 3:53 PM, Sumanth Chinthagunta  wrote:
>> 
>> 
>> Hi All,
>> I am new to NiFi and   I'm stuck with couple issues:  
>> 
>> 1. Unable to get hold of ControllerService  from Processor’s init method. 
>>  I wanted to pre-set some dependencies in my Processor during init phase 
>> instead of  querying them repeatedly in onTrigger method. 
>>  I am getting null for service and not sure what I have to pass for 
>> 'serviceIdentifier’ . I couldn't find documentation or examples on how to 
>> give Identifier to a service.  
>> 
>> 
>>  final VertxServiceInterface vertxService = (VertxServiceInterface) 
>> context.getControllerServiceLookup().getControllerService("VertxService”)
>> 
>>  
>> https://github.com/xmlking/nifi-websocket/blob/master/src/main/java/com/crossbusiness/nifi/processors/PutEventBus.java#L55
>>  
>> <https://github.com/xmlking/nifi-websocket/blob/master/src/main/java/com/crossbusiness/nifi/processors/PutEventBus.java#L55>
>> 
>> 2. for my usecase I get data published to a topic from EventBus with 
>> following code. 
>> 
>>  EventBus eb = vertx.eventBus();
>> 
>>  eb.consumer("news.uk.sport", message -> {
>>  System.out.println("I have received a message: " + 
>> message.body());
>>  });
>>  
>>  I am working on a date ingest processor (push based) that needs to 
>> listen for new messages on a topic and send them to flow as FlowFile. 
>>  In my case data source is EvenBus that expose emit messages via 
>> callback API. 
>>  I am looking for ideas on how to call Processor’s onTrigger method when 
>> the above callback is evoked. 
>>  Should I have to use my own intermediate queue and poll it in onTrigger 
>> method? 
>>  is there a better way to trigger the  onTrigger method programmatically 
>> ? 
>> 
>> Thanks 
>> Sumo
>

Re: nifi question

2015-09-28 Thread Sumanth Chinthagunta

Great. that is exactly what I needed.

> On Sep 28, 2015, at 2:42 PM, Mark Payne  wrote:
> 
> Sumo,
> 
> Generally, the approach that we take is to perform that type of operation in 
> a method that has the @OnScheduled annotation.
> 
> Then you do it only once when the Processor is scheduled.
> 
> If you want to ensure that it happens only one time for the lifecycle of the 
> JVM, you could use a boolean to keep track of whether or
> not the action has been performed. For example:
> 
> private volatile boolean expensiveActionPerformed = false;
> 
> @OnScheduled
> public void doExpensiveSetup(final ProcessContext context) {
>if (!expensiveActionPerformed) {
> 
>// do expensive action
> 
>expensiveActionPerformed = true;
>    }
> }
> 
> 
> Thanks
> -Mark
> 
> 
>> On Sep 28, 2015, at 5:38 PM, Sumanth Chinthagunta  wrote:
>> 
>> Thanks Mark for pointing me to GetTwitter example. I will try to follow this 
>> pattern. 
>> PropertyDescriptor is only available in onTrigger method via ProcessContext. 
>> I needed to get hold of Service from  initialize method using 
>> ProcessorInitializationContexts as documented in the Developer guide. this 
>> helps me to do expensive Service calls once not doing every time onTrigger 
>> method is invoked. 
>> 
>> Thanks 
>> Sumo
>> 
>> 
>>> On Sep 28, 2015, at 5:32 AM, Mark Payne  wrote:
>>> 
>>> Sumo,
>>> 
>>> The preferred mechanism for obtaining a Controller Service is to do so via 
>>> a PropertyDescriptor. You would
>>> specify that the property represents a Controller Service by using the 
>>> identifiesControllerService(Class) method.
>>> This is discussed in the "Interacting with a Controller Service" section of 
>>> the Developer's Guide [1].
>>> 
>>> In terms of communicating with some asynchronous process, generally the 
>>> best solution is to use a queue and
>>> then read from that queue in the onTrigger method. You will not be able to 
>>> call the onTrigger method properly
>>> yourself, as it is designed to be called by the framework at the 
>>> appropriate time. I would recommend using
>>> a BlockingQueue with a small capacity, as you do not want your Java Heap to 
>>> fill up with objects from this
>>> Processor if the Processor is stopped for a while. You can look at how this 
>>> is done in the GetTwitter processor,
>>> if you would like to have an example to look at.
>>> 
>>> Let us know if you have any more questions, or if anything is still not 
>>> clear!
>>> 
>>> Thanks
>>> -Mark
>>> 
>>> 
>>> [1] 
>>> http://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#interacting-with-controller-service
>>> 
>>> 
>>> 
>>>> On Sep 27, 2015, at 3:53 PM, Sumanth Chinthagunta  
>>>> wrote:
>>>> 
>>>> 
>>>> Hi All,
>>>> I am new to NiFi and   I'm stuck with couple issues:  
>>>> 
>>>> 1. Unable to get hold of ControllerService  from Processor’s init method. 
>>>>I wanted to pre-set some dependencies in my Processor during init phase 
>>>> instead of  querying them repeatedly in onTrigger method. 
>>>>I am getting null for service and not sure what I have to pass for 
>>>> 'serviceIdentifier’ . I couldn't find documentation or examples on how to 
>>>> give Identifier to a service.  
>>>> 
>>>> 
>>>>final VertxServiceInterface vertxService = (VertxServiceInterface) 
>>>> context.getControllerServiceLookup().getControllerService("VertxService”)
>>>> 
>>>>
>>>> https://github.com/xmlking/nifi-websocket/blob/master/src/main/java/com/crossbusiness/nifi/processors/PutEventBus.java#L55
>>>>  
>>>> <https://github.com/xmlking/nifi-websocket/blob/master/src/main/java/com/crossbusiness/nifi/processors/PutEventBus.java#L55>
>>>> 
>>>> 2. for my usecase I get data published to a topic from EventBus with 
>>>> following code. 
>>>> 
>>>>EventBus eb = vertx.eventBus();
>>>> 
>>>>eb.consumer("news.uk.sport", message -> {
>>>>System.out.println("I have received a message: " + 
>>>> message.body());
>>>>});
>>>>
>>>>I am working on a date ingest processor (push based) that needs to 
>>>> listen for new messages on a topic and send them to flow as FlowFile. 
>>>>In my case data source is EvenBus that expose emit messages via 
>>>> callback API. 
>>>>I am looking for ideas on how to call Processor’s onTrigger method when 
>>>> the above callback is evoked. 
>>>>Should I have to use my own intermediate queue and poll it in onTrigger 
>>>> method? 
>>>>is there a better way to trigger the  onTrigger method programmatically 
>>>> ? 
>>>> 
>>>> Thanks 
>>>> Sumo
>>> 
>> 
>

Re: need help with Async code

2015-09-30 Thread Sumanth Chinthagunta

thanks for clarifying on getControllerServiceIdentifiers API.
I have an other question:
If I have a processor that is designed to have on side effect on FlowFile, what 
is the best/clean way to read content of the flowfile?
e.g., my processor’s only job is to log  content of FlowFile, is there a method 
like FlowFile.getContentAsString()? or do I have to do what I am doing here ? 
https://github.com/xmlking/nifi-websocket/blob/master/src/main/java/com/crossbusiness/nifi/processors/PublishEventBus.java#L102
 
<https://github.com/xmlking/nifi-websocket/blob/master/src/main/java/com/crossbusiness/nifi/processors/PublishEventBus.java#L102>

Thanks
Sumo 

> On Sep 30, 2015, at 2:52 PM, Aldrin Piri  wrote:
> 
> Sumo,
> 
> I did some digging around on your Github Repo and see that you've migrated
> your ControllerService lookup to your @OnScheduled method, making use of
> the ProcessContext.  This approach is certainly more preferred in terms of
> allowing configuration of the Processor than the prior method you outlined
> above.  That identifier is the unique ID generated by the framework for a
> new controller service.  However, to close the trail on the previous path,
> from within init, you would have needed to do something along the lines of:
> 
> context.getControllerServiceLookup().getControllerServiceIdentifiers(
> VertxServiceInterface.class)
> 
> to find all the instances available and then choose one of those
> identifiers, if any were present, at the time the processor was initialized.
> 
> To your second question, I believe you are on the right track, although I
> am not overly familiar with Vertx.  This seems to map quite closely with
> the JMS family of processors (GetJMSTopic [1] in conjunction with its
> abstract parent JmsConsumer [2]). If you find you need more granular
> control of the session, you can create a Processor that extends
> AbstractSessionFactoryProcessor instead of AbstractProcessor.
> 
> Feel free to follow up with any additional questions or details you may
> have.
> 
> Thanks!
> 
> [1]
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/GetJMSTopic.java
>  
> <https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/GetJMSTopic.java>
> [2]
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/JmsConsumer.java
>  
> <https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/JmsConsumer.java>
> 
> On Sun, Sep 27, 2015 at 3:24 PM, Sumanth Chinthagunta  <mailto:xmlk...@gmail.com>>
> wrote:
> 
>> 
>> Hi All,
>> I am new to NiFi and   I'm stuck with couple issues:
>> 
>> 1. Unable to get hold of ControllerService  from Processor’s init method.
>>I wanted to pre-set some dependencies during init phase instead
>> of  querying them repeatedly in onTrigger method.
>>I am getting null for service and not sure what I have to pass for
>> 'serviceIdentifier’ . I couldn't find documentation or examples on how to
>> give Identifier to a service.
>> 
>> 
>>final VertxServiceInterface vertxService = (VertxServiceInterface)
>> context.getControllerServiceLookup().getControllerService("VertxService”)
>> 
>> 
>> https://github.com/xmlking/nifi-websocket/blob/master/src/main/java/com/crossbusiness/nifi/processors/PutEventBus.java#L55
>> <
>> https://github.com/xmlking/nifi-websocket/blob/master/src/main/java/com/crossbusiness/nifi/processors/PutEventBus.java#L55
>>  
>> <https://github.com/xmlking/nifi-websocket/blob/master/src/main/java/com/crossbusiness/nifi/processors/PutEventBus.java#L55>
>>> 
>> 
>> 2. for my usecase I get data published to a topic from EventBus with
>> following code.
>> 
>>EventBus eb = vertx.eventBus();
>> 
>>eb.consumer("news.uk.sport", message -> {
>>System.out.println("I have received a message: " +
>> message.body());
>>});
>> 
>>I am working on a date ingest processor (push based) that needs to
>> listen for new messages on a topic and send to flow as FlowFile.
>>In my case data source is EvenBus that expose emit messages via
>> callback API.
>>I am looking for ideas on how to call Processor’s onTrigger method
>> when the above callback is evoked.
>>Should I have to use my own intermediate queue and poll it in
>> onTrigger method?
>>is there a better way to trigger the  onTrigger method
>> programmatically ?
>> 
>> Thanks
>> Sumo

attributes to json

2015-10-03 Thread Sumanth Chinthagunta

do we have attributes to json processor ? 
I am thinking to use it along with ExtractText where the matching data is 
stored in attributes. now I need to convert those attributes into json flowFile.
if we don’t have such processor, any ideas to use, existing processors to 
compose what I needed?
thanks
sumo

Re: attributes to json

2015-10-04 Thread Sumanth Chinthagunta

Thanks Bryan. 
I guess I am not reading docs. wish we have more examples and blog posts for 
each existing processor.
developers prefer to learn from examples then reading docs :-)

I am trying to consolidate all flow examples on GitHub. 
I just finished a flow that connect NiFi to streaming dashboard:  
https://github.com/xmlking/nifi-examples/tree/master/collect-stream-logs 
<https://github.com/xmlking/nifi-examples/tree/master/collect-stream-logs>

sumo

> On Oct 3, 2015, at 5:46 PM, Bryan Bende  wrote:
> 
> Hello,
> 
> You could use ReplaceText to build a new Json document from flow file
> attributes. Something like...
> 
> Replacement Text = { "myField" : "${attribute1}" }
> 
> -Bryan
> 
> On Saturday, October 3, 2015, Sumanth Chinthagunta 
> wrote:
> 
>> do we have attributes to json processor ?
>> I am thinking to use it along with ExtractText where the matching data is
>> stored in attributes. now I need to convert those attributes into json
>> flowFile.
>> if we don’t have such processor, any ideas to use, existing processors to
>> compose what I needed?
>> thanks
>> sumo
> 
> 
> 
> -- 
> Sent from Gmail Mobile

Re: NiFi Authentication Mechanisms

2015-10-05 Thread Sumanth Chinthagunta

JSON Web Tokens (JWT)  can be an option.
It will provide claims required for authorization without needing
verification with issuer.
Auth0.com has more info on this method.
JWT can be use to propagate identity along the flow so that it can be  used
later by processors

On Mon, Oct 5, 2015, 7:41 PM Rick Braddy  wrote:

> SSO is another important consideration.
>
> Spring Security looks like a winner. Very impressive list of support.
>
>
> > On Oct 5, 2015, at 9:34 PM, larry mccay  wrote:
> >
> > The wiki page seems to describe continuing to use Spring Security.
> > I believe this to be a wise choice.
> >
> > I would encourage you to try and expose the capabilities of that
> framework
> > as much as possible rather than providing support for a constrained set
> of
> > providers.
> >
> > SSO integrations are becoming important for a number of ecosystem
> projects
> > and UIs for instance.
> > The ability to add a custom authentication provider will be important for
> > such usecases.
> >
> >> On Mon, Oct 5, 2015 at 10:10 PM, Tony Kurc  wrote:
> >>
> >> I'd like to see Duo Web two-factor
> https://www.duosecurity.com/docs/duoweb
> >>
> >>> On Mon, Oct 5, 2015 at 10:00 PM, Rick Braddy 
> wrote:
> >>>
> >>> 1) Basic password authentication with Recaptcha after N failed logins
> >>> (encrypted password storage)
> >>>
> >>> 2) 2-factor Google Auth option to supplement password logins
> >>>
> >>> 3) Active Directory / Kerberos auth (with 2-factor option as well)
> >>>
>  On Oct 5, 2015, at 8:56 PM, Joe Witt  wrote:
> 
>  Thanks Rick.  If you were to say which of that you'd want 'first' and
>  then which you can see coming later please advise.
> 
>  All: Please do just that - let us know which you need 'now' and which
>  you can wait on.
> 
>  Thanks
>  Joe
> 
> > On Mon, Oct 5, 2015 at 9:53 PM, Rick Braddy 
> >>> wrote:
> > Matt,
> >
> > Here you go:
> >
> > -  2-factor Google Authenticator to supplement password auth (e.g. to
> >>> strengthen password with mobile phone onetime ID or other support
> strong
> >>> auth options)
> >
> > - Recaptcha required after N failed password login attempts to block
> >>> brute force attacks (e.g. 5 failed logins, then captcha required)
> >
> > - Password strength policies
> >
> > - PAM support provides pluggable authentication options, at least for
> >>> Linux (better than locally stored passwords)
> >
> > - Active Directory Kerberos integration (Windows native and Linux)
> >
> > If passwords to be stored locally, must be encrypted.
> >
> > Hope that helps.
> >
> > Rick
> >
> >> On Oct 5, 2015, at 8:34 PM, Matt Gilman 
> >>> wrote:
> >>
> >> All,
> >>
> >> I've started working on providing additional authentication
> >> mechanisms
> >>> for
> >> the NiFi user interface. Currently, only two way SSL using client
> >> certificates is supported to authenticate users. I would like to
> >>> inquire
> >> about which other mechanisms the community would like to see
> >>> implemented.
> >>
> >> We have created a feature proposal discussing some of the options
> >> [1].
> >>> At a
> >> high level, in additional to PKI, we are looking at
> >>
> >> - Username/password
> >> -- stored in a local configuration file (ie authorized-users.xml)
> >> -- stored in a configurable LDAP
> >> -- stored in a configurable database
> >> - Kerberos
> >> - OpenId Connect
> >>
> >> What other options are important and should be added to the list?
> >>> Thanks!
> >>
> >> Matt
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/display/NIFI/Pluggable+Authentication
> >>
>

MonitorActivity bug?

2015-10-17 Thread Sumanth Chinthagunta

Noticed "copy attributes" is implemented only for  REL_ACTIVITY_RESTORED
case.
I needed original attributes for REL_INACTIVE case as well.
wonder why it is limited to activity restored relationship only.


https://github.com/apache/nifi/blob/31fba6b3332978ca2f6a1d693f6053d719fb9daa/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/MonitorActivity.java#L208

-- 
Thanks,
Sumanth

Re: NiFi won't start with my Processor

2015-10-21 Thread Sumanth Chinthagunta

I noticed similar error when I have my flow using older version of a custom
nar and when I deployed new version of nar with package refactoring.
My workaround to bring the server back is to redeploy old nar, start nifi
and delete the flow then recreate the flow after deploying refactored nar.
Nifi should start even a nar file is replaced or removed.

On Wed, Oct 21, 2015, 9:21 AM Joe Witt  wrote:

> Ok so in seeing this part of the stackdump we know there is a
> 'classpath thing' going on
>
>"Caused by: java.lang.ClassNotFoundException:
> org.mitre.nifi.NiFiNITFReader"
>
> And we see from your provided info that your jar does contain such a class
>
>   "22243 Wed Oct 21 10:36:46 EDT 2015 org/mitre/nifi/NiFiNITFReader.class"
>
> And we see that you have a service loader manifest
>
>   "808 Wed Oct 21 10:36:42 EDT 2015
> META-INF/services/org.apache.nifi.processor.Processor"
>
> QUESTIONS:
> ---
> Can you please provide content of that service manifest file?
>
> Also can you please provide the log information that writes out on
> startup which shows the processors/extensions that load?
>
> It looks like this...
>
> 2015-10-21 09:20:32,520 INFO [main]
> org.apache.nifi.nar.ExtensionManager Extension Type Mapping to
> Classloader:
> === ProvenanceEventRepository type || Classloader ===
> org.apache.nifi.provenance.PersistentProvenanceRepository ||
>
> org.apache.nifi.nar.NarClassLoader[./work/nar/extensions/nifi-provenance-repository-nar-0.3.1-SNAPSHOT.nar-unpacked]
> org.apache.nifi.provenance.VolatileProvenanceRepository ||
>
> org.apache.nifi.nar.NarClassLoader[./work/nar/extensions/nifi-provenance-repository-nar-0.3.1-SNAPSHOT.nar-unpacked]
> === End ProvenanceEventRepository types ===
> === Processor type || Classloader ===
> org.apache.nifi.processors.hl7.RouteHL7 ||
>
> org.apache.nifi.nar.NarClassLoader[./work/nar/extensions/nifi-hl7-nar-0.3.1-SNAPSHOT.nar-unpacked]
> org.apache.nifi.processors.standard.MergeContent ||
>
> org.apache.nifi.nar.NarClassLoader[./work/nar/extensions/nifi-standard-nar-0.3.1-SNAPSHOT.nar-unpacked]
> org.apache.nifi.processors.standard.EncryptContent ||
>
> org.apache.nifi.nar.NarClassLoader[./work/nar/extensions/nifi-standard-nar-0.3.1-SNAPSHOT.nar-unpacked]
> org.apache.nifi.processors.aws.s3.PutS3Object ||
>
> org.apache.nifi.nar.NarClassLoader[./work/nar/extensions/nifi-aws-nar-0.3.1-SNAPSHOT.nar-unpacked]
> .
>
>
> Thanks!
> Joe
>
> On Wed, Oct 21, 2015 at 8:51 AM, Jones, Patrick L.  wrote:
> > Howdy,
> >
> >I have created my own processor which I have been testing
> with NiFi for a while now. This stuff used to work for me, now it doesn't.
> Today nifi won't start.  The nifi-app.log is below
> > The processor reads in an image type NITF and does a few things with
> it.  I do a:
> > $ mvn install
> >
> > from ~/nifi-0.3.0/nifi-mitre-bundle/ which is where my software is. I
> then
> > $ cp
> ~/nifi-0.3.0/nifi-mitre-bundle/nifi-mitre-nar/target/nifi-mitre-nar-1.0-SNAPSHOT.nar
> ../lib/
> >
> > When I start nifi the nifi-app.log shows the below exception with the
> bottom line being
> > Caused by: java.lang.ClassNotFoundException:
> org.mitre.nifi.NiFiNITFReader
> >
> > I looked in my lib/nifi-mitre-nar-1.0-SNAPSHOT.nar it contains
> > 0 Wed Oct 21 10:36:50 EDT 2015 META-INF/
> >152 Wed Oct 21 10:36:48 EDT 2015 META-INF/MANIFEST.MF
> >  0 Wed Oct 21 10:36:48 EDT 2015 META-INF/bundled-dependencies/
> > ...
> > 22201 Wed Oct 21 10:36:48 EDT 2015
> META-INF/bundled-dependencies/nifi-mitre-processors-1.0-SNAPSHOT.jar
> > ...
> >   2963 Wed Oct 21 10:36:48 EDT 2015 META-INF/DEPENDENCIES
> > 11358 Wed Oct 21 10:36:48 EDT 2015 META-INF/LICENSE
> >155 Wed Oct 21 10:36:48 EDT 2015 META-INF/NOTICE
> >  0 Wed Oct 21 10:36:50 EDT 2015 META-INF/maven/
> >  0 Wed Oct 21 10:36:50 EDT 2015 META-INF/maven/mitre/
> >  0 Wed Oct 21 10:36:50 EDT 2015 META-INF/maven/mitre/nifi-mitre-nar/
> >   1562 Fri Oct 09 11:19:10 EDT 2015
> META-INF/maven/mitre/nifi-mitre-nar/pom.xml
> >111 Wed Oct 21 10:36:48 EDT 2015
> META-INF/maven/mitre/nifi-mitre-nar/pom.properties
> >
> > I then unjarred the .nar file and looked at
> META-INF/bundled-dependencies/nifi-mitre-processors-1.0-SNAPSHOT.jar
> >
> > $ jar tvf
> ~/dum/META-INF/bundled-dependencies/nifi-mitre-processors-1.0-SNAPSHOT.jar
> >  0 Wed Oct 21 10:36:48 EDT 2015 META-INF/
> >412 Wed Oct 21 10:36:46 EDT 2015 META-INF/MANIFEST.MF
> >  0 Wed Oct 21 10:36:42 EDT 2015 META-INF/services/
> >  0 Wed Oct 21 10:36:44 EDT 2015 org/
> >  0 Wed Oct 21 10:36:44 EDT 2015 org/mitre/
> >  0 Wed Oct 21 10:36:46 EDT 2015 org/mitre/nifi/
> >   2730 Wed Oct 21 10:36:42 EDT 2015 META-INF/DEPENDENCIES
> > 11358 Wed Oct 21 10:36:42 EDT 2015 META-INF/LICENSE
> >162 Wed Oct 21 10:36:42 EDT 2015 META-INF/NOTICE
> >808 Wed Oct 21 10:36:42 EDT 2015
> META-INF/services/org.apache.nifi.processor.Processor

Re: NiFi won't start with my Processor

2015-10-21 Thread Sumanth Chinthagunta

My case  is simple and easy to reproduce:
1. Create a flow with custom processor (e.g.  abc.MyProcesser )
2. Make sure  flow is working. Then stop the server.
3. Replace nar in lib with new nar after refactoring processor's package (
e.g., xzy.MyProcesser)
4. When you try to start nifi , it fails with error in the log saying class
not found ( obviously coz deployed flow is still looking for old class)

My expectation is system should at least start and only throw error when
user try to start a flow that that had reference to class(processor) which
is no more available in the newly deployed nar.
I can send the my error log tonight

On Wed, Oct 21, 2015, 5:58 PM Joe Witt  wrote:

> Sumanth,
>
> Do you happen to still have the stack traces around?  Unless we do
> some breaking/API unfriendly thing such issues should simply not
> occur.  Want to make sure we understand what is happening here.
>
> Thanks
> Joe
>
> On Wed, Oct 21, 2015 at 8:56 PM, Sumanth Chinthagunta 
> wrote:
> > I noticed similar error when I have my flow using older version of a
> custom
> > nar and when I deployed new version of nar with package refactoring.
> > My workaround to bring the server back is to redeploy old nar, start nifi
> > and delete the flow then recreate the flow after deploying refactored
> nar.
> > Nifi should start even a nar file is replaced or removed.
> >
> > On Wed, Oct 21, 2015, 9:21 AM Joe Witt  wrote:
> >
> >> Ok so in seeing this part of the stackdump we know there is a
> >> 'classpath thing' going on
> >>
> >>"Caused by: java.lang.ClassNotFoundException:
> >> org.mitre.nifi.NiFiNITFReader"
> >>
> >> And we see from your provided info that your jar does contain such a
> class
> >>
> >>   "22243 Wed Oct 21 10:36:46 EDT 2015
> org/mitre/nifi/NiFiNITFReader.class"
> >>
> >> And we see that you have a service loader manifest
> >>
> >>   "808 Wed Oct 21 10:36:42 EDT 2015
> >> META-INF/services/org.apache.nifi.processor.Processor"
> >>
> >> QUESTIONS:
> >> ---
> >> Can you please provide content of that service manifest file?
> >>
> >> Also can you please provide the log information that writes out on
> >> startup which shows the processors/extensions that load?
> >>
> >> It looks like this...
> >>
> >> 2015-10-21 09:20:32,520 INFO [main]
> >> org.apache.nifi.nar.ExtensionManager Extension Type Mapping to
> >> Classloader:
> >> === ProvenanceEventRepository type || Classloader ===
> >> org.apache.nifi.provenance.PersistentProvenanceRepository ||
> >>
> >>
> org.apache.nifi.nar.NarClassLoader[./work/nar/extensions/nifi-provenance-repository-nar-0.3.1-SNAPSHOT.nar-unpacked]
> >> org.apache.nifi.provenance.VolatileProvenanceRepository ||
> >>
> >>
> org.apache.nifi.nar.NarClassLoader[./work/nar/extensions/nifi-provenance-repository-nar-0.3.1-SNAPSHOT.nar-unpacked]
> >> === End ProvenanceEventRepository types ===
> >> === Processor type || Classloader ===
> >> org.apache.nifi.processors.hl7.RouteHL7 ||
> >>
> >>
> org.apache.nifi.nar.NarClassLoader[./work/nar/extensions/nifi-hl7-nar-0.3.1-SNAPSHOT.nar-unpacked]
> >> org.apache.nifi.processors.standard.MergeContent ||
> >>
> >>
> org.apache.nifi.nar.NarClassLoader[./work/nar/extensions/nifi-standard-nar-0.3.1-SNAPSHOT.nar-unpacked]
> >> org.apache.nifi.processors.standard.EncryptContent ||
> >>
> >>
> org.apache.nifi.nar.NarClassLoader[./work/nar/extensions/nifi-standard-nar-0.3.1-SNAPSHOT.nar-unpacked]
> >> org.apache.nifi.processors.aws.s3.PutS3Object ||
> >>
> >>
> org.apache.nifi.nar.NarClassLoader[./work/nar/extensions/nifi-aws-nar-0.3.1-SNAPSHOT.nar-unpacked]
> >> .
> >>
> >>
> >> Thanks!
> >> Joe
> >>
> >> On Wed, Oct 21, 2015 at 8:51 AM, Jones, Patrick L. 
> wrote:
> >> > Howdy,
> >> >
> >> >I have created my own processor which I have been
> testing
> >> with NiFi for a while now. This stuff used to work for me, now it
> doesn't.
> >> Today nifi won't start.  The nifi-app.log is below
> >> > The processor reads in an image type NITF and does a few things with
> >> it.  I do a:
> >> > $ mvn install
> >> >
> >> > from ~/nifi-0.3.0/nifi-mitre-bundle/ which is where my software is

EvaluateJsonPath error: Unable to return a scalar value for the expression

2015-11-16 Thread Sumanth Chinthagunta

I am trying to extract data into   attribute using EvaluateJsonPath. when what 
JsonPath return complex type, I am getting error: Unable to return a scalar 
value for the expression $['data'] for FlowFile 152. Evaluated value was 
{id=1…..}. Transferring to failure

data  -   $.data  <—  Error 
id  -  $.data.id   <— works 
{
"database": "test”,
"table": "guests”,
"type": "insert”,
"ts": 1446422524,
"xid": 1800,
"commit": true,
"data": {
"reg_date": "2015-11-02 00:02:04",
"firstname": "sumo",
"id": 1,
"lastname": "demo"
}
}

if it possible to extract JSON object from FlowFile using EvaluateJsonPath? 
if not please advice what options I have.

Thanks
Sumo

Re: EvaluateJsonPath error: Unable to return a scalar value for the expression

2015-11-16 Thread Sumanth Chinthagunta

Thanks Aldrin. 
it works after I changed Return Type to JSON.

> On Nov 16, 2015, at 12:47 PM, Aldrin Piri  wrote:
> 
> Sumo,
> 
> The scalar option has the processor looking for the resultant value of the
> expression to provide a non-Map/List representation of the targeted
> expression.  In this case, if you change the property to json, it should
> work as anticipated.  The property itself is more of a validation of the
> data that is being extracted (in that it is an object/array or a simple
> value).
> 
> On Mon, Nov 16, 2015 at 3:20 PM, Sumanth Chinthagunta 
> wrote:
> 
>> I am trying to extract data into   attribute using EvaluateJsonPath. when
>> what JsonPath return complex type, I am getting error: Unable to return a
>> scalar value for the expression $['data'] for FlowFile 152. Evaluated value
>> was {id=1…..}. Transferring to failure
>> 
>> data  -   $.data  <—  Error
>> id  -  $.data.id   <— works
>> {
>>"database": "test”,
>>"table": "guests”,
>>"type": "insert”,
>>"ts": 1446422524,
>>"xid": 1800,
>>"commit": true,
>>"data": {
>>"reg_date": "2015-11-02 00:02:04",
>>"firstname": "sumo",
>>"id": 1,
>>"lastname": "demo"
>>}
>> }
>> 
>> if it possible to extract JSON object from FlowFile using EvaluateJsonPath?
>> if not please advice what options I have.
>> 
>> Thanks
>> Sumo
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>

remote command execution via SSH?

2015-11-24 Thread Sumanth Chinthagunta

Are there any NiFi processors to execute remote commands via SSH? 

I need to  SSH to a remoter server and run a shell script on schedule basses. 
thinking of using NiFi’s scheduling and argument passing capability.

I find this lib can be used, if no such processor exist. 
https://github.com/int128/groovy-ssh 

-Sumo

Re: remote command execution via SSH?

2015-11-24 Thread Sumanth Chinthagunta

Thanks Adam and Oleg.
My case is similar to Adam's.
NiFi is running on a node, that doesn’t have Hadoop client configured. It have 
to invoke a script on other edge node that has  Hadoop client  setup. 

I think you guys may have configured password less login for  SSH (keys?) 
In my case the  edge node is managed by different team and they don’t allow me 
to add my SSH key. 

I am thinking we need ExecuteRemoteCommand processor (based on 
https://github.com/int128/groovy-ssh) that will take care of key or password 
base SSH login. 


ExecuteRemoteCommand should have configurable attributes and return command 
output as flowfile

host : Hostname or IP address.
port : Port. Defaults to 22.
user : User name.
password: A password for password authentication.
identity : A private key file for public-key authentication.
execute - Execute a command.
executeBackground - Execute a command in background.
executeSudo - Execute a command with sudo support.
shell - Execute a shell.

If there is enough interest, I can code this extension :)

-Sumo 

  
> On Nov 24, 2015, at 10:28 AM, Adam Taft  wrote:
> 
> Right.  As an example, I'm currently using ExecuteCommand to transfer data
> over an SSH pipe.  I'm actually transferring data from a NiFi dataflow* to
> an HDFS cluster, using something like:
> 
> Command Path:  bash
> Command Arguments:  -c; ssh $host 'hadoop fs -appendToFile -
> /path/to/hdfs/file'
> 
> For some reason (that I don't remember), I liked "bash -c" better than
> calling ssh straight in the "Command Path" property.  I think maybe it had
> a better environment configuration that I needed.  You might be able to
> just call ssh directly.
> 
> * note, the nifi in question isn't directly on the hdfs network segment, so
> this was an easy/quick way to transfer data into hdfs from outside.
> 
> ** ssh -X is not required, contrary to Oleg's comment.  The -X option is
> for forwarding X Windows sessions, probably not what you need.
> 
> 
> On Tue, Nov 24, 2015 at 12:06 PM, Oleg Zhurakousky <
> ozhurakou...@hortonworks.com> wrote:
> 
>> Sumo
>> 
>> You may also want to consider ExecuteCommand processor. Your command could
>> be ‘ssh’ with ‘-X’ option which would allow you to invoke remote process
>> over SSH.
>> Not necessarily sure it would address your use case fully, but give it a
>> shot and see what happens. May be you would discover some more details that
>> you can feed back that would eventually lead into a first class support for
>> such case.
>> 
>> Cheers
>> Oleg
>> 
>>> On Nov 24, 2015, at 11:51 AM, Joe Witt  wrote:
>>> 
>>> Hello Sumo,
>>> 
>>> At present there are no such processors to do this.  I know it has
>>> been done in the past but that was not in an open source environment
>>> so don't have anything to show for it.
>>> 
>>> It could be a great contrib.
>>> 
>>> Thanks
>>> Joe
>>> 
>>> On Tue, Nov 24, 2015 at 11:32 AM, Sumanth Chinthagunta
>>>  wrote:
>>>> Are there any NiFi processors to execute remote commands via SSH?
>>>> 
>>>> I need to  SSH to a remoter server and run a shell script on schedule
>> basses.
>>>> thinking of using NiFi’s scheduling and argument passing capability.
>>>> 
>>>> I find this lib can be used, if no such processor exist.
>>>> https://github.com/int128/groovy-ssh <
>> https://github.com/int128/groovy-ssh>
>>>> 
>>>> -Sumo
>>> 
>> 
>>

bundling shared jars

2015-11-24 Thread Sumanth Chinthagunta


Sorry if this question may have been answered previously:

I am implementing two custom processors (bundled separately as NAR files), both 
depends on Groovy Jar.

instead of bundling Groovy Jars in each NAR, is there a way to share jars that 
don’t have any processors.

Should we consider NiFi support for adding optional language packs (Groovy , 
Scala etc) ?

Thanks 
Sumo

Re: remote command execution via SSH?

2015-11-25 Thread Sumanth Chinthagunta

 I have first-cut  implementation of ExecuteRemoteProcess processor   at: 

https://github.com/xmlking/nifi-scripting/releases 
<https://github.com/xmlking/nifi-scripting/releases>

I tried to provide all capabilities offed by groovy-ssh 
(https://gradle-ssh-plugin.github.io/docs/ 
<https://gradle-ssh-plugin.github.io/docs/>) to ExecuteRemoteProcess user.
it takes three attributes: 
1. SSH Config DSL (run once on OnScheduled)
remotes {
web01 {
role 'masterNode'
host = '192.168.1.5'
user = 'sumo'
password = ‘fake'
knownHosts = allowAnyHosts
}
web02 {
host = '192.168.1.5'
user = 'sumo'
knownHosts = allowAnyHosts
}
}
2. Run DSL ( run on each onTrigger)
ssh.run {
session(ssh.remotes.web01) {
  result = execute 'uname -a' 
}
}
3. User supplied Arguments which will be available in Run DSL 

anything that is assigned to ‘result’ in RunDSL  is passed as flowfile to 
success relationship.

Any suggestions for improvements are welcome.

-Sumo

> On Nov 24, 2015, at 8:19 PM, Adam Taft  wrote:
> 
> Sumo,
> 
> On Tue, Nov 24, 2015 at 10:27 PM, Sumanth Chinthagunta 
> wrote:
> 
>> I think you guys may have configured password less login for  SSH (keys?)
>> 
> 
> Correct.  I'm using SSH key exchange for authentication.  It's usually
> done password-less, true, but it doesn't necessarily have to be (if using
> ssh-agent).
> 
> 
> 
> 
>> In my case the  edge node is managed by different team and they don’t
>> allow me to add my SSH key.
>> 
> 
> Yikes.  Someone should teach them the benefits of ssh keys!  :)
> 
> 
> 
>> I am thinking we need ExecuteRemoteCommand processor (based on
>> https://github.com/int128/groovy-ssh) that will take care of key or
>> password base SSH login.
>> 
> 
> +1  - this would be a pretty nice contribution.  Recommend building the
> processor and then posting here for review. I'm sure this would be a useful
> processor for many people.
> 
> 
> ExecuteRemoteCommand should have configurable attributes and return command
>> output as flowfile
>> 
>> host : Hostname or IP address.
>> port : Port. Defaults to 22.
>> user : User name.
>> password: A password for password authentication.
>> identity : A private key file for public-key authentication.
>> execute - Execute a command.
>> executeBackground - Execute a command in background.
>> executeSudo - Execute a command with sudo support.
>> shell - Execute a shell.
>> 
>> 
> As we do for SSL contexts, it might make sense to bury some of these
> properties in an SSH key controller service.  I'm thinking username,
> password, identity might make sense to have configured externally as a
> service so they could be reused by multiple processors.  Unsure though,
> there might not be enough re-usability to really get the benefit.
> 
> Also, I'm thinking that the "background", "sudo" and "shell" options should
> possibly be a multi-valued option of the processor, not separate
> properties, and definitely not separate "commands."  i.e. I'd probably
> recommend property configuration similar to ExecuteCommand, with options
> for specifying the background, sudo, shell preference.
> 
> Good idea, I hope this works out.
> 
> Adam

Re: remote command execution via SSH?

2015-11-30 Thread Sumanth Chinthagunta

Sure Joe. I will create Jira tickets for those processors . I am also working 
on to move groovy lib dependency to parent nar level  to keep processor nars 
sleek.
Sumo 

Sent from my iPhone

> On Nov 30, 2015, at 7:25 AM, Joe Percivall  
> wrote:
> 
> Hey Sumo,
> 
> I don't know much about this use-case but just taking a quick look the 
> processors in that github repo they seem to be potentially a great addition 
> to NiFi!
> 
> I think you should consider creating a Jira and working this there. It would 
> a lot easier to get feedback and have a record of it on Jira than just on the 
> Dev list.
> 
> Joe
> - - - - - - 
> Joseph Percivall
> linkedin.com/in/Percivall
> e: joeperciv...@yahoo.com
> 
> 
> 
> 
> On Wednesday, November 25, 2015 2:12 PM, Sumanth Chinthagunta 
>  wrote:
> I have first-cut  implementation of ExecuteRemoteProcess processor   at: 
> 
> https://github.com/xmlking/nifi-scripting/releases 
> <https://github.com/xmlking/nifi-scripting/releases>
> 
> I tried to provide all capabilities offed by groovy-ssh 
> (https://gradle-ssh-plugin.github.io/docs/ 
> <https://gradle-ssh-plugin.github.io/docs/>) to ExecuteRemoteProcess user.
> it takes three attributes: 
> 1. SSH Config DSL (run once on OnScheduled)
> remotes {
>web01 {
>role 'masterNode'
>host = '192.168.1.5'
>user = 'sumo'
>password = ‘fake'
>knownHosts = allowAnyHosts
>}
>web02 {
>host = '192.168.1.5'
>user = 'sumo'
>knownHosts = allowAnyHosts
>}
> }
> 2. Run DSL ( run on each onTrigger)
> ssh.run {
>session(ssh.remotes.web01) {
>  result = execute 'uname -a' 
>}
> }
> 3. User supplied Arguments which will be available in Run DSL 
> 
> anything that is assigned to ‘result’ in RunDSL  is passed as flowfile to 
> success relationship.
> 
> Any suggestions for improvements are welcome.
> 
> -Sumo
> 
> 
>> On Nov 24, 2015, at 8:19 PM, Adam Taft  wrote:
>> 
>> Sumo,
>> 
>> On Tue, Nov 24, 2015 at 10:27 PM, Sumanth Chinthagunta 
>> wrote:
>> 
>>> I think you guys may have configured password less login for  SSH (keys?)
>>> 
>> 
>> Correct.  I'm using SSH key exchange for authentication.  It's usually
>> done password-less, true, but it doesn't necessarily have to be (if using
>> ssh-agent).
>> 
>> 
>> 
>> 
>>> In my case the  edge node is managed by different team and they don’t
>>> allow me to add my SSH key.
>>> 
>> 
>> Yikes.  Someone should teach them the benefits of ssh keys!  :)
>> 
>> 
>> 
>>> I am thinking we need ExecuteRemoteCommand processor (based on
>>> https://github.com/int128/groovy-ssh) that will take care of key or
>>> password base SSH login.
>>> 
>> 
>> +1  - this would be a pretty nice contribution.  Recommend building the
>> processor and then posting here for review. I'm sure this would be a useful
>> processor for many people.
>> 
>> 
>> ExecuteRemoteCommand should have configurable attributes and return command
>>> output as flowfile
>>> 
>>> host : Hostname or IP address.
>>> port : Port. Defaults to 22.
>>> user : User name.
>>> password: A password for password authentication.
>>> identity : A private key file for public-key authentication.
>>> execute - Execute a command.
>>> executeBackground - Execute a command in background.
>>> executeSudo - Execute a command with sudo support.
>>> shell - Execute a shell.
>>> 
>>> 
>> As we do for SSL contexts, it might make sense to bury some of these
>> properties in an SSH key controller service.  I'm thinking username,
>> password, identity might make sense to have configured externally as a
>> service so they could be reused by multiple processors.  Unsure though,
>> there might not be enough re-usability to really get the benefit.
>> 
>> Also, I'm thinking that the "background", "sudo" and "shell" options should
>> possibly be a multi-valued option of the processor, not separate
>> properties, and definitely not separate "commands."  i.e. I'd probably
>> recommend property configuration similar to ExecuteCommand, with options
>> for specifying the background, sudo, shell preference.
>> 
>> Good idea, I hope this works out.
>> 
>> Adam

Re: [GitHub] nifi pull request: NIFI-1218 upgraded Kafka to 0.9.0.0 client API

2015-12-16 Thread Sumanth Chinthagunta

Hi Oleg,

For my case NiFi (consumer) has to talk to Kafka cluster behind firewall and it 
was difficult got Kafka brokers and ZooKeeper ports opened. 
I noticed that 0.9.0.0 support consumers directly communicate to Kafka brokers 
with `bootstrap.servers` setting. 
it might be good idea if you consider to upgrade GetKafka also to new consumer 
API to avoid firewall hassle.
Thanks 
Sumanth  


> On Dec 16, 2015, at 9:50 AM, olegz  wrote:
> 
> GitHub user olegz opened a pull request:
> 
>https://github.com/apache/nifi/pull/143
> 
>NIFI-1218 upgraded Kafka to 0.9.0.0 client API
> 
>Tested and validated that it is still compatible with 0.8.* Kafka brokers
> 
> You can merge this pull request into a Git repository by running:
> 
>$ git pull https://github.com/olegz/nifi NIFI-1218
> 
> Alternatively you can review and apply these changes as the patch at:
> 
>https://github.com/apache/nifi/pull/143.patch
> 
> To close this pull request, make a commit to your master/trunk branch
> with (at least) the following in the commit message:
> 
>This closes #143
> 
> 
> commit 10cbc92873784a4a3871ae6937b8a43ac3e0abe8
> Author: Oleg Zhurakousky 
> Date:   2015-12-16T17:49:44Z
> 
>NIFI-1218 upgraded Kafka to 0.9.0.0 client API
>Tested and validated that it is still compatible with 0.8.* Kafka brokers
> 
> 
> 
> 
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---

Cluster Start Error: Apache NiFi is already running, listening to Bootstrap on port

2015-12-18 Thread Sumanth Chinthagunta


Hi 
I am following clustering instructions as per the link below:  NCM, Node1 in 
Server1 and Node2 on Server2.
https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#clustering
 


I created two copies of Nifi 0.4.0 folders on Server1 and started NCM, then 
tried to start Node1 and got the following error. 
Error: 
org.apache.nifi.bootstrap.Command Apache NiFi is already running, listening to 
Bootstrap on port 43010
 
Am I missing any steps?
Thanks 
Sumo

Re: Cluster Start Error: Apache NiFi is already running, listening to Bootstrap on port

2015-12-18 Thread Sumanth Chinthagunta

Thanks Corey and Mark for quick response.

You are right, I was sharing some folders using symbolic links between  NCM and 
Node1. After removing sharing for bin folder, it works fine :)

NCM/Node1
bin -> /software/nifi/latest/bin/
conf
content_repository
database_repository
docs -> /software/nifi/latest/docs/
flowfile_repository
lib -> /software/nifi/latest/lib/
logs
provenance_repository
work

Thanks
Sumo

> On Dec 18, 2015, at 10:05 AM, Mark Payne  wrote:
> 
> Sumo,
> 
> When you say "I created two copies of Nifi 0.4.0 folders on Server1" does 
> that mean that you made a copy of the first directory, or
> that you untarred the .tar.gz again?
> 
> It looks like the same nifi.pid file is in the bin/ directory of both 
> instances. You should be able to delete the nifi.pid file from the node
> and then start it up.
> 
> Thanks
> -Mark
> 
> 
>> On Dec 18, 2015, at 12:59 PM, Sumanth Chinthagunta  wrote:
>> 
>> 
>> Hi 
>> I am following clustering instructions as per the link below:  NCM, Node1 in 
>> Server1 and Node2 on Server2.
>> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#clustering
>>  
>> <https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#clustering>
>> 
>> I created two copies of Nifi 0.4.0 folders on Server1 and started NCM, then 
>> tried to start Node1 and got the following error. 
>> Error: 
>> org.apache.nifi.bootstrap.Command Apache NiFi is already running, listening 
>> to Bootstrap on port 43010
>> 
>> Am I missing any steps?
>> Thanks 
>> Sumo
>> 
>

Re: Cluster Start Error: Apache NiFi is already running, listening to Bootstrap on port

2015-12-18 Thread Sumanth Chinthagunta

Issue 1:
in my NiFi installation I granted only read access to NIFI_HOME/lib and I got 
this error while trying to start NiFi.

This got resolved when I gave read+write access to NIFI_HOME/lib. this makes me 
wonder why NiFi start process needs write access to lib folder!

INFO [main] org.apache.nifi.BootstrapListener Successfully initiated 
communication with Bootstrap
WARN [main] org.apache.nifi.nar.NarUnpacker Unable to load NAR library bundles 
due to java.io.IOException: /software/nifi-node1/./lib directory does not have 
read/write privilege Will proceed without loading any further Nar bundles
ERROR [main] org.apache.nifi.NiFi Failure to launch NiFi due to 
java.lang.IllegalStateException: Unable to find the framework NAR ClassLoader.
java.lang.IllegalStateException: Unable to find the framework NAR ClassLoader.
at org.apache.nifi.NiFi.(NiFi.java:116) 
~[nifi-runtime-0.4.0.jar:0.4.0]
at org.apache.nifi.NiFi.main(NiFi.java:227) 
~[nifi-runtime-0.4.0.jar:0.4.0]
INFO [Thread-1] org.apache.nifi.NiFi Initiating shutdown of Jetty web server...
INFO [Thread-1] org.apache.nifi.NiFi Jetty web server shutdown completed 
(nicely or otherwise).

Issue 2:
Server 1 : NCM , node1
Server 2 : node2

Case 1: I started NiFi NCM on server 1 and  then node2 on server 2. when you 
access admin UI, you will see error even  node2 and NCM are communicating. 
Case 2:  I started NiFi NCM and then node1 on server 1. Admin UI works as 
expected. 
My workaround was, to set nifi.web.http.host to hostname instead of leaving to 
default ‘localhost’  on Server2. this should be documented. 

Issue 3:
After above issues are resolved, now I see both node1 and node2 connected to 
NCM. when I try to add any processors or process group, I am getting following 
error in the browser UI: 

Cluster is unable to service request to change flow: Received a mutable request 
[PUT -- 
http://xyz:8090/nifi-api/controller/process-groups/77afb88b-5f7d-45d7-a1f4-f9e9269b489b/processors/bceff62d-435d-4c47-a907-bb2a26cd0e56]

<http://xyz:8090/nifi-api/controller/process-groups/77afb88b-5f7d-45d7-a1f4-f9e9269b489b/processors/bceff62d-435d-4c47-a907-bb2a26cd0e56]>
 while in safe mode

please let me know if you find anything wrong with my setup.

Thanks 
Sumo

> On Dec 18, 2015, at 11:37 AM, Sumanth Chinthagunta  wrote:
> 
> Thanks Corey and Mark for quick response.
> 
> You are right, I was sharing some folders using symbolic links between  NCM 
> and Node1. After removing sharing for bin folder, it works fine :)
> 
> NCM/Node1
> bin -> /software/nifi/latest/bin/
> conf
> content_repository
> database_repository
> docs -> /software/nifi/latest/docs/
> flowfile_repository
> lib -> /software/nifi/latest/lib/
> logs
> provenance_repository
> work
> 
> Thanks
> Sumo
> 
>> On Dec 18, 2015, at 10:05 AM, Mark Payne  wrote:
>> 
>> Sumo,
>> 
>> When you say "I created two copies of Nifi 0.4.0 folders on Server1" does 
>> that mean that you made a copy of the first directory, or
>> that you untarred the .tar.gz again?
>> 
>> It looks like the same nifi.pid file is in the bin/ directory of both 
>> instances. You should be able to delete the nifi.pid file from the node
>> and then start it up.
>> 
>> Thanks
>> -Mark
>> 
>> 
>>> On Dec 18, 2015, at 12:59 PM, Sumanth Chinthagunta  
>>> wrote:
>>> 
>>> 
>>> Hi 
>>> I am following clustering instructions as per the link below:  NCM, Node1 
>>> in Server1 and Node2 on Server2.
>>> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#clustering
>>>  
>>> <https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#clustering>
>>> 
>>> I created two copies of Nifi 0.4.0 folders on Server1 and started NCM, then 
>>> tried to start Node1 and got the following error. 
>>> Error: 
>>> org.apache.nifi.bootstrap.Command Apache NiFi is already running, listening 
>>> to Bootstrap on port 43010
>>> 
>>> Am I missing any steps?
>>> Thanks 
>>> Sumo
>>> 
>> 
>

Re: Cluster Start Error: Apache NiFi is already running, listening to Bootstrap on port

2015-12-18 Thread Sumanth Chinthagunta

got it. now my cluster is all set :) 
 Thanks 

> On Dec 18, 2015, at 5:04 PM, Matt Gilman  wrote:
> 
> Sumo,
> 
> It appears that while getting all the nodes stood up, there is no node that
> is assigned the primary role. NiFi will be in safe mode until a primary is
> established. If you go to the cluster table (accessed from the cluster icon
> in the upper right hand corner) you should be able to assign the primary
> role to one of the nodes in your cluster. This is done by clicking the
> ribbon icon in the row for the desired node.
> 
> The primary node is responsible for running processors that are configured
> with the scheduling strategy of 'Primary Node Only'. When scheduled to run,
> these will only execute on that Node. This is useful if your using certain
> protocols that would be troublesome if multiple nodes executed concurrently
> (like [S]FTP for instance).
> 
> Matt
> 
> On Fri, Dec 18, 2015 at 7:26 PM, Sumanth Chinthagunta  <mailto:xmlk...@gmail.com>>
> wrote:
> 
>> Issue 1:
>> in my NiFi installation I granted only read access to NIFI_HOME/lib and I
>> got this error while trying to start NiFi.
>> 
>> This got resolved when I gave read+write access to NIFI_HOME/lib. this
>> makes me wonder why NiFi start process needs write access to lib folder!
>> 
>> INFO [main] org.apache.nifi.BootstrapListener Successfully initiated
>> communication with Bootstrap
>> WARN [main] org.apache.nifi.nar.NarUnpacker Unable to load NAR library
>> bundles due to java.io.IOException: /software/nifi-node1/./lib directory
>> does not have read/write privilege Will proceed without loading any further
>> Nar bundles
>> ERROR [main] org.apache.nifi.NiFi Failure to launch NiFi due to
>> java.lang.IllegalStateException: Unable to find the framework NAR
>> ClassLoader.
>> java.lang.IllegalStateException: Unable to find the framework NAR
>> ClassLoader.
>>at org.apache.nifi.NiFi.(NiFi.java:116)
>> ~[nifi-runtime-0.4.0.jar:0.4.0]
>>at org.apache.nifi.NiFi.main(NiFi.java:227)
>> ~[nifi-runtime-0.4.0.jar:0.4.0]
>> INFO [Thread-1] org.apache.nifi.NiFi Initiating shutdown of Jetty web
>> server...
>> INFO [Thread-1] org.apache.nifi.NiFi Jetty web server shutdown completed
>> (nicely or otherwise).
>> 
>> 
>> Issue 2:
>> Server 1 : NCM , node1
>> Server 2 : node2
>> 
>> Case 1: I started NiFi NCM on server 1 and  then node2 on server 2. when
>> you access admin UI, you will see error even  node2 and NCM are
>> communicating.
>> Case 2:  I started NiFi NCM and then node1 on server 1. Admin UI works as
>> expected.
>> My workaround was, to set nifi.web.http.host to hostname instead of
>> leaving to default ‘localhost’  on Server2. this should be documented.
>> 
>> Issue 3:
>> After above issues are resolved, now I see both node1 and node2 connected
>> to NCM. when I try to add any processors or process group, I am getting
>> following error in the browser UI:
>> 
>> Cluster is unable to service request to change flow: Received a mutable
>> request [PUT --
>> http://xyz:8090/nifi-api/controller/process-groups/77afb88b-5f7d-45d7-a1f4-f9e9269b489b/processors/bceff62d-435d-4c47-a907-bb2a26cd0e56]
>> <
>> http://xyz:8090/nifi-api/controller/process-groups/77afb88b-5f7d-45d7-a1f4-f9e9269b489b/processors/bceff62d-435d-4c47-a907-bb2a26cd0e56]
>>  
>> <http://xyz:8090/nifi-api/controller/process-groups/77afb88b-5f7d-45d7-a1f4-f9e9269b489b/processors/bceff62d-435d-4c47-a907-bb2a26cd0e56]>>
>> while in safe mode
>> 
>> please let me know if you find anything wrong with my setup.
>> 
>> Thanks
>> Sumo
>> 
>>> On Dec 18, 2015, at 11:37 AM, Sumanth Chinthagunta 
>> wrote:
>>> 
>>> Thanks Corey and Mark for quick response.
>>> 
>>> You are right, I was sharing some folders using symbolic links between
>> NCM and Node1. After removing sharing for bin folder, it works fine :)
>>> 
>>> NCM/Node1
>>> bin -> /software/nifi/latest/bin/
>>> conf
>>> content_repository
>>> database_repository
>>> docs -> /software/nifi/latest/docs/
>>> flowfile_repository
>>> lib -> /software/nifi/latest/lib/
>>> logs
>>> provenance_repository
>>> work
>>> 
>>> Thanks
>>> Sumo
>>> 
>>>> On Dec 18, 2015, at 10:05 AM, Mark Payne  wrote:
>>>> 
>>>> Sumo,
>>>> 
>>>> When you say "I created two copies of Nifi 0.4.0 folders on S

Error: is not the most recent version of this flow file within this session

2015-12-21 Thread Sumanth Chinthagunta

I am creating a new flowFile in my processor’s onTrigger method and getting 
following error!
Please advice if I am doing anything wrong …..

My requirement is: I need to create a new outgoing flowFile with a set of 
derived  attributes and route to success path.  
 
```groovy
public void onTrigger(ProcessContext context, ProcessSession session) throws 
ProcessException {
FlowFile ff =  session.create()
session.putAllAttributes(ff,[summ:'demo'])
session.transfer(ff, REL_SUCCESS);
}
```
org.apache.nifi.processor.exception.FlowFileHandlingException: 
FlowFile[0,54356692392022.mockFlowFile,131B] is not the most recent version of 
this flow file within this session

Thanks 
Sumanth

Re: [DISCUSS] Proposal for an Apache NiFi sub project - MiNiFi

2016-01-11 Thread Sumanth Chinthagunta

Good idea. There will be many possibilities if we can make MiNIFi run on 
android / iOS or other embedded devices.

Wonder how back-pressure works in this kind of distributed setup. 
I was reading about reactivesocket project, This project is trying to solve 
reactive / back-pressure problem over network boundaries 

 http://reactivesocket.io

Sumo
Sent from my iPhone

> On Jan 9, 2016, at 4:29 PM, Joe Witt  wrote:
> 
> NiFi Community,
> 
> I'd like to initiate discussion around a proposal to create our first
> sub-project of NiFi.  A possible name for it is "MiNiFi" a sort of
> play on Mini-NiFi.
> 
> The idea is to provide a complementary data collection agent to NiFi's
> current approach of dataflow management.  As noted in our ASF TLP
> resolution NiFi is to provide "an automated and durable data broker
> between systems providing interactive command and control and detailed
> chain of custody for data."  MiNiFi would be consistent with that
> scope with a  specific focus on the first-mile challenge so common in
> dataflow.
> 
> Specific goals of MiNiFi would be to provide a small, lightweight,
> centrally managed  agent that natively generates data provenance and
> seamlessly integrates with NiFi for follow-on dataflow management and
> maintenance of the chain of custody provided by the powerful data
> provenance features of NiFi.
> 
> MiNiFi should be designed to operate directly on or adjacent to the
> source sensor, system, server generating the events as a resource
> sensitive tenant.  There are numerous agent models in existence today
> but they do not offer the command and control or provenance that is so
> important to the philosophy and scope of NiFi.
> 
> These agents would necessarily have a different interactive command
> and control model than NiFi as you'd not expect consistent behavior,
> capability, or accessibility of all instances of the agents at any
> given time.
> 
> Multiple implementations of MiNiFi are envisioned including those that
> operate on the JVM and those that do not.
> 
> As the discussion advances we can put together wiki pages, concept
> diagrams, and requirements to help better articulate how this might
> evolve.  We should also discuss the mechanics of how this might work
> in terms of infrastructure, code repository, and more.
> 
> Thanks
> Joe

Re: Git NiFi processor

2016-02-18 Thread Sumanth Chinthagunta

Sure. 

Sent from my iPhone

> On Feb 18, 2016, at 4:12 AM, Joe Witt  wrote:
> 
> Sumo,
> 
> Agreed.  If interested in helping with the discussion/idea formation
> for this can you add to this [1].
> 
> Looks like [2] gives us a potentially strong option for a Java-based
> Git client and [3] shows some interesting example code.
> 
> [1] 
> https://cwiki.apache.org/confluence/display/NIFI/Configuration+Management+of+Flows
> [2] http://www.eclipse.org/jgit/
> [3] https://git-scm.com/book/en/v2/Embedding-Git-in-your-Applications-JGit
> 
>> On Wed, Feb 17, 2016 at 11:21 PM,   wrote:
>> I have an usecase where I have to pull some configuration files from Git 
>> repository and store them into a server directory, whenever files are 
>> updated via webhooks.[automated deployment]
>> 
>> Checking if anybody build custom processor to pull files from Git.
>> 
>> By the way, it would be nice if NiFi UI directly provide saving and 
>> versioning flow templates into Git.
>> Thanks
>> Sumo
>>

Scripted controllers

2016-02-20 Thread Sumanth Chinthagunta

Since we have scripted processors, any plans to support scripted controllers in 
the roadmap?

-sumo

NiFi error with embedded ZooKeeper for State Management

2016-03-14 Thread Sumanth Chinthagunta

I am getting following error when  NiFi Cluster started with  embedded 
ZooKeeper enabled.  
I would like to configure external ZooKeeper for NiFi State Management  without 
interfering with java.security.auth.login.config setting for MapR HDFS. Can 
somebody provide me instructions how to configure external ZooKeeper for NiFi 
cluster State Management without security enabled? 
 
Env :
NiFi 0.5.1 cluster (NCM, Node1, Node2)
I have following line in  bootstrap.conf
# ***For MapR HDFS***
java.arg.15=-Djava.security.auth.login.config=/opt/mapr/conf/mapr.login.conf
 
/opt/mapr/conf/mapr.login.conf has
/**
* Used by Zookeeper
*/
Server {
  com.mapr.security.maprsasl.MaprSecurityLoginModule required
  checkUGI=false
  cldbkeylocation="/opt/mapr/conf/cldb.key"
  debug=true;
};
 
I have following in zookeeper.properties
server.2=myhost2:2888:3888
 
 
2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
o.a.zookeeper.server.ZooKeeperServer Server environment:java.io.tmpdir=/tmp
2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
o.a.zookeeper.server.ZooKeeperServer Server environment:java.compiler=
2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
o.a.zookeeper.server.ZooKeeperServer Server environment:os.name=Linux
2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
o.a.zookeeper.server.ZooKeeperServer Server environment:os.arch=amd64
2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
o.a.zookeeper.server.ZooKeeperServer Server 
environment:os.version=2.6.32-573.3.1.el6.x86_64
2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
o.a.zookeeper.server.ZooKeeperServer Server environment:user.name=sumo
2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
o.a.zookeeper.server.ZooKeeperServer Server environment:user.home=/home/ sumo
2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
o.a.zookeeper.server.ZooKeeperServer Server 
environment:user.dir=/app/runtime/nifi-node2
2016-03-12 16:39:07,519 INFO [Framework Task Thread Thread-1] 
o.a.zookeeper.server.ZooKeeperServer tickTime set to 2000
2016-03-12 16:39:07,519 INFO [Framework Task Thread Thread-1] 
o.a.zookeeper.server.ZooKeeperServer minSessionTimeout set to -1
2016-03-12 16:39:07,519 INFO [Framework Task Thread Thread-1] 
o.a.zookeeper.server.ZooKeeperServer maxSessionTimeout set to -1
2016-03-12 16:39:07,539 ERROR [Framework Task Thread Thread-1] 
o.apache.nifi.controller.FlowController NiFi was connected to the cluster but 
failed to start embedded ZooKeeper Server
java.io.IOException: Failed to start embedded ZooKeeper Server
at 
org.apache.nifi.controller.state.server.ZooKeeperStateServer.startStandalone(ZooKeeperStateServer.java:87)
 ~[na:na]
at 
org.apache.nifi.controller.state.server.ZooKeeperStateServer.start(ZooKeeperStateServer.java:60)
 ~[na:na]
at 
org.apache.nifi.controller.FlowController$5.run(FlowController.java:3145) 
~[na:na]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_65]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[na:1.8.0_65]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 [na:1.8.0_65]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 [na:1.8.0_65]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_65]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_65]
Caused by: java.io.IOException: Could not configure server because SASL 
configuration did not allow the  ZooKeeper server to authenticate itself 
properly: javax.security.auth.login.LoginException: unable to find LoginModule 
class: com.mapr.security.maprsasl.MaprSecurityLoginModule
at 
org.apache.zookeeper.server.ServerCnxnFactory.configureSaslLogin(ServerCnxnFactory.java:207)
 ~[na:na]
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:87)
 ~[na:na]
at 
org.apache.nifi.controller.state.server.ZooKeeperStateServer.startStandalone(ZooKeeperStateServer.java:81)
 ~[na:na]
... 9 common frames omitted
 
 Thanks 
Sumo

Re: NiFi error with embedded ZooKeeper for State Management

2016-03-14 Thread Sumanth Chinthagunta

Thanks Mark. 
I had java.arg.15 setting in bootstrap.conf  from NiFi 0.4.x to make putHDFS 
processor work with my MapR cluster. If I remove it, putHDFS will fail. 

Now with NiFi 0.5.1, 
if I set nifi.state.management.embedded.zookeeper.start=false , 
can I point to dedicated external ZooKeeper in conf/state-management.xml 
without interfering with java.arg.15 setting that is used by putHDFS?
Thanks 
Sumo 


> On Mar 14, 2016, at 10:10 AM, Mark Payne  wrote:
> 
> Sumo,
> 
> If your intent is to use an external ZooKeeper, you should not be starting 
> the embedded ZooKeeper.
> You will also not want to set the java.arg.15 parameter there to point to a 
> login config file, as that is
> necessary only when enabling Kerberos - not for use when security disabled.
> 
> So you would want to change the following in your config:
> - Remove the java.arg.15 parameter from bootstrap.conf
> - Set the nifi.state.management.embedded.zookeeper.start property to false
> - Change the conf/state-management.xml to point to the external ZooKeeper via 
> the Connect String property.
> 
> Does all of this make sense?
> 
> Thanks
> -Mark
> 
> 
>> On Mar 14, 2016, at 12:59 PM, Sumanth Chinthagunta  wrote:
>> 
>> I am getting following error when  NiFi Cluster started with  embedded 
>> ZooKeeper enabled.  
>> I would like to configure external ZooKeeper for NiFi State Management  
>> without interfering with java.security.auth.login.config setting for MapR 
>> HDFS. Can somebody provide me instructions how to configure external 
>> ZooKeeper for NiFi cluster State Management without security enabled? 
>> 
>> Env :
>> NiFi 0.5.1 cluster (NCM, Node1, Node2)
>> I have following line in  bootstrap.conf
>> # ***For MapR HDFS***
>> java.arg.15=-Djava.security.auth.login.config=/opt/mapr/conf/mapr.login.conf
>> 
>> /opt/mapr/conf/mapr.login.conf has
>> /**
>> * Used by Zookeeper
>> */
>> Server {
>> com.mapr.security.maprsasl.MaprSecurityLoginModule required
>> checkUGI=false
>> cldbkeylocation="/opt/mapr/conf/cldb.key"
>> debug=true;
>> };
>> 
>> I have following in zookeeper.properties
>> server.2=myhost2:2888:3888
>> 
>> 
>> 2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
>> o.a.zookeeper.server.ZooKeeperServer Server environment:java.io.tmpdir=/tmp
>> 2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
>> o.a.zookeeper.server.ZooKeeperServer Server environment:java.compiler=
>> 2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
>> o.a.zookeeper.server.ZooKeeperServer Server environment:os.name=Linux
>> 2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
>> o.a.zookeeper.server.ZooKeeperServer Server environment:os.arch=amd64
>> 2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
>> o.a.zookeeper.server.ZooKeeperServer Server 
>> environment:os.version=2.6.32-573.3.1.el6.x86_64
>> 2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
>> o.a.zookeeper.server.ZooKeeperServer Server environment:user.name=sumo
>> 2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
>> o.a.zookeeper.server.ZooKeeperServer Server environment:user.home=/home/ sumo
>> 2016-03-12 16:39:07,518 INFO [Framework Task Thread Thread-1] 
>> o.a.zookeeper.server.ZooKeeperServer Server 
>> environment:user.dir=/app/runtime/nifi-node2
>> 2016-03-12 16:39:07,519 INFO [Framework Task Thread Thread-1] 
>> o.a.zookeeper.server.ZooKeeperServer tickTime set to 2000
>> 2016-03-12 16:39:07,519 INFO [Framework Task Thread Thread-1] 
>> o.a.zookeeper.server.ZooKeeperServer minSessionTimeout set to -1
>> 2016-03-12 16:39:07,519 INFO [Framework Task Thread Thread-1] 
>> o.a.zookeeper.server.ZooKeeperServer maxSessionTimeout set to -1
>> 2016-03-12 16:39:07,539 ERROR [Framework Task Thread Thread-1] 
>> o.apache.nifi.controller.FlowController NiFi was connected to the cluster 
>> but failed to start embedded ZooKeeper Server
>> java.io.IOException: Failed to start embedded ZooKeeper Server
>>   at 
>> org.apache.nifi.controller.state.server.ZooKeeperStateServer.startStandalone(ZooKeeperStateServer.java:87)
>>  ~[na:na]
>>   at 
>> org.apache.nifi.controller.state.server.ZooKeeperStateServer.start(ZooKeeperStateServer.java:60)
>>  ~[na:na]
>>   at 
>> org.apache.nifi.controller.FlowController$5.run(FlowController.java:3145) 
>> ~[na:na]
>>   at 
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
>> [na:1.8.0_65]
>>

TailFile Multiple files and Rolling Filename Pattern not working.

2017-03-11 Thread Sumanth Chinthagunta

Hi
I am trying to use TailFile Processor to aggregate logs.
my app produce three log files :
1. app-sumo-audit.log
2. app-sumo-api.log
3. app-sumo-logging.log

and corresponding rolling files

app-sumo-audit.log.2017-03-11.0
app-sumo-audit.log.2017-03-11.1
app-sumo-api.log.2017-03-11.0
app-sumo-api.log.2017-03-11.1
app-sumo-logging.log.2017-03-11.0
app-sumo-logging.log.2017-03-11.1
etc.

My TailFile Processor configuration as snow in the attached picture not
working for rolled files.

please guide me if I am doing wrong

Property
Value

Tailing mode

Multiple files

File(s) to Tail

app-sumo-(audit|api|logging).log

Rolling Filename Pattern

${filename}.log.*

Base directory

/Developer/Applications/hNiFi/app/logs

Initial Start Position

Beginning of File

State Location

Local

Recursive lookup

false

Rolling Strategy

Changing name

Lookup frequency

10 minutes

Maximum age

24 hours



[image: Inline image 1]

*logback.xml*













${LOGS_HOME}/app-sumo-logging.log


[ignore]





${LOGS_HOME}/app-sumo-logging.log.%d{-MM-dd}.%i
  ${MAX_HISTORY}

${MAX_FILE_SIZE}





${LOGS_HOME}/app-sumo-audit.log


[ignore]





${LOGS_HOME}/app-sumo-audit.log.%d{-MM-dd}.%i
${MAX_HISTORY}

${MAX_FILE_SIZE}





${LOGS_HOME}/app-sumo-api.log


[ignore]
[ignore]
[ignore]





${LOGS_HOME}/app-sumo-api.log.%d{-MM-dd}.%i
${MAX_HISTORY}

${MAX_FILE_SIZE}

How to disable MiNiFi's provenance repository.

2017-05-14 Thread Sumanth Chinthagunta

How to disable  MiNiFi's (java version) provenance repository.

Thanks 


Sent from my iPhone

MiNiFi logs not removed as per l maxHistory in ogback.xml

2017-05-15 Thread Sumanth Chinthagunta

We have an issue with MiNiFi (minifi-0.1.0 Java).

It is not rotating its logs as per conf/logback.xml maxHistory value.

Over the time, it is filling the drive and leaving many log files on desk.

many "minifi-app.log15558502285619089.tmp" files are left on desk.
please advice.



Thanks

Sumanth Chinthagunta



logback.xml















true







${org.apache.nifi.minifi.bootstrap.config.log.
dir}/minifi-app.log





${org.apache.nifi.minifi.bootstrap.config.
log.dir}/minifi-app_%d{-MM-dd_HH}.%i.log.gz



10



1MB



10MB





%date %level [%thread] %logger{40} %msg%n

true









${org.apache.nifi.minifi.bootstrap.config.log.
dir}/minifi-bootstrap.log





${org.apache.nifi.minifi.bootstrap.config.
log.dir}/minifi-bootstrap_%d.log.gz



5





%date %level [%thread] %logger{40} %msg%n











%date %level [%thread] %logger{40} %msg%n






























































































-- 
Thanks,
Sumanth

Re: MiNiFi logs not removed as per l maxHistory in ogback.xml

2017-05-15 Thread Sumanth Chinthagunta

Thanks for quick support Aldrin.
Yes  I can evaluate 0.2

Sent from my iPhone

> On May 15, 2017, at 12:38 PM, Aldrin Piri  wrote:
> 
> Just to close the loop, the issues referenced earlier are:
> 
> https://issues.apache.org/jira/browse/MINIFI-144
> https://issues.apache.org/jira/browse/MINIFI-173
> 
> The associated issues with logback, inclusive of what I believe you were
> experiencing, include:
> 
> https://jira.qos.ch/browse/LOGBACK-1166
> https://jira.qos.ch/browse/LOGBACK-1236
> 
>> On Mon, May 15, 2017 at 2:04 PM, Aldrin Piri  wrote:
>> 
>> Hi Sumanth,
>> 
>> This was a product of some bugs with the log back library itself and have
>> been remedied. On my phone currently but there are a couple JIRAs related
>> to this. The new version currently under vote (0.2.0) should have these
>> issues resolved since including newer releases of the logback lib. If you
>> would be able to evaluate please let us know if there continue to be issues.
>> 
>>> On May 15, 2017, at 13:43, Sumanth Chinthagunta 
>> wrote:
>>> 
>>> We have an issue with MiNiFi (minifi-0.1.0 Java).
>>> 
>>> It is not rotating its logs as per conf/logback.xml maxHistory value.
>>> 
>>> Over the time, it is filling the drive and leaving many log files on
>> desk.
>>> 
>>> many "minifi-app.log15558502285619089.tmp" files are left on desk.
>>> please advice.
>>> 
>>> 
>>> 
>>> Thanks
>>> 
>>> Sumanth Chinthagunta
>>> 
>>> 
>>> 
>>> logback.xml
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>>   
>>> 
>>>   true
>>> 
>>>   
>>> 
>>> 
>>> 
>>>   
>>> 
>>>   ${org.apache.nifi.minifi.bootstrap.config.log.
>>> dir}/minifi-app.log
>>> 
>>>   
>>> 
>>>   
>>> 
>>>   ${org.apache.nifi.minifi.bootstrap.config.
>>> log.dir}/minifi-app_%d{-MM-dd_HH}.%i.log.gz
>>> 
>>>   
>>> 
>>>   10
>>> 
>>>   
>>> 
>>>   1MB
>>> 
>>>   
>>> 
>>>   10MB
>>> 
>>>   
>>> 
>>>   
>>> 
>>>   %date %level [%thread] %logger{40} %msg%n
>>> 
>>>   true
>>> 
>>>   
>>> 
>>>   
>>> 
>>> 
>>> 
>>>   
>>> 
>>>   ${org.apache.nifi.minifi.bootstrap.config.log.
>>> dir}/minifi-bootstrap.log
>>> 
>>>   
>>> 
>>>   
>>> 
>>>   ${org.apache.nifi.minifi.bootstrap.config.
>>> log.dir}/minifi-bootstrap_%d.log.gz
>>> 
>>>   
>>> 
>>>   5
>>> 
>>>   
>>> 
>>>   
>>> 
>>>   %date %level [%thread] %logger{40} %msg%n
>>> 
>>>   
>>> 
>>>   
>>> 
>>> 
>>> 
>>>   
>>> 
>>>   
>>> 
>>>   %date %level [%thread] %logger{40} %msg%n
>>> 
>>>   
>>> 
>>>   
>>> 
>>> 
>>> 
>>>   
>>> 
>>> 
>>> 
>>>   
>>> 
>>>   
>>> 
>>>   >> level="INFO"/>
>>> 
>>>   >> level="WARN" />
>>> 
>>> 
>>> 
>>>   
>>> 
>>>   
>>> 
>>> 
>>> 
>>>   
>>> 
>>>   >> level="ERROR"/>
>>> 
>>>   
>>> 
>>>   
>>> 
>>> 
>>> 
>>>   
>>> 
>>>   
>>> 
>>> 
>>> 
>>>   
>>> 
>>>   >> additivity="false">
>>> 
>>>   
>>> 
>>>   
>>> 
>>>   >> additivity="false">
>>> 
>>>   
>>> 
>>>   
>>> 
>>>   
>>> 
>>> 
>>> 
>>>   
>>> 
>>>   >> additivity="false">
>>> 
>>>   
>>> 
>>>   
>>> 
>>> 
>>> 
>>>   
>>> 
>>>   > level="ERROR"
>>> additivity="false">
>>> 
>>>   
>>> 
>>>   
>>> 
>>> 
>>> 
>>> 
>>> 
>>>   
>>> 
>>>   
>>> 
>>>   
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Thanks,
>>> Sumanth
>>

VolatileProvenanceRepository for MiNiFi

2017-10-25 Thread Sumanth Chinthagunta

Hi,

In NiFi ,  I can set following properties in 'nifi.property'.

nifi.provenance.repository.implementation=org.apache.nifi.provenance.VolatileProvenanceRepository
nifi.provenance.repository.buffer.size=1000


What is the equivalent process for MiNiFi configuration ?

-- 
Thanks,
Sumanth

Re: [DISCUSS] Apache NiFi distribution has grown too large

2018-01-17 Thread Sumanth Chinthagunta

Just an idea, can we also manger Nar, custom Nars in NiFi Repository and let 
fresh NiFi installers to add required nars and their dependencies on the fly vi 
NiFi UI ? 

-Sumanth

> On Jan 17, 2018, at 8:05 AM, Matt Burgess  wrote:
> 
> I'd like to echo many of the comments / discussion points here,
> including the extension registry (#3), NAR packs, and mixins. A couple
> of additional comments and caveats:
> 
> NAR package management:
> 
> - Grouping NAR packs based on functionality (Hadoop, RDBMS, etc.) is a
> good first start but it still seems like we'd want to end up with an a
> la carte capability at the end. An incremental approach might be to
> have a simple graphical tool (in the toolkit?) pointing at your NiFi
> install and some common repository, where you can add and delete NAR
> packs, but also delete individual NARs from your NiFi install. The use
> case here is when you download the Hadoop NAR pack for HBase and
> related components, but don't want things like the Hive NAR (which I
> think is the largest at ~93MB).
> 
> - Some NiFi installs will be located on systems that cannot contact an
> outside (or any external) repository. When we consider NAR
> repositories, we should consider providing a repo-to-go or something
> of that sort. At the very least I would think the Extension Registry
> itself would support such a thing; the ability to have an Extension
> Registry anywhere, not just attached to Bintray or Apache repo HTTP
> pages, etc.
> 
> - Murphy's Law says as soon as we pick NAR pack boundaries, there will
> be components that don't fit well into one or another, or they fit
> into more than one. For instance, a user might expect the Spark/Livy
> NAR to be in the Hadoop NAR pack but there is no requirement for Spark
> or Livy to run on Hadoop. Perhaps with a "Big Data" NAR pack (versus
> Hadoop) it would encompass the Hadoop and Spark stuff, but then where
> does Cassandra fit in? It certainly handles Big Data, but if there
> were a "NoSQL" NAR pack, which should it belong to (or can it be in
> both?).
> 
> - Because NARs are unpacked before use in NiFi, there are two related
> footprints, the footprint of the NARs in the lib/ folder, and the
> footprint of the unpacked NARs. As part of the "duplicate JARs"
> discussion, this also segues into another area, the runtime footprint
> (to include classloader hierarchies, etc.)
> 
> Optimized JARs/classloading
> 
> - Promoting JARs to the lib/ folder because they are common to many
> processors is not the right solution IMO. With parent-first
> classloaders (which is what NarClassLoaders are), if you had a NAR
> that needed a different version of a library, then it would find the
> parent version first and would likely cause issues.  We could make the
> NarClassLoader self-first (which we might want to do under other
> circumstances anyway), but then care would need to be taken to ensure
> that shared/API dependencies are indeed "provided".
> 
> - I do like the idea of "promotion" though, not just for JAR
> deduplication but also for better classloading. Here's an idea for how
> we might achieve this. When unpacking NARs, we would do something
> similar to a Maven install, where we build up a repository of
> artifacts. If two artifacts are the same (we'd likely want to verify
> checksums too, not just Maven coordinates), they'd install to the same
> place. At the end of NAR unpacking, the repo would contain unique
> (de-duplicated) JARs, and each NAR would have a bill-of-materials
> (BOM) from which to build its classloader.  An possible runtime
> improvement on top of that is to build a classloader hierarchy, where
> JARs shared by multiple NARs could be in their own classloader, which
> would be the parent of the NARs' classloaders. This way, instead of
> the same classes loaded into each NAR's classloader, they would only
> be loaded once into a shared parent. This "de-dupes" the memory
> footprint of the JARs as well. Hopefully the construction of the
> classloader graph would not be too computationally intensive, but we
> could have a best-effort algorithm rather than an optimal one if that
> were an issue.
> 
> Thoughts? Thanks,
> Matt
> 
> 
> 
>> On Tue, Jan 16, 2018 at 12:52 PM, Kevin Doran  wrote:
>> Nice discussion on this thread.
>> 
>> I'm also in favor of the long-term solution being publishing extension NARs 
>> to an extension registry (#3) and removing them from the NiFi convenience 
>> binary.
>> 
>> A few thoughts that build upon what others have said:
>> 
>> 1. Many decisions, such as the structure of the project/repo(s) and 
>> mechanics of the release, don't have to be made right away, though it is 
>> probably good to start considering the impacts of various approaches as 
>> people have. There is a lot that has to be done to make progress towards the 
>> long-term goal regardless of those decisions, some of which follows below.
>> 
>> 2. We can start adding support for extensions to the Registry project 
>> (obvious

Feature Request: NodeID of NiFi cluster nodes as FlowFile common attribute.

2016-05-16 Thread Sumanth Chinthagunta

In my NiFi cluster environment,  for a given flow, each node in the cluster is 
producing afinal FlowFile that is written to HDFS.
I needed to append NodeID to filename attribute (e.g., node1_filename1.tar) so 
that files written to HDFS has unique name and I can easily identify which node 
produced those files.
it would be if NiFi has build in support to add  NodeID as common attribute for 
each FlowFile that can be used by any processors. 
Thanks
Sumanth

Re: MergeContent to group related files

2016-06-17 Thread Sumanth Chinthagunta

Hi Joe,
Thanks a lot helping with the solution. I don’t  understand  before on how 
correlation-identifier works.
I guess MergeContent may be pulling flowFiles with same  correlation-identifier 
from queue in a batch. 

I don’t really need 1_N in the file name , this solution should work for my 
case. I will try and let you know.
Thanks 
-Sumo

> On Jun 17, 2016, at 6:27 AM, Joe Witt  wrote:
> 
> Sumo,
> 
> Should be doable.  The only part that may be tricky is the filename showing 
> 1_N if that means the whole thing has to retain sequential ordering from 
> source through destination.  Otherwise...
> 
> When merging flowfiles together you need to decide 'how should content be 
> merged' and 'how should attributes be merged'.  The properties to control 
> that are 'Merge Strategy' and 'Attribute Strategy' respectively.  For merge 
> strategy you'll want to do binary merge.  For the attribute strategy the 
> default of keeping only common attributes should likely be sufficient.  The 
> reason it should be is the information that you'll need for writing to HDFS 
> then is the common databaseName, tableName, and action.  When merging you'll 
> merge by all three of these attributes combined.  You can do this by creating 
> an attribute that combines those three things right after your extract 
> attributes processor.
> 
> Lets say your extract attributes pulls out 'databaseName', 'tableName' and 
> 'action'.  If so then put an UpdateAttributes between your extract attributes 
> and MergeContent (or you could use HashAttribute as well).  In this create an 
> attribute called 'correlatation-identifier' and give it a value of 
> ${databaseName}-${tableName}-${action}
> 
> Then in merge content use that correlation-identifier attribute in the 
> 'Correlation Attribute Name' property.
> 
> Now, given that you'll be smashing JSON documents together keep in mind the 
> resulting smashed together thing would not be valid JSON itself.  You'd need 
> to either make sure when it is merged that the resulting output is also valid 
> JSON which you can do by using MergeContent's header/footer/demarcator 
> feature.  Or, you need the thing that reads these merged JSON documents to be 
> able to demarcate them for you.
> 
> If you want to end up with roughly 64MB bundles and these objects can be 
> quite small (between say 1 and 10KB) then you'd be bundling around 6000-1 
> objects each time and that is not factoring in compression.  I'd recommend a 
> two phase merge with a GZIP compression step in between then.  GZIP is nice 
> as it compresses quite fast and it can be safely concatenated.  So the 'merge 
> step' would really be:
> - First Merge
> - GZIP Compress
> - Final Merge
> 
> In first merge do bundles of at least 800 objects but no more than 1000 and 
> set an age kick-out of say 1 minute or whatever is appropriate in your case
> In GZIP compress set level 1
> In final merge do bundles of at least 55MB but no more than 64MB with an age 
> kick-out of say 5 minutes or whatever is appropriate in your case
> 
> Since the common attributes you needed will be retained in this model you 
> will be able to write to hdfs using a path of something like 
> '/${databaseName}/${tableName}/${action}/${uuid}.whatever.
> 
> Now that I got here I just noticed you set 'tar' so presumably you are using 
> tar merging strategy and most likely this is to address how to keep these 
> objects separate and avoid the need for header/foot/demarcator/etc..  Good 
> choice as well.
> 
> There are a lot of ways to slice this up.
> 
> Thanks
> Joe
> 
> On Wed, Jun 15, 2016 at 6:04 PM, Sumanth Chinthagunta  <mailto:xmlk...@gmail.com>> wrote:
> 
> Hi, 
> I have following flow that receives JSON data from Kafka and writes to HDFS.
> Each flowFile received from Kafka has following attributes and JSON payload.
> 1.  databaseName = db1 or db2 etc 
> 2.  tableName = customer or address etc 
> 3.  action = [insert, update, delete] 
> 
> My goal is to merge 1000 flowFlies into single file and write to HDFS 
> (because writing large files into HDFS is more efficient then writing small 
> JSON files.)
> I also want to write into HDFS folder structure   like:
> 1_1000.tar
> 1000_2000.tar
> 
> With default MergeContent configuration, I am losing individual flowFile’s 
> attributes and cannot organize bin files into directory structure. Is it 
> possible to accomplish my goal with   MergeContent?   
>  
> 
> Thanks 
> -Sumo
>

State sharing

2016-07-10 Thread Sumanth Chinthagunta

If I set state from one ExecuteScript processor via stateManager , can I access 
same state from other processor ? 
Thanks
Sumo 

Sent from my iPhone

Re: State sharing

2016-07-10 Thread Sumanth Chinthagunta

Andrew,
If I use cluster mode, if one processor writes state, a diffrent processor can 
read it ?
I understand that same processor on multiple nodes in the cluster can share 
state. 
Thanks
Sumo 

Sent from my iPhone

> On Jul 10, 2016, at 7:23 PM, Andrew Grande  wrote:
> 
> Sumo,
> 
> IIRC there's a node one selects when setting state. If you invoke with a
> cluster mode, the state will be set to a ZK by default. Otherwise just
> local to this processor node.
> 
> Andrew
> 
> On Sun, Jul 10, 2016, 10:17 PM Sumanth Chinthagunta 
> wrote:
> 
>> If I set state from one ExecuteScript processor via stateManager , can I
>> access same state from other processor ?
>> Thanks
>> Sumo
>> 
>> Sent from my iPhone

Re: State sharing

2016-07-10 Thread Sumanth Chinthagunta

Thanks Bryan.
Would be nice if we get support for state sharing across diffrent processors in 
the future.
-Sumo 

Sent from my iPhone

> On Jul 10, 2016, at 7:39 PM, Bryan Bende  wrote:
> 
> Sumo,
> 
> Two different processor instances (different UUIDs) can not share state
> that is stored through the state manager. For something like this you would
> likely use the distributed map cache.
> 
> As Andrew mentioned, the state is accessible across the cluster, so a
> given processor can access the state from any node because the processor
> will have the same UUID on each node.
> 
> -Bryan
> 
>> On Sunday, July 10, 2016, Andrew Grande  wrote:
>> 
>> Sumo,
>> 
>> IIRC there's a node one selects when setting state. If you invoke with a
>> cluster mode, the state will be set to a ZK by default. Otherwise just
>> local to this processor node.
>> 
>> Andrew
>> 
>> On Sun, Jul 10, 2016, 10:17 PM Sumanth Chinthagunta > >
>> wrote:
>> 
>>> If I set state from one ExecuteScript processor via stateManager , can I
>>> access same state from other processor ?
>>> Thanks
>>> Sumo
>>> 
>>> Sent from my iPhone
> 
> 
> -- 
> Sent from Gmail Mobile

Re: State sharing

2016-07-11 Thread Sumanth Chinthagunta

Andrew,
Thanks for the link. My current use case need to store data in centralized 
key/value store, that can be updated by one Processor and read by couple of 
different processors in the cluster.  Data speed is low (tracking database 
schema changes) , so concurrently is not big concern.
My use case is explained little bit here: 
https://github.com/zendesk/maxwell/pull/344 
<https://github.com/zendesk/maxwell/pull/344>

+-++
| Table   | Version   |
+-++
| Book| v1  |
+-++
| Author | v2   |
+-++
| Sales  | v1   |
+-++
| Cart | v1   |
+-++

One processor update schema version number (increment) whenever new schema 
change event arrived to it. 
Other processors start writing new DML events to a sub folder, based on schema 
version number updated by first processes . 
 
Thanks
Sumanth 


> On Jul 11, 2016, at 10:33 AM, Andrew Grande  wrote:
> 
> Sumo,
> 
> Something lightweight I devised here, backed by a simple in mem concurrent
> map, for cases when a distributed map is too much
> https://github.com/aperepel/nifi-csv-bundle/tree/master/nifi-csv-processors/src/main/java/org/apache/nifi/processors/lookup
> 
> In a cluster, though, it's the true distributed replicated cache that one
> must explicitly design for and use. While state framework is ok, the
> default implementation backed by a ZK is not meant for high speed
> concurrent use. Distributed caches are, however.
> 
> Andrew
> 
> On Sun, Jul 10, 2016, 10:58 PM Sumanth Chinthagunta 
> wrote:
> 
>> Thanks Bryan.
>> Would be nice if we get support for state sharing across diffrent
>> processors in the future.
>> -Sumo
>> 
>> Sent from my iPhone
>> 
>>> On Jul 10, 2016, at 7:39 PM, Bryan Bende  wrote:
>>> 
>>> Sumo,
>>> 
>>> Two different processor instances (different UUIDs) can not share state
>>> that is stored through the state manager. For something like this you
>> would
>>> likely use the distributed map cache.
>>> 
>>> As Andrew mentioned, the state is accessible across the cluster, so a
>>> given processor can access the state from any node because the processor
>>> will have the same UUID on each node.
>>> 
>>> -Bryan
>>> 
>>>> On Sunday, July 10, 2016, Andrew Grande  wrote:
>>>> 
>>>> Sumo,
>>>> 
>>>> IIRC there's a node one selects when setting state. If you invoke with a
>>>> cluster mode, the state will be set to a ZK by default. Otherwise just
>>>> local to this processor node.
>>>> 
>>>> Andrew
>>>> 
>>>> On Sun, Jul 10, 2016, 10:17 PM Sumanth Chinthagunta >>> >
>>>> wrote:
>>>> 
>>>>> If I set state from one ExecuteScript processor via stateManager , can
>> I
>>>>> access same state from other processor ?
>>>>> Thanks
>>>>> Sumo
>>>>> 
>>>>> Sent from my iPhone
>>> 
>>> 
>>> --
>>> Sent from Gmail Mobile
>>

DistributedMapCacheClient Groovy example

2016-07-13 Thread Sumanth Chinthagunta

looking for example script ( for ExecuteScript processor)  that uses 
DistributedMapCacheClient to put/get key/value pair.

Thanks 
-Sumo

Re: DistributedMapCacheClient Groovy example

2016-07-13 Thread Sumanth Chinthagunta

Thanks Matt,
If I create DistributedMapCacheServer controller service in NiFi cluster(two 
node) enveronment, will it create one instance of CacheServer or two? 

Sent from my iPhone

> On Jul 13, 2016, at 6:13 PM, Matt Burgess  wrote:
> 
> Sumo,
> 
> I have some example code at
> http://funnifi.blogspot.com/2016/04/inspecting-your-nifi.html.
> Although it's not expressly for ExecuteScript it should be pretty
> usable as-is.
> 
> Also you might be able to use the technique outlined in my other post:
> http://funnifi.blogspot.com/2016/04/sql-in-nifi-with-executescript.html.
> With this you would get a reference to the ControllerService for the
> DistributedMapCacheClient (although you likely won't be able to refer
> to the DistributedMapCacheClient class), but using dynamic method
> invocation (as outlined in the blog post) you can call its methods to
> get and put values.
> 
> Regards,
> Matt
> 
>> On Wed, Jul 13, 2016 at 8:26 PM, Sumanth Chinthagunta  
>> wrote:
>> looking for example script ( for ExecuteScript processor)  that uses 
>> DistributedMapCacheClient to put/get key/value pair.
>> 
>> Thanks
>> -Sumo

Re: DistributedMapCacheClient Groovy example

2016-07-13 Thread Sumanth Chinthagunta

Matt,
I setup DistributedMapCacheServer controller service and trying to run 
following script. Am I doing correct? 
I have couple of issues:
1. How do I make org.apache.nifi.distributed.cache.* package available  for 
ExecuteScript?  
2. When I try : cache.get("1",null,null), getting following error 
Failed to process session due to 
org.apache.nifi.processor.exception.ProcessException: 
javax.script.ScriptException: javax.script.ScriptException: 
groovy.lang.MissingMethodException: No signature of method: 
com.sun.proxy.$Proxy129.get() is applicable for argument types: 
(java.lang.String, null, null) values: [1, null, null]
Possible solutions: grep(), wait(), any(), getAt(java.lang.String), dump(), 
find(): org.apache.nifi.processor.exception.ProcessException: 
javax.script.ScriptException: javax.script.ScriptException: 
groovy.lang.MissingMethodException: No signature of method: 
com.sun.proxy.$Proxy129.get() is applicable for argument types: 
(java.lang.String, null, null) values: [1, null, null]
Possible solutions: grep(), wait(), any(), getAt(java.lang.String), dump(), 
find()

import org.apache.nifi.controller.ControllerService
import org.apache.nifi.distributed.cache.client.Deserializer;
import org.apache.nifi.distributed.cache.client.Serializer;
import 
org.apache.nifi.distributed.cache.client.exception.DeserializationException;
import 
org.apache.nifi.distributed.cache.client.exception.SerializationException;

static class StringSerializer implements Serializer {

@Override
public void serialize(final String value, final OutputStream out) throws 
SerializationException, IOException {
out.write(value.getBytes(StandardCharsets.UTF_8));
}
}

static class CacheValueDeserializer implements Deserializer {

@Override
public byte[] deserialize(final byte[] input) throws 
DeserializationException, IOException {
if (input == null || input.length == 0) {
return null;
}
return input;
}
}

private final Serializer keySerializer = new StringSerializer();
private final Deserializer valueDeserializer = new 
CacheValueDeserializer();

def lookup = context.controllerServiceLookup
def cacheServerName = distributedMapCacheServerName.value

def cacheServerId = 
lookup.getControllerServiceIdentifiers(ControllerService).find {
cs -> lookup.getControllerServiceName(cs) == cacheServerName
}

def cache = lookup.getControllerService(cacheServerId)

//log.error cache.get("1",keySerializer,valueDeserializer)
log.error cache.get("1",null,null)

 Thanks 
Sumo

> On Jul 13, 2016, at 6:13 PM, Matt Burgess  wrote:
> 
> Sumo,
> 
> I have some example code at
> http://funnifi.blogspot.com/2016/04/inspecting-your-nifi.html.
> Although it's not expressly for ExecuteScript it should be pretty
> usable as-is.
> 
> Also you might be able to use the technique outlined in my other post:
> http://funnifi.blogspot.com/2016/04/sql-in-nifi-with-executescript.html.
> With this you would get a reference to the ControllerService for the
> DistributedMapCacheClient (although you likely won't be able to refer
> to the DistributedMapCacheClient class), but using dynamic method
> invocation (as outlined in the blog post) you can call its methods to
> get and put values.
> 
> Regards,
> Matt
> 
> On Wed, Jul 13, 2016 at 8:26 PM, Sumanth Chinthagunta  
> wrote:
>> looking for example script ( for ExecuteScript processor)  that uses 
>> DistributedMapCacheClient to put/get key/value pair.
>> 
>> Thanks
>> -Sumo

Re: DistributedMapCacheClient Groovy example

2016-07-13 Thread Sumanth Chinthagunta

Realized I also need to setup DistributedMapCacheClientService along with 
DistributedMapCacheServer. [ wonder how many instances service will be running 
in clustered env) 
 
but still looks for guidelines on how to   make 
org.apache.nifi.distributed.cache.* available  for ExecuteScript processor !
Thanks 
Sumo 


> On Jul 13, 2016, at 9:12 PM, Sumanth Chinthagunta  wrote:
> 
> Matt,
> I setup DistributedMapCacheServer controller service and trying to run 
> following script. Am I doing correct? 
> I have couple of issues:
> 1. How do I make org.apache.nifi.distributed.cache.* package available  for 
> ExecuteScript?  
> 2. When I try : cache.get("1",null,null), getting following error 
> Failed to process session due to 
> org.apache.nifi.processor.exception.ProcessException: 
> javax.script.ScriptException: javax.script.ScriptException: 
> groovy.lang.MissingMethodException: No signature of method: 
> com.sun.proxy.$Proxy129.get() is applicable for argument types: 
> (java.lang.String, null, null) values: [1, null, null]
> Possible solutions: grep(), wait(), any(), getAt(java.lang.String), dump(), 
> find(): org.apache.nifi.processor.exception.ProcessException: 
> javax.script.ScriptException: javax.script.ScriptException: 
> groovy.lang.MissingMethodException: No signature of method: 
> com.sun.proxy.$Proxy129.get() is applicable for argument types: 
> (java.lang.String, null, null) values: [1, null, null]
> Possible solutions: grep(), wait(), any(), getAt(java.lang.String), dump(), 
> find()
> 
> 
>  
> 
> import org.apache.nifi.controller.ControllerService
> import org.apache.nifi.distributed.cache.client.Deserializer;
> import org.apache.nifi.distributed.cache.client.Serializer;
> import 
> org.apache.nifi.distributed.cache.client.exception.DeserializationException;
> import 
> org.apache.nifi.distributed.cache.client.exception.SerializationException;
> 
> static class StringSerializer implements Serializer {
> 
> @Override
> public void serialize(final String value, final OutputStream out) throws 
> SerializationException, IOException {
> out.write(value.getBytes(StandardCharsets.UTF_8));
> }
> }
> 
> static class CacheValueDeserializer implements Deserializer {
> 
> @Override
> public byte[] deserialize(final byte[] input) throws 
> DeserializationException, IOException {
> if (input == null || input.length == 0) {
> return null;
> }
> return input;
> }
> }
> 
> private final Serializer keySerializer = new StringSerializer();
> private final Deserializer valueDeserializer = new 
> CacheValueDeserializer();
> 
> def lookup = context.controllerServiceLookup
> def cacheServerName = distributedMapCacheServerName.value
> 
> def cacheServerId = 
> lookup.getControllerServiceIdentifiers(ControllerService).find {
> cs -> lookup.getControllerServiceName(cs) == cacheServerName
> }
> 
> def cache = lookup.getControllerService(cacheServerId)
> 
> //log.error cache.get("1",keySerializer,valueDeserializer)
> log.error cache.get("1",null,null)
> 
>  Thanks 
> Sumo
> 
> 
> 
> 
>> On Jul 13, 2016, at 6:13 PM, Matt Burgess > <mailto:mattyb...@gmail.com>> wrote:
>> 
>> Sumo,
>> 
>> I have some example code at
>> http://funnifi.blogspot.com/2016/04/inspecting-your-nifi.html 
>> <http://funnifi.blogspot.com/2016/04/inspecting-your-nifi.html>.
>> Although it's not expressly for ExecuteScript it should be pretty
>> usable as-is.
>> 
>> Also you might be able to use the technique outlined in my other post:
>> http://funnifi.blogspot.com/2016/04/sql-in-nifi-with-executescript.html 
>> <http://funnifi.blogspot.com/2016/04/sql-in-nifi-with-executescript.html>.
>> With this you would get a reference to the ControllerService for the
>> DistributedMapCacheClient (although you likely won't be able to refer
>> to the DistributedMapCacheClient class), but using dynamic method
>> invocation (as outlined in the blog post) you can call its methods to
>> get and put values.
>> 
>> Regards,
>> Matt
>> 
>> On Wed, Jul 13, 2016 at 8:26 PM, Sumanth Chinthagunta > <mailto:xmlk...@gmail.com>> wrote:
>>> looking for example script ( for ExecuteScript processor)  that uses 
>>> DistributedMapCacheClient to put/get key/value pair.
>>> 
>>> Thanks
>>> -Sumo
>

Re: DistributedMapCacheClient Groovy example

2016-07-16 Thread Sumanth Chinthagunta

ripts because it is not available
> to ExecuteScript, that's what I meant about not being able to refer to
> the class directly :)  Instead you can get a reference to the
> ControllerService by name, then just call methods even though you
> aren't directly referencing the interface that defines them (thanks
> Groovy!)
> 
> However this approach may only work if the methods you're calling
> don't require classes you don't have access to. In this case you might
> not be able to use this approach as the get() and put() methods
> require a Serializer/Deserializer which are in the same package/NAR as
> the DistributedMapCacheClient.
> 
> I'll give it a try myself to see if I can find a way (using closures
> -- but not interface coercion since I can't reference the class) to
> achieve this, but otherwise you can use the first blog post from my
> last email to write a simple client of your own inside Groovy.
> 
> Regards,
> Matt
> 
> On Thu, Jul 14, 2016 at 1:04 AM, Sumanth Chinthagunta  <mailto:xmlk...@gmail.com>> wrote:
>> Realized I also need to setup DistributedMapCacheClientService along with 
>> DistributedMapCacheServer. [ wonder how many instances service will be 
>> running in clustered env)
>> 
>> but still looks for guidelines on how to   make 
>> org.apache.nifi.distributed.cache.* available  for ExecuteScript processor !
>> Thanks
>> Sumo
>> 
>> 
>>> On Jul 13, 2016, at 9:12 PM, Sumanth Chinthagunta  wrote:
>>> 
>>> Matt,
>>> I setup DistributedMapCacheServer controller service and trying to run 
>>> following script. Am I doing correct?
>>> I have couple of issues:
>>> 1. How do I make org.apache.nifi.distributed.cache.* package available  for 
>>> ExecuteScript?
>>> 2. When I try : cache.get("1",null,null), getting following error
>>> Failed to process session due to 
>>> org.apache.nifi.processor.exception.ProcessException: 
>>> javax.script.ScriptException: javax.script.ScriptException: 
>>> groovy.lang.MissingMethodException: No signature of method: 
>>> com.sun.proxy.$Proxy129.get() is applicable for argument types: 
>>> (java.lang.String, null, null) values: [1, null, null]
>>> Possible solutions: grep(), wait(), any(), getAt(java.lang.String), dump(), 
>>> find(): org.apache.nifi.processor.exception.ProcessException: 
>>> javax.script.ScriptException: javax.script.ScriptException: 
>>> groovy.lang.MissingMethodException: No signature of method: 
>>> com.sun.proxy.$Proxy129.get() is applicable for argument types: 
>>> (java.lang.String, null, null) values: [1, null, null]
>>> Possible solutions: grep(), wait(), any(), getAt(java.lang.String), dump(), 
>>> find()
>>> 
>>> 
>>> 
>>> 
>>> import org.apache.nifi.controller.ControllerService
>>> import org.apache.nifi.distributed.cache.client.Deserializer;
>>> import org.apache.nifi.distributed.cache.client.Serializer;
>>> import 
>>> org.apache.nifi.distributed.cache.client.exception.DeserializationException;
>>> import 
>>> org.apache.nifi.distributed.cache.client.exception.SerializationException;
>>> 
>>> static class StringSerializer implements Serializer {
>>> 
>>>@Override
>>>public void serialize(final String value, final OutputStream out) throws 
>>> SerializationException, IOException {
>>>out.write(value.getBytes(StandardCharsets.UTF_8));
>>>}
>>> }
>>> 
>>> static class CacheValueDeserializer implements Deserializer {
>>> 
>>>@Override
>>>public byte[] deserialize(final byte[] input) throws 
>>> DeserializationException, IOException {
>>>if (input == null || input.length == 0) {
>>>return null;
>>>}
>>>return input;
>>>}
>>> }
>>> 
>>> private final Serializer keySerializer = new StringSerializer();
>>> private final Deserializer valueDeserializer = new 
>>> CacheValueDeserializer();
>>> 
>>> def lookup = context.controllerServiceLookup
>>> def cacheServerName = distributedMapCacheServerName.value
>>> 
>>> def cacheServerId = 
>>> lookup.getControllerServiceIdentifiers(ControllerService).find {
>>>cs -> lookup.getControllerServiceName(cs) == cacheServerName
>>> }
>>> 
>>> def cache = lookup.getControllerService(cacheServerId)
>>> 
>>> //log.error cache.get("1&

Re: DistributedMapCacheClient Groovy example

2016-07-17 Thread Sumanth Chinthagunta

I had to custom build NiFi to add distributed-cache support for scripting 
processors as described here: 
https://github.com/xmlking/mapr-nifi-hadoop-libraries-bundle/blob/master/nifi-mapr-build.md#add-optional
 
<https://github.com/xmlking/mapr-nifi-hadoop-libraries-bundle/blob/master/nifi-mapr-build.md#add-optional>
Please consider adding this support in next release. 

-Sumo 

> On Jul 16, 2016, at 10:17 PM, Sumanth Chinthagunta  wrote:
> 
> Hi Matt,
> 
> Did you find any solution to use DistributedMapCacheClientService from 
> Scripting processors ?
> 
> I tried to use module directory with nifi-sumo-common-0.7.0-SNAPSHOT-all.jar 
> bundling with org.apache.nifi:nifi-distributed-cache-client-service-api 
> dependency here: 
> https://github.com/xmlking/nifi-scripting/releases/tag/0.7.0 
> <https://github.com/xmlking/nifi-scripting/releases/tag/0.7.0>
> 
> but getting weird error :(
> 
> Would be nice if this dependency is bundled with scripting processor Nar in 
> NiFi 0.7.1 :)
> 
>
> org.apache.nifi
> nifi-distributed-cache-client-service-api
> 
> 
> My Groovy script:
> 
> import org.apache.nifi.controller.ControllerService
> import com.crossbusiness.nifi.processors.StringSerDe
> 
> final StringSerDe stringSerDe = new StringSerDe();
> 
> def lookup = context.controllerServiceLookup
> def cacheServiceName = DistributedMapCacheClientServiceName.value
> 
> log.error  "cacheServiceName: ${cacheServiceName}"
> 
> def cacheServiceId = 
> lookup.getControllerServiceIdentifiers(ControllerService).find {
> cs -> lookup.getControllerServiceName(cs) == cacheServiceName
> }
> 
> log.error  "cacheServiceId:  ${cacheServiceId}"
> 
> def cache = lookup.getControllerService(cacheServiceId)
> log.error cache.get("aaa", stringSerDe, stringSerDe )
>  Error: 
>  
> 00:02:04 CDT
> ERROR
> 3886ddbc-1ccd-437e-8e34-f5a98602264b
> ExecuteScript[id=3886ddbc-1ccd-437e-8e34-f5a98602264b] cacheServiceName: 
> DistributedMapCacheClientService
> 00:02:04 CDT
> ERROR
> 3886ddbc-1ccd-437e-8e34-f5a98602264b
> ExecuteScript[id=3886ddbc-1ccd-437e-8e34-f5a98602264b] cacheServiceId:  
> 8971c8e8-6bc5-4e07-8e30-7189fa8a5252
> 00:02:04 CDT
> ERROR
> 3886ddbc-1ccd-437e-8e34-f5a98602264b
> ExecuteScript[id=3886ddbc-1ccd-437e-8e34-f5a98602264b] 
> ExecuteScript[id=3886ddbc-1ccd-437e-8e34-f5a98602264b] failed to process due 
> to org.apache.nifi.processor.exception.ProcessException: 
> javax.script.ScriptException: javax.script.ScriptException: 
> groovy.lang.MissingMethodException: No signature of method: 
> com.sun.proxy.$Proxy123.get() is applicable for argument types: 
> (java.lang.String, com.crossbusiness.nifi.processors.StringSerDe, 
> com.crossbusiness.nifi.processors.StringSerDe) values: [aaa, 
> com.crossbusiness.nifi.processors.StringSerDe@5b1ef80b 
> <mailto:com.crossbusiness.nifi.processors.StringSerDe@5b1ef80b>, ...]
> Possible solutions: grep(), get(java.lang.Object, 
> org.apache.nifi.distributed.cache.client.Serializer, 
> org.apache.nifi.distributed.cache.client.Deserializer), wait(), any(), 
> getAt(java.lang.String), every(); rolling back session: 
> org.apache.nifi.processor.exception.ProcessException: 
> javax.script.ScriptException: javax.script.ScriptException: 
> groovy.lang.MissingMethodException: No signature of method: 
> com.sun.proxy.$Proxy123.get() is applicable for argument types: 
> (java.lang.String, com.crossbusiness.nifi.processors.StringSerDe, 
> com.crossbusiness.nifi.processors.StringSerDe) values: [aaa, 
> com.crossbusiness.nifi.processors.StringSerDe@5b1ef80b 
> <mailto:com.crossbusiness.nifi.processors.StringSerDe@5b1ef80b>, ...]
> Possible solutions: grep(), get(java.lang.Object, 
> org.apache.nifi.distributed.cache.client.Serializer, 
> org.apache.nifi.distributed.cache.client.Deserializer), wait(), any(), 
> getAt(java.lang.String), every()
> 00:02:04 CDT
> ERROR
> 3886ddbc-1ccd-437e-8e34-f5a98602264b
> ExecuteScript[id=3886ddbc-1ccd-437e-8e34-f5a98602264b] Failed to process 
> session due to org.apache.nifi.processor.exception.ProcessException: 
> javax.script.ScriptException: javax.script.ScriptException: 
> groovy.lang.MissingMethodException: No signature of method: 
> com.sun.proxy.$Proxy123.get() is applicable for argument types: 
> (java.lang.String, com.crossbusiness.nifi.processors.StringSerDe, 
> com.crossbusiness.nifi.processors.StringSerDe) values: [aaa, 
> com.crossbusiness.nifi.processors.StringSerDe@5b1ef80b 
> <mailto:com.crossbusiness.nifi.processors.StringSerDe@5b1ef80b>, ...]
> Possible solutions: grep(), get(java.lang.Object, 
> org.apache.nifi.distributed.cache.client.Serializer,

ConsumeKafka vs getKafka

2016-07-17 Thread Sumanth Chinthagunta

Env: Linux , Kafka 9.0.x, NiFi: 0.7 

We like to use new ConsumeKafka processor  but found critical limitations 
compared to old getKafka processor. 
New ConsumeKafka is not writing critical Kafka attributes  i.e., kafka.key, 
kafka.offset, kafka.partition etc into flowFile attributes. 


Old getKafka processor: 

Standard FlowFile Attributes
Key: 'entryDate'
   Value: 'Sun Jul 17 15:17:00 CDT 2016'
Key: 'lineageStartDate'
   Value: 'Sun Jul 17 15:17:00 CDT 2016'
Key: 'fileSize'
   Value: '183'
FlowFile Attribute Map Content
Key: 'filename'
   Value: '19709945781167274'
Key: 'kafka.key'
   Value: '{"database":"test","table":"sc_job","pk.systemid":1}'
Key: 'kafka.offset'
   Value: '1184010261'
Key: 'kafka.partition'
   Value: '0'
Key: 'kafka.topic'
   Value: ‘data'
Key: 'path'
   Value: './'
Key: 'uuid'
   Value: '244059bb-9ad9-4d74-b1fb-312eee72124a'
 
—
new ConsumeKafka processor : 
 
Standard FlowFile Attributes
Key: 'entryDate'
   Value: 'Sun Jul 17 15:18:41 CDT 2016'
Key: 'lineageStartDate'
   Value: 'Sun Jul 17 15:18:41 CDT 2016'
Key: 'fileSize'
   Value: '183'
FlowFile Attribute Map Content
Key: 'filename'
   Value: '19710046870478139'
Key: 'path'
   Value: './'
Key: 'uuid'
   Value: '349fbeb3-e342-4533-be4c-424793fa5c59’

Thanks 
Sumo

Re: ConsumeKafka vs getKafka

2016-07-17 Thread Sumanth Chinthagunta

Done.

https://issues.apache.org/jira/browse/NIFI-2298 
<https://issues.apache.org/jira/browse/NIFI-2298>

Thanks 
-Sumo

> On Jul 17, 2016, at 1:37 PM, Sumanth Chinthagunta  wrote:
> 
> Env: Linux , Kafka 9.0.x, NiFi: 0.7 
> 
> We like to use new ConsumeKafka processor  but found critical limitations 
> compared to old getKafka processor. 
> New ConsumeKafka is not writing critical Kafka attributes  i.e., kafka.key, 
> kafka.offset, kafka.partition etc into flowFile attributes. 
> 
> 
> Old getKafka processor: 
> 
> Standard FlowFile Attributes
> Key: 'entryDate'
>Value: 'Sun Jul 17 15:17:00 CDT 2016'
> Key: 'lineageStartDate'
>Value: 'Sun Jul 17 15:17:00 CDT 2016'
> Key: 'fileSize'
>Value: '183'
> FlowFile Attribute Map Content
> Key: 'filename'
>Value: '19709945781167274'
> Key: 'kafka.key'
>Value: '{"database":"test","table":"sc_job","pk.systemid":1}'
> Key: 'kafka.offset'
>Value: '1184010261'
> Key: 'kafka.partition'
>Value: '0'
> Key: 'kafka.topic'
>Value: ‘data'
> Key: 'path'
>Value: './'
> Key: 'uuid'
>Value: '244059bb-9ad9-4d74-b1fb-312eee72124a'
>  
> —
> new ConsumeKafka processor : 
>  
> Standard FlowFile Attributes
> Key: 'entryDate'
>Value: 'Sun Jul 17 15:18:41 CDT 2016'
> Key: 'lineageStartDate'
>Value: 'Sun Jul 17 15:18:41 CDT 2016'
> Key: 'fileSize'
>Value: '183'
> FlowFile Attribute Map Content
> Key: 'filename'
>Value: '19710046870478139'
> Key: 'path'
>Value: './'
> Key: 'uuid'
>Value: '349fbeb3-e342-4533-be4c-424793fa5c59’
> 
> Thanks 
> Sumo

Re: DistributedMapCacheClient Groovy example

2016-07-17 Thread Sumanth Chinthagunta

Added Jira https://issues.apache.org/jira/browse/NIFI-2299 
<https://issues.apache.org/jira/browse/NIFI-2299>

Thanks 
-Sumo

> On Jul 17, 2016, at 4:56 AM, Matt Burgess  wrote:
> 
> Adding API JARs to the scripting NAR is a good idea since it extends the 
> capabilities as you have shown. Mind writing an improvement Jira to capture 
> this?
> 
> Thanks,
> Matt
> 
> 
>> On Jul 17, 2016, at 4:51 AM, Sumanth Chinthagunta  wrote:
>> 
>> I had to custom build NiFi to add distributed-cache support for scripting 
>> processors as described here: 
>> https://github.com/xmlking/mapr-nifi-hadoop-libraries-bundle/blob/master/nifi-mapr-build.md#add-optional
>>  
>> <https://github.com/xmlking/mapr-nifi-hadoop-libraries-bundle/blob/master/nifi-mapr-build.md#add-optional>
>> Please consider adding this support in next release. 
>> 
>> -Sumo 
>> 
>>> On Jul 16, 2016, at 10:17 PM, Sumanth Chinthagunta  
>>> wrote:
>>> 
>>> Hi Matt,
>>> 
>>> Did you find any solution to use DistributedMapCacheClientService from 
>>> Scripting processors ?
>>> 
>>> I tried to use module directory with 
>>> nifi-sumo-common-0.7.0-SNAPSHOT-all.jar bundling with 
>>> org.apache.nifi:nifi-distributed-cache-client-service-api dependency here: 
>>> https://github.com/xmlking/nifi-scripting/releases/tag/0.7.0 
>>> <https://github.com/xmlking/nifi-scripting/releases/tag/0.7.0>
>>> 
>>> but getting weird error :(
>>> 
>>> Would be nice if this dependency is bundled with scripting processor Nar in 
>>> NiFi 0.7.1 :)
>>> 
>>>  
>>>   org.apache.nifi
>>>   nifi-distributed-cache-client-service-api
>>>   
>>> 
>>> My Groovy script:
>>> 
>>> import org.apache.nifi.controller.ControllerService
>>> import com.crossbusiness.nifi.processors.StringSerDe
>>> 
>>> final StringSerDe stringSerDe = new StringSerDe();
>>> 
>>> def lookup = context.controllerServiceLookup
>>> def cacheServiceName = DistributedMapCacheClientServiceName.value
>>> 
>>> log.error  "cacheServiceName: ${cacheServiceName}"
>>> 
>>> def cacheServiceId = 
>>> lookup.getControllerServiceIdentifiers(ControllerService).find {
>>>   cs -> lookup.getControllerServiceName(cs) == cacheServiceName
>>> }
>>> 
>>> log.error  "cacheServiceId:  ${cacheServiceId}"
>>> 
>>> def cache = lookup.getControllerService(cacheServiceId)
>>> log.error cache.get("aaa", stringSerDe, stringSerDe )
>>> Error: 
>>> 
>>> 00:02:04 CDT
>>> ERROR
>>> 3886ddbc-1ccd-437e-8e34-f5a98602264b
>>> ExecuteScript[id=3886ddbc-1ccd-437e-8e34-f5a98602264b] cacheServiceName: 
>>> DistributedMapCacheClientService
>>> 00:02:04 CDT
>>> ERROR
>>> 3886ddbc-1ccd-437e-8e34-f5a98602264b
>>> ExecuteScript[id=3886ddbc-1ccd-437e-8e34-f5a98602264b] cacheServiceId:  
>>> 8971c8e8-6bc5-4e07-8e30-7189fa8a5252
>>> 00:02:04 CDT
>>> ERROR
>>> 3886ddbc-1ccd-437e-8e34-f5a98602264b
>>> ExecuteScript[id=3886ddbc-1ccd-437e-8e34-f5a98602264b] 
>>> ExecuteScript[id=3886ddbc-1ccd-437e-8e34-f5a98602264b] failed to process 
>>> due to org.apache.nifi.processor.exception.ProcessException: 
>>> javax.script.ScriptException: javax.script.ScriptException: 
>>> groovy.lang.MissingMethodException: No signature of method: 
>>> com.sun.proxy.$Proxy123.get() is applicable for argument types: 
>>> (java.lang.String, com.crossbusiness.nifi.processors.StringSerDe, 
>>> com.crossbusiness.nifi.processors.StringSerDe) values: [aaa, 
>>> com.crossbusiness.nifi.processors.StringSerDe@5b1ef80b 
>>> <mailto:com.crossbusiness.nifi.processors.StringSerDe@5b1ef80b>, ...]
>>> Possible solutions: grep(), get(java.lang.Object, 
>>> org.apache.nifi.distributed.cache.client.Serializer, 
>>> org.apache.nifi.distributed.cache.client.Deserializer), wait(), any(), 
>>> getAt(java.lang.String), every(); rolling back session: 
>>> org.apache.nifi.processor.exception.ProcessException: 
>>> javax.script.ScriptException: javax.script.ScriptException: 
>>> groovy.lang.MissingMethodException: No signature of method: 
>>> com.sun.proxy.$Proxy123.get() is applicable for argument types: 
>>> (java.lang.String, com.crossbusiness.nifi.processors.StringSerDe, 
>>> com.crossbusiness.nifi.processors.St

DB2 to MS SQL data transfer

2016-08-10 Thread Sumanth Chinthagunta

I have same scheme in DB2 and SQL server.
Is there a simple solution to transfer data with NiFi? ( one time ) 
Hoping for somebody have better  idea to build the flow :)
Thanks 
Sumo 
Sent from my iPhone

Re: DB2 to MS SQL data transfer

2016-08-11 Thread Sumanth Chinthagunta

I have A set of tables to transfer, but as I have small set of tables , I can 
duplicate flows if flow design works with one table. 
Basically migrating an app from mainframe/DB2 to MS SQL server.
Thanks 
Sumo
Sent from my iPhone

> On Aug 11, 2016, at 1:17 PM, Jeff  wrote:
> 
> Hello Sumo,
> 
> Are you looking at transferring data from a specific table between two
> databases, or all tables?
> 
> On Wed, Aug 10, 2016 at 3:17 PM Sumanth Chinthagunta 
> wrote:
> 
>> I have same scheme in DB2 and SQL server.
>> Is there a simple solution to transfer data with NiFi? ( one time )
>> Hoping for somebody have better  idea to build the flow :)
>> Thanks
>> Sumo
>> Sent from my iPhone
>>

Re: DB2 to MS SQL data transfer

2016-08-12 Thread Sumanth Chinthagunta

Thanks Matt,
I already have scheme replicated. Will try this method and let you know.
If first time transfer works, with QueryDatabaseTable I should be able to 
replicate new records automatically, which is what I am planing to have.
Thanks,
-Sumo

Sent from my iPhone

> On Aug 12, 2016, at 8:46 AM, Matt Burgess  wrote:
> 
> Sumo,
> 
> One-time migrations are possible with NiFi (although probably not a
> common use case). Are the target tables created already? Some DBs
> (like MySQL) support queries that return the Create Table statement,
> if you get those you can use PutSQL to execute them on your target
> system.
> 
> If the target tables are created, you can use QueryDatabaseTable ->
> ConvertAvroToJSON -> ConvertJsonToSQL -> PutSQL, the conversions
> should leave you with flowfiles containing INSERT statements. If you
> already know the schema, you could probably use ReplaceText instead of
> ConvertJsonToSQL.
> 
> QueryDatabaseTable [1], if you configure it with a maximum-value
> column like the primary key column, will only fetch the data once. It
> keeps track of the largest value in the column you specify, so as long
> as more data (with higher values in that column) are not added to the
> source table, then QueryDatabaseTable can continue to "run" but it
> won't output any records after the first full fetch.
> 
> Another option might be to export the source table to a file, then
> transfer it with NiFi to the target system, then use PutSQL to do a
> BULK INSERT [2].
> 
> Regards,
> Matt
> 
> [1] 
> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.QueryDatabaseTable/index.html
> [2] https://msdn.microsoft.com/en-us/library/ms188365.aspx
> 
>> On Thu, Aug 11, 2016 at 7:09 PM, Sumanth Chinthagunta  
>> wrote:
>> I have A set of tables to transfer, but as I have small set of tables , I 
>> can duplicate flows if flow design works with one table.
>> Basically migrating an app from mainframe/DB2 to MS SQL server.
>> Thanks
>> Sumo
>> Sent from my iPhone
>> 
>>> On Aug 11, 2016, at 1:17 PM, Jeff  wrote:
>>> 
>>> Hello Sumo,
>>> 
>>> Are you looking at transferring data from a specific table between two
>>> databases, or all tables?
>>> 
>>> On Wed, Aug 10, 2016 at 3:17 PM Sumanth Chinthagunta 
>>> wrote:
>>> 
>>>> I have same scheme in DB2 and SQL server.
>>>> Is there a simple solution to transfer data with NiFi? ( one time )
>>>> Hoping for somebody have better  idea to build the flow :)
>>>> Thanks
>>>> Sumo
>>>> Sent from my iPhone
>>>>

Re: Odd inter-operation behavior when using clustered NiFi against secure MapR cluster

2016-11-08 Thread Sumanth Chinthagunta

Andre,
Sorry for late replay. Yes we are able to successfully integrate nifi with 
secure MapR cluster  with JAAS method 
Sumo 

Sent from my iPhone

> On Oct 27, 2016, at 1:38 AM, Andre  wrote:
> 
> Sumo,
> 
> Have you tried the workaround? Did it work?
> 
> Cheers
> 
>> On Thu, Oct 27, 2016 at 3:41 PM, Sumanth  wrote:
>> 
>> I had same issue with Secure MapR Cluster + NiFi cluster with embedded zk
>> + PutHDFS setup.
>> Switched back to non-cluster NiFi to avoid conflict between NiFi enabled
>> zk and MapR. Waiting for  stable solution to get back NiFi cluster.
>> Thanks
>> Sumo
>> 
>> 
>> 
>> 
>> Sent from my iPad
>> 
>>> On Oct 26, 2016, at 7:21 AM, Andre  wrote:
>>> 
>>> Bryan,
>>> 
>>> My apologies as the original email wasn't explicit about this:
>>> 
>>> Your assumption is correct: My flow contains a processor (PutHDFS) with a
>>> core-site.xml configured. The file contains the property you refer to (as
>>> this is a cleaner way to force NiFi to connect to the secure MapR
>> cluster).
>>> 
>>> Funny enough, when used without zk, the processor works fine. Same way zk
>>> works if correctly configured,
>>> 
>>> However, in order for both to work at the same time, I had to use to JAAS
>>> workaround.
>>> 
>>> 
>>> As a side note, In case you wonder: MapR's JAAS contains both the Server
>>> and Client stanzas required to run a secure zk, however, they are
>> designed
>>> to use MapR's security mechanism and their packaged version of zookeeper.
>>> As consequence, their Stanzas require jars to be added to class path and
>>> all sort of weird stuff that I preferred not to introduce (since I am
>> using
>>> the zk embedded within NiFi).
>>> 
>>> Had not been the case, I could point arg.15 to MapR's default JAAS as
>>> described here: http://doc.mapr.com/display/MapR/mapr.login.conf and
>> here:
>>> http://doc.mapr.com/pages/viewpage.action?pageId=32506648
>>> 
>>> 
>>> Cheers
>>> 
>>> 
>>> 
 On Thu, Oct 27, 2016 at 12:51 AM, Bryan Bende  wrote:
 
 Meant to say the config instance somehow got
 "hadoop.security.authentication"
 set to "kerberos"
 
> On Wed, Oct 26, 2016 at 9:50 AM, Bryan Bende  wrote:
> 
> Andre,
> 
> This definitely seems weird that somehow using embedded ZooKeeper is
> causing this.
> 
> One thing I can say though, is that in order to get into the code in
>> your
> stacktrace, it had to pass through SecurityUtil.
 isSecurityEnabled(config)
> which does the following:
> 
> public static boolean isSecurityEnabled(final Configuration config) {
>   Validate.notNull(config);
>   return "kerberos".equalsIgnoreCase(config.get("hadoop.security.
> authentication"));
> }
> 
> The Configuration instance passed in is created using the default
> constructor Configuration config = new Configuration(); and then any
> files/paths entered into the processor's resource property is added to
 the
> config.
> 
> So in order for isSecurityEnabled to return true, it means the config
> instance somehow got "hadoop.security.authentication" set to true,
>> which
> usually only happens if you put a core-site.xml on the classpath with
 that
> value set.
> 
> Is it possible some JAR from the MapR dependencies has a core-site.xml
> embedded in it?
> 
> -Bryan
> 
>> On Wed, Oct 26, 2016 at 6:09 AM, Andre  wrote:
>> 
>> Hi there,
>> 
>> I've notice an odd behavior when using embedded Zookeeper on a NiFi
>> cluster
>> with MapR compatible processors:
>> 
>> I noticed that every time I enable embedded zookeeper, NiFi's HDFS
>> processors (e.g. PutHDFS) start complaining about Kerberos identities:
>> 
>> 2016-10-26 20:07:22,376 ERROR [StandardProcessScheduler Thread-2]
>> o.apache.nifi.processors.hadoop.PutHDFS
>> java.io.IOException: Login failure for princical@REALM-NAME-GOES-HERE
>> from
>> keytab /path/to/keytab_file/nifi.keytab
>>   at
>> org.apache.hadoop.security.UserGroupInformation.loginUserFro
>> mKeytabAndReturnUGI(UserGroupInformation.java:1084)
>> ~[hadoop-common-2.7.0-mapr-1602.jar:na]
>>   at
>> org.apache.nifi.hadoop.SecurityUtil.loginKerberos(
>> SecurityUtil.java:52)
>> ~[nifi-hadoop-utils-1.0.0.jar:1.0.0]
>>   at
>> org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.re
>> setHDFSResources(AbstractHadoopProcessor.java:285)
>> ~[nifi-hdfs-processors-1.0.0.jar:1.0.0]
>>   at
>> org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.ab
>> stractOnScheduled(AbstractHadoopProcessor.java:213)
>> ~[nifi-hdfs-processors-1.0.0.jar:1.0.0]
>>   at
>> org.apache.nifi.processors.hadoop.PutHDFS.onScheduled(
>> PutHDFS.java:181)
>> [nifi-hdfs-processors-1.0.0.jar:1.0.0]
>> 
>> So far so good, these errors are quite familiar to people using NiFi
>> against secure MapR clusters and

54 matches

Mail list logo