Load-balancing web api in cluster

2016-12-19 Thread Hart, Greg
Hi all, What¹s the recommended way for communicating with the NiFi REST API in a cluster? I see that NiFi uses ZooKeeper so is it possible to get the Cluster Coordinator hostname and API port from ZooKeeper, or should I use something like haproxy? Thanks! -Greg

Re: DetectDuplicate

2016-12-19 Thread Andrew Grande
Juan, no change from how you remember this processor yet. I personally would love to have a more pluggable backend for it, too. Andrew On Mon, Dec 19, 2016, 2:35 PM Juan Sequeiros wrote: > Hello, > > I am wondering if DetectDuplicate still has single dependency on > Distributed Cache Service? >

DetectDuplicate

2016-12-19 Thread Juan Sequeiros
Hello, I am wondering if DetectDuplicate still has single dependency on Distributed Cache Service? And if so can I assume that DetectDuplicate will fail if Distributed Cache server is down? I want to replace our DetectDuplicate solution "external DB" and use NIFI's but single point reliance on C

Re: merge flowfiles

2016-12-19 Thread Raf Huys
Yeah, I have indeed no clue as to when all flowfiles are landed. Somehow I need to figure out when that attribute changed, and act upon that event. Currently looking at the FlowfileAggregationProcessor. On Mon, Dec 19, 2016 at 6:29 PM, Lee Laim wrote: > Raf, > > You might be able to use PutFile

Re: merge flowfiles

2016-12-19 Thread Lee Laim
Raf, You might be able to use PutFile and 'merge' your flowfiles in a temporary batch directory. Once you are confident that all the flow files have landed, you can pull the contents of the directory. In other words, when a new directory shows up, pull the contents of the older directory back in

Re: merge flowfiles

2016-12-19 Thread Jeff
Hello Raf, MergeContent can merge based on a correlation ID (attribute). However, the merging currently operates in two modes: Defragment or Bin-Packing Algorithm. Defragment is completed by defragmenting based on the correlation ID and a known number of fragments. Bin-Packing Algorithm is comp

merge flowfiles

2016-12-19 Thread Raf Huys
I want to batch incoming flowfiles based on an attribute. As soon as this attributes' value changes, the current batch should be transferred downstream and be reset. So basically I'm looking for a tumbling window. Can this be done with the MergeContent processor (which strategy?) or should I write