Jim,

I would offer you a few bits of advice. First, NiFi relies on ZooKeeper to 
coordinate which node is responsible
to act as the Cluster Coordinator and which node should be the Primary Node. 
NiFi does allow you to start
and embedded ZooKeeper, but for production use, it is recommended that you use 
an external ZooKeeper
running on different boxes.

Secondly, the default timeouts used for clustering (the 
"nifi.cluster.node.connection.timeout" and
"nifi.cluster.node.read.timeout" properties in nifi.properties) are set to 5 
seconds. This is okay for a "getting started"
type of cluster. However, if you start processing large numbers of FlowFiles, 
the JVM's garbage collection can sometimes
cause some fairly lengthy pauses. This could cause timeouts. So I would 
recommend increasing that to 10-15 seconds
or more.

Finally, the "nifi.zookeeper.connect.timeout" and 
"nifi.zookeeper.session.timeout" default to 3 seconds, as this is what
the ZooKeeper default is. However, I've heard some people indicate that they 
saw frequent ZooKeeper timeouts, which caused
the Primary Node and Cluster Coordinator to change frequently. Changing the 
value to 5 or 10 seconds was much better.

Thanks
-Mark

> On Apr 20, 2017, at 7:01 AM, James McMahon <jsmcmah...@gmail.com> wrote:
> 
> Good morning. I have established an initial single-threaded NiFi server 
> instance for my customers. It works well, but I anticipate increasing usage 
> as groups learn more about it. I also want to move beyond our single 
> threaded-ness.
> 
> I would like to take the next step in the evolution of our NiFi capability, 
> implementing a clustered NiFi server configuration to help me address the 
> following requirements:
> 1. increase our fault tolerance
> 2. permit our configuration to scale to peak processing demands during bulk 
> data loads and as more customers begin to leverage our NiFi instance
> 3. permit our configuration to load balance
> 
> I do intend to begin by reading through the clustering sections in the NiFi 
> Sys Admin guide. I am also interested in hearing from our user community, 
> particularly regarding clustering "best practices" and practical insights 
> based on your experiences. Thanks in advance for any insights you are willing 
> to share.  -Jim

Reply via email to