I have been running a standalone instance of Nifi and am preparing a move into a cluster configuration. One aspect I am curious about is how ControlRate is going to operate with n nodes. I am using control rate to satisfy rate-limit requirements for external services.
My flow looks something like: ... > ControlRate count 5000/sec > ControlRate data 5MB/sec > PutKinesisFirehose batch 500, buffer 4MB I am trying to figure out how to throttle when I add a second node which will be running the same flow. ControlRate might already run on the primary node only. I noticed in code it had the @TriggerSerially annotation which is in common with ListS3, and ListSFTP which are isolated processors that only run on the primary node. I don't know exactly what defines a processor as isolated though. If ControlRate is not isolated, one option would be to make it (optionally) so. The description doesn't explicitly say if it is or not and I couldn't find anything related to isolated processors in the developer guide. Only the admin-guide seems to use that terminology. Does anyone have some insight on this? I could divide the count and data rates on each ControlRate to rate-limit/node-count. With batching though I think they might be able to exceed the rate limit of a given stream unless I also divided batch sizes. This option seems not great because I don't want to have to update properties when adding/removing nodes. https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#clustering http://docs.aws.amazon.com/firehose/latest/dev/limits.html Thanks, Nick