Re: Clustering Questions

Pierre Villard Wed, 18 Apr 2018 03:13:45 -0700

Hi Jon,

Just as a note for your unrelated question:
I opened NIFI-4026 few months ago but didn't have time to work on it so far.


[1] https://issues.apache.org/jira/browse/NIFI-4026



2018-04-17 20:34 GMT+02:00 Jon Logan <jmlo...@buffalo.edu>:

> Thanks Joe, just a few follow-up questions:
>
> re:durability -- is this something that people have just been accepting as
> a risk and hoping for the best? Or is this something people build their
> applications around -- ie. using durability outside of the Nifi system
> boundary and push it into a database, etc?
>
> re:heterogenous -- you can join nodes of differing hardware specs, but it
> seems like you will end up causing your lighter-weight nodes to explode as
> there's no way to configure how many tasks and how much to have processing
> "in-flight" on the node different than the other nodes? ie. if I know my
> large nodes can handle 3 of a cpu-intensive task, that's going to cause
> issues for smaller nodes. This is an even bigger problem for differing
> memory sizes.
>
> And an unrelated question to the previous -- is there a way to skew or
> influence how a RPG distributes its tasks? Say, you wanted to do a group-by
> type distribution?
>
>
> Thanks again!
> Jon
>
>
> On Fri, Apr 13, 2018 at 2:17 PM, Joe Witt <joe.w...@gmail.com> wrote:
>
>> Jon,
>>
>> Node Failure:
>>  You have to care about two things generally speaking.  First is the
>> flow execution and second is data in-flight
>>  For flow execution nifi clustering will take care of re-assigning the
>> primary node and cluster coordinator as needed.
>>  For data we do not at present offer distributed data durability.  The
>> current model is predicated on using reliable storage such as RAID,
>> EBS, etc..
>>   There is a very clear and awesome looking K8S based path though that
>> will make this work really nicely with persistent volumes and elastic
>> scaling.  No clear timeline but discussions/JIRA/contributions i hope
>> to start or participate in soon.
>>
>> How scalable is the NiFi scaling model:
>>   Usually NiFi clusters are a few nodes to maybe 10-20 or so.  Some
>> have been larger but generally if you're needing that much flow
>> management then often it makes more sense to have clusters dedicated
>> along various domains of expertise anyway.  So say 3-10 nodes with
>> each handling 100,000 events per second around say 100MB per second
>> (conservatively) and you can see why a single fairly small cluster can
>> handle pretty massive volumes.
>>
>> RPGs feeding back:
>> - This caused issues previously but I believe in recent releases has
>> improved significantly.
>>
>> UI Actions Causing issues:
>> There have been reports similar to this especially for some of the
>> really massive flows we've seen in terms of number of components and
>> concurrent users.  These JIRAs when sorted will help a lot [1], [2],
>> [3].
>>
>> Heterogenous cluster nodes:
>> - This should work quite well actually and is a major reason why NiFi
>> and the S2S protocol supports/honors backpressure.  Nodes that can
>> take on more work take on more work and nodes that cannot pushback.
>> You also want to ensure you're using good and scalable protocols to
>> source data into the cluster.  If you find you're using a lot of
>> protocols requiring you to make many data sourcing steps run 'primary
>> node only' then that will require that primary node to do more work
>> than others and I have seen uneven behavior in such cases.  Yes, you
>> can then route using S2S/RPG which we recommend but still...try to
>> design away from 'primary node only' when possible.
>>
>>
>> Thanks
>> Joe
>>
>>
>> [1] https://issues.apache.org/jira/browse/NIFI-950
>> [2] https://issues.apache.org/jira/browse/NIFI-5064
>> [3] https://issues.apache.org/jira/browse/NIFI-5066
>>
>> On Fri, Apr 13, 2018 at 5:49 PM, Jon Logan <jmlo...@buffalo.edu> wrote:
>> > All, I had a few general questions regarding Clustering, and was
>> looking for
>> > any sort of advice or best-practices information --
>> >
>> > - documentation discusses failure handling primarily from a NiFi crash
>> > scenario, but I don't recall seeing any information on entire
>> node-failure
>> > scenarios. Is there a way that this is supposed to be handled?
>> > - at what point should we expect pain in scaling? I am particularly
>> > concerned about the all-to-all relationship that seems to exist if you
>> > connect a cluster RPG to itself, as all nodes need to distribute all
>> data to
>> > all other nodes. We have been also been having some issues when things
>> are
>> > not as responsive as NiFi would like -- namely, the UI seems to get very
>> > upset and crash
>> > - do UI actions (incl read-only) require delegation to all nodes
>> underneath?
>> > I suspect this is the case as otherwise you wouldn't be able to
>> determine
>> > queue sizes?
>> > - is there a way to have a cluster with heterogeneous node sizes?
>> >
>> >
>> > Thanks in advance!
>>
>
>

Re: Clustering Questions

Reply via email to