Re: NiFi architecture

Joe Witt Tue, 16 Dec 2014 20:24:41 -0800

Jonathan

NAR stands for "NiFi Archive" and it is simply our approach to classloader
isolation.  In maven then you can build these nar's and they become bundles
of processors/services/whatever NiFi extension you want which you can then
include into a build ala carte style or however.  The framework itself is
even a nar.  We did all this to first reduce the amount stuff (and thus
possible conflicts) on the system classloader and second we did it so
different bundles of capabilities could have different and sometimes
conflicting dependencies but there to be no problem.

I am with you on the distributed side being where things get more
interesting.  Processors execute in nodes and all nodes execute all
processors.  They only fire if they have data though.  So you interface
with protocols which inherently offer load balancing or you ensure that
NiFi is pulling the data and it by its nature will do load balancing.  We
then include things like back-pressure and congestion avoidance functions
which means even if load balancing isn't working we can still have nodes
'back off' and thus other nodes naturally pick up the slack.  This helps to
address the natural hot-spotting that tends to occur in data flow and data
processing.

Fault tolerance:  If a node dies the data on the node at this time is 'as
dead as the node'.  Any new data will be routed around that node in the
case of pushes from producers to NiFi or other nodes will take over the
load of pulling in the case of pull by NiFi from those consumers.  The
cluster of NiFi Nodes is managed by a thing called the 'NiFi Cluster
Manager [NCM]'.  As long as it is up you can comman and control all the
nodes in the cluster.  If the NCM is dead the nodes all keep rolling based
on what they knew last.  You then also have to be concerned about network
partitioning.  Each node is always heartbeating to the NCM.  If the NCM
doesn't get a heartbeat in a reasonable period of time it will mark that
node as disconnected.  When any node in the cluster is disconnected then
data flow changes cannot be made.  This prevents the flow from being into a
bad state as a result of the partitioning event.  If you know that node is
not just isolated but is actually 'dead', 'gone for  while', whatever then
you can delete it from the cluster and then you'll be able to make changes
to the flow again.

As new nodes join the cluster the first thing that happens is the typical
'am i authorized' to be here check.  Once through that then the NCM sends a
copy of the latest flow configuration to the node at which point the node
loads the flow begins its journey.

I'm glossing over a lot of the fun details of course but we will obviously
dig further here and if there are any tracks you want to go deeper on
sooner then please advise.

Thanks
Joe

On Tue, Dec 16, 2014 at 11:04 PM, Jonathan Natkins <[email protected]>
wrote:
>
> Hey Joe,
>
> This is really helpful. In terms of examples of good architectural
> descriptions, I think the Kafka overview is pretty great (I think a lot of
> it came from the original academic paper). It's very helpful for
> understanding the key concepts and design trade-offs. My personal feeling
> is that diagrams are very helpful: my guess is that the single-node
> processing layer is not all that complex, but where architecture gets
> interesting (and where a lot of my curiosity lies) is once you get into the
> distributed modes. How is fault-tolerance handled, how do I specify which
> processors operate on which nodes, how is cluster membership handled, etc.
>
> Also: what does NAR stand for?
>
> Thanks!
> Natty
>
> Jonathan "Natty" Natkins
> StreamSets | Customer Engagement Engineer
> mobile: 609.577.1600 | linkedin <http://www.linkedin.com/in/nattyice>
>
>
> On Tue, Dec 16, 2014 at 6:08 PM, Joe Witt <[email protected]> wrote:
> >
> > Natty
> >
> > There are very little existing resources as of yet but fully recognize
> that
> > this is a problem.
> >
> > https://issues.apache.org/jira/browse/NIFI-162
> >
> > If there are specific examples of architectural descriptions that you
> think
> > are well done I'd love to see them.
> >
> > The very brief version of how execution and scale work:
> >
> > Execution:
> > NiFi runs within the JVM.  As data flows through a given NiFi instance
> > there are two primary repositories that we keep which hold key
> information
> > about the data.  One repository is known as the Flowfile repository and
> its
> > job is to keep information about the data in the flow.  The other
> > repository is the content repository and it keeps the actual data.  In
> nifi
> > you're composing directed graphs of processors.  Each processor is
> > scheduled to run according to its configured scheduling style and is
> given
> > time to run by a flow controller/thread-pool.  When a given process runs
> it
> > is given access to the Flowfile Repository and content repository as
> > necessary to be able to access and modify the data in a safe and
> efficient
> > manner.
> >
> > Out of the box the flow file repo can be all in-memory or run off a
> > write-ahead log based implementation with high reliability and
> throughput.
> > For the content repo it too supports all in-memory or using one or more
> > disks in parallel yielding again very high throughput with excellent
> > durability.
> >
> > Scale:
> > Vertical: Supports highly concurrent processing and can utilize multiple
> > physical disks in parallel.
> > Horizontal: Supports clustering whereby a cluster manager relays commands
> > to nodes in the cluster and coordinates all their responses.  Nodes then
> > operate as they would if they were standalone.
> >
> > Lots more coming here of course but if you have specific questions now
> > please feel free to fire away.
> >
> > Thanks
> > Joe
> >
> >
> >
> >
> > On Tue, Dec 16, 2014 at 7:16 PM, Jonathan Natkins <[email protected]>
> > wrote:
> > >
> > > Hi there,
> > >
> > > I was curious if there exist any resources that would be helpful in
> > > understanding the NiFi architecture. I'm trying to understand how
> > dataflows
> > > are executed, or how I would scale the system. Are there any
> > architectural
> > > docs, or blog posts, or academic papers out there that would be
> helpful?
> > >
> > > Alternatively, some pointers into the code base as to where the
> execution
> > > layer code lives could be helpful.
> > >
> > > Thanks!
> > > Natty
> > >
> > > Jonathan "Natty" Natkins
> > > StreamSets | Customer Engagement Engineer
> > > mobile: 609.577.1600 | linkedin <http://www.linkedin.com/in/nattyice>
> > >
> >
>

Re: NiFi architecture

Reply via email to