Also of note, the distributed cache service is probably the closest to a
cluster-wide framework state management service.  It currently uses our own
persistence backend, but it's conceivable to adapt the distributed cache to
use a database, jndi resource, or a true cache engine, like ehcache.

Adam


On Wed, Jan 14, 2015 at 7:12 AM, Joe Witt <[email protected]> wrote:

> Joe - thanks for bumping this.
>
> Bryan,
>
> "What are the best practices for implementing a processor that needs to
> maintain some kind of state?
>
> I'm thinking of a processor that executes on a timer and pulls data from
> somewhere, but needs to know where it left off for the next execution, and
> I was hoping to not involve an external data store here."
>
> The only managed state the framework provides is through the use of Flow
> File objects and the passing of them between processors.  To keep
> persistent accounting for a given processor of some state of what its doing
> that exists outside of that then you do need to implement some state
> persistence mechanism (to a file, to a database, etc..).
>
> One example of a processor that does this is the GetHttp processor.  It
> interacts with web services and in so doing needs to keep track of any
> cache/E-Tag information it receives so it can be smart about pulling the
> same resource or not depending on whether the server indicates it has
> changed.  How this processor does this is by saving off a file in
> 'conf/.httpCache-<<processor uuid>>'  This use of the processor uuid in the
> name avoids conflicts with other processors of the same type and makes
> referencing it on startup very easy.  If it is there use it to recover
> state and if not start a new one.
>
> That said it is clearly desirable for the framework to offer some sort of
> managed state mechanism for such simple cases.  We've talked about this
> many times over the years but just never pulled the trigger because there
> was always some aspect of our design ideas we didn't like.  So for right
> now you'll need to implement state persistence like this outside the
> framework.  But I've also kicked off a Jira for doing something about this
> here: https://issues.apache.org/jira/browse/NIFI-259
>
> What you were seeing in GetKafka and GetJMS processors was management of
> state that involves interaction with their specific resources (Kafka,
> JMS).  In the case of JMS it was a connection pooling type mechanism and in
> the case of Kafka it was part of Kafkas stream iterator.   That is a
> different thing than this managed persistent state you're asking about.
>
> This is an important topic for us to communicate very well on.  Please feel
> free to keep firing away until we've answered it fully.
>
> Thanks
> Joe
>
> On Wed, Jan 14, 2015 at 5:06 AM, Joe Gresock <[email protected]> wrote:
>
> > I'm also interested in the answers to Bryan's questions, if anyone has
> some
> > input.
> >
> > Thanks,
> > Joe
> >
> > On Fri, Jan 9, 2015 at 3:50 PM, Bryan Bende <[email protected]> wrote:
> >
> > > What are the best practices for implementing a processor that needs to
> > > maintain some kind of state?
> > >
> > > I'm thinking of a processor that executes on a timer and pulls data
> from
> > > somewhere, but needs to know where it left off for the next execution,
> > and
> > > I was hoping to not involve an external data store here.
> > >
> > > From looking at processors like GetJMS and GetKafka, I noticed the use
> of
> > > BlockingQueue<> where poll() is called at the beginning of onTrigger(),
> > and
> > > then the object is put back in the queue in a finally block.
> > >
> > > As far as I could tell it looks like the intent was to only have one
> > object
> > > in the queue, and use the queue as the mechanism for synchronizing
> access
> > > to the shared object, so that if another thread called onTrigger it
> would
> > > block on poll() until the previous execution put the object back in the
> > > queue.
> > >
> > > Is that the general approach?
> > >
> > > Thanks,
> > >
> > > Bryan
> > >
> >
> >
> >
> > --
> > I know what it is to be in need, and I know what it is to have plenty.  I
> > have learned the secret of being content in any and every situation,
> > whether well fed or hungry, whether living in plenty or in want.  I can
> do
> > all this through him who gives me strength.    *-Philippians 4:12-13*
> >
>

Reply via email to