Joe W,

The use case that originally got me thinking about this was a processor
highlighter that looks for processors above or below configured thresholds,
possibly by instance or type.  I think that requires the ability to

   1. enumerate the processor and/or queue collections,
   2. query each processor or queue for stats like those shown in the UI,
   3. highlight the processor some how, by changing color for instance, and
   4. (possibly) affect the processor by stopping / starting it.

I realize this exposes system internal and threading concerns, but the REST
API already provides the information externally.  Any controller service
using this information must be designed to not have a negative impact on
the system, but that's already true of any custom processor or controller
service since they can overload or lock up the framework if they behave
badly.

Overall, I think the visibility into the data flows, hot/cold spots, larger
than expected ingests, etc. provides value that would far outweigh concerns
about any new risk this capability would create.

Thanks,
Joe S

On Fri, Apr 22, 2016 at 2:25 PM, Joe Witt <joe.w...@gmail.com> wrote:

> Yeah understood.  So let's dig into this more.  We need to avoid over
> exposure of internal state which one might want to crawl through
> because that introduces some multi-threaded challenges and could limit
> our ability to evolve internals.  However, if we understand the
> questions you'd like to be able to ask of certain things better
> perhaps we can better expose those results.
>
> Can you try stating what you're looking for in a bit more specific
> examples.  For instance you said "want to iterate over the processor
> collections...to look for performance thresholds" - What sorts of
> performance threshold questions?
>
> On Fri, Apr 22, 2016 at 2:20 PM, Joe Skora <jsk...@gmail.com> wrote:
> > Joe Witt - Not really, this kind of went sideways from where I was
> > originally headed.
> >
> > I'm looking for a way for a controller service to iterate over the
> > processor and queue collections (maybe others as well) to look for
> > performance thresholds or other issues and then provide feedback somehow
> or
> > report externally.
> >
> > If it can be done through the REST API, seems like it should be possible
> > from within the framework as well.
> >
> > On Fri, Apr 22, 2016 at 1:32 PM, Joe Witt <joe.w...@gmail.com> wrote:
> >
> >> Joe Skora - does Jeremy's JIRA cover your use case needs?
> >>
> >> On Fri, Apr 22, 2016 at 12:44 PM, Jeremy Dyer <jdy...@gmail.com> wrote:
> >> > Mark,
> >> >
> >> > ok that makes sense. I have created a jira for this improvement
> >> > https://issues.apache.org/jira/browse/NIFI-1805
> >> >
> >> > On Fri, Apr 22, 2016 at 12:27 PM, Mark Payne <marka...@hotmail.com>
> >> wrote:
> >> >
> >> >> Jeremy,
> >> >>
> >> >> It should be relatively easy. In FlowController, we would have to
> update
> >> >> getGroupStatus() to set the values on ConnectionStatus
> >> >> and of course update ConnectionStatus to have getters & setters for
> the
> >> >> new values. That should be about it, I think.
> >> >>
> >> >> -Mark
> >> >>
> >> >>
> >> >> > On Apr 22, 2016, at 12:17 PM, Jeremy Dyer <jdy...@gmail.com>
> wrote:
> >> >> >
> >> >> > Mark,
> >> >> >
> >> >> > What would the process look like for doing that? Would that be
> >> something
> >> >> > trivial or require some reworking?
> >> >> >
> >> >> > On Fri, Apr 22, 2016 at 10:26 AM, Mark Payne <marka...@hotmail.com
> >
> >> >> wrote:
> >> >> >
> >> >> >> I definitely don't think we should be exposing the FlowController
> to
> >> a
> >> >> >> Reporting Task.
> >> >> >> However, I think exposing information about whether or not
> >> backpressure
> >> >> is
> >> >> >> being applied
> >> >> >> (or even is configured) is a very reasonable idea.
> >> >> >>
> >> >> >> -Mark
> >> >> >>
> >> >> >>
> >> >> >>> On Apr 22, 2016, at 10:22 AM, Jeremy Dyer <jdy...@gmail.com>
> wrote:
> >> >> >>>
> >> >> >>> I could see the argument for not making that available. What
> about
> >> some
> >> >> >>> sort of reference that would allow the ReportingTask to to
> >> determine if
> >> >> >>> backpressure is being applied to a Connection? It currently seems
> >> you
> >> >> can
> >> >> >>> see the number of bytes and/or objects count queued in a
> connection
> >> but
> >> >> >>> don't have any reference to the values a user has setup for
> >> >> backpressure
> >> >> >> in
> >> >> >>> the UI. Is there a way to get those values in the scope of the
> >> >> >>> ReportingTask?
> >> >> >>>
> >> >> >>> On Fri, Apr 22, 2016 at 10:03 AM, Bryan Bende <bbe...@gmail.com>
> >> >> wrote:
> >> >> >>>
> >> >> >>>> I think the only way you could do it directly without the REST
> API
> >> is
> >> >> by
> >> >> >>>> having access to the FlowController,
> >> >> >>>> but that is purposely not exposed to extension points...
> actually
> >> >> >>>> StandardFlowController is what implements the
> >> >> >>>> EventAccess interface which ends up providing the path way to
> the
> >> >> status
> >> >> >>>> objects.
> >> >> >>>>
> >> >> >>>> I would have to defer to Joe, Mark, and others about whether we
> >> would
> >> >> >> want
> >> >> >>>> to expose direct access to components
> >> >> >>>> through controller services, or some other extension point.
> >> >> >>>>
> >> >> >>>> On Fri, Apr 22, 2016 at 9:46 AM, Jeremy Dyer <jdy...@gmail.com>
> >> >> wrote:
> >> >> >>>>
> >> >> >>>>> Bryan,
> >> >> >>>>>
> >> >> >>>>> The ReportingTask enumeration makes sense and was helpful for
> >> >> something
> >> >> >>>>> else I am working on as well.
> >> >> >>>>>
> >> >> >>>>> Like Joe however I'm looking for a way to not just get the
> *Status
> >> >> >>>> objects
> >> >> >>>>> but rather start and stop processors. Is there a way to do that
> >> from
> >> >> >> the
> >> >> >>>>> ReportContext scope? I imagine you could pull the Processor
> "Id"
> >> from
> >> >> >> the
> >> >> >>>>> ProcessorStatus and then use the REST API but was looking for
> >> >> something
> >> >> >>>>> more direct than having to use the REST API
> >> >> >>>>>
> >> >> >>>>>
> >> >> >>>>> On Fri, Apr 22, 2016 at 9:23 AM, Bryan Bende <bbe...@gmail.com
> >
> >> >> wrote:
> >> >> >>>>>
> >> >> >>>>>> Hi Joe,
> >> >> >>>>>>
> >> >> >>>>>> I'm not sure if a controller service can do this, but a
> >> >> ReportingTask
> >> >> >>>> has
> >> >> >>>>>> access to similar information.
> >> >> >>>>>>
> >> >> >>>>>> A ReportingTask gets access to a ReportingContext, which can
> >> access
> >> >> >>>>>> EventAccess which can access ProcessGroupStatus.
> >> >> >>>>>>
> >> >> >>>>>> From ProcessGroupStatus you are at the root process group and
> can
> >> >> >>>>> enumerate
> >> >> >>>>>> the flow:
> >> >> >>>>>>
> >> >> >>>>>> private Collection<ConnectionStatus> connectionStatus = new
> >> >> >>>>> ArrayList<>();
> >> >> >>>>>> private Collection<ProcessorStatus> processorStatus = new
> >> >> >>>> ArrayList<>();
> >> >> >>>>>> private Collection<ProcessGroupStatus> processGroupStatus =
> new
> >> >> >>>>>> ArrayList<>();
> >> >> >>>>>> private Collection<RemoteProcessGroupStatus>
> >> >> remoteProcessGroupStatus
> >> >> >> =
> >> >> >>>>> new
> >> >> >>>>>> ArrayList<>();
> >> >> >>>>>> private Collection<PortStatus> inputPortStatus = new
> >> ArrayList<>();
> >> >> >>>>>> private Collection<PortStatus> outputPortStatus = new
> >> ArrayList<>();
> >> >> >>>>>>
> >> >> >>>>>> Not sure if that is what you were looking for.
> >> >> >>>>>>
> >> >> >>>>>> -Bryan
> >> >> >>>>>>
> >> >> >>>>>>
> >> >> >>>>>> On Fri, Apr 22, 2016 at 8:25 AM, Joe Skora <jsk...@gmail.com>
> >> >> wrote:
> >> >> >>>>>>
> >> >> >>>>>>> Is it possible and if so what is the best way for a
> controller
> >> >> >>>> service
> >> >> >>>>> to
> >> >> >>>>>>> get the collection of all processors or queues?
> >> >> >>>>>>>
> >> >> >>>>>>> The goal being to iterate over the collection of processors
> or
> >> >> queues
> >> >> >>>>> to
> >> >> >>>>>>> gather information or make adjustments to the flow.
> >> >> >>>>>>>
> >> >> >>>>>>
> >> >> >>>>>
> >> >> >>>>
> >> >> >>
> >> >> >>
> >> >>
> >> >>
> >>
>

Reply via email to