Joe, Mike is right in that it was intended to be a more efficient scheduling strategy. With Timer-Driven, the processors used to constantly be checking if they had work to do and if not would switch contexts and check again. And again. This was pretty expensive, so we added the Event-Driven strategy.
Generally, implementing the Event-Driven strategy should be fairly simple and straight-forward. When a FlowFile lands in a queue, just call the onTrigger method of the queue's destination. However, it got a lot more complicated when we need to consider backpressure and limiting the number of concurrent tasks. So much more complicated, in fact, that tested showed that the Event-Driven strategy was noticeably slower than Timer-Driven. To that end, we added the "nifi.bored.yield.duration" property to nifi.properties and updated the framework so that if there is no work for the Processor to do (due to its queues being empty or backpressure being applied) we don't schedule that processor thread for the configured number of time. Implementing this showed a significant drop in CPU resources while still providing great throughput. So, truth be told, we pretty much abandoned using Event-Driven. I do also remember several years back, running into an issue where under high load we would occasionally see a Processor "freeze up" using Event-Driven scheduling. I think that was the main reason we marked it experimental. It was unclear what the cause was, but given how well the Timer-Driven scheduling strategy as worked for us, I've just never re-visited it. That being said, I do believe that an Event-Driven approach is a good idea. But given how much more mature NiFi is now than it was at the point that it was implemented, I would probably approach the idea entirely differently. To answer your questions directly: 1. I would never recommend using event-driven over timer-driven processors. 2. Not sure who is using it in production, but I would recommend against it. 3. My vote would be to mark it as deprecated. 4. To be honest, I'm not sure that I fully understand this question, as it is somewhat vague. Are you referring specifically to scheduling, obtaining the best performance, minimizing resource utilization, or did you intend for this to be vague and are just asking for any general guidance in whatever form? Thanks -Mark > On Sep 12, 2018, at 5:11 PM, Michael Moser <mose...@apache.org> wrote: > > Hi Joe, > > I'm guessing here, but I think the Event Driven scheduling was intended to > be more efficient than Timer Driven scheduling, in the way that push > notifications should be more efficient than polling. In practice, I'm not > sure anyone has measured the difference. > > I have seen folks use Event Driven scheduling to get access to the separate > thread pool from the Timer Driven pool. For example, if you are running on > an 8 core system but you want a Timer Driven pool with 50 threads to do > lots of I/O bound tasks, you might create an Event Driven pool with 4 > threads and assign your CPU heavy processing to that pool. This limit may > avoid having way more than 8 CPU heavy threads (from the Timer Driven pool) > bogging down your 8 core system. > > Regards, > -- Mike > > > On Thu, Sep 6, 2018 at 3:11 PM Joe Percivall <jperciv...@apache.org> wrote: > >> Hey everyone, >> >> The dataflow I'm running has one main flow and a couple other disjoint >> process groups. Within that main flow, there are sections which aren't used >> very often. In trying to optimize things, I looked into the guidance we >> have on the "event-driven" scheduling type. There doesn't appear to be much >> concrete other than "it's experimental". Which has been the go-to, >> basically since being open-sourced. >> >> So with that, I'm curious about a couple things: >> 1: With the recent improvements to the controller and timer-based >> scheduling, what should be our guidance on when to use event-based over >> timer-based? >> 2: Is anyone actually using it in production? >> 3: Given it's been 3+ years of "it's experimental", we should start >> thinking about either declaring it good to go or deprecating it. >> 4: Any lessons learned on optimizing disjoint/sparse flows. >> >> Cheers, >> Joe >> -- >> *Joe Percivall* >> linkedin.com/in/Percivall >> e: jperciv...@apache.com >>