Mark Payne created NIP-26:
-----------------------------
Summary: Automatic Scheduling Strategy for Processors
Key: NIP-26
URL: https://issues.apache.org/jira/browse/NIP-26
Project: NiFi Improvement Proposal
Issue Type: New Feature
Reporter: Mark Payne
Assignee: Mark Payne
h3. Motivation
NiFi currently requires users to manually configure the number of concurrent
tasks for each processor. Choosing the right value is difficult -- too few
tasks leaves throughput on the table, while too many can cause excessive
contention or overwhelm downstream processors. Additionally, what is ideal in
one environment may not be ideal in another environment with different
resources.
We should introduce an "Automatic" scheduling strategy that dynamically adjusts
the number of concurrent tasks for a processor based on runtime conditions such
as queue depth, backpressure, throughput, and CPU load. This removes the need
for manual tuning and allows the system to adapt as workloads change over time.
h3. NiFi API Changes
* Introduce a new {{SchedulingStrategy.AUTO}} enum value. When selected, the
framework manages concurrency automatically; the user does not configure a
concurrent task count, run schedule, or run duration.
* Introduce a new {{@AllowsAutoScheduling}} annotation with a {{boolean
value()}} defaulting to {{{}true{}}}. Processors that allocate a fixed number
of resources per concurrent task at schedule time (e.g., one Kafka consumer per
task, one script engine per task) should use {{@AllowsAutoScheduling(false)}}
to opt out. Processors annotated this way will not show the Automatic option in
the UI.
h3. REST API Changes
* {{ProcessorDTO}} gains a {{supportsAutoScheduling}} boolean field, populated
from the annotation. The REST API validates that {{AUTO}} cannot be set on
processors that do not support it.
* {{ProcessorConfigDTO}} default concurrent task and scheduling period maps
conditionally include or exclude the {{AUTO}} entry based on the processor's
annotation.
h3. Framework Changes
* A new {{VirtualThreadSchedulingAgent}} replaces the thread-pool-based
scheduling agent. All processors (regardless of strategy) run on virtual
threads, using a semaphore to bound the number of tasks that can be run
concurrently (effectively, applying a cap on Virtual Threads in much the same
way that the thread pool currently does). The virtual threads are beneficial
here because it allows for much cleaner addition and removal of tasks/threads
for a given processor. More generally, though, this has been desirable for a
while because it simplifies a lot of the scheduling logic and pins Processors
to specific threads, which makes thread dumps clearer and enables the use of
ThreadLocal that historically we've avoided in Processors.
* For AUTO processors, the Processor starts with a single virtual thread and
periodically evaluates whether to scale up or down using a
{{VirtualThreadScaler}} interface. This provides a clear separation between the
scaling logic and the scheduling logic. The initial implementation would use a
queue-based heuristic (examining inbound queue depth, outbound backpressure
ratio, and throughput to ensure that increasing threads actually helps).
Additionally, it should consider CPU load so that when adding 1 thread helps
but we're already over-budget on CPU we don't keep adding. We'd also introduce
a cap on the number of threads that any given processor can be given - probably
configurable via nifi.properties with a max of 12 threads.
* {{ProcessorContext.getMaxConcurrentTasks()}} should return the max number of
threads available for the Processor in AUTO mode (12 by default or 1 for
@TriggerSerially processors)
* Diagnostic dump that handles the thread dump will need a slight
modifications because the existing MXBean does not include virtual threads;
we'd need to change to the {{{}HotSpotDiagnosticMXBean{}}}.
h3. UI Changes
* The Scheduling Strategy dropdown gains an "Automatic (Experimental)" option.
When selected, the Concurrent Tasks, Run Schedule, and Run Duration fields are
hidden since they are managed by the framework.
* The option is filtered out for processors annotated with
{{{}@AllowsAutoScheduling(false){}}}.
h3. nifi.properties Configuration
* {{nifi.processor.max.concurrent.tasks}} -- upper bound on concurrent tasks
per processor for both manual and automatic scheduling. We could potentially
have a property that applies only to those Processors in Auto mode. However, we
frequently see users configuring values that are drastically too high, in the
hundreds or more, and this leads to performance degradation and other problems.
So it makes sense to just apply this to all processors regardless of their
scheduling strategy instead of going out of our way to apply it only to Auto
mode.
Note that this would be introduced as an "Experimental" feature and would
clearly be labeled as such in the UI.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)