When ControlRate yields, it will yield for the full Yield Duration.  So
your scenario, Alexis, will allow 10 files through in the first
half-second, and when the 11th arrives it will yield for 1 second (if Yield
Duration = 1 sec).  After that yield ends, it will allow the next 10 files
through immediately (assuming at least 10 have been waiting) and yield
again for 1 second.

-- Mike

On Mon, Jun 3, 2024 at 9:41 AM Alexis Sarda-Espinosa <
[email protected]> wrote:

> Hi Jim,
>
> Thanks for the prompt response. I think that could work, but now that I
> look at it, the documentation says that accuracy can be increased "by
> decreasing the Yield Duration" - if I want, say, 10 files per second and 20
> arrive evenly spread out within a second, does that mean that after half a
> second the processor would yield? If yes, would it yield for another half a
> second, or for the whole Yield Duration that's configured?
>
> Regards,
> Alexis.
>
> Am Mo., 3. Juni 2024 um 15:08 Uhr schrieb Jim Steinebrey <
> [email protected]>:
>
>> Hi Alexis,
>> Yes, what you see in your experiments is the expected behavior.
>> The NiFi documentation for FlowFileCurrency says:
>> SINGLE_BATCH_PER_NODE
>> <https://javadoc.io/static/org.apache.nifi/nifi-framework-core-api/1.17.0/org/apache/nifi/groups/FlowFileConcurrency.html#SINGLE_BATCH_PER_NODE>
>> When an Input Port is triggered to run, it will pull *all* FlowFiles
>> from its input queues into the Process Group as a single batch of FlowFiles.
>>
>> In addition, back pressure only applies to the processor BEFORE the
>> connection queue, not the input port AFTER the connection queue.
>>
>> Would it work for you to add a ControlRate processor before input port to
>> separate the flow files into the max batch size you seek?
>>
>> Regards,
>> Jim
>>
>>
>> On Jun 3, 2024, at 5:41 AM, Alexis Sarda-Espinosa <
>> [email protected]> wrote:
>>
>> Hello,
>>
>> I am using NiFi 1.26.0. I created a processor group and configured its
>> FlowFile Concurrency as "Single Batch Per Node". The group has a single
>> input port that connects to only one processor. I configured the queue
>> between the input port and the processor to generate back pressure, hoping
>> that if the Back Pressure Object Threshold is X, the maximum batch size per
>> node would be equal to X, but based on my experiments it seems that the
>> input port will consume as much data as it can without considering the back
>> pressure from the queues connected to it. Is this expected?
>>
>> Regards,
>> Alexis.
>>
>>
>>

Reply via email to