[ 
https://issues.apache.org/jira/browse/NIFI-11837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Handermann updated NIFI-11837:
------------------------------------
    Fix Version/s: 1.23.0
                       (was: 1.latest)

> When a queue starts swapping out data, it never stops
> -----------------------------------------------------
>
>                 Key: NIFI-11837
>                 URL: https://issues.apache.org/jira/browse/NIFI-11837
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Major
>             Fix For: 2.0.0, 1.23.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When a queue reaches the swap threshold (defined in nifi.properties as 
> {{nifi.queue.swap.threshold}} and defaulted to 20,000 FlowFiles), it enters 
> 'swap mode'. However, it never exits swap mode.
> This means that even if the queue is completely emptied, the data that does 
> enter the queue will be swapped out if the queue reaches 10K FlowFiles. 
> Additionally, there is significant overhead under the covers in handling this.
> To replicate, create a simple flow:
>   GenerateFlowFile -> UpdateAttribute.
> Set GenerateFlowFile to run with 6 threads, Run Schedule of "0 secs" and a 
> Run Duration of "100 ms". Auto-terminate the 'success' relationship of 
> UpdateAttribute
> This will quickly fill the queue beyond 20K FlowFiles.
> Now, stop GenerateFlowFile. Lower to 4 threads and a Run Duration of "10 ms"
> Start both processors. Watch the logs indicating that data is constantly be 
> swapped in and out.
> This can have a very significant impact on performance. In my testing on my 
> laptop, once this flow started swapping, its 5-minute stats dropped from 14.5 
> MM FlowFiles per 5 minutes down to 11 MM FlowFiles (roughly a 30% decline)
> In addition to lower throughput, it causes much higher resource utilization, 
> which affects all flows.
> This defect may affect anyone using a large number of small FlowFiles, 
> especially those where data may be bursty enough to exceed to 20,000 FlowFile 
> swapping limit or flows that have Backpressure Threshold set beyond 10,000.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to