[ https://issues.apache.org/jira/browse/NIFI-13340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yuanhao Zhu updated NIFI-13340: ------------------------------- Priority: Blocker (was: Critical) > Flowfiles stopped to be ingested before a processor group > --------------------------------------------------------- > > Key: NIFI-13340 > URL: https://issues.apache.org/jira/browse/NIFI-13340 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework > Affects Versions: 1.25.0 > Environment: Host OS :ubuntu 20.04 in wsl > CPU: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz > RAM: 128GB > ZFS ARC cache was configured to be maximum 4 GB > Reporter: Yuanhao Zhu > Priority: Blocker > Labels: data-consistency, statestore > Attachments: image-2024-06-03-14-42-37-280.png, > image-2024-06-03-14-44-15-590.png > > > *Background:* We run nifi as a standalone instance in a docker container in > on all our stages, the statemanagement provider used by our instance is the > local WAL backed provider. > {*}Description{*}: We observed that the flowfiles in front of one of our > processor groups stopped to be ingested once in a while and it happens on all > our stages without noticeable pattern. The flowfile concurrency policy of the > processor group that stopped ingesting data is set to SINGLE_BATCH_PER_NODE > and the outbound policy is set to BATCH_OUTPUT. The ingestion should continue > since the processor group had already send out every flowfile in it, but it > stopped. > > There is only one brutal solution from our side(We have to manually switch > the flowfile concurrency to unbounded and then switch it back to make it work > again) and the occurrence of this issue had impacted our data ingestion. > !image-2024-06-03-14-42-37-280.png! > !image-2024-06-03-14-44-15-590.png! > As you can see in the screenshot that the processor group 'Delete before > Insert' has no more flowfile to output but still it does not ingest the data > queued in the input port > > {{In the log file I found the following:}} > > {code:java} > 2024-06-03 11:34:01,772 TRACE [Timer-Driven Process Thread-15] > o.apache.nifi.groups.StandardDataValve Will not allow data to flow into > StandardProcessGroup[identifier=5eb0ad69-e8ed-3ba0-52da-af94fb9836cd,name=Delete > before Insert] because Outbound Policy is Batch Output and valve is already > open to allow data to flow out of group{code} > > {{ }} > {{Also in the diagnostics, I found the following for the 'Delete before > Insert' processor group:}} > {{ }} > {code:java} > Process Group e6510a87-aa78-3268-1b11-3c310f0ad144, Name = Search and Delete > existing reports(This is the parent processor group of the Delete before > Insert) > Currently Have Data Flowing In: [] > Currently Have Data Flowing Out: > [StandardProcessGroup[identifier=5eb0ad69-e8ed-3ba0-52da-af94fb9836cd,name=Delete > before Insert]] > Reason for Not allowing data to flow in: > Data Valve is already allowing data to flow out of group: > > StandardProcessGroup[identifier=5eb0ad69-e8ed-3ba0-52da-af94fb9836cd,name=Delete > before Insert] > Reason for Not allowing data to flow out: > Output is Allowed: > > StandardProcessGroup[identifier=5eb0ad69-e8ed-3ba0-52da-af94fb9836cd,name=Delete > before Insert] {code} > {{ }} > {{Which is clearly {*}not correct{*}. since there are currently not data > flowing out from the 'Delete before Insert' processor group.}} > {{ }} > {{We dig through the source code of StandardDataValve.java and found that > that data valve's states are stored every time the data valve is opened and > close so the potential reason causing this issue is that the processor group > id was put into the statemap when data flowed in but somehow the removal of > the entry was not successful. We are aware that if the statemap is not stored > before the nifi restarts, it could lead to such circumstances, but in the > recent occurrences of this issue, there were not nifi restart recorded or > observed at the time when all those flowfiles started to queue in front of > the processor group}} > {{ }} > -- This message was sent by Atlassian Jira (v8.20.10#820010)