Philipp Freyer created NIFI-15519:
-------------------------------------
Summary: Non-deterministic processor running state on Nifi startup
Key: NIFI-15519
URL: https://issues.apache.org/jira/browse/NIFI-15519
Project: Apache NiFi
Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Philipp Freyer
Attachments: anonymized.nifi-app_2026-01-27_22.0-1.log
h2. Issue:
All processors are either started or stopped after a Nifi instance restart when
they should always be started.
This is not deterministic behavior and the logs do not show any noticeable
indication on why this is.
h2. Background:
We are running an Instance of Apache Nifi for years now with rather complex
flows on it. These flows were working rock-solid for this time, even when
restarting the Nifi instance and even through Nifi upgrades all the way from
1.X to the current Nifi version.
But since the time we activated Github versioning (GitHubFlowRegistryClient
2.6.0), the processors on this instance are not starting reliably after Nifi
instance startup. Only around 25% of the times the instances start up we see
running processors on the instance. In about 75% of the time all processors on
the instance are stopped.
We use GitHub versioning extensively on the Nifi isntance with the root process
group containing one (versioned) process group that then contains all flow
logic. Since the versioned file became too big to restore with default Nifi
timeout settings, we multiplied these timeouts by 10 (in nifi.properties) and
also introduced versioned sub-process groups to split the files exported to
GitHub.
The log files do not show any log output indicating any issues related to
processor startup (as far as we could see).
What we do see is errors loading the versioned process group info for a while,
when the registry client is not ready yet - these errors get resolved as soon
as the registry client is initialized properly, though (and they occur every
startup):
{code:java}
2026-01-27 22:18:44,248 ERROR [Framework Task Thread-1]
o.a.nifi.groups.StandardProcessGroup Failed to synchronize
StandardProcessGroup[identifier=6134e32f-f850-3427-76ec-bad51ec7dd01,name=mycompany
Group AG] with Flow Registry because could not retrieve version
e9fd9e94f7dd2e0e26371ca2e3d26ef2bdfcd07e of flow with identifier
MycompanyGroupAg in bucket default
org.apache.nifi.registry.flow.FlowRegistryException:
GitHubFlowRegistryClient[id=a0e83df6-019a-1000-47c4-c03b90820d7f] cannot
currently be used to synchronize with Flow Registry because it is currently
validating
at
org.apache.nifi.groups.StandardProcessGroup.synchronizeWithFlowRegistry(StandardProcessGroup.java:3752)
at
org.apache.nifi.groups.StandardProcessGroup.lambda$setVersionControlInformation$27(StandardProcessGroup.java:3591)
at
org.apache.nifi.controller.scheduling.StandardProcessScheduler.lambda$wrapTask$1(StandardProcessScheduler.java:189)
at java.base/java.lang.VirtualThread.run(VirtualThread.java:329){code}
A log showing the instance shutdown (for backup by copying the entire Nifi
file system) and restart with stopped processors after startup is attached.
We will gladly discuss details (or a testing session) in Slack, if that helps
resolving this issue.
We wil also gladly provide more information, if possible.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)