Hi Arthur, I'm afraid I can't answer your questions regarding integrating NiFi with other software, but for the developers that do, maybe it is a good idea to describe the functionality you intend to add by integrating NiFi in an external application?
The original intent of NiFi comes from its NSA roots: Handle a very large and unpredictable influx of data and enable analysts to deal with errors right there and then, in the live environment. As you've noted this makes it an easy tool to pick up and use quickly, but it doesn't fit well in the current everything-as-code-from-a-pipeline world. The NiFi contributors have done a great job of providing good management options including the registry clients and a full-featured API, but there is still tension between those two ways of working. What I can tell you is that we experimented with deploying/managing flows from a pipeline and/or client application using the NiFi API (through NiPyApi python package for pipeline stuff). This worked quite well for deployment, but the external operations tool was too cumbersome in our scenario. Developers much preferred to use NiFi in the originally intended way: Investigate and fix problems on the Canvas itself. The platform team also preferred not to have to add a newly found exception to our tool every week. Scaling NiFi was designed to handle many flows in a single instance/cluster, making optimal use of the shared resources through dynamic scheduling of processors, streaming large data to disk directly and loadbalancing across servers. Scaling is achieved by adding more servers to the cluster or increasing resources per server. There is a bottleneck with the primary node for certain processes, but that is also being improved. Deploying multiple small NiFi instances works well too and I think more users are shifting to that model in a Kubernetes world. Typical reasons to split would be optimizing for workload type (streaming/low latency vs batch processing), limiting by security scope or dynamic scaling requirements. You're still running a fairly large instance with a webserver and everything related, so NiFi itself is not a good fit for running single flows. Single flows is what MiNiFi is best suited for. Originally designed to run on edge devices or workload servers to collect data locally, it provides a solid option for running a remotely managed flow. You get rid of the large overhead of the webserver and clustering stuff and instead start a lean instance (like a Kubernetes pod) that does only the minimum needed and can be stopped again after finishing. This works great for flows like IoT data that don't change often. The overhead is in testing and debugging/redeploying when things don't work as planned yet. In our case the tradeoff wasn't worth it at the time, especially since the data collection flows are a sideshow for the developers and they greatly appreciate the intuitive development and testing on the NiFi canvas without the extra steps for MiNiFi. Our biggest deployment feeds a datalake and it has many scheduled flows that trigger once per day or a few times at most, load a large amount of data, do some memory-intensive stuff and then finish again. The cluster nodes have 64 GB RAM, not even that big and they deal with a few hundred flows just fine. Dealing with contention typically means shifting some flow's schedule a few minutes or telling a team to be less ambitious in their transforming of GB+ files. The "shared everything" model does mean that we've had occasional incidents where developers made a flow that soft-hung the cluster by exhausting heap or threads. I hope this helps you get some perspective. Regards, Isha From: Derewjankin, Arthur via users <[email protected]> Sent: Monday, 30 March 2026 09:46 To: [email protected] Cc: Derewjankin, Arthur <[email protected]> Subject: Question on embedding/integrating Apache NiFi with Quarkus applications and scalability considerations Dear Apache NiFi Community, I hope you are doing well. I am working as a Business Analyst and have recently been exploring Apache NiFi. I really appreciate the way data flows can be modeled and visualized-it's been very intuitive and powerful for our use cases. We are currently evaluating whether we can integrate NiFi into one of our Quarkus-based applications. We also looked into MiNiFi as a lightweight alternative. However, based on our initial proof of concept, it seems that embedding or tightly integrating NiFi (or MiNiFi) directly into an application might not be the intended usage pattern, as we were not able to make this approach work successfully. In addition, one of our goals was to deploy each flow independently (e.g., one flow per deployment unit) in order to achieve a high degree of scalability and isolation. This also proved challenging in our experiments. This leads to a few questions we were hoping the community could help clarify: 1. Is embedding NiFi (or MiNiFi) within an application such as a Quarkus service a supported or recommended approach? 2. Is deploying individual flows as separate, independently scalable units aligned with NiFi's design, or does this conflict with its intended usage model? 3. If these approaches are not recommended, could you share the reasoning behind these architectural decisions? 4. Our concern is that NiFi might become too heavyweight over time in our scenario: * We expect a growing number of flows * Many flows would run at scheduled times each day * Data volumes vary significantly-from small messages to files in the GB range * Flows would include polling from external systems, importing, and exporting data Given these characteristics, would NiFi still be an appropriate choice, or should it rather be operated as a standalone service instead of being integrated into an application? We would greatly appreciate any guidance, best practices, or architectural recommendations you can share based on your experience. Thank you very much for your time and support. Best regards, Arthur P.S. Resending this as I recently subscribed to the mailing list and my previous message may not have gone through. [Logo]<https://www.l-p-a.com/> Arthur Derewjankin Business Analyst Lucht Probst Associates GmbH Große Gallusstraße 9 Frankfurt am Main 60311, Hessen, DE e: [email protected]<mailto:[email protected]> | w: www.l-p-a.com<http://www.l-p-a.com/> m: | p: +4969971485245 LinkedIn<https://www.linkedin.com/company/lpa-lucht-probst-associates-gmbh> [Logo]<https://www.l-p-a.com/wp-content/uploads/payoff-magazine_0326_RZ.pdf> ________________________________ The information contained in this e-mail is privileged and confidential, intended only for the use of the recipient named above. If the reader of this message is not the above mentioned recipient or is not acting on behalf of the respective recipient, please note that any saving, distribution or copying of this e-mail is strictly prohibited. Please notify us immediately by telephone at + 49-69-97 14 85-0 or e-mail if you have received such misdirected messages, and kindly destroy this e-mail. Thank you. Lucht Probst Associates GmbH, Große Gallusstraße 9, 60311 Frankfurt, Germany Managing Directors: Stefan Lucht Vat id number : DE203779866 I commercial register entry: Amtsgericht Frankfurt/Main I commercial register no.: HRB 48809 All necessary information for processing your data is available here: https://www.l-p-a.com/privacy-policy
