I agree that a native library and wrapper isn't ideal, but I would like to point out RocksDB has been used successfully with NiFi in the past:
https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#rocksdb-flowfile-repository On Thu, Sep 25, 2025 at 4:17 PM Matt Burgess <[email protected]> wrote: > Thanks to all for the due diligence here, it is much appreciated. If we > consider RocksDB (mentioned earlier) I think we should look at Apache > KVRocks[1] , one of the cons is that it's not Java but as a RocksDB-backed > Redis-compatible NoSQL store it might do the trick. I would volunteer to > evaluate but I won't have the cycles for a while, just wanted to bring it > up. > > Regards, > Matt > > [1] https://kvrocks.apache.org/ > > On Thu, Sep 25, 2025 at 10:52 AM David Handermann < > [email protected]> wrote: > > > Peter, > > > > Thanks for the initial investigation and summary. > > > > On review, I agree that going with QuestDB is not the best option. > > Although QuestDB is a framework dependency, it is packaged in a > > separate NAR, and the QuestDB JAR packages platform-specific native > > libraries. More to the point, however, I agree that the Action > > structure makes the adaptation more challenging. The historical > > implementation of the Flow Configuration History used multiple tables, > > which is still an option to consider, depending on the selected > > solution. > > > > Another important factor to consider in this process is that changing > > the implementation will require maintaining the Xodus-based version > > for a period of time, in order to support migration. > > > > One note regarding NIFI-12468, based on the scenario described, it > > sounds like the purge action could be evaluated for improvement. The > > creation of several million records in a short period of time does not > > align with standard operation, and perhaps other options to support a > > complete purge of history, versus over a range of time, would be > > better. > > > > Returning to the topic of alternatives, another option to consider is > > Apache Lucene, which NiFi uses for provenance. I've started evaluating > > this approach, which would be a bit more involved, but also has the > > advantage of being a current dependency. > > > > Regards, > > David Handermann > > > > On Thu, Sep 25, 2025 at 8:58 AM Peter Gyori <[email protected]> wrote: > > > > > > Hi David and Matt, > > > > > > I've finished my review of QuestDB, and here's what I found. > > > > > > I believe QuestDB could be used to store the Flow Configuration > History, > > > and in some ways, it offers benefits over Xodus for this use case. > > However, > > > I think the downsides ultimately outweigh the pros. > > > > > > *Pros:* > > > > > > - > > > > > > *Existing Dependency:* QuestDB is already a dependency in NiFi, so > no > > > new libraries would need to be introduced. > > > - > > > > > > *Faster Purging:* The Flow Configuration History purge operation > would > > > be significantly faster because QuestDB can delete records within a > > > timestamp range. (Currently, Xodus deletes each entity one by one.) > > > - > > > > > > *Faster Entry Retrieval:* finding stored records would likely be > > faster > > > due to QuestDB's record format. > > > > > > *Cons:* > > > > > > > > > - QuestDB is a *column-oriented* database. The Flow Configuration > > > History currently stores *serialized Java objects*. To move to > > QuestDB, > > > we would need to create a table with a column structure that matches > > our > > > Java objects. Since we don't plan to use aggregate functions on > these > > > columns, this approach offers no benefit. Instead, the *table > > structure > > > would be tightly coupled to our Java object structure*, meaning any > > > changes to the objects would require a database migration. > > > - Alternatively, we could use a *"hybrid" approach* where we flatten > > out > > > the Action class' simple fields like id, userIdentity and timestamp > > into > > > columns and store more complex fields (ComponentDetails and > > ActionDetails) > > > as serialized byte arrays. However, *this would require > "reassembling" > > > the Action object during reads*, which is more complex than the > > current > > > method of simply deserializing the entire object. > > > > > > Since this use case doesn't leverage any of QuestDB’s strengths - such > as > > > handling time series, aggregation, or sampling - I don't believe it's > the > > > right choice. Ignoring the fact that QuestDB is already a NiFi > > dependency, > > > it's difficult to justify using it for Flow Configuration History. > > > > > > For that reason, I recommend we find a simple object store instead. > > > > > > What are your thoughts? > > > > > > Regards, > > > Peter > > > > > > On Fri, Sep 19, 2025 at 11:33 PM Peter Gyori <[email protected]> > wrote: > > > > > > > Thank you, David and Matt. > > > > I will also evaluate QuestDB to see if it's a good fit. > > > > > > > > Regards, > > > > Peter > > > > > > > > On Fri, Sep 19, 2025 at 7:04 PM Matt Burgess <[email protected]> > > wrote: > > > > > > > >> That's where I'm tending towards as well, QuestDB. I think it's a > good > > > >> idea > > > >> to back whatever appropriate capabilities with the same database > > library > > > >> if > > > >> only just for maintenance purposes. Of course the downside is any > > > >> vulnerabilities that may arise, such as we had to deal with re: H2 a > > > >> couple > > > >> years ago. > > > >> > > > >> Regards, > > > >> Matt > > > >> > > > >> On Fri, Sep 19, 2025 at 11:34 AM David Handermann < > > > >> [email protected]> wrote: > > > >> > > > >> > Hi Peter, > > > >> > > > > >> > Another option I am evaluating is QuestDB [1]. There is an > optional > > > >> > framework extension that uses QuestDB for persistent Status > > History. I > > > >> > would not intend to couple or reuse code from that module, but > > > >> > building a new implementation of the Audit Store on QuestDB might > > be a > > > >> > good solution. The Flow Configuration History is certainly > > > >> > timestamp-oriented, so this might be a potential way forward. > > > >> > > > > >> > Regards, > > > >> > David Handermann > > > >> > > > > >> > [1] https://questdb.com > > > >> > > > > >> > On Fri, Sep 19, 2025 at 9:36 AM Peter Gyori <[email protected]> > > wrote: > > > >> > > > > > >> > > Hi David, > > > >> > > > > > >> > > Thank you for your reply. > > > >> > > > > > >> > > Regarding NIFI-12468: whenever an Xodus transaction exceeds 60 > > > >> seconds, > > > >> > the > > > >> > > database connection is terminated, and NiFi does not recover > > without a > > > >> > > restart. (Interestingly, with NiFi-1.x using Java11, recovery is > > not > > > >> an > > > >> > > issue.) > > > >> > > > > > >> > > I also evaluated YouTrackDB, but ultimately decided against it. > > As an > > > >> > > object-oriented graph database, YouTrackDB seems to be a more > > > >> high-level > > > >> > > and complex solution than the simple key-value datastore we are > > > >> looking > > > >> > for. > > > >> > > > > > >> > > Regards, > > > >> > > Peter > > > >> > > > > > >> > > On Fri, Sep 19, 2025 at 3:19 PM David Handermann < > > > >> > > [email protected]> wrote: > > > >> > > > > > >> > > > Hi Peter, > > > >> > > > > > > >> > > > Thanks for initiating this discussion. Despite activity on > other > > > >> > > > branches, I have also observed the lack of recent releases for > > > >> Xodus. > > > >> > > > I have not encountered the issues described in NIFI-12468, > but I > > > >> agree > > > >> > > > that an alternative needs to be considered based on the lack > of > > > >> > > > maintenance activity. It is interesting that Xodus now > mentions > > > >> future > > > >> > > > work on YouTrackDB, but that repository has not published a > > release > > > >> to > > > >> > > > Maven Central, so it does not appear to be in a helpful > > position. > > > >> > > > > > > >> > > > Anything that requires a native library and wrapper is not a > > great > > > >> > > > candidate, like RocksDB as you noted. I looked at MapDB > > recently as > > > >> > > > well, but I was also concerned about the maintenance level. > I'm > > not > > > >> > > > familiar with Chronicle-Map, so I plan to take a closer look. > It > > > >> > > > appears to have a number of dependencies, which is initially > > > >> > > > concerning. Returning to H2 is not a good option, but > > mentioning it > > > >> > > > for the sake of background. Apache Derby is another embedded > > > >> database, > > > >> > > > but it has had less maintenance in recent years. > > > >> > > > > > > >> > > > I plan to evaluate options and follow up, thanks again for > > raising > > > >> the > > > >> > > > topic! > > > >> > > > > > > >> > > > Regards, > > > >> > > > David Handermann > > > >> > > > > > > >> > > > On Fri, Sep 19, 2025 at 7:46 AM Peter Gyori < > [email protected]> > > > >> wrote: > > > >> > > > > > > > >> > > > > Team, > > > >> > > > > > > > >> > > > > I am writing to propose we replace Xodus ( > > > >> > > > > https://github.com/JetBrains/xodus ) in NiFi with a more > > actively > > > >> > > > > maintained library. This change is necessary due to two key > > > >> issues: > > > >> > > > > > > > >> > > > > - The Xodus project is no longer under active > development. > > > >> > > > > - We've encountered issues with Xodus when running NiFi > on > > Java > > > >> > 21, as > > > >> > > > > detailed in the comments of > > > >> > > > > https://issues.apache.org/jira/browse/NIFI-12468 > > > >> > > > > > > > >> > > > > I have evaluated some potential replacements and have > > summarized > > > >> my > > > >> > > > initial > > > >> > > > > findings below. > > > >> > > > > > > > >> > > > > Replacement Candidates: > > > >> > > > > > > > >> > > > > - RocksDB https://github.com/facebook/rocksdb > > > >> > > > > - Pros: Popular, actively maintained, and > > > >> license-compatible. > > > >> > > > > - Con: Written in C++ and relies on JNI. > > > >> > > > > - MapDB https://github.com/jankotek/mapdb > > > >> > > > > - Pros: Java-based and license-compatible. > > > >> > > > > - Con: The last release was in January 2024. > > > >> > > > > - Chronicle-Map https://github.com/OpenHFT/Chronicle-Map > > > >> > > > > - Pros: Java-based, actively maintained, and > > > >> > license-compatible. > > > >> > > > > > > > >> > > > > I welcome your input on this proposal, these candidates or > any > > > >> other > > > >> > > > > alternatives you might suggest. > > > >> > > > > > > > >> > > > > Regards, > > > >> > > > > Peter > > > >> > > > > > > >> > > > > >> > > > > > > >
