Thanks to all for the due diligence here, it is much appreciated. If we consider RocksDB (mentioned earlier) I think we should look at Apache KVRocks[1] , one of the cons is that it's not Java but as a RocksDB-backed Redis-compatible NoSQL store it might do the trick. I would volunteer to evaluate but I won't have the cycles for a while, just wanted to bring it up.
Regards, Matt [1] https://kvrocks.apache.org/ On Thu, Sep 25, 2025 at 10:52 AM David Handermann < [email protected]> wrote: > Peter, > > Thanks for the initial investigation and summary. > > On review, I agree that going with QuestDB is not the best option. > Although QuestDB is a framework dependency, it is packaged in a > separate NAR, and the QuestDB JAR packages platform-specific native > libraries. More to the point, however, I agree that the Action > structure makes the adaptation more challenging. The historical > implementation of the Flow Configuration History used multiple tables, > which is still an option to consider, depending on the selected > solution. > > Another important factor to consider in this process is that changing > the implementation will require maintaining the Xodus-based version > for a period of time, in order to support migration. > > One note regarding NIFI-12468, based on the scenario described, it > sounds like the purge action could be evaluated for improvement. The > creation of several million records in a short period of time does not > align with standard operation, and perhaps other options to support a > complete purge of history, versus over a range of time, would be > better. > > Returning to the topic of alternatives, another option to consider is > Apache Lucene, which NiFi uses for provenance. I've started evaluating > this approach, which would be a bit more involved, but also has the > advantage of being a current dependency. > > Regards, > David Handermann > > On Thu, Sep 25, 2025 at 8:58 AM Peter Gyori <[email protected]> wrote: > > > > Hi David and Matt, > > > > I've finished my review of QuestDB, and here's what I found. > > > > I believe QuestDB could be used to store the Flow Configuration History, > > and in some ways, it offers benefits over Xodus for this use case. > However, > > I think the downsides ultimately outweigh the pros. > > > > *Pros:* > > > > - > > > > *Existing Dependency:* QuestDB is already a dependency in NiFi, so no > > new libraries would need to be introduced. > > - > > > > *Faster Purging:* The Flow Configuration History purge operation would > > be significantly faster because QuestDB can delete records within a > > timestamp range. (Currently, Xodus deletes each entity one by one.) > > - > > > > *Faster Entry Retrieval:* finding stored records would likely be > faster > > due to QuestDB's record format. > > > > *Cons:* > > > > > > - QuestDB is a *column-oriented* database. The Flow Configuration > > History currently stores *serialized Java objects*. To move to > QuestDB, > > we would need to create a table with a column structure that matches > our > > Java objects. Since we don't plan to use aggregate functions on these > > columns, this approach offers no benefit. Instead, the *table > structure > > would be tightly coupled to our Java object structure*, meaning any > > changes to the objects would require a database migration. > > - Alternatively, we could use a *"hybrid" approach* where we flatten > out > > the Action class' simple fields like id, userIdentity and timestamp > into > > columns and store more complex fields (ComponentDetails and > ActionDetails) > > as serialized byte arrays. However, *this would require "reassembling" > > the Action object during reads*, which is more complex than the > current > > method of simply deserializing the entire object. > > > > Since this use case doesn't leverage any of QuestDB’s strengths - such as > > handling time series, aggregation, or sampling - I don't believe it's the > > right choice. Ignoring the fact that QuestDB is already a NiFi > dependency, > > it's difficult to justify using it for Flow Configuration History. > > > > For that reason, I recommend we find a simple object store instead. > > > > What are your thoughts? > > > > Regards, > > Peter > > > > On Fri, Sep 19, 2025 at 11:33 PM Peter Gyori <[email protected]> wrote: > > > > > Thank you, David and Matt. > > > I will also evaluate QuestDB to see if it's a good fit. > > > > > > Regards, > > > Peter > > > > > > On Fri, Sep 19, 2025 at 7:04 PM Matt Burgess <[email protected]> > wrote: > > > > > >> That's where I'm tending towards as well, QuestDB. I think it's a good > > >> idea > > >> to back whatever appropriate capabilities with the same database > library > > >> if > > >> only just for maintenance purposes. Of course the downside is any > > >> vulnerabilities that may arise, such as we had to deal with re: H2 a > > >> couple > > >> years ago. > > >> > > >> Regards, > > >> Matt > > >> > > >> On Fri, Sep 19, 2025 at 11:34 AM David Handermann < > > >> [email protected]> wrote: > > >> > > >> > Hi Peter, > > >> > > > >> > Another option I am evaluating is QuestDB [1]. There is an optional > > >> > framework extension that uses QuestDB for persistent Status > History. I > > >> > would not intend to couple or reuse code from that module, but > > >> > building a new implementation of the Audit Store on QuestDB might > be a > > >> > good solution. The Flow Configuration History is certainly > > >> > timestamp-oriented, so this might be a potential way forward. > > >> > > > >> > Regards, > > >> > David Handermann > > >> > > > >> > [1] https://questdb.com > > >> > > > >> > On Fri, Sep 19, 2025 at 9:36 AM Peter Gyori <[email protected]> > wrote: > > >> > > > > >> > > Hi David, > > >> > > > > >> > > Thank you for your reply. > > >> > > > > >> > > Regarding NIFI-12468: whenever an Xodus transaction exceeds 60 > > >> seconds, > > >> > the > > >> > > database connection is terminated, and NiFi does not recover > without a > > >> > > restart. (Interestingly, with NiFi-1.x using Java11, recovery is > not > > >> an > > >> > > issue.) > > >> > > > > >> > > I also evaluated YouTrackDB, but ultimately decided against it. > As an > > >> > > object-oriented graph database, YouTrackDB seems to be a more > > >> high-level > > >> > > and complex solution than the simple key-value datastore we are > > >> looking > > >> > for. > > >> > > > > >> > > Regards, > > >> > > Peter > > >> > > > > >> > > On Fri, Sep 19, 2025 at 3:19 PM David Handermann < > > >> > > [email protected]> wrote: > > >> > > > > >> > > > Hi Peter, > > >> > > > > > >> > > > Thanks for initiating this discussion. Despite activity on other > > >> > > > branches, I have also observed the lack of recent releases for > > >> Xodus. > > >> > > > I have not encountered the issues described in NIFI-12468, but I > > >> agree > > >> > > > that an alternative needs to be considered based on the lack of > > >> > > > maintenance activity. It is interesting that Xodus now mentions > > >> future > > >> > > > work on YouTrackDB, but that repository has not published a > release > > >> to > > >> > > > Maven Central, so it does not appear to be in a helpful > position. > > >> > > > > > >> > > > Anything that requires a native library and wrapper is not a > great > > >> > > > candidate, like RocksDB as you noted. I looked at MapDB > recently as > > >> > > > well, but I was also concerned about the maintenance level. I'm > not > > >> > > > familiar with Chronicle-Map, so I plan to take a closer look. It > > >> > > > appears to have a number of dependencies, which is initially > > >> > > > concerning. Returning to H2 is not a good option, but > mentioning it > > >> > > > for the sake of background. Apache Derby is another embedded > > >> database, > > >> > > > but it has had less maintenance in recent years. > > >> > > > > > >> > > > I plan to evaluate options and follow up, thanks again for > raising > > >> the > > >> > > > topic! > > >> > > > > > >> > > > Regards, > > >> > > > David Handermann > > >> > > > > > >> > > > On Fri, Sep 19, 2025 at 7:46 AM Peter Gyori <[email protected]> > > >> wrote: > > >> > > > > > > >> > > > > Team, > > >> > > > > > > >> > > > > I am writing to propose we replace Xodus ( > > >> > > > > https://github.com/JetBrains/xodus ) in NiFi with a more > actively > > >> > > > > maintained library. This change is necessary due to two key > > >> issues: > > >> > > > > > > >> > > > > - The Xodus project is no longer under active development. > > >> > > > > - We've encountered issues with Xodus when running NiFi on > Java > > >> > 21, as > > >> > > > > detailed in the comments of > > >> > > > > https://issues.apache.org/jira/browse/NIFI-12468 > > >> > > > > > > >> > > > > I have evaluated some potential replacements and have > summarized > > >> my > > >> > > > initial > > >> > > > > findings below. > > >> > > > > > > >> > > > > Replacement Candidates: > > >> > > > > > > >> > > > > - RocksDB https://github.com/facebook/rocksdb > > >> > > > > - Pros: Popular, actively maintained, and > > >> license-compatible. > > >> > > > > - Con: Written in C++ and relies on JNI. > > >> > > > > - MapDB https://github.com/jankotek/mapdb > > >> > > > > - Pros: Java-based and license-compatible. > > >> > > > > - Con: The last release was in January 2024. > > >> > > > > - Chronicle-Map https://github.com/OpenHFT/Chronicle-Map > > >> > > > > - Pros: Java-based, actively maintained, and > > >> > license-compatible. > > >> > > > > > > >> > > > > I welcome your input on this proposal, these candidates or any > > >> other > > >> > > > > alternatives you might suggest. > > >> > > > > > > >> > > > > Regards, > > >> > > > > Peter > > >> > > > > > >> > > > >> > > > >
