Peter, Thanks for the initial investigation and summary.
On review, I agree that going with QuestDB is not the best option. Although QuestDB is a framework dependency, it is packaged in a separate NAR, and the QuestDB JAR packages platform-specific native libraries. More to the point, however, I agree that the Action structure makes the adaptation more challenging. The historical implementation of the Flow Configuration History used multiple tables, which is still an option to consider, depending on the selected solution. Another important factor to consider in this process is that changing the implementation will require maintaining the Xodus-based version for a period of time, in order to support migration. One note regarding NIFI-12468, based on the scenario described, it sounds like the purge action could be evaluated for improvement. The creation of several million records in a short period of time does not align with standard operation, and perhaps other options to support a complete purge of history, versus over a range of time, would be better. Returning to the topic of alternatives, another option to consider is Apache Lucene, which NiFi uses for provenance. I've started evaluating this approach, which would be a bit more involved, but also has the advantage of being a current dependency. Regards, David Handermann On Thu, Sep 25, 2025 at 8:58 AM Peter Gyori <[email protected]> wrote: > > Hi David and Matt, > > I've finished my review of QuestDB, and here's what I found. > > I believe QuestDB could be used to store the Flow Configuration History, > and in some ways, it offers benefits over Xodus for this use case. However, > I think the downsides ultimately outweigh the pros. > > *Pros:* > > - > > *Existing Dependency:* QuestDB is already a dependency in NiFi, so no > new libraries would need to be introduced. > - > > *Faster Purging:* The Flow Configuration History purge operation would > be significantly faster because QuestDB can delete records within a > timestamp range. (Currently, Xodus deletes each entity one by one.) > - > > *Faster Entry Retrieval:* finding stored records would likely be faster > due to QuestDB's record format. > > *Cons:* > > > - QuestDB is a *column-oriented* database. The Flow Configuration > History currently stores *serialized Java objects*. To move to QuestDB, > we would need to create a table with a column structure that matches our > Java objects. Since we don't plan to use aggregate functions on these > columns, this approach offers no benefit. Instead, the *table structure > would be tightly coupled to our Java object structure*, meaning any > changes to the objects would require a database migration. > - Alternatively, we could use a *"hybrid" approach* where we flatten out > the Action class' simple fields like id, userIdentity and timestamp into > columns and store more complex fields (ComponentDetails and ActionDetails) > as serialized byte arrays. However, *this would require "reassembling" > the Action object during reads*, which is more complex than the current > method of simply deserializing the entire object. > > Since this use case doesn't leverage any of QuestDB’s strengths - such as > handling time series, aggregation, or sampling - I don't believe it's the > right choice. Ignoring the fact that QuestDB is already a NiFi dependency, > it's difficult to justify using it for Flow Configuration History. > > For that reason, I recommend we find a simple object store instead. > > What are your thoughts? > > Regards, > Peter > > On Fri, Sep 19, 2025 at 11:33 PM Peter Gyori <[email protected]> wrote: > > > Thank you, David and Matt. > > I will also evaluate QuestDB to see if it's a good fit. > > > > Regards, > > Peter > > > > On Fri, Sep 19, 2025 at 7:04 PM Matt Burgess <[email protected]> wrote: > > > >> That's where I'm tending towards as well, QuestDB. I think it's a good > >> idea > >> to back whatever appropriate capabilities with the same database library > >> if > >> only just for maintenance purposes. Of course the downside is any > >> vulnerabilities that may arise, such as we had to deal with re: H2 a > >> couple > >> years ago. > >> > >> Regards, > >> Matt > >> > >> On Fri, Sep 19, 2025 at 11:34 AM David Handermann < > >> [email protected]> wrote: > >> > >> > Hi Peter, > >> > > >> > Another option I am evaluating is QuestDB [1]. There is an optional > >> > framework extension that uses QuestDB for persistent Status History. I > >> > would not intend to couple or reuse code from that module, but > >> > building a new implementation of the Audit Store on QuestDB might be a > >> > good solution. The Flow Configuration History is certainly > >> > timestamp-oriented, so this might be a potential way forward. > >> > > >> > Regards, > >> > David Handermann > >> > > >> > [1] https://questdb.com > >> > > >> > On Fri, Sep 19, 2025 at 9:36 AM Peter Gyori <[email protected]> wrote: > >> > > > >> > > Hi David, > >> > > > >> > > Thank you for your reply. > >> > > > >> > > Regarding NIFI-12468: whenever an Xodus transaction exceeds 60 > >> seconds, > >> > the > >> > > database connection is terminated, and NiFi does not recover without a > >> > > restart. (Interestingly, with NiFi-1.x using Java11, recovery is not > >> an > >> > > issue.) > >> > > > >> > > I also evaluated YouTrackDB, but ultimately decided against it. As an > >> > > object-oriented graph database, YouTrackDB seems to be a more > >> high-level > >> > > and complex solution than the simple key-value datastore we are > >> looking > >> > for. > >> > > > >> > > Regards, > >> > > Peter > >> > > > >> > > On Fri, Sep 19, 2025 at 3:19 PM David Handermann < > >> > > [email protected]> wrote: > >> > > > >> > > > Hi Peter, > >> > > > > >> > > > Thanks for initiating this discussion. Despite activity on other > >> > > > branches, I have also observed the lack of recent releases for > >> Xodus. > >> > > > I have not encountered the issues described in NIFI-12468, but I > >> agree > >> > > > that an alternative needs to be considered based on the lack of > >> > > > maintenance activity. It is interesting that Xodus now mentions > >> future > >> > > > work on YouTrackDB, but that repository has not published a release > >> to > >> > > > Maven Central, so it does not appear to be in a helpful position. > >> > > > > >> > > > Anything that requires a native library and wrapper is not a great > >> > > > candidate, like RocksDB as you noted. I looked at MapDB recently as > >> > > > well, but I was also concerned about the maintenance level. I'm not > >> > > > familiar with Chronicle-Map, so I plan to take a closer look. It > >> > > > appears to have a number of dependencies, which is initially > >> > > > concerning. Returning to H2 is not a good option, but mentioning it > >> > > > for the sake of background. Apache Derby is another embedded > >> database, > >> > > > but it has had less maintenance in recent years. > >> > > > > >> > > > I plan to evaluate options and follow up, thanks again for raising > >> the > >> > > > topic! > >> > > > > >> > > > Regards, > >> > > > David Handermann > >> > > > > >> > > > On Fri, Sep 19, 2025 at 7:46 AM Peter Gyori <[email protected]> > >> wrote: > >> > > > > > >> > > > > Team, > >> > > > > > >> > > > > I am writing to propose we replace Xodus ( > >> > > > > https://github.com/JetBrains/xodus ) in NiFi with a more actively > >> > > > > maintained library. This change is necessary due to two key > >> issues: > >> > > > > > >> > > > > - The Xodus project is no longer under active development. > >> > > > > - We've encountered issues with Xodus when running NiFi on Java > >> > 21, as > >> > > > > detailed in the comments of > >> > > > > https://issues.apache.org/jira/browse/NIFI-12468 > >> > > > > > >> > > > > I have evaluated some potential replacements and have summarized > >> my > >> > > > initial > >> > > > > findings below. > >> > > > > > >> > > > > Replacement Candidates: > >> > > > > > >> > > > > - RocksDB https://github.com/facebook/rocksdb > >> > > > > - Pros: Popular, actively maintained, and > >> license-compatible. > >> > > > > - Con: Written in C++ and relies on JNI. > >> > > > > - MapDB https://github.com/jankotek/mapdb > >> > > > > - Pros: Java-based and license-compatible. > >> > > > > - Con: The last release was in January 2024. > >> > > > > - Chronicle-Map https://github.com/OpenHFT/Chronicle-Map > >> > > > > - Pros: Java-based, actively maintained, and > >> > license-compatible. > >> > > > > > >> > > > > I welcome your input on this proposal, these candidates or any > >> other > >> > > > > alternatives you might suggest. > >> > > > > > >> > > > > Regards, > >> > > > > Peter > >> > > > > >> > > >> > >
