Thanks to all for the due diligence here, it is much appreciated. If we
consider RocksDB (mentioned earlier) I think we should look at Apache
KVRocks[1] , one of the cons is that it's not Java but as a RocksDB-backed
Redis-compatible NoSQL store it might do the trick. I would volunteer to
evaluate but I won't have the cycles for a while, just wanted to bring it
up.

Regards,
Matt

[1] https://kvrocks.apache.org/

On Thu, Sep 25, 2025 at 10:52 AM David Handermann <
[email protected]> wrote:

> Peter,
>
> Thanks for the initial investigation and summary.
>
> On review, I agree that going with QuestDB is not the best option.
> Although QuestDB is a framework dependency, it is packaged in a
> separate NAR, and the QuestDB JAR packages platform-specific native
> libraries. More to the point, however, I agree that the Action
> structure makes the adaptation more challenging. The historical
> implementation of the Flow Configuration History used multiple tables,
> which is still an option to consider, depending on the selected
> solution.
>
> Another important factor to consider in this process is that changing
> the implementation will require maintaining the Xodus-based version
> for a period of time, in order to support migration.
>
> One note regarding NIFI-12468, based on the scenario described, it
> sounds like the purge action could be evaluated for improvement. The
> creation of several million records in a short period of time does not
> align with standard operation, and perhaps other options to support a
> complete purge of history, versus over a range of time, would be
> better.
>
> Returning to the topic of alternatives, another option to consider is
> Apache Lucene, which NiFi uses for provenance. I've started evaluating
> this approach, which would be a bit more involved, but also has the
> advantage of being a current dependency.
>
> Regards,
> David Handermann
>
> On Thu, Sep 25, 2025 at 8:58 AM Peter Gyori <[email protected]> wrote:
> >
> > Hi David and Matt,
> >
> > I've finished my review of QuestDB, and here's what I found.
> >
> > I believe QuestDB could be used to store the Flow Configuration History,
> > and in some ways, it offers benefits over Xodus for this use case.
> However,
> > I think the downsides ultimately outweigh the pros.
> >
> > *Pros:*
> >
> >    -
> >
> >    *Existing Dependency:* QuestDB is already a dependency in NiFi, so no
> >    new libraries would need to be introduced.
> >    -
> >
> >    *Faster Purging:* The Flow Configuration History purge operation would
> >    be significantly faster because QuestDB can delete records within a
> >    timestamp range. (Currently, Xodus deletes each entity one by one.)
> >    -
> >
> >    *Faster Entry Retrieval:* finding stored records would likely be
> faster
> >    due to QuestDB's record format.
> >
> > *Cons:*
> >
> >
> >    - QuestDB is a *column-oriented* database. The Flow Configuration
> >    History currently stores *serialized Java objects*. To move to
> QuestDB,
> >    we would need to create a table with a column structure that matches
> our
> >    Java objects. Since we don't plan to use aggregate functions on these
> >    columns, this approach offers no benefit. Instead, the *table
> structure
> >    would be tightly coupled to our Java object structure*, meaning any
> >    changes to the objects would require a database migration.
> >    - Alternatively, we could use a *"hybrid" approach* where we flatten
> out
> >    the Action class' simple fields like id, userIdentity and timestamp
> into
> >    columns and store more complex fields (ComponentDetails and
> ActionDetails)
> >    as serialized byte arrays. However, *this would require "reassembling"
> >    the Action object during reads*, which is more complex than the
> current
> >    method of simply deserializing the entire object.
> >
> > Since this use case doesn't leverage any of QuestDB’s strengths - such as
> > handling time series, aggregation, or sampling - I don't believe it's the
> > right choice. Ignoring the fact that QuestDB is already a NiFi
> dependency,
> > it's difficult to justify using it for Flow Configuration History.
> >
> > For that reason, I recommend we find a simple object store instead.
> >
> > What are your thoughts?
> >
> > Regards,
> > Peter
> >
> > On Fri, Sep 19, 2025 at 11:33 PM Peter Gyori <[email protected]> wrote:
> >
> > > Thank you, David and Matt.
> > > I will also evaluate QuestDB to see if it's a good fit.
> > >
> > > Regards,
> > > Peter
> > >
> > > On Fri, Sep 19, 2025 at 7:04 PM Matt Burgess <[email protected]>
> wrote:
> > >
> > >> That's where I'm tending towards as well, QuestDB. I think it's a good
> > >> idea
> > >> to back whatever appropriate capabilities with the same database
> library
> > >> if
> > >> only just for maintenance purposes. Of course the downside is any
> > >> vulnerabilities that may arise, such as we had to deal with re: H2 a
> > >> couple
> > >> years ago.
> > >>
> > >> Regards,
> > >> Matt
> > >>
> > >> On Fri, Sep 19, 2025 at 11:34 AM David Handermann <
> > >> [email protected]> wrote:
> > >>
> > >> > Hi Peter,
> > >> >
> > >> > Another option I am evaluating is QuestDB [1]. There is an optional
> > >> > framework extension that uses QuestDB for persistent Status
> History. I
> > >> > would not intend to couple or reuse code from that module, but
> > >> > building a new implementation of the Audit Store on QuestDB might
> be a
> > >> > good solution. The Flow Configuration History is certainly
> > >> > timestamp-oriented, so this might be a potential way forward.
> > >> >
> > >> > Regards,
> > >> > David Handermann
> > >> >
> > >> > [1] https://questdb.com
> > >> >
> > >> > On Fri, Sep 19, 2025 at 9:36 AM Peter Gyori <[email protected]>
> wrote:
> > >> > >
> > >> > > Hi David,
> > >> > >
> > >> > > Thank you for your reply.
> > >> > >
> > >> > > Regarding NIFI-12468: whenever an Xodus transaction exceeds 60
> > >> seconds,
> > >> > the
> > >> > > database connection is terminated, and NiFi does not recover
> without a
> > >> > > restart. (Interestingly, with NiFi-1.x using Java11, recovery is
> not
> > >> an
> > >> > > issue.)
> > >> > >
> > >> > > I also evaluated YouTrackDB, but ultimately decided against it.
> As an
> > >> > > object-oriented graph database, YouTrackDB seems to be a more
> > >> high-level
> > >> > > and complex solution than the simple key-value datastore we are
> > >> looking
> > >> > for.
> > >> > >
> > >> > > Regards,
> > >> > > Peter
> > >> > >
> > >> > > On Fri, Sep 19, 2025 at 3:19 PM David Handermann <
> > >> > > [email protected]> wrote:
> > >> > >
> > >> > > > Hi Peter,
> > >> > > >
> > >> > > > Thanks for initiating this discussion. Despite activity on other
> > >> > > > branches, I have also observed the lack of recent releases for
> > >> Xodus.
> > >> > > > I have not encountered the issues described in NIFI-12468, but I
> > >> agree
> > >> > > > that an alternative needs to be considered based on the lack of
> > >> > > > maintenance activity. It is interesting that Xodus now mentions
> > >> future
> > >> > > > work on YouTrackDB, but that repository has not published a
> release
> > >> to
> > >> > > > Maven Central, so it does not appear to be in a helpful
> position.
> > >> > > >
> > >> > > > Anything that requires a native library and wrapper is not a
> great
> > >> > > > candidate, like RocksDB as you noted. I looked at MapDB
> recently as
> > >> > > > well, but I was also concerned about the maintenance level. I'm
> not
> > >> > > > familiar with Chronicle-Map, so I plan to take a closer look. It
> > >> > > > appears to have a number of dependencies, which is initially
> > >> > > > concerning. Returning to H2 is not a good option, but
> mentioning it
> > >> > > > for the sake of background. Apache Derby is another embedded
> > >> database,
> > >> > > > but it has had less maintenance in recent years.
> > >> > > >
> > >> > > > I plan to evaluate options and follow up, thanks again for
> raising
> > >> the
> > >> > > > topic!
> > >> > > >
> > >> > > > Regards,
> > >> > > > David Handermann
> > >> > > >
> > >> > > > On Fri, Sep 19, 2025 at 7:46 AM Peter Gyori <[email protected]>
> > >> wrote:
> > >> > > > >
> > >> > > > > Team,
> > >> > > > >
> > >> > > > > I am writing to propose we replace Xodus (
> > >> > > > > https://github.com/JetBrains/xodus ) in NiFi with a more
> actively
> > >> > > > > maintained library. This change is necessary due to two key
> > >> issues:
> > >> > > > >
> > >> > > > >    - The Xodus project is no longer under active development.
> > >> > > > >    - We've encountered issues with Xodus when running NiFi on
> Java
> > >> > 21, as
> > >> > > > >    detailed in the comments of
> > >> > > > >    https://issues.apache.org/jira/browse/NIFI-12468
> > >> > > > >
> > >> > > > > I have evaluated some potential replacements and have
> summarized
> > >> my
> > >> > > > initial
> > >> > > > > findings below.
> > >> > > > >
> > >> > > > > Replacement Candidates:
> > >> > > > >
> > >> > > > >    - RocksDB https://github.com/facebook/rocksdb
> > >> > > > >       - Pros: Popular, actively maintained, and
> > >> license-compatible.
> > >> > > > >       - Con: Written in C++ and relies on JNI.
> > >> > > > >    - MapDB https://github.com/jankotek/mapdb
> > >> > > > >       - Pros: Java-based and license-compatible.
> > >> > > > >       - Con: The last release was in January 2024.
> > >> > > > >    - Chronicle-Map https://github.com/OpenHFT/Chronicle-Map
> > >> > > > >       - Pros: Java-based, actively maintained, and
> > >> > license-compatible.
> > >> > > > >
> > >> > > > > I welcome your input on this proposal, these candidates or any
> > >> other
> > >> > > > > alternatives you might suggest.
> > >> > > > >
> > >> > > > > Regards,
> > >> > > > > Peter
> > >> > > >
> > >> >
> > >>
> > >
>

Reply via email to