I agree that a native library and wrapper isn't ideal, but I would like to
point out RocksDB has been used successfully with NiFi in the past:

https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#rocksdb-flowfile-repository




On Thu, Sep 25, 2025 at 4:17 PM Matt Burgess <[email protected]> wrote:

> Thanks to all for the due diligence here, it is much appreciated. If we
> consider RocksDB (mentioned earlier) I think we should look at Apache
> KVRocks[1] , one of the cons is that it's not Java but as a RocksDB-backed
> Redis-compatible NoSQL store it might do the trick. I would volunteer to
> evaluate but I won't have the cycles for a while, just wanted to bring it
> up.
>
> Regards,
> Matt
>
> [1] https://kvrocks.apache.org/
>
> On Thu, Sep 25, 2025 at 10:52 AM David Handermann <
> [email protected]> wrote:
>
> > Peter,
> >
> > Thanks for the initial investigation and summary.
> >
> > On review, I agree that going with QuestDB is not the best option.
> > Although QuestDB is a framework dependency, it is packaged in a
> > separate NAR, and the QuestDB JAR packages platform-specific native
> > libraries. More to the point, however, I agree that the Action
> > structure makes the adaptation more challenging. The historical
> > implementation of the Flow Configuration History used multiple tables,
> > which is still an option to consider, depending on the selected
> > solution.
> >
> > Another important factor to consider in this process is that changing
> > the implementation will require maintaining the Xodus-based version
> > for a period of time, in order to support migration.
> >
> > One note regarding NIFI-12468, based on the scenario described, it
> > sounds like the purge action could be evaluated for improvement. The
> > creation of several million records in a short period of time does not
> > align with standard operation, and perhaps other options to support a
> > complete purge of history, versus over a range of time, would be
> > better.
> >
> > Returning to the topic of alternatives, another option to consider is
> > Apache Lucene, which NiFi uses for provenance. I've started evaluating
> > this approach, which would be a bit more involved, but also has the
> > advantage of being a current dependency.
> >
> > Regards,
> > David Handermann
> >
> > On Thu, Sep 25, 2025 at 8:58 AM Peter Gyori <[email protected]> wrote:
> > >
> > > Hi David and Matt,
> > >
> > > I've finished my review of QuestDB, and here's what I found.
> > >
> > > I believe QuestDB could be used to store the Flow Configuration
> History,
> > > and in some ways, it offers benefits over Xodus for this use case.
> > However,
> > > I think the downsides ultimately outweigh the pros.
> > >
> > > *Pros:*
> > >
> > >    -
> > >
> > >    *Existing Dependency:* QuestDB is already a dependency in NiFi, so
> no
> > >    new libraries would need to be introduced.
> > >    -
> > >
> > >    *Faster Purging:* The Flow Configuration History purge operation
> would
> > >    be significantly faster because QuestDB can delete records within a
> > >    timestamp range. (Currently, Xodus deletes each entity one by one.)
> > >    -
> > >
> > >    *Faster Entry Retrieval:* finding stored records would likely be
> > faster
> > >    due to QuestDB's record format.
> > >
> > > *Cons:*
> > >
> > >
> > >    - QuestDB is a *column-oriented* database. The Flow Configuration
> > >    History currently stores *serialized Java objects*. To move to
> > QuestDB,
> > >    we would need to create a table with a column structure that matches
> > our
> > >    Java objects. Since we don't plan to use aggregate functions on
> these
> > >    columns, this approach offers no benefit. Instead, the *table
> > structure
> > >    would be tightly coupled to our Java object structure*, meaning any
> > >    changes to the objects would require a database migration.
> > >    - Alternatively, we could use a *"hybrid" approach* where we flatten
> > out
> > >    the Action class' simple fields like id, userIdentity and timestamp
> > into
> > >    columns and store more complex fields (ComponentDetails and
> > ActionDetails)
> > >    as serialized byte arrays. However, *this would require
> "reassembling"
> > >    the Action object during reads*, which is more complex than the
> > current
> > >    method of simply deserializing the entire object.
> > >
> > > Since this use case doesn't leverage any of QuestDB’s strengths - such
> as
> > > handling time series, aggregation, or sampling - I don't believe it's
> the
> > > right choice. Ignoring the fact that QuestDB is already a NiFi
> > dependency,
> > > it's difficult to justify using it for Flow Configuration History.
> > >
> > > For that reason, I recommend we find a simple object store instead.
> > >
> > > What are your thoughts?
> > >
> > > Regards,
> > > Peter
> > >
> > > On Fri, Sep 19, 2025 at 11:33 PM Peter Gyori <[email protected]>
> wrote:
> > >
> > > > Thank you, David and Matt.
> > > > I will also evaluate QuestDB to see if it's a good fit.
> > > >
> > > > Regards,
> > > > Peter
> > > >
> > > > On Fri, Sep 19, 2025 at 7:04 PM Matt Burgess <[email protected]>
> > wrote:
> > > >
> > > >> That's where I'm tending towards as well, QuestDB. I think it's a
> good
> > > >> idea
> > > >> to back whatever appropriate capabilities with the same database
> > library
> > > >> if
> > > >> only just for maintenance purposes. Of course the downside is any
> > > >> vulnerabilities that may arise, such as we had to deal with re: H2 a
> > > >> couple
> > > >> years ago.
> > > >>
> > > >> Regards,
> > > >> Matt
> > > >>
> > > >> On Fri, Sep 19, 2025 at 11:34 AM David Handermann <
> > > >> [email protected]> wrote:
> > > >>
> > > >> > Hi Peter,
> > > >> >
> > > >> > Another option I am evaluating is QuestDB [1]. There is an
> optional
> > > >> > framework extension that uses QuestDB for persistent Status
> > History. I
> > > >> > would not intend to couple or reuse code from that module, but
> > > >> > building a new implementation of the Audit Store on QuestDB might
> > be a
> > > >> > good solution. The Flow Configuration History is certainly
> > > >> > timestamp-oriented, so this might be a potential way forward.
> > > >> >
> > > >> > Regards,
> > > >> > David Handermann
> > > >> >
> > > >> > [1] https://questdb.com
> > > >> >
> > > >> > On Fri, Sep 19, 2025 at 9:36 AM Peter Gyori <[email protected]>
> > wrote:
> > > >> > >
> > > >> > > Hi David,
> > > >> > >
> > > >> > > Thank you for your reply.
> > > >> > >
> > > >> > > Regarding NIFI-12468: whenever an Xodus transaction exceeds 60
> > > >> seconds,
> > > >> > the
> > > >> > > database connection is terminated, and NiFi does not recover
> > without a
> > > >> > > restart. (Interestingly, with NiFi-1.x using Java11, recovery is
> > not
> > > >> an
> > > >> > > issue.)
> > > >> > >
> > > >> > > I also evaluated YouTrackDB, but ultimately decided against it.
> > As an
> > > >> > > object-oriented graph database, YouTrackDB seems to be a more
> > > >> high-level
> > > >> > > and complex solution than the simple key-value datastore we are
> > > >> looking
> > > >> > for.
> > > >> > >
> > > >> > > Regards,
> > > >> > > Peter
> > > >> > >
> > > >> > > On Fri, Sep 19, 2025 at 3:19 PM David Handermann <
> > > >> > > [email protected]> wrote:
> > > >> > >
> > > >> > > > Hi Peter,
> > > >> > > >
> > > >> > > > Thanks for initiating this discussion. Despite activity on
> other
> > > >> > > > branches, I have also observed the lack of recent releases for
> > > >> Xodus.
> > > >> > > > I have not encountered the issues described in NIFI-12468,
> but I
> > > >> agree
> > > >> > > > that an alternative needs to be considered based on the lack
> of
> > > >> > > > maintenance activity. It is interesting that Xodus now
> mentions
> > > >> future
> > > >> > > > work on YouTrackDB, but that repository has not published a
> > release
> > > >> to
> > > >> > > > Maven Central, so it does not appear to be in a helpful
> > position.
> > > >> > > >
> > > >> > > > Anything that requires a native library and wrapper is not a
> > great
> > > >> > > > candidate, like RocksDB as you noted. I looked at MapDB
> > recently as
> > > >> > > > well, but I was also concerned about the maintenance level.
> I'm
> > not
> > > >> > > > familiar with Chronicle-Map, so I plan to take a closer look.
> It
> > > >> > > > appears to have a number of dependencies, which is initially
> > > >> > > > concerning. Returning to H2 is not a good option, but
> > mentioning it
> > > >> > > > for the sake of background. Apache Derby is another embedded
> > > >> database,
> > > >> > > > but it has had less maintenance in recent years.
> > > >> > > >
> > > >> > > > I plan to evaluate options and follow up, thanks again for
> > raising
> > > >> the
> > > >> > > > topic!
> > > >> > > >
> > > >> > > > Regards,
> > > >> > > > David Handermann
> > > >> > > >
> > > >> > > > On Fri, Sep 19, 2025 at 7:46 AM Peter Gyori <
> [email protected]>
> > > >> wrote:
> > > >> > > > >
> > > >> > > > > Team,
> > > >> > > > >
> > > >> > > > > I am writing to propose we replace Xodus (
> > > >> > > > > https://github.com/JetBrains/xodus ) in NiFi with a more
> > actively
> > > >> > > > > maintained library. This change is necessary due to two key
> > > >> issues:
> > > >> > > > >
> > > >> > > > >    - The Xodus project is no longer under active
> development.
> > > >> > > > >    - We've encountered issues with Xodus when running NiFi
> on
> > Java
> > > >> > 21, as
> > > >> > > > >    detailed in the comments of
> > > >> > > > >    https://issues.apache.org/jira/browse/NIFI-12468
> > > >> > > > >
> > > >> > > > > I have evaluated some potential replacements and have
> > summarized
> > > >> my
> > > >> > > > initial
> > > >> > > > > findings below.
> > > >> > > > >
> > > >> > > > > Replacement Candidates:
> > > >> > > > >
> > > >> > > > >    - RocksDB https://github.com/facebook/rocksdb
> > > >> > > > >       - Pros: Popular, actively maintained, and
> > > >> license-compatible.
> > > >> > > > >       - Con: Written in C++ and relies on JNI.
> > > >> > > > >    - MapDB https://github.com/jankotek/mapdb
> > > >> > > > >       - Pros: Java-based and license-compatible.
> > > >> > > > >       - Con: The last release was in January 2024.
> > > >> > > > >    - Chronicle-Map https://github.com/OpenHFT/Chronicle-Map
> > > >> > > > >       - Pros: Java-based, actively maintained, and
> > > >> > license-compatible.
> > > >> > > > >
> > > >> > > > > I welcome your input on this proposal, these candidates or
> any
> > > >> other
> > > >> > > > > alternatives you might suggest.
> > > >> > > > >
> > > >> > > > > Regards,
> > > >> > > > > Peter
> > > >> > > >
> > > >> >
> > > >>
> > > >
> >
>

Reply via email to