Thanks for the suggestion and additional background Bence, that is very
helpful in evaluating the default inclusion approach.

I agree with Joe's concern about handling potential corruption. We have
recently reduced dependency on the H2 file-backed database driver so that
it is now limited to Flow Configuration History. Based on experience there,
NiFi can fail to start when the database file is corrupted, which is not
ideal. We should look into improving that behavior, allowing NiFi to start
and saving off the corrupted file instead of failing to start. If we go
forward with QuestDB as the default strategy for status history, we should
build in the resilient approach as a prerequisite to enabling it in the
default configuration.

Regards,
David Handermann

On Wed, Jul 19, 2023 at 8:31 AM Simon Bence <simonbence....@gmail.com>
wrote:

> Thanks for the quick feedback!
>
> Joe: your concerns are relevant, let me provide some details:
>
> The database uses some disk space, determined by the number of components
> and the number of covered days. During adding it I was checking for time
> usage and however I don’t have the numbers any more, the usage seemed
> reasonable. I can do a bit of testing and bring some numbers to improve
> confidence with it. Additionally the necessary disk space is limited: we
> have rollover handling capability, which limits the amount of stored data,
> to the target number plus one days. This is due to the limitations of
> QuestDB with partitioning: at the time of development the smallest
> partition strategy way day based if I remember correctly so the unit of
> deletion was the partition just shifted out from the threshold. (Now it
> looks to be the hour based partitoning which might worth the effort to
> upgrade to)
>
> The current rollover deletes all the data older than the threshold, but I
> am thinking on adding a new implementation which keeps some aggregated
> information about the components. That of course needs some more space,
> again: depending on the number of components and the time.
>
> In case the disk is full, we have no way to push down metrics to the
> database and currently there is no fallback strategy for it. A possible way
> would be to temporarily keep the data in memory (similar to the
> VolatileComponentStatusRepository in that regard) but I am not convinced
> that if a node with resources close to the limitations it would be
> necessarily a good strategy to write data into the memory instead of the
> disk. This is something to consider.
>
> If the database becomes corrupted than we loose the status information.
> This I think is true for most of the persisted storage however I would
> think if the database files are not changed by using external tools there
> is an insignificant chance for this. Fallback strategies might be added
> (like if NiFi considers the database corrupted, it might start a new one)
> but even without this I think the QuestDB based solution has its merits
> compared to the in memory storage.
>
> Manual intervention should not be needed. Currently in order to use this
> capability, the configuration must be changed but if we would make this the
> default, it should work without any additional interaction.
>
> Regards,
> Bence
>
> > On 2023. Jul 19., at 14:57, Joe Witt <joe.w...@gmail.com> wrote:
> >
> > Agree functionally
> >
> > How does this handle disk usage?   Any manual intervention needed?  What
> if
> > the disk is full where it writes?  What if the db somehow becomes
> > corrupted?
> >
> > Id like to ensure this thing is zero ops as much as possible such that in
> > error conditions it resets and gets going again.
> >
> > Thanks
> >
> > On Wed, Jul 19, 2023 at 8:55 AM Pierre Villard <
> pierre.villard...@gmail.com>
> > wrote:
> >
> >> I do think this provides great value. The possibility to get access to
> >> status history of the components and at system level across restart is a
> >> great improvement for NiFi troubleshooting. It also gives the ability to
> >> store this information for a longer period of time. I'm definitely in
> favor
> >> of making this the default starting with NiFi 2.0.
> >>
> >> Le mer. 19 juil. 2023 à 13:49, Simon Bence <simonbence....@gmail.com> a
> >> écrit :
> >>
> >>> Hi Community,
> >>>
> >>> I was thinking if it would make sense to set the QuestDB as default for
> >>> status history backend in 2.0? It is there for a while and I would
> >> consider
> >>> it as a step forward so the new major version might be a good time for
> >> the
> >>> wider audience. It comes with less memory usage for bigger flows, the
> >>> possibility of checking status information when the node is not running
> >> or
> >>> restarted so I think it worth consideration. Any insight or improvement
> >>> point is appreciated, thanks!
> >>>
> >>> Regards,
> >>> Bence
> >>
>
>

Reply via email to