Re: NiFi 2.0 - QuestDB

Joe Witt Tue, 01 Aug 2023 13:01:21 -0700

Hello

Here are a couple examples where it seems like we definitely have some work
to do for the QuestDB bits.  Right now it is blocking full clean builds on
Linux machines in my case though works on OSX.


See:
https://issues.apache.org/jira/browse/NIFI-11896
https://issues.apache.org/jira/browse/NIFI-11897

Thanks

On Wed, Jul 19, 2023 at 8:15 AM Mark Payne <marka...@hotmail.com> wrote:

> Sounds good. Thanks Bence.
>
> > On Jul 19, 2023, at 11:07 AM, Simon Bence <simonbence....@gmail.com>
> wrote:
> >
> > Thanks for the feedback from everyone!
> >
> > As I understand the intention is supported and with some preparation
> (covering the cases mentioned) it can be done. I will raise some PR in the
> foreseeable future to target these questions.
> >
> > Regards,
> > Bence
> >
> >> On 2023. Jul 19., at 16:01, David Handermann <
> exceptionfact...@apache.org> wrote:
> >>
> >> Thanks for the suggestion and additional background Bence, that is very
> >> helpful in evaluating the default inclusion approach.
> >>
> >> I agree with Joe's concern about handling potential corruption. We have
> >> recently reduced dependency on the H2 file-backed database driver so
> that
> >> it is now limited to Flow Configuration History. Based on experience
> there,
> >> NiFi can fail to start when the database file is corrupted, which is not
> >> ideal. We should look into improving that behavior, allowing NiFi to
> start
> >> and saving off the corrupted file instead of failing to start. If we go
> >> forward with QuestDB as the default strategy for status history, we
> should
> >> build in the resilient approach as a prerequisite to enabling it in the
> >> default configuration.
> >>
> >> Regards,
> >> David Handermann
> >>
> >> On Wed, Jul 19, 2023 at 8:31 AM Simon Bence <simonbence....@gmail.com>
> >> wrote:
> >>
> >>> Thanks for the quick feedback!
> >>>
> >>> Joe: your concerns are relevant, let me provide some details:
> >>>
> >>> The database uses some disk space, determined by the number of
> components
> >>> and the number of covered days. During adding it I was checking for
> time
> >>> usage and however I don’t have the numbers any more, the usage seemed
> >>> reasonable. I can do a bit of testing and bring some numbers to improve
> >>> confidence with it. Additionally the necessary disk space is limited:
> we
> >>> have rollover handling capability, which limits the amount of stored
> data,
> >>> to the target number plus one days. This is due to the limitations of
> >>> QuestDB with partitioning: at the time of development the smallest
> >>> partition strategy way day based if I remember correctly so the unit of
> >>> deletion was the partition just shifted out from the threshold. (Now it
> >>> looks to be the hour based partitoning which might worth the effort to
> >>> upgrade to)
> >>>
> >>> The current rollover deletes all the data older than the threshold,
> but I
> >>> am thinking on adding a new implementation which keeps some aggregated
> >>> information about the components. That of course needs some more space,
> >>> again: depending on the number of components and the time.
> >>>
> >>> In case the disk is full, we have no way to push down metrics to the
> >>> database and currently there is no fallback strategy for it. A
> possible way
> >>> would be to temporarily keep the data in memory (similar to the
> >>> VolatileComponentStatusRepository in that regard) but I am not
> convinced
> >>> that if a node with resources close to the limitations it would be
> >>> necessarily a good strategy to write data into the memory instead of
> the
> >>> disk. This is something to consider.
> >>>
> >>> If the database becomes corrupted than we loose the status information.
> >>> This I think is true for most of the persisted storage however I would
> >>> think if the database files are not changed by using external tools
> there
> >>> is an insignificant chance for this. Fallback strategies might be added
> >>> (like if NiFi considers the database corrupted, it might start a new
> one)
> >>> but even without this I think the QuestDB based solution has its merits
> >>> compared to the in memory storage.
> >>>
> >>> Manual intervention should not be needed. Currently in order to use
> this
> >>> capability, the configuration must be changed but if we would make
> this the
> >>> default, it should work without any additional interaction.
> >>>
> >>> Regards,
> >>> Bence
> >>>
> >>>> On 2023. Jul 19., at 14:57, Joe Witt <joe.w...@gmail.com> wrote:
> >>>>
> >>>> Agree functionally
> >>>>
> >>>> How does this handle disk usage?   Any manual intervention needed?
> What
> >>> if
> >>>> the disk is full where it writes?  What if the db somehow becomes
> >>>> corrupted?
> >>>>
> >>>> Id like to ensure this thing is zero ops as much as possible such
> that in
> >>>> error conditions it resets and gets going again.
> >>>>
> >>>> Thanks
> >>>>
> >>>> On Wed, Jul 19, 2023 at 8:55 AM Pierre Villard <
> >>> pierre.villard...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> I do think this provides great value. The possibility to get access
> to
> >>>>> status history of the components and at system level across restart
> is a
> >>>>> great improvement for NiFi troubleshooting. It also gives the
> ability to
> >>>>> store this information for a longer period of time. I'm definitely in
> >>> favor
> >>>>> of making this the default starting with NiFi 2.0.
> >>>>>
> >>>>> Le mer. 19 juil. 2023 à 13:49, Simon Bence <simonbence....@gmail.com>
> a
> >>>>> écrit :
> >>>>>
> >>>>>> Hi Community,
> >>>>>>
> >>>>>> I was thinking if it would make sense to set the QuestDB as default
> for
> >>>>>> status history backend in 2.0? It is there for a while and I would
> >>>>> consider
> >>>>>> it as a step forward so the new major version might be a good time
> for
> >>>>> the
> >>>>>> wider audience. It comes with less memory usage for bigger flows,
> the
> >>>>>> possibility of checking status information when the node is not
> running
> >>>>> or
> >>>>>> restarted so I think it worth consideration. Any insight or
> improvement
> >>>>>> point is appreciated, thanks!
> >>>>>>
> >>>>>> Regards,
> >>>>>> Bence
> >>>>>
> >>>
> >>>
> >
>
>

Re: NiFi 2.0 - QuestDB

Reply via email to