Re: [DISCUSSION] IEP-59: CDC - Capture Data Change

Pavel Kovalenko Fri, 16 Oct 2020 04:20:17 -0700

Alexey,

>> If a CDC agent is restarted, it will have to start from scratch
>> If a CDC reader does not keep up with the WAL write rate (e.g. there
   is a short-term write burst and WAL archive is small), the Ignite node
will
   delete WAL segments while the consumer is still reading it.


I think these cases can be resolved with the following approach:
PostgreSQL can be configured to execute a shell command after WAL segment
is archived. The same thing we can do for Ignite as well.
A command can create a hardlink for such WAL segment to a specified
directory to not loose it after deletion by Ignite and notify a CDC (or
another kind of process) about this segment.
That will be a filesystem queue and CDC after restart may proceed only
segments located at this directory, so it's no need to start from scratch.
When WAL segment is processed by CDC a hardlink from queue directory is
deleted.



пт, 16 окт. 2020 г. в 13:42, Alexey Goncharuk <[email protected]>:

> Hello Nikolay,
>
> Thanks for the suggestion, it definitely may be a good feature, however, I
> do not see any significant value that it currently adds to the already
> existing WAL Iterator. I think the following issues should be addressed,
> otherwise, no regular user will be able to use the CDC reliably:
>
>    - The interface exposes WALRecord which is a private API
>    - There is no way to start capturing changes from a certain point (a
>    watermark for already processed data). Users can configure a large size
> for
>    WAL archive to sustain long node downtime for historical rebalance. If a
>    CDC agent is restarted, it will have to start from scratch. I see that
> it
>    is present in the IEP as a design choice, but I think this is a major
>    usability issue
>    - If a CDC reader does not keep up with the WAL write rate (e.g. there
>    is a short-term write burst and WAL archive is small), the Ignite node
> will
>    delete WAL segments while the consumer is still reading it. Since the
>    consumer is running out-of-process, we need to specify some sort of
>    synchronization protocol between the node and the consumer
>    - If Ignite node crashes, gets restarted and initiates full rebalance,
>    the consumer will lose some updates
>    - Usually, it makes sense for the CDC consumer to read updates only on
>    primary nodes (otherwise, multiple agents will be doing duplicate
> work). In
>    the current design, the consumer will not be able to differentiate
>    primary/backup updates. Moreover, even if we wrote such flags to WAL,
> the
>    consumer would need to process backup records anyway because it is
> unknown
>    whether the primary consumer is alive. In other words, how would an end
>    user organize the CDC failover minimizing the duplicate work?
>
>
> ср, 14 окт. 2020 г. в 14:21, Nikolay Izhikov <[email protected]>:
>
> > Hello, Igniters.
> >
> > I want to start a discussion of the new feature [1]
> >
> > CDC - capture data change. The feature allows the consumer to receive
> > online notifications about data record changes.
> >
> > It can be used in the following scenarios:
> >         * Export data into some warehouse, full-text search, or
> > distributed log system.
> >         * Online statistics and analytics.
> >         * Wait and respond to some specific events or data changes.
> >
> > Propose to implement new IgniteCDC application as follows:
> >         * Run on the server node host.
> >         * Watches for the appearance of the WAL archive segments.
> >         * Iterates it using existing WALIterator and notifies consumer of
> > each record from the segment.
> >
> > IgniteCDC features:
> >         * Independence from the server node process (JVM) - issues and
> > failures of the consumer will not lead to server node instability.
> >         * Notification guarantees and failover - i.e. CDC track and save
> > the pointer to the last consumed record. Continue notification from this
> > pointer in case of restart.
> >         * Resilience for the consumer - it's not an issue when a consumer
> > temporarily consumes slower than data appear.
> >
> > WDYT?
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-59+CDC+-+Capture+Data+Change
>

Re: [DISCUSSION] IEP-59: CDC - Capture Data Change

Reply via email to