Alexey, >> If a CDC agent is restarted, it will have to start from scratch >> If a CDC reader does not keep up with the WAL write rate (e.g. there is a short-term write burst and WAL archive is small), the Ignite node will delete WAL segments while the consumer is still reading it.
I think these cases can be resolved with the following approach: PostgreSQL can be configured to execute a shell command after WAL segment is archived. The same thing we can do for Ignite as well. A command can create a hardlink for such WAL segment to a specified directory to not loose it after deletion by Ignite and notify a CDC (or another kind of process) about this segment. That will be a filesystem queue and CDC after restart may proceed only segments located at this directory, so it's no need to start from scratch. When WAL segment is processed by CDC a hardlink from queue directory is deleted. пт, 16 окт. 2020 г. в 13:42, Alexey Goncharuk <[email protected]>: > Hello Nikolay, > > Thanks for the suggestion, it definitely may be a good feature, however, I > do not see any significant value that it currently adds to the already > existing WAL Iterator. I think the following issues should be addressed, > otherwise, no regular user will be able to use the CDC reliably: > > - The interface exposes WALRecord which is a private API > - There is no way to start capturing changes from a certain point (a > watermark for already processed data). Users can configure a large size > for > WAL archive to sustain long node downtime for historical rebalance. If a > CDC agent is restarted, it will have to start from scratch. I see that > it > is present in the IEP as a design choice, but I think this is a major > usability issue > - If a CDC reader does not keep up with the WAL write rate (e.g. there > is a short-term write burst and WAL archive is small), the Ignite node > will > delete WAL segments while the consumer is still reading it. Since the > consumer is running out-of-process, we need to specify some sort of > synchronization protocol between the node and the consumer > - If Ignite node crashes, gets restarted and initiates full rebalance, > the consumer will lose some updates > - Usually, it makes sense for the CDC consumer to read updates only on > primary nodes (otherwise, multiple agents will be doing duplicate > work). In > the current design, the consumer will not be able to differentiate > primary/backup updates. Moreover, even if we wrote such flags to WAL, > the > consumer would need to process backup records anyway because it is > unknown > whether the primary consumer is alive. In other words, how would an end > user organize the CDC failover minimizing the duplicate work? > > > ср, 14 окт. 2020 г. в 14:21, Nikolay Izhikov <[email protected]>: > > > Hello, Igniters. > > > > I want to start a discussion of the new feature [1] > > > > CDC - capture data change. The feature allows the consumer to receive > > online notifications about data record changes. > > > > It can be used in the following scenarios: > > * Export data into some warehouse, full-text search, or > > distributed log system. > > * Online statistics and analytics. > > * Wait and respond to some specific events or data changes. > > > > Propose to implement new IgniteCDC application as follows: > > * Run on the server node host. > > * Watches for the appearance of the WAL archive segments. > > * Iterates it using existing WALIterator and notifies consumer of > > each record from the segment. > > > > IgniteCDC features: > > * Independence from the server node process (JVM) - issues and > > failures of the consumer will not lead to server node instability. > > * Notification guarantees and failover - i.e. CDC track and save > > the pointer to the last consumed record. Continue notification from this > > pointer in case of restart. > > * Resilience for the consumer - it's not an issue when a consumer > > temporarily consumes slower than data appear. > > > > WDYT? > > > > [1] > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-59+CDC+-+Capture+Data+Change >
