Thanks Jack for the great write-up. Good summary of the current landscape
of CDC too. Left a few comments to discuss.

On Wed, Apr 26, 2023 at 11:38 AM Anton Okolnychyi
<aokolnyc...@apple.com.invalid> wrote:

> Thanks for starting a thread, Jack! I am yet to go through the proposal.
>
> I recently came across a similar idea in BigQuery, which relies on a
> staleness threshold:
>
> https://cloud.google.com/blog/products/data-analytics/bigquery-gains-change-data-capture-functionality/
>
> It would also be nice to check if there are any applicable ideas in Paimon:
> https://github.com/apache/incubator-paimon/
>
> - Anton
>
> On Apr 26, 2023, at 11:32 AM, Jack Ye <yezhao...@gmail.com> wrote:
>
> Hi everyone,
>
> As we discussed in the community sync, it looks like we have some general
> interest in improving the CDC streaming process. Dan mentioned that Ryan
> has a proposal about an alternative CDC approach that has an accumulated
> changelog that is periodically synced to a target table.
>
> I have a very similar design doc I have been working on for quite some
> time to describe a set of improvements we could do to the Iceberg CDC use
> case, and it contains a very similar improvement (see improvement 3).
>
> I would appreciate feedback from the community about this doc, and I can
> organize some meetings to discuss our thoughts about this topic afterwards.
>
> Doc link:
> https://docs.google.com/document/d/1kyyJp4masbd1FrIKUHF1ED_z1hTARL8bNoKCgb7fhSQ/edit#
>
> Best,
> Jack Ye
>
>
>

Reply via email to