Thanks for starting a thread, Jack! I am yet to go through the proposal. 

I recently came across a similar idea in BigQuery, which relies on a staleness 
threshold:
https://cloud.google.com/blog/products/data-analytics/bigquery-gains-change-data-capture-functionality/
 
<https://cloud.google.com/blog/products/data-analytics/bigquery-gains-change-data-capture-functionality/>

It would also be nice to check if there are any applicable ideas in Paimon:
https://github.com/apache/incubator-paimon/ 
<https://github.com/apache/incubator-paimon/>

- Anton

> On Apr 26, 2023, at 11:32 AM, Jack Ye <yezhao...@gmail.com> wrote:
> 
> Hi everyone,
> 
> As we discussed in the community sync, it looks like we have some general 
> interest in improving the CDC streaming process. Dan mentioned that Ryan has 
> a proposal about an alternative CDC approach that has an accumulated 
> changelog that is periodically synced to a target table.
> 
> I have a very similar design doc I have been working on for quite some time 
> to describe a set of improvements we could do to the Iceberg CDC use case, 
> and it contains a very similar improvement (see improvement 3).
> 
> I would appreciate feedback from the community about this doc, and I can 
> organize some meetings to discuss our thoughts about this topic afterwards.
> 
> Doc link: 
> https://docs.google.com/document/d/1kyyJp4masbd1FrIKUHF1ED_z1hTARL8bNoKCgb7fhSQ/edit#
>  
> <https://docs.google.com/document/d/1kyyJp4masbd1FrIKUHF1ED_z1hTARL8bNoKCgb7fhSQ/edit#>
> 
> Best,
> Jack Ye

Reply via email to