It's a good question. Let me ping @Leonard to share more thoughts.


vtygoss <> 于2022年5月20日周五 16:04写道:

> Hi community!
> I'm working on migrating from full-data-pipeline(with spark) to
> incremental-data-pipeline(with flink cdc), and i met a problem about
> accuracy validation between pipeline based flink and spark.
> For bounded data, it's simple to validate the two result sets are
> consitent or not.
> But, for unbouned data and event-driven application, how to make sure the
> data stream produced is correct, especially when there are some retract
> functions with high impactions, e.g. row_number.
> Is there any document for this preblom?  Thanks for your any suggestions
> or replies.
> Best Regards!

Reply via email to