It's a good question. Let me ping @Leonard to share more thoughts. Best, Shengkai
vtygoss <vtyg...@126.com> 于2022年5月20日周五 16:04写道: > Hi community! > > > I'm working on migrating from full-data-pipeline(with spark) to > incremental-data-pipeline(with flink cdc), and i met a problem about > accuracy validation between pipeline based flink and spark. > > > For bounded data, it's simple to validate the two result sets are > consitent or not. > > But, for unbouned data and event-driven application, how to make sure the > data stream produced is correct, especially when there are some retract > functions with high impactions, e.g. row_number. > > > Is there any document for this preblom? Thanks for your any suggestions > or replies. > > > Best Regards! >