Hi community!
I'm working on migrating from full-data-pipeline(with spark) to incremental-data-pipeline(with flink cdc), and i met a problem about accuracy validation between pipeline based flink and spark. For bounded data, it's simple to validate the two result sets are consitent or not. But, for unbouned data and event-driven application, how to make sure the data stream produced is correct, especially when there are some retract functions with high impactions, e.g. row_number. Is there any document for this preblom? Thanks for your any suggestions or replies. Best Regards!