Hi community, The Carbondata’s CDC is an important feature as CDC is an important use case in data analytics of merging the source data changes to target table. With the current design of CDC, the performance is not good when the data is huge in the target table and input source data also. This mail is to discuss about the improvements on this in phase wise.
In the offline discussion in community, we have decided to go with the minmax based pruning in phase1, PFA link of design doc, please check and give your inputs/suggestions. https://docs.google.com/document/d/1Qa4yEbKYsYo7LUnnKjKHzF-DMtRUSfjkh9arWawPLSc/edit?usp=sharing Thanks and Regards, Akash R