Hi Li, Welcome. Both the delta streamer and data source support an option to de-duplicate data before inserting. How are you planning on writing the Hudi dataset? I can point you in the right direction accordingly
Thanks Vinoth On Tue, Apr 23, 2019 at 4:12 PM Li Gao <[email protected]> wrote: > Hi Hudi community, > > I am fairly new to the hudi community and trying to evaluate whether hudi's > incremental compaction support record deduplications from landing data > coming off kafka. If yes I want to understand how it works currently. > > Thank you, > Li >
