TJX2014 commented on PR #6595: URL: https://github.com/apache/hudi/pull/6595#issuecomment-1241461060
> > but in flink side, I think deduplicate should also open as default option for mor table , when duplicate write to log file, very hard for compact to read, also lead mor table not stable due to the duplicate record twice read into memory. > > Do you mean that there are two client writing to the same partition at the same time? Not exactly, if we deduplicate the record in memory, and then write to log is elegant for MOR because result is same. As @danny0405 say, in cdc situation, we need to retain origin records, not compact firstly in memory, which is acceptable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org