Sagar Sumit created HUDI-2496: --------------------------------- Summary: Inserts are precombined even with dedup disabled Key: HUDI-2496 URL: https://issues.apache.org/jira/browse/HUDI-2496 Project: Apache Hudi Issue Type: Bug Reporter: Sagar Sumit
Test case by [~xushiyan] : https://github.com/apache/hudi/pull/3723/files RCA by [~shivnarayan] : Within HoodieMergeHandle, we use a hashmap to store incoming records, where keys are record keys. and so, if you see 1st batch, duplicates would remain intact. but wrt 2nd batch, only unique records are considered and later concatenated w/ 1st batch. https://github.com/apache/hudi/blob/36be28712196ff4427c41b0aa885c7fcd7356d7f/hudi-[…]-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java -- This message was sent by Atlassian Jira (v8.3.4#803005)