[GitHub] [hudi] nsivabalan commented on issue #2656: HUDI insert operation is working same as upsert

GitBox Fri, 16 Apr 2021 20:23:48 -0700


nsivabalan commented on issue #2656:
URL: https://github.com/apache/hudi/issues/2656#issuecomment-821757914



   I guess I understand what's happening. in COW, when creating a new data 
file, hudi reads existing data and merges w/ incoming data. From merging 
standpoint, partition path and record key pairs are considered unique. And so 
even if we insert the same batch again, new data file will not have duplicated 
data. One option is to create unique keys for every record as suggested by 
@pengzhiwei2018. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #2656: HUDI insert operation is working same as upsert

Reply via email to