[ https://issues.apache.org/jira/browse/HUDI-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
kazdy updated HUDI-5848: ------------------------ Summary: No PreCombineField mode - make COMBINE_BEFORE_UPSERT=false automatically (was: If no precombine field is provided make COMBINE_BEFORE_UPSERT=false automatically) > No PreCombineField mode - make COMBINE_BEFORE_UPSERT=false automatically > ------------------------------------------------------------------------ > > Key: HUDI-5848 > URL: https://issues.apache.org/jira/browse/HUDI-5848 > Project: Apache Hudi > Issue Type: Improvement > Components: dev-experience > Reporter: kazdy > Assignee: kazdy > Priority: Minor > Fix For: 0.13.1 > > > Starting from 0.13 precombine field is optional in Spark. > Before this was only available in Flink, but in Flink COMBINE_BEFORE_UPSERT > is set to false by default and if no precombine field is provided upserts can > be done without any configuration changes. > In Hudi + Spark, on the other hand, users must explicitly set > COMBINE_BEFORE_UPSERT option to false first in order to do upserts in absence > of precombine field. > As a Hudi user, if no precombine field is provided I would like Hudi to > automatically set the appropriate option of COMBINE_BEFORE_UPSERT, to provide > a seamless experience. > I assume precombine field can be optional only if the table type is CoW, for > MoR precombine is required for it to work properly so it's ok to throw an > error in absence of precombine when operation is upsert. > Therefore this should work only for CoW. -- This message was sent by Atlassian Jira (v8.20.10#820010)