[ https://issues.apache.org/jira/browse/HUDI-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinoth Chandar updated HUDI-1376: --------------------------------- Fix Version/s: 0.7.0 > Drop Hudi metadata columns before Spark datasource writing > ----------------------------------------------------------- > > Key: HUDI-1376 > URL: https://issues.apache.org/jira/browse/HUDI-1376 > Project: Apache Hudi > Issue Type: Bug > Reporter: Wenning Ding > Assignee: Wenning Ding > Priority: Major > Labels: pull-request-available > Fix For: 0.7.0 > > > When updating a Hudi table through Spark datasource, it will use the schema > of the input dataframe as the schema stored in the commit files. Thus, when > upserted with rows containing metadata columns, the upsert commit file will > store the metadata columns schema in the commit file which is unnecessary for > common cases. And also this will bring an issue for bootstrap table. > Since metadata columns are not used during the Spark datasource writing > process, we can drop those columns in the beginning. -- This message was sent by Atlassian Jira (v8.3.4#803005)