[
https://issues.apache.org/jira/browse/HUDI-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Davis Zhang reassigned HUDI-8629:
---------------------------------
Assignee: Y Ethan Guo
> MergeInto w/ Partial updates pulls in fields from source not in assignment
> clause
> ---------------------------------------------------------------------------------
>
> Key: HUDI-8629
> URL: https://issues.apache.org/jira/browse/HUDI-8629
> Project: Apache Hudi
> Issue Type: Sub-task
> Reporter: sivabalan narayanan
> Assignee: Y Ethan Guo
> Priority: Blocker
> Fix For: 1.0.1
>
> Attachments: image-2024-12-02-04-07-54-483.png
>
>
> TestPartialUpdateForMergeInto.Test partial update with MOR and Avro log
> format w/ some slight changes.
>
> spark.sql(s"set
> ${HoodieWriteConfig.MERGE_SMALL_FILE_GROUP_CANDIDATES_LIMIT.key} = 0")
> spark.sql(s"set
> ${DataSourceWriteOptions.ENABLE_MERGE_INTO_PARTIAL_UPDATES.key} = true")
> spark.sql(s"set ${HoodieStorageConfig.LOGFILE_DATA_BLOCK_FORMAT.key} =
> $logDataBlockFormat")
> spark.sql(s"set ${HoodieReaderConfig.FILE_GROUP_READER_ENABLED.key} = false")
> // Create a table with five data fields
> spark.sql(
> s"""
> |create table $tableName (
> | id int,
> | name string,
> | price long,
> | _ts long,
> | description string
> |) using hudi
> |tblproperties(
> | type ='$tableType',
> | primaryKey = 'id',
> | preCombineField = '_ts'
> |)
> |location '$basePath'
> """.stripMargin)
> spark.sql(s"insert into $tableName values (1, 'a1', 10, 1000, 'a1: desc1')," +
> "(2, 'a2', 20, 1200, 'a2: desc2'), (3, 'a3', 30.0, 1250, 'a3: desc3')")
>
>
> spark.sql(
> s"""
> |merge into $tableName t0
> |using ( select 1 as id, 'a1' as name, 12 as price, 1001 as ts
> |union select 3 as id, 'a3' as name, 25 as price, 1260 as ts) s0
> |on t0.id = s0.id
> |when matched then update set price = s0.price, _ts = s0.ts
> |""".stripMargin)
>
>
> While executing this MergeInto statement, we modify the schema to be as
> follows.
> !image-2024-12-02-04-07-54-483.png!
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)