[jira] [Assigned] (HUDI-4001) "hoodie.datasource.write.operation" from table config should not be used as write operation

Jira Tue, 03 May 2022 22:18:07 -0700


     [ 
https://issues.apache.org/jira/browse/HUDI-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


董可伦 reassigned HUDI-4001:
-------------------------

    Assignee: 董可伦

> "hoodie.datasource.write.operation" from table config should not be used as 
> write operation
> -------------------------------------------------------------------------------------------
>
>                 Key: HUDI-4001
>                 URL: https://issues.apache.org/jira/browse/HUDI-4001
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: spark-sql
>            Reporter: Ethan Guo
>            Assignee: 董可伦
>            Priority: Critical
>             Fix For: 0.11.1
>
>
> [https://github.com/apache/hudi/issues/5248]
> when I use spark sql create table and set 
> {*}hoodie.datasource.write.operation{*}=upsert.
> delete sql （like pr [#5215|https://github.com/apache/hudi/pull/5215] ）, 
> insert overwrite sql etc will still use *hoodie.datasource.write.operation* 
> to update record, not delete, insert_overwrite etc.
> eg:
> create a table and set hoodie.datasource.write.operation upsert
> when I use sql to delete, the delete operation key will be overwrite by 
> hoodie.datasource.write.operation from table or env, *OPERATION.key -> 
> DataSourceWriteOptions.DELETE_OPERATION_OPT_VAL* will not effect, overwrite 
> to *upsert*
> withSparkConf(sparkSession, hoodieCatalogTable.catalogProperties) { Map( 
> "path" -> path, RECORDKEY_FIELD.key -> 
> hoodieCatalogTable.primaryKeys.mkString(","), TBL_NAME.key -> 
> tableConfig.getTableName, HIVE_STYLE_PARTITIONING.key -> 
> tableConfig.getHiveStylePartitioningEnable, URL_ENCODE_PARTITIONING.key -> 
> tableConfig.getUrlEncodePartitioning, KEYGENERATOR_CLASS_NAME.key -> 
> classOf[SqlKeyGenerator].getCanonicalName, 
> SqlKeyGenerator.ORIGIN_KEYGEN_CLASS_NAME -> 
> tableConfig.getKeyGeneratorClassName, OPERATION.key -> 
> DataSourceWriteOptions.DELETE_OPERATION_OPT_VAL, PARTITIONPATH_FIELD.key -> 
> tableConfig.getPartitionFieldProp, HiveSyncConfig.HIVE_SYNC_MODE.key -> 
> HiveSyncMode.HMS.name(), HiveSyncConfig.HIVE_SUPPORT_TIMESTAMP_TYPE.key -> 
> "true", HoodieWriteConfig.DELETE_PARALLELISM_VALUE.key -> "200", 
> SqlKeyGenerator.PARTITION_SCHEMA -> partitionSchema.toDDL ) }
> so, when use sql, what about don't write it to hoodie.properties, confine it 
> when sql check, command generated itself in runtime.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Assigned] (HUDI-4001) "hoodie.datasource.write.operation" from table config should not be used as write operation

Reply via email to