onlywangyh commented on code in PR #5434: URL: https://github.com/apache/hudi/pull/5434#discussion_r858555712
########## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/FilePathUtils.java: ########## @@ -420,9 +420,9 @@ public static org.apache.flink.core.fs.Path toFlinkPath(Path path) { * @return array of the partition fields */ public static String[] extractPartitionKeys(org.apache.flink.configuration.Configuration conf) { - if (FlinkOptions.isDefaultValueDefined(conf, FlinkOptions.PARTITION_PATH_FIELD)) { + if (FlinkOptions.isDefaultValueDefined(conf, FlinkOptions.HIVE_SYNC_PARTITION_FIELDS)) { return new String[0]; } - return conf.getString(FlinkOptions.PARTITION_PATH_FIELD).split(","); + return conf.getString(FlinkOptions.HIVE_SYNC_PARTITION_FIELDS).split(","); } Review Comment: In HiveSyncContext this PARTITION_PATH_FIELD assign to hive sync partition fields .I think these two params `PARTITION_PATH_FIELD` and `HIVE_SYNC_PARTITION_FIELDS` have different meanings in hudi. `PARTITION_PATH_FIELD` is for hudi KeyGenerator to get a partitionPath `HIVE_SYNC_PARTITION_FIELDS` is use for hive to set a partition field. This function _extractPartitionKeys_ should get the hive partition fields key rather than a hudi partition path field. Sometimes confuse the values of the two will cause some errors In this case we use TimestampBasedAvroKeyGenerator and set hudi partition path field is same as hive partition fields . There will be some promblems, see: ` PARTITION_PATH_FIELD=datetime HIVE_SYNC_PARTITION_FIELDS=datetime ` **In hudi:** we will get the _1596074902000L_ value and converted to a string hudi partition path like _2020-07-30_. **In hive:** We will get the table like : ``` CREATE EXTERNAL TABLE `testTable`( `_hoodie_commit_time` string COMMENT '', `_hoodie_commit_seqno` string COMMENT '', `_hoodie_record_key` string COMMENT '', `_hoodie_partition_path` string COMMENT '', `_hoodie_file_name` string COMMENT '', `id` int COMMENT '', `datetime` bigint COMMENT '' ) PARTITIONED BY (`datetime` string COMMENT '')... ``` This partition value _datetime=2020-07-30_ also will be add to hive. The problems is we can't get the datetime from hive And the partition is also broken -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org