onlywangyh commented on code in PR #5434:
URL: https://github.com/apache/hudi/pull/5434#discussion_r858555712


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/FilePathUtils.java:
##########
@@ -420,9 +420,9 @@ public static org.apache.flink.core.fs.Path 
toFlinkPath(Path path) {
    * @return array of the partition fields
    */
   public static String[] 
extractPartitionKeys(org.apache.flink.configuration.Configuration conf) {
-    if (FlinkOptions.isDefaultValueDefined(conf, 
FlinkOptions.PARTITION_PATH_FIELD)) {
+    if (FlinkOptions.isDefaultValueDefined(conf, 
FlinkOptions.HIVE_SYNC_PARTITION_FIELDS)) {
       return new String[0];
     }
-    return conf.getString(FlinkOptions.PARTITION_PATH_FIELD).split(",");
+    return conf.getString(FlinkOptions.HIVE_SYNC_PARTITION_FIELDS).split(",");
   }

Review Comment:
   In HiveSyncContext this PARTITION_PATH_FIELD assign to hive sync partition 
fields .I think these two params `PARTITION_PATH_FIELD` and  
`HIVE_SYNC_PARTITION_FIELDS` 
   have different meanings in hudi.
   `PARTITION_PATH_FIELD` is for hudi KeyGenerator to get a partitionPath
   `HIVE_SYNC_PARTITION_FIELDS`  is use for hive to set a partition field.
   
   This function _extractPartitionKeys_ should get the hive partition fields 
key rather than a hudi partition path field. Sometimes confuse the values ​​of 
the two will cause some errors
   
   
   In this case we use TimestampBasedAvroKeyGenerator and set hudi partition 
path field is same as hive partition fields .  There will be some promblems, 
see:
   `
   PARTITION_PATH_FIELD=datetime
   HIVE_SYNC_PARTITION_FIELDS=datetime
   `
   
   **In hudi:** we will get the _1596074902000L_ value and converted to a 
string hudi partition path like _2020-07-30_.  
   **In hive:** We will get the table like :
   ```
    CREATE EXTERNAL TABLE `testTable`(
      `_hoodie_commit_time` string COMMENT '',       
      `_hoodie_commit_seqno` string COMMENT '',          
      `_hoodie_record_key` string COMMENT '',
      `_hoodie_partition_path` string COMMENT '',
      `_hoodie_file_name` string COMMENT '',               
      `id` int COMMENT '',
      `datetime` bigint COMMENT ''        
      )
    PARTITIONED BY (`datetime` string COMMENT '')...
   ```
   This partition value _datetime=2020-07-30_  also will be add to hive. The 
problems is we can't get  the datetime from hive  And the partition is also 
broken
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to