hudi-bot opened a new issue, #14539:
URL: https://github.com/apache/hudi/issues/14539

   The [https://hudi.apache.org/docs/quick-start-guide.html] example data has a 
column `partitionpath` which holds values like `americas/brazil/sao_paulo`. 
Using the docker environment's spark-shell, you can change the basePath from 
the quickstart to save to hdfs://user/hive/warehouse/hudi_trips_cow and write 
the table. Then you can see the folder in the HDFS browser, similar to the 
stock_ticks_cow folder created in the docker demo.
   
   However, if you try to use run_sync_tool.sh to sync the table to Hive, you 
get the error: "java.lang.IllegalArgumentException: Partition key parts 
[partitionpath] does not match with partition values [americas, brazil, 
sao_paulo]. Check partition strategy. "
   {quote}{{/var/hoodie/ws/hudi-hive/run_sync_tool.sh --jdbc-url 
jdbc:hive2://hiveserver:10000 --user hive --pass hive --partitioned-by 
partitionpath --partition-value-extractor 
org.apache.hudi.hive.MultiPartKeysValueExtractor -MultiPartKeysValueExtractor 
-base-path /user/hive/warehouse/hudi_trips_cow --database default --table 
hudi_trips_cow}}
   {quote}
   This error is thrown in `HoodieHiveClient.getPartitionClause`, which uses 
`extractPartitionValuesInPath` to get a list of partitionValues. The problem is 
that it compares the length of the partitionValues to the length of the 
partitionField. In this example, there is only 1 partitionField, 
"partitionpath," which is split into 3 partitionValues. Thus the check fails 
and throws the exception. 
   
   See 
[https://github.com/apache/incubator-hudi/blob/master/hudi-hive/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java#L182]
   
    
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-628
   - Type: Bug
   - Attachment(s):
     - 21/Feb/20 
23:06;popart;stack_trace.txt;https://issues.apache.org/jira/secure/attachment/12994147/stack_trace.txt
   
   
   ---
   
   
   ## Comments
   
   22/Feb/20 00:25;vbalaji;@Andrew Wong, This is expected if you use 
MultiPartKeysValueExtractor as it splits by "/". You might want to give 3 
fields as partition fields (continent, country, city) for  
"americas/brazil/sao_paulo". If you want to treat them as one field, you can 
simply add a new implementation for PartitionValueExtractor and plug it in.  ;;;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to