hudi-bot opened a new issue, #14539: URL: https://github.com/apache/hudi/issues/14539
The [https://hudi.apache.org/docs/quick-start-guide.html] example data has a column `partitionpath` which holds values like `americas/brazil/sao_paulo`. Using the docker environment's spark-shell, you can change the basePath from the quickstart to save to hdfs://user/hive/warehouse/hudi_trips_cow and write the table. Then you can see the folder in the HDFS browser, similar to the stock_ticks_cow folder created in the docker demo. However, if you try to use run_sync_tool.sh to sync the table to Hive, you get the error: "java.lang.IllegalArgumentException: Partition key parts [partitionpath] does not match with partition values [americas, brazil, sao_paulo]. Check partition strategy. " {quote}{{/var/hoodie/ws/hudi-hive/run_sync_tool.sh --jdbc-url jdbc:hive2://hiveserver:10000 --user hive --pass hive --partitioned-by partitionpath --partition-value-extractor org.apache.hudi.hive.MultiPartKeysValueExtractor -MultiPartKeysValueExtractor -base-path /user/hive/warehouse/hudi_trips_cow --database default --table hudi_trips_cow}} {quote} This error is thrown in `HoodieHiveClient.getPartitionClause`, which uses `extractPartitionValuesInPath` to get a list of partitionValues. The problem is that it compares the length of the partitionValues to the length of the partitionField. In this example, there is only 1 partitionField, "partitionpath," which is split into 3 partitionValues. Thus the check fails and throws the exception. See [https://github.com/apache/incubator-hudi/blob/master/hudi-hive/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java#L182] ## JIRA info - Link: https://issues.apache.org/jira/browse/HUDI-628 - Type: Bug - Attachment(s): - 21/Feb/20 23:06;popart;stack_trace.txt;https://issues.apache.org/jira/secure/attachment/12994147/stack_trace.txt --- ## Comments 22/Feb/20 00:25;vbalaji;@Andrew Wong, This is expected if you use MultiPartKeysValueExtractor as it splits by "/". You might want to give 3 fields as partition fields (continent, country, city) for "americas/brazil/sao_paulo". If you want to treat them as one field, you can simply add a new implementation for PartitionValueExtractor and plug it in. ;;; -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
