zhedoubushishi opened a new pull request #1953: URL: https://github.com/apache/hudi/pull/1953
## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request When using ```fixed_len_byte_array``` decimal type as Hudi record key, Hudi would not correctly display the decimal value, instead, Hudi would display it as a byte array. During the Hudi writing phase, Hudi would save the parquet source data into Avro Generic Record. For example, the source parquet data has a column with decimal type: ``` optional fixed_len_byte_array(16) LN_LQDN_OBJ_ID (DECIMAL(38,0)); ``` Then Hudi will convert it into the following avro decimal type: ``` { "name" : "LN_LQDN_OBJ_ID", "type" : [ { "type" : "fixed", "name" : "fixed", "namespace" : "hoodie.hudi_ln_lqdn.hudi_ln_lqdn_record.LN_LQDN_OBJ_ID", "size" : 16, "logicalType" : "decimal", "precision" : 38, "scale" : 0 }, "null" ] } ``` This decimal field would be stored as a fixed length bytes array. And in the reading phase, Hudi will convert this bytes array back to a readable decimal value through this [converter](https://github.com/apache/hudi/blob/master/hudi-spark/src/main/scala/org/apache/hudi/AvroConversionHelper.scala#L58). However, the problem is, when setting decimal type as record keys, Hudi would read the value from Avro Generic Record and then directly convert it into ```String``` type(See [here](https://github.com/apache/hudi/blob/master/hudi-spark/src/main/java/org/apache/hudi/DataSourceUtils.java#L76)). As a result, what shows in the ```_hoodie_record_key``` field would be something like: ```LN_LQDN_OBJ_ID:[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 25, 40, 95, -71]```. So we need to handle this special case to convert bytes array back before converting to ```String```. ## Brief change log Similar to what we did for Date type columns: https://github.com/apache/hudi/commit/2d040145810b8b14c59c5882f9115698351039d1#diff-21f77fb372831d468dab018505592e12, I added another logic to handle decimal type column. ## Verify this pull request This change added tests and can be verified as follows: - *Added a decimal test case in TestDataSourceUtils.java to verify the change.* ## Committer checklist - [x] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org