cdmikechen opened a new pull request, #3391: URL: https://github.com/apache/hudi/pull/3391
### Change Logs This pull request let hive can read timestamp type column datas correctly. The problem was initially related to JIRA [HUDI-83](https://issues.apache.org/jira/browse/HUDI-83) and related issues on issue https://github.com/apache/hudi/issues/2544 - Change `HoodieParquetInputFormat` to use a custom `ParquetInputFormat` named `HudiAvroParquetInputFormat` - In `HudiAvroParquetInputFormat` we use a custom `RecordReader` named `HudiAvroParquetReader`. In this class we use `AvroReadSupport` so that Hive can get parquet data with an avro GenericRecord. - Use `org.apache.hudi.hadoop.utils.HoodieRealtimeRecordReaderUtils.avroToArrayWritable` to transform GenericRecord to ArrayWriteable. At the same time, timestamp/date type processing for different situations of hive2 and hive3 is added to this method. - Set `hoodie.datasource.hive_sync.support_timestamp` default value from false to true - add a `supportAvroRead` value to be compatible with the adaptation of some old hudi versions for hive3 timestamp/date types ### Impact - hudi-hadoop-mr - spark ### Risk level low ### Documentation Update The javadoc has been modified and the website document will be on other PR later. ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org