[ 
https://issues.apache.org/jira/browse/HUDI-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17207838#comment-17207838
 ] 

cdmikechen edited comment on HUDI-83 at 10/5/20, 6:42 AM:
----------------------------------------------------------

Some codes may have duplicate parts with HUDI-1302 . I will submit after the PR 
of HUDI-1302 is completed.

[~uditme]
There will be some more changes to let Hive2 to recognize hudi timestamp type 
in parquet-avro file.
1. Create a new class *HudiWritableTimestampObjectInspector* extends 
WritableTimestampObjectInspector. Spark-avro package transform timestamp to 
long with timestamp-micros logical type (long * 1000), so that wee need to 
change LongWritable value to TimestampWritable.
2. Create a new class *HudiArrayWritableObjectInspector* extends 
SettableStructObjectInspector. In this class, if column type is timestamp, hive 
can use *HudiWritableTimestampObjectInspector* to cas long to timestamp.
3. Create a new hive custom serd class *HudiParquetAvroSerDe*. In this class, 
hive can use objInspector by  *HudiArrayWritableObjectInspector*.
4. Change bigint to timestamp in *HiveSynTool* which HUDI-1302 open a PR to 
merge. Meanwhile, change *HiveSchemaUtil.generateCreateDDL()* to set *ROW 
FORMAT SERDE HudiParquetAvroSerDe*
5. add *HudiParquetAvroSerDe* value to *hive.serdes.using.metastore.for.schema* 
in hive-site.xml.




was (Author: chenxiang):
Some codes may have duplicate parts with HUDI-1302 . I will submit after the PR 
of HUDI-1302 is completed.

There will be some more changes to let Hive2 to recognize hudi timestamp type 
in parquet-avro file.
1. Create a new class *HudiWritableTimestampObjectInspector* extends 
WritableTimestampObjectInspector. Spark-avro package transform timestamp to 
long with timestamp-micros logical type (long * 1000), so that wee need to 
change LongWritable value to TimestampWritable.
2. Create a new class *HudiArrayWritableObjectInspector* extends 
SettableStructObjectInspector. In this class, if column type is timestamp, hive 
can use *HudiWritableTimestampObjectInspector* to cas long to timestamp.
3. Create a new hive custom serd class *HudiParquetAvroSerDe*. In this class, 
hive can use objInspector by  *HudiArrayWritableObjectInspector*.
4. Change bigint to timestamp in *HiveSynTool* which HUDI-1302 open a PR to 
merge. Meanwhile, change *HiveSchemaUtil.generateCreateDDL()* to set *ROW 
FORMAT SERDE HudiParquetAvroSerDe*
5. add *HudiParquetAvroSerDe* value to *hive.serdes.using.metastore.for.schema* 
in hive-site.xml.



> Map Timestamp type in spark to corresponding Timestamp type in Hive during 
> Hive sync
> ------------------------------------------------------------------------------------
>
>                 Key: HUDI-83
>                 URL: https://issues.apache.org/jira/browse/HUDI-83
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: Hive Integration, Usability
>            Reporter: Vinoth Chandar
>            Assignee: cdmikechen
>            Priority: Major
>              Labels: bug-bash-0.6.0
>             Fix For: 0.7.0
>
>
> [https://github.com/apache/incubator-hudi/issues/543] &; related issues 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to