[ 
https://issues.apache.org/jira/browse/HIVE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302837#comment-14302837
 ] 

Yang Yang commented on HIVE-6394:
---------------------------------

the parquet spec about logical types and Timestamp specifically, seems to say 
https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md
"TIMESTAMP_MILLIS is used for a combined logical date and time type. It must 
annotate an int64 that stores the number of milliseconds from the Unix epoch, 
00:00:00.000 on 1 January 1970, UTC.

"


i.e. here it says that the type is only precise to the point of miliseconds and 
it starts from 1970.


but if u look at the hive-parquet code in 
https://github.com/apache/hive/blob/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java#L142
https://github.com/apache/hive/blob/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTime.java#L54
it seems that hive's encoding of timestamp on parquet is of a different spec, 
precise to the point of nano seconds, and starting from "Monday, January 1, 
4713 " (defined in jodd.datetime.JDateTime) 


so Hive's parquet timestamp storage is completely different from the above spec 
?




> Implement Timestmap in ParquetSerde
> -----------------------------------
>
>                 Key: HIVE-6394
>                 URL: https://issues.apache.org/jira/browse/HIVE-6394
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Serializers/Deserializers
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Szehon Ho
>              Labels: Parquet
>             Fix For: 0.14.0
>
>         Attachments: HIVE-6394.2.patch, HIVE-6394.3.patch, HIVE-6394.4.patch, 
> HIVE-6394.5.patch, HIVE-6394.6.patch, HIVE-6394.6.patch, HIVE-6394.7.patch, 
> HIVE-6394.patch
>
>
> This JIRA is to implement timestamp support in Parquet SerDe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to