[
https://issues.apache.org/jira/browse/HIVE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302837#comment-14302837
]
Yang Yang commented on HIVE-6394:
---------------------------------
the parquet spec about logical types and Timestamp specifically, seems to say
https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md
"TIMESTAMP_MILLIS is used for a combined logical date and time type. It must
annotate an int64 that stores the number of milliseconds from the Unix epoch,
00:00:00.000 on 1 January 1970, UTC.
"
i.e. here it says that the type is only precise to the point of miliseconds and
it starts from 1970.
but if u look at the hive-parquet code in
https://github.com/apache/hive/blob/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java#L142
https://github.com/apache/hive/blob/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTime.java#L54
it seems that hive's encoding of timestamp on parquet is of a different spec,
precise to the point of nano seconds, and starting from "Monday, January 1,
4713 " (defined in jodd.datetime.JDateTime)
so Hive's parquet timestamp storage is completely different from the above spec
?
> Implement Timestmap in ParquetSerde
> -----------------------------------
>
> Key: HIVE-6394
> URL: https://issues.apache.org/jira/browse/HIVE-6394
> Project: Hive
> Issue Type: Sub-task
> Components: Serializers/Deserializers
> Reporter: Jarek Jarcec Cecho
> Assignee: Szehon Ho
> Labels: Parquet
> Fix For: 0.14.0
>
> Attachments: HIVE-6394.2.patch, HIVE-6394.3.patch, HIVE-6394.4.patch,
> HIVE-6394.5.patch, HIVE-6394.6.patch, HIVE-6394.6.patch, HIVE-6394.7.patch,
> HIVE-6394.patch
>
>
> This JIRA is to implement timestamp support in Parquet SerDe.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)