Jesus Camacho Rodriguez created HIVE-20007:
----------------------------------------------
Summary: Hive should carry out timestamp computations in UTC
Key: HIVE-20007
URL: https://issues.apache.org/jira/browse/HIVE-20007
Project: Hive
Issue Type: Sub-task
Components: Hive
Reporter: Ryan Blue
Assignee: Jesus Camacho Rodriguez
Fix For: 3.1.0
Hive currently uses the "local" time of a java.sql.Timestamp to represent the
SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use
{{Timestamp#getYear()}} and similar methods to implement SQL functions like
{{year}}.
When the SQL session's time zone is a DST zone, such as America/Los_Angeles
that alternates between PST and PDT, there are times that cannot be represented
because the effective zone skips them.
{code}
hive> select TIMESTAMP '2015-03-08 02:10:00.101';
2015-03-08 03:10:00.101
{code}
Using UTC instead of the SQL session time zone as the underlying zone for a
java.sql.Timestamp avoids this bug, while still returning correct values for
{{getYear}} etc. Using UTC as the convenience representation (timestamp without
time zone has no real zone) would make timestamp calculations more consistent
and avoid similar problems in the future.
Notably, this would break the {{unix_timestamp}} UDF that specifies the result
is with respect to ["the default timezone and default
locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
That function would need to be updated to use the
{{System.getProperty("user.timezone")}} zone.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)