[ 
https://issues.apache.org/jira/browse/HIVE-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459981#comment-16459981
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-19354:
---------------------------------------------------------

I made a fix for this issue and uploading the patch to run precommit tests.

The issue here is that when we pass a UTC string with timezone, in the line 

[https://github.com/apache/hive/blob/41de95318d80df282fbed17ede6b3a05f649cce9/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java#L1264]

the Timestamp.from() method ignores the timezone as timestamp does not really 
have a concept of timezone.

Hence, the change in my patch is to change the method to accept a ZoneId, and 
if ZoneId is present, convert the Instant to a LocalDateTime which will 
represent the date and time according to the ZoneId. Now, when we convert it to 
TimeStamp, it will be in accordance with the timezone we need.

> from_utc_timestamp returns incorrect results for datetime values with timezone
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-19354
>                 URL: https://issues.apache.org/jira/browse/HIVE-19354
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 3.1.0
>            Reporter: Bruce Robbins
>            Assignee: Bharathkrishna Guruvayoor Murali
>            Priority: Major
>         Attachments: HIVE-19354.01.patch
>
>
> On the master branch, from_utc_timestamp returns incorrect results for 
> datetime strings that containĀ a timezone:
> {noformat}
> hive> select from_utc_timestamp('2000-10-10 00:00:00+00:00', 
> 'America/Los_Angeles');
> OK
> 2000-10-09 10:00:00
> Time taken: 0.294 seconds, Fetched: 1 row(s)
> hive> select from_utc_timestamp('2000-10-10 00:00:00', 'America/Los_Angeles');
> OK
> 2000-10-09 17:00:00
> Time taken: 0.121 seconds, Fetched: 1 row(s)
> hive> 
> {noformat}
> Both inputs are 2000-10-10 00:00:00 in UTC time, but I got two different 
> results.
> In version 2.3.3, from_utc_timestamp doesn't accept timezones in its input 
> strings, so it does not have this bug:
> {noformat}
> hive> select from_utc_timestamp('2000-10-10 00:00:00+00:00', 
> 'America/Los_Angeles');
> OK
> NULL
> Time taken: 5.152 seconds, Fetched: 1 row(s)
> hive> select from_utc_timestamp('2000-10-10 00:00:00', 'America/Los_Angeles');
> OK
> 2000-10-09 17:00:00
> Time taken: 0.069 seconds, Fetched: 1 row(s)
> hive> 
> {noformat}
> Since the function is expecting a UTC datetime value, it probably should 
> continue to reject input that contains a timezone component.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to