[ https://issues.apache.org/jira/browse/HIVE-25268?focusedWorklogId=613121&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613121 ]
ASF GitHub Bot logged work on HIVE-25268: ----------------------------------------- Author: ASF GitHub Bot Created on: 22/Jun/21 07:50 Start Date: 22/Jun/21 07:50 Worklog Time Spent: 10m Work Description: guptanikhil007 commented on a change in pull request #2409: URL: https://github.com/apache/hive/pull/2409#discussion_r655134815 ########## File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java ########## @@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException { // the function should support both short date and full timestamp format // time part of the timestamp should not be skipped Timestamp ts = getTimestampValue(arguments, 0, tsConverters); + if (ts == null) { Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters); if (d == null) { return null; } ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id); } - - - date.setTime(ts.toEpochMilli(id)); - String res = formatter.format(date); + Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC")); + Instant instant = Instant.ofEpochSecond(ts2.toEpochSecond(), ts2.getNanos()); + ZonedDateTime zonedDateTime = ZonedDateTime.ofInstant(instant, ZoneOffset.UTC); + String res = formatter.format(zonedDateTime); Review comment: The timezone gets converted for some specific locale's in case we don't do this conversion: PDT -> PT CST -> CT ########## File path: ql/src/test/queries/clientpositive/udf_date_format.q ########## @@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z"); --julian date set hive.local.time.zone=UTC; select date_format("1001-01-05","dd---MM--yyyy"); + +--dates prior to 1900 +set hive.local.time.zone=Asia/Bangkok; +select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z'); +select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z'); + +set hive.local.time.zone=Europe/Berlin; +select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z'); +select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z'); + +set hive.local.time.zone=Africa/Johannesburg; +select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z'); Review comment: All the existing tests with SimpleDateFormat Formatter is passing except the milliseconds change which I have mentioned in my comment. ########## File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java ########## @@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException { // the function should support both short date and full timestamp format // time part of the timestamp should not be skipped Timestamp ts = getTimestampValue(arguments, 0, tsConverters); + if (ts == null) { Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters); if (d == null) { return null; } ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id); } - - - date.setTime(ts.toEpochMilli(id)); - String res = formatter.format(date); + Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC")); Review comment: done ########## File path: ql/src/test/queries/clientpositive/udf_date_format.q ########## @@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z"); --julian date set hive.local.time.zone=UTC; select date_format("1001-01-05","dd---MM--yyyy"); + +--dates prior to 1900 +set hive.local.time.zone=Asia/Bangkok; +select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z'); +select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z'); + +set hive.local.time.zone=Europe/Berlin; +select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z'); +select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z'); + +set hive.local.time.zone=Africa/Johannesburg; +select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z'); Review comment: Once this patch is merged I will update the Hive wiki as well -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 613121) Time Spent: 4h 10m (was: 4h) > date_format udf doesn't work for dates prior to 1900 if the timezone is > different from UTC > ------------------------------------------------------------------------------------------ > > Key: HIVE-25268 > URL: https://issues.apache.org/jira/browse/HIVE-25268 > Project: Hive > Issue Type: Bug > Components: UDF > Affects Versions: 3.1.0, 3.1.1, 3.1.2, 4.0.0 > Reporter: Nikhil Gupta > Assignee: Nikhil Gupta > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 4h 10m > Remaining Estimate: 0h > > *Hive 1.2.1*: > {code:java} > select date_format('1400-01-14 01:00:00', 'yyyy-MM-dd HH:mm:ss z'); > +--------------------------+--+ > | _c0 | > +--------------------------+--+ > | 1400-01-14 01:00:00 ICT | > +--------------------------+--+ > select date_format('1800-01-14 01:00:00', 'yyyy-MM-dd HH:mm:ss z'); > +--------------------------+--+ > | _c0 | > +--------------------------+--+ > | 1800-01-14 01:00:00 ICT | > +--------------------------+--+ > {code} > *Hive 3.1, Hive 4.0:* > {code:java} > select date_format('1400-01-14 01:00:00', 'yyyy-MM-dd HH:mm:ss z'); > +--------------------------+ > | _c0 | > +--------------------------+ > | 1400-01-06 01:17:56 ICT | > +--------------------------+ > select date_format('1800-01-14 01:00:00', 'yyyy-MM-dd HH:mm:ss z'); > +--------------------------+ > | _c0 | > +--------------------------+ > | 1800-01-14 01:17:56 ICT | > +--------------------------+ > {code} > VM timezone is set to 'Asia/Bangkok' -- This message was sent by Atlassian Jira (v8.3.4#803005)