[ 
https://issues.apache.org/jira/browse/HIVE-25576?focusedWorklogId=664847&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664847
 ]

ASF GitHub Bot logged work on HIVE-25576:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/Oct/21 18:20
            Start Date: 13/Oct/21 18:20
    Worklog Time Spent: 10m 
      Work Description: zabetak commented on a change in pull request #2690:
URL: https://github.com/apache/hive/pull/2690#discussion_r728036705



##########
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##########
@@ -3711,6 +3711,11 @@ private static void 
populateLlapDaemonVarsSet(Set<String> llapDaemonVarsSetLocal
         "1800s", new TimeValidator(TimeUnit.SECONDS),
         "Interval to synchronize privileges from external authorizer 
periodically in HS2"),
 
+    HIVE_LEGACY_TIMEPARSER_POLICY("hive.legacy.timeparser.policy", false,

Review comment:
       How about renaming to `hive.datetime.formatter.legacy.enabled`? That way 
we create a `hive.datetime` namespace which we can use for other props in the 
future.

##########
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##########
@@ -3711,6 +3711,11 @@ private static void 
populateLlapDaemonVarsSet(Set<String> llapDaemonVarsSetLocal
         "1800s", new TimeValidator(TimeUnit.SECONDS),
         "Interval to synchronize privileges from external authorizer 
periodically in HS2"),
 
+    HIVE_LEGACY_TIMEPARSER_POLICY("hive.legacy.timeparser.policy", false,

Review comment:
       One thing that I am a bit skeptical about is that when I search for 
`SimpleDateFormat` or `DateTimeFormatter` in the repo I find occurrences in 
various places (including other UDFs). I am wondering what should we do with 
the other parts using new/old formatters. The current description of the 
property makes me believe that it has a kind of global effect but in this PR we 
are only touching the Unix timestamp UDFs.

##########
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##########
@@ -3711,6 +3711,11 @@ private static void 
populateLlapDaemonVarsSet(Set<String> llapDaemonVarsSetLocal
         "1800s", new TimeValidator(TimeUnit.SECONDS),
         "Interval to synchronize privileges from external authorizer 
periodically in HS2"),
 
+    HIVE_LEGACY_TIMEPARSER_POLICY("hive.legacy.timeparser.policy", false,
+        "When true, java.text.SimpleDateFormat is used for formatting and 
parsing\n"
+            + "dates/timestamps in a locale-sensitive manner, which is the 
approach before Hive 3.x.\n"
+            + "When set to false, classes from java.time.* packages are used 
for the same purpose.\n"
+            + "The default value is false, RuntimeException is thrown when we 
will get different results."),

Review comment:
       Do we throw an exception? I don't think so.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 664847)
    Time Spent: 1h 20m  (was: 1h 10m)

> Add config to parse date with older date format
> -----------------------------------------------
>
>                 Key: HIVE-25576
>                 URL: https://issues.apache.org/jira/browse/HIVE-25576
>             Project: Hive
>          Issue Type: Improvement
>    Affects Versions: 3.1.0, 3.0.0, 3.1.1, 3.1.2, 4.0.0
>            Reporter: Ashish Sharma
>            Assignee: Ashish Sharma
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','yyyy-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 07:00:00
> *Implementation details* - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime = formatter.parse(textval).getTime() / 1000;
> Date date = new Date(unixtime * 1000L);
> https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
> documentation they have mention that "Unfortunately, the API for these 
> functions was not amenable to internationalization and The corresponding 
> methods in Date are deprecated" . Due to that this is producing wrong result
> *Master branch* - 
> set hive.local.time.zone=Asia/Bangkok;
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','yyyy-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 06:42:04
> *Implementation details* - 
> DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
>         .parseCaseInsensitive()
>         .appendPattern(pattern)
>         .toFormatter();
>     ZonedDateTime zonedDateTime = 
> ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
>     Long dttime = zonedDateTime.toInstant().getEpochSecond();
> *Problem*- 
> Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
> giving the correct result but it is not backword compatible. Which is causing 
> issue at time for migration to new version. Because the older data written is 
> using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.
> *Solution*
> Introduce an config "hive.legacy.timeParserPolicy" with following values -
> 1. *True*- use *SimpleDateFormat* 
> 2. *False*  - use *DateTimeFormatter*
> Note: apache spark also face the same issue 
> https://issues.apache.org/jira/browse/SPARK-30668



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to