Hi @Stamatis Zampetakis<mailto:zabe...@gmail.com>, @David<mailto:dam6...@gmail.com>,
Our current implementation using DateTimeFormatter is not backward compatible and it leads to migration issues. One of our customer who have this use-case where we don't have a better options to migrate. Hive 1.2/Spark 2.4 (Shared metastore): Set VM time zone to Asia/Bangkok. INSERT values ("1400-01-01 00:00:00") into parquet_table; // Here, parquet writer converts the data into UTC (- 07:00:00) and stored it. Migrate to Hive 3.x/Spark 3.x (Shared metastore):: Set VM time zone to Asia/Bangkok. SELECT ts from parquet_table; // Hive returns different value whereas Spark (spark.sql.legacy.timeParserPolicy=LEGACY) returns 1400-01-01 00:00:00 It is not easy to change thousands of Hive scripts to handle this difference and it adds to migration cost. I think, it is necessary to enable backward compatibility for smooth migration. Pls share your thoughts. Thanks, Sankar From: Ashish Sharma <ashishkumarsharm...@gmail.com> Sent: 29 September 2021 19:11 To: dev@hive.apache.org; u...@hive.apache.org Cc: sank...@apache.org Subject: [EXTERNAL] Raise exception instead of silent change for new DateTimeformatter History Hive 1.2 - VM time zone set to Asia/Bangkok Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 UTC','yyyy-MM-dd HH:mm:ss z')); Result - 1800-01-01 07:00:00 Implementation details - SimpleDateFormat formatter = new SimpleDateFormat(pattern); Long unixtime = formatter.parse(textval).getTime() / 1000; Date date = new Date(unixtime * 1000L); https://docs.oracle.com/javase/8/docs/api/java/util/Date.html<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fapi%2Fjava%2Futil%2FDate.html&data=04%7C01%7CSankar.Hariappan%40microsoft.com%7C013a8535c2af4647fb1308d9834ede18%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637685197136779324%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=xxOBj5zDm29DTpPYC6rlgz639Dhn7vpHxALYHdn9VO0%3D&reserved=0> . In official documentation they have mentioned that "Unfortunately, the API for these functions was not amenable to internationalization and The corresponding methods in Date are deprecated" . Due to that this is producing wrong result latest hive - set hive.local.time.zone=Asia/Bangkok; Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 UTC','yyyy-MM-dd HH:mm:ss z')); Result - 1800-01-01 06:42:04 Implementation details - DateTimeFormatter dtformatter = new DateTimeFormatterBuilder() .parseCaseInsensitive() .appendPattern(pattern) .toFormatter(); ZonedDateTime zonedDateTime = ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone)); Long dttime = zonedDateTime.toInstant().getEpochSecond(); Problem- Now SimpleDateFormat has been replaced with DateTimeFormatter which is not backward compatible. Causing issues at times for migration to the new version. Because the older data written using Hive 1.x or 2.x is not compatible with DateTimeFormatter. Solution - Introduce an config "hive.legacy.timeParserPolicy" with following values - 1. EXCEPTION - compare value of both SimpleDateFormat & DateTimeFormatter raise exception if doesn't match 2. LEGACY - use SimpleDateFormat 3. CORRECTED - use DateTimeFormatter This will help hive user in the following manner - 1. Migrate to new version using LEGACY 2. Find values which are not compatible with the new version - EXCEPTION 3. Use latest date apis - CORRECTED Note: apache spark also face the same issue https://issues.apache.org/jira/browse/SPARK-30668<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FSPARK-30668&data=04%7C01%7CSankar.Hariappan%40microsoft.com%7C013a8535c2af4647fb1308d9834ede18%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637685197136779324%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=yYKfiVbSW%2FbfD5V1leqB8cH349Qb6FzYtSn5ClcZrqc%3D&reserved=0> Hive jira - https://issues.apache.org/jira/browse/HIVE-25576<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHIVE-25576&data=04%7C01%7CSankar.Hariappan%40microsoft.com%7C013a8535c2af4647fb1308d9834ede18%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637685197136789283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=W9ZZoPgtBeA69eF%2FonPtdXdp15PG4%2F1M6rc99G%2BErcc%3D&reserved=0> Thanks Ashish Sharma