Janos Kovacs created HIVE-27742:
-----------------------------------

             Summary: LOCAL timezone value is treated as UTC instead of 
system's timezone which causes data consistency issues
                 Key: HIVE-27742
                 URL: https://issues.apache.org/jira/browse/HIVE-27742
             Project: Hive
          Issue Type: Sub-task
    Affects Versions: 4.0.0-beta-1
            Reporter: Janos Kovacs
            Assignee: Zoltán Rátkai


The Hive configuration states:
{noformat}
    HIVE_LOCAL_TIME_ZONE("hive.local.time.zone", "LOCAL",
        "Sets the time-zone for displaying and interpreting time stamps. If 
this property value is set to\n" +
        "LOCAL, it is not specified, or it is not a correct time-zone, the 
system default time-zone will be\n " +
        "used instead. Time-zone IDs can be specified as region-based zone IDs 
(based on IANA time-zone data),\n" +
        "abbreviated zone IDs, or offset IDs."),
{noformat}

But seems like in hive4 (-beta) it always treated as UTC - as any other invalid 
timezone value (see HIVE-27741).

Repro code:
{noformat}
docker rm -f hive4

export HIVE_VERSION=4.0.0-beta-2-SNAPSHOT
export HS2_ENV_TZ="Europe/Budapest"
export HS2_USER_TZ=${HS2_ENV_TZ}
export HIVE_LOCAL_TZ=${HS2_ENV_TZ}

export HS2_OPTS="-Duser.timezone=$HS2_USER_TZ 
-Dhive.local.time.zone=$HIVE_LOCAL_TZ"
export HS2_OPTS="$HS2_OPTS  
-Dhive.server2.tez.initialize.default.sessions=false"
docker run -d -p 10000:10000 -p 10001:10001 -p 10002:10002 --env 
TZ=${HS2_ENV_TZ} --env SERVICE_OPTS=${HS2_OPTS} --env SERVICE_NAME=hiveserver2 
--name hive4 apache/hive:${HIVE_VERSION}

docker exec -it hive4 beeline -u 'jdbc:hive2://localhost:10000/' -e "
SELECT '\${env:TZ}' as \`env:TZ\`,
       '\${system:user.timezone}' as \`system:user.timezone\`,
       '\${hiveconf:hive.local.time.zone}' as \`hiveconf:hive.local.time.zone\`;

DROP TABLE IF EXISTS timestamptest;
CREATE TABLE timestamptest (
  ts timestamp,
  tz timestamp with local time zone
) STORED AS TEXTFILE;
INSERT INTO timestamptest select TIMESTAMP'2016-01-03 
12:26:34',TIMESTAMPLOCALTZ'2016-01-03 12:26:34 America/Los_Angeles';

SET hive.query.results.cache.enabled=false;

SET hive.local.time.zone=LOCAL;
SELECT '\${env:TZ}' as \`env:TZ\`,
       '\${system:user.timezone}' as \`system:user.timezone\`,
       '\${hiveconf:hive.local.time.zone}' as \`hiveconf:hive.local.time.zone\`;
SELECT
  'LOCAL'                                                             as tzset,
  tz                                                                  as orig,
  to_utc_timestamp(tz, 'LOCAL')                                       as 
utc_local,
  to_utc_timestamp(tz, 'Europe/Budapest')                             as utc_tz,
  from_utc_timestamp(to_utc_timestamp(tz,'LOCAL'),'Europe/Budapest')  as to_bp
FROM timestamptest;

SET hive.local.time.zone=Europe/Budapest;
SELECT '\${env:TZ}' as \`env:TZ\`,
       '\${system:user.timezone}' as \`system:user.timezone\`,
       '\${hiveconf:hive.local.time.zone}' as \`hiveconf:hive.local.time.zone\`;
SELECT
  'Europe/Budapest'                                                             
as tzset,
  tz                                                                            
as orig,
  to_utc_timestamp(tz, 'LOCAL')                                                 
as utc_local,
  to_utc_timestamp(tz, 'Europe/Budapest')                                       
as utc_tz,
  from_utc_timestamp(to_utc_timestamp(tz,'Europe/Budapest'),'Europe/Budapest')  
as to_bp
FROM timestamptest;
"
{noformat}

The results are:
{noformat}
+------------------+-----------------------+--------------------------------+
|      env:tz      | system:user.timezone  | hiveconf:hive.local.time.zone  |
+------------------+-----------------------+--------------------------------+
| Europe/Budapest  | Europe/Budapest       | LOCAL                          |
+------------------+-----------------------+--------------------------------+

+--------+----------------------------------------+------------------------+------------------------+------------------------+
| tzset  |                  orig                  |       utc_local        |    
     utc_tz         |         to_bp          |
+--------+----------------------------------------+------------------------+------------------------+------------------------+
| LOCAL  | 2016-01-03 21:26:34.0 Europe/Budapest  | 2016-01-03 21:26:34.0  | 
2016-01-03 20:26:34.0  | 2016-01-03 22:26:34.0  |
+--------+----------------------------------------+------------------------+------------------------+------------------------+


+------------------+-----------------------+--------------------------------+
|      env:tz      | system:user.timezone  | hiveconf:hive.local.time.zone  |
+------------------+-----------------------+--------------------------------+
| Europe/Budapest  | Europe/Budapest       | Europe/Budapest                |
+------------------+-----------------------+--------------------------------+

+------------------+----------------------------------------+------------------------+------------------------+------------------------+
|      tzset       |                  orig                  |       utc_local   
     |         utc_tz         |         to_bp          |
+------------------+----------------------------------------+------------------------+------------------------+------------------------+
| Europe/Budapest  | 2016-01-03 21:26:34.0 Europe/Budapest  | 2016-01-03 
21:26:34.0  | 2016-01-03 20:26:34.0  | 2016-01-03 21:26:34.0  |
+------------------+----------------------------------------+------------------------+------------------------+------------------------+
{noformat}

The *to_bp* values show a 1h offset because LOCAL treated as UTC instead of 
system's timezone and converting from it offsets the calculated value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to