Janos Kovacs created HIVE-27742: ----------------------------------- Summary: LOCAL timezone value is treated as UTC instead of system's timezone which causes data consistency issues Key: HIVE-27742 URL: https://issues.apache.org/jira/browse/HIVE-27742 Project: Hive Issue Type: Sub-task Affects Versions: 4.0.0-beta-1 Reporter: Janos Kovacs Assignee: Zoltán Rátkai
The Hive configuration states: {noformat} HIVE_LOCAL_TIME_ZONE("hive.local.time.zone", "LOCAL", "Sets the time-zone for displaying and interpreting time stamps. If this property value is set to\n" + "LOCAL, it is not specified, or it is not a correct time-zone, the system default time-zone will be\n " + "used instead. Time-zone IDs can be specified as region-based zone IDs (based on IANA time-zone data),\n" + "abbreviated zone IDs, or offset IDs."), {noformat} But seems like in hive4 (-beta) it always treated as UTC - as any other invalid timezone value (see HIVE-27741). Repro code: {noformat} docker rm -f hive4 export HIVE_VERSION=4.0.0-beta-2-SNAPSHOT export HS2_ENV_TZ="Europe/Budapest" export HS2_USER_TZ=${HS2_ENV_TZ} export HIVE_LOCAL_TZ=${HS2_ENV_TZ} export HS2_OPTS="-Duser.timezone=$HS2_USER_TZ -Dhive.local.time.zone=$HIVE_LOCAL_TZ" export HS2_OPTS="$HS2_OPTS -Dhive.server2.tez.initialize.default.sessions=false" docker run -d -p 10000:10000 -p 10001:10001 -p 10002:10002 --env TZ=${HS2_ENV_TZ} --env SERVICE_OPTS=${HS2_OPTS} --env SERVICE_NAME=hiveserver2 --name hive4 apache/hive:${HIVE_VERSION} docker exec -it hive4 beeline -u 'jdbc:hive2://localhost:10000/' -e " SELECT '\${env:TZ}' as \`env:TZ\`, '\${system:user.timezone}' as \`system:user.timezone\`, '\${hiveconf:hive.local.time.zone}' as \`hiveconf:hive.local.time.zone\`; DROP TABLE IF EXISTS timestamptest; CREATE TABLE timestamptest ( ts timestamp, tz timestamp with local time zone ) STORED AS TEXTFILE; INSERT INTO timestamptest select TIMESTAMP'2016-01-03 12:26:34',TIMESTAMPLOCALTZ'2016-01-03 12:26:34 America/Los_Angeles'; SET hive.query.results.cache.enabled=false; SET hive.local.time.zone=LOCAL; SELECT '\${env:TZ}' as \`env:TZ\`, '\${system:user.timezone}' as \`system:user.timezone\`, '\${hiveconf:hive.local.time.zone}' as \`hiveconf:hive.local.time.zone\`; SELECT 'LOCAL' as tzset, tz as orig, to_utc_timestamp(tz, 'LOCAL') as utc_local, to_utc_timestamp(tz, 'Europe/Budapest') as utc_tz, from_utc_timestamp(to_utc_timestamp(tz,'LOCAL'),'Europe/Budapest') as to_bp FROM timestamptest; SET hive.local.time.zone=Europe/Budapest; SELECT '\${env:TZ}' as \`env:TZ\`, '\${system:user.timezone}' as \`system:user.timezone\`, '\${hiveconf:hive.local.time.zone}' as \`hiveconf:hive.local.time.zone\`; SELECT 'Europe/Budapest' as tzset, tz as orig, to_utc_timestamp(tz, 'LOCAL') as utc_local, to_utc_timestamp(tz, 'Europe/Budapest') as utc_tz, from_utc_timestamp(to_utc_timestamp(tz,'Europe/Budapest'),'Europe/Budapest') as to_bp FROM timestamptest; " {noformat} The results are: {noformat} +------------------+-----------------------+--------------------------------+ | env:tz | system:user.timezone | hiveconf:hive.local.time.zone | +------------------+-----------------------+--------------------------------+ | Europe/Budapest | Europe/Budapest | LOCAL | +------------------+-----------------------+--------------------------------+ +--------+----------------------------------------+------------------------+------------------------+------------------------+ | tzset | orig | utc_local | utc_tz | to_bp | +--------+----------------------------------------+------------------------+------------------------+------------------------+ | LOCAL | 2016-01-03 21:26:34.0 Europe/Budapest | 2016-01-03 21:26:34.0 | 2016-01-03 20:26:34.0 | 2016-01-03 22:26:34.0 | +--------+----------------------------------------+------------------------+------------------------+------------------------+ +------------------+-----------------------+--------------------------------+ | env:tz | system:user.timezone | hiveconf:hive.local.time.zone | +------------------+-----------------------+--------------------------------+ | Europe/Budapest | Europe/Budapest | Europe/Budapest | +------------------+-----------------------+--------------------------------+ +------------------+----------------------------------------+------------------------+------------------------+------------------------+ | tzset | orig | utc_local | utc_tz | to_bp | +------------------+----------------------------------------+------------------------+------------------------+------------------------+ | Europe/Budapest | 2016-01-03 21:26:34.0 Europe/Budapest | 2016-01-03 21:26:34.0 | 2016-01-03 20:26:34.0 | 2016-01-03 21:26:34.0 | +------------------+----------------------------------------+------------------------+------------------------+------------------------+ {noformat} The *to_bp* values show a 1h offset because LOCAL treated as UTC instead of system's timezone and converting from it offsets the calculated value. -- This message was sent by Atlassian Jira (v8.20.10#820010)