[ https://issues.apache.org/jira/browse/HIVE-27741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Denys Kuzmenko updated HIVE-27741: ---------------------------------- Affects Version/s: 4.0.0 > Invalid timezone value in to_utc_timestamp() is treated as UTC which can lead > to data consistency issues > -------------------------------------------------------------------------------------------------------- > > Key: HIVE-27741 > URL: https://issues.apache.org/jira/browse/HIVE-27741 > Project: Hive > Issue Type: Sub-task > Affects Versions: 4.0.0, 4.0.0-beta-1 > Reporter: Janos Kovacs > Assignee: Zoltán Rátkai > Priority: Major > Labels: pull-request-available > Fix For: 4.1.0 > > > When the timezone specified in the *to_utc_timestamp()* function is not > valid, it still treated as UTC instead of throwing an error. If the user > accidentally made a typo - e.g. America/Los{color:#ff0000}*t*{color}_Angeles, > the query runs successfully returning an invalid converted value which can > lead to data consistency issues. > Repro code: > {noformat} > docker rm -f hive4 > export HIVE_VERSION=4.0.0-beta-2-SNAPSHOT > export HS2_ENV_TZ="Europe/Budapest" > export HS2_USER_TZ=${HS2_ENV_TZ} > export HIVE_LOCAL_TZ="America/Los_Angeles" > export HS2_OPTS="-Duser.timezone=$HS2_USER_TZ > -Dhive.local.time.zone=$HIVE_LOCAL_TZ" > export HS2_OPTS="$HS2_OPTS > -Dhive.server2.tez.initialize.default.sessions=false" > docker run -d -p 10000:10000 -p 10001:10001 -p 10002:10002 --env > TZ=${HS2_ENV_TZ} --env SERVICE_OPTS=${HS2_OPTS} --env > SERVICE_NAME=hiveserver2 --name hive4 apache/hive:${HIVE_VERSION} > docker exec -it hive4 beeline -u 'jdbc:hive2://localhost:10000/' -e " > SELECT '\${env:TZ}' as \`env:TZ\`, > '\${system:user.timezone}' as \`system:user.timezone\`, > '\${hiveconf:hive.local.time.zone}' as > \`hiveconf:hive.local.time.zone\`; > DROP TABLE IF EXISTS timestamptest; > CREATE TABLE timestamptest ( > ts timestamp, > tz timestamp with local time zone > ) STORED AS TEXTFILE; > INSERT INTO timestamptest select TIMESTAMP'2016-01-03 > 12:26:34',TIMESTAMPLOCALTZ'2016-01-03 12:26:34 America/Los_Angeles'; > SELECT > tz as orig, > to_utc_timestamp(tz, 'America/Los_Angeles') as utc_correct_tz, > to_utc_timestamp(tz, 'Europe/HereIsATypo') as utc_incorrect_tz, > to_utc_timestamp(tz, 'LOCAL') as > utc_local_aslo_incorrect_tz, > to_utc_timestamp(tz, 'UTC') as utc_tz > FROM timestamptest; > " > {noformat} > > The results are: > {noformat} > +------------------+-----------------------+--------------------------------+ > | env:tz | system:user.timezone | hiveconf:hive.local.time.zone | > +------------------+-----------------------+--------------------------------+ > | Europe/Budapest | Europe/Budapest | America/Los_Angeles | > +------------------+-----------------------+--------------------------------+ > +--------------------------------------------+------------------------+------------------------+------------------------------+------------------------+ > | orig | utc_correct_tz | > utc_incorrect_tz | utc_local_aslo_incorrect_tz | utc_tz | > +--------------------------------------------+------------------------+------------------------+------------------------------+------------------------+ > | 2016-01-03 12:26:34.0 America/Los_Angeles | 2016-01-03 20:26:34.0 | > 2016-01-03 12:26:34.0 | 2016-01-03 12:26:34.0 | 2016-01-03 12:26:34.0 > | > +--------------------------------------------+------------------------+------------------------+------------------------------+------------------------+ > {noformat} > Note: > * the invalid timezone - utc_incorrect_tz - is treated as UTC > * also note that LOCAL is also treated as UTC which in fact should be > treated as system's timezone, but as LOCAL is also an invalid timezone value > in hive4, ut becomes UTC just like any other invalid and/or typo timezone > values (see HIVE-27742) > > Hive should throw an Exception in that case to let the user know that the > provided timezone is wrong - at least this should be configurable, e.g. via > something like {*}hive.strict.time.zone.check{*}. -- This message was sent by Atlassian Jira (v8.20.10#820010)