[jira] [Updated] (HIVE-13948) Incorrect timezone handling in Writable results in wrong dates in queries
[ https://issues.apache.org/jira/browse/HIVE-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13948: Resolution: Fixed Fix Version/s: 2.0.2 2.1.0 1.2.2 1.3.0 Status: Resolved (was: Patch Available) Committed to 1 branches. [~jcamachorodriguez] fyi another one went into 2.1. Please let me know if the RC is out (doesn't look like it), I can change to 2.1.1 > Incorrect timezone handling in Writable results in wrong dates in queries > - > > Key: HIVE-13948 > URL: https://issues.apache.org/jira/browse/HIVE-13948 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Blocker > Fix For: 1.3.0, 1.2.2, 2.1.0, 2.0.2 > > Attachments: HIVE-13948.patch, HIVE-13948.patch > > > Modifying TestDateWritable to cover 200 years, adding all timezones to the > set, and making it accumulate errors, results in the following set (I bet > many are duplicates via different names, but there's enough). > This ONLY logs errors where YMD date mismatches. There are many more where > YMD is the same but the time mismatches, omitted for brevity. > Queries as simple as "select date(...);" reproduce the error (if Java tz is > set to a problematic tz) > I was investigating some case for a specific date and it seems like the > conversion from dates to ms, namely offset calculation that takes the offset > at UTC midnight and the offset at arbitrary time derived from that, is > completely bogus and it's not clear why it would work. > I think we either need to derive date from UTC and then create local date > from YMD if needed (for many cases e.g. toString for sinks, it would not be > needed at all), and/or add a lookup table for timezone used (for popular > dates, e.g. 1900-present, it would be 40k-odd entries, although the price of > building it is another question). > Format: tz-expected-actual > {noformat} > 2016-06-04T18:33:57,499 ERROR [main[]]: io.TestDateWritable > (TestDateWritable.java:testDaylightSavingsTime(234)) - > DATE MISMATCH: > Africa/Abidjan: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Accra: 1918-01-01 00:00:52 != 1918-12-31 23:59:08 > Africa/Bamako: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Banjul: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Bissau: 1912-01-01 00:02:20 != 1912-12-31 23:57:40 > Africa/Bissau: 1975-01-01 01:00:00 != 1975-12-31 23:00:00 > Africa/Casablanca: 1913-10-26 00:30:20 != 1913-10-25 23:29:40 > Africa/Ceuta: 1901-01-01 00:21:16 != 1901-12-31 23:38:44 > Africa/Conakry: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Dakar: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/El_Aaiun: 1976-04-14 01:00:00 != 1976-04-13 23:00:00 > Africa/Freetown: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Lome: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Monrovia: 1972-05-01 00:44:30 != 1972-04-30 23:15:30 > Africa/Nouakchott: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Ouagadougou: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Sao_Tome: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Timbuktu: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > America/Anguilla: 1912-03-02 00:06:04 != 1912-03-01 23:53:56 > America/Antigua: 1951-01-01 01:00:00 != 1951-12-31 23:00:00 > America/Araguaina: 1914-01-01 00:12:48 != 1914-12-31 23:47:12 > America/Araguaina: 1932-10-03 01:00:00 != 1932-10-02 23:00:00 > America/Araguaina: 1949-12-01 01:00:00 != 1949-11-30 23:00:00 > America/Argentina/Buenos_Aires: 1920-05-01 00:16:48 != 1920-04-30 23:43:12 > America/Argentina/Buenos_Aires: 1930-12-01 01:00:00 != 1930-11-30 23:00:00 > America/Argentina/Buenos_Aires: 1931-10-15 01:00:00 != 1931-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1932-11-01 01:00:00 != 1932-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1933-11-01 01:00:00 != 1933-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1934-11-01 01:00:00 != 1934-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1935-11-01 01:00:00 != 1935-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1936-11-01 01:00:00 != 1936-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1937-11-01 01:00:00 != 1937-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1938-11-01 01:00:00 != 1938-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1939-11-01 01:00:00 != 1939-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1940-07-01 01:00:00 != 1940-06-30 23:00:00 > America/Argentina/Buenos_Aires: 1941-10-15 01:00:00 != 1941-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1943-10-15 01:00:00 != 1943-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1946-10-01 01:00:00 != 1946-09-30 23:00:00 > America/Argentina/Buenos_Aires: 1963-12-15 01:00:00 != 196
[jira] [Updated] (HIVE-13948) Incorrect timezone handling in Writable results in wrong dates in queries
[ https://issues.apache.org/jira/browse/HIVE-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13948: Attachment: HIVE-13948.patch Minor fixes, mostly to comments. The patch seems to work end-to-end to fix problematic queries. q files need to be run in specific timezones to reproduce original issue (I was setting it via JAVA_TOOL_OPTIONS="-Duser.timezone=Canada/Eastern ..."), so no q files are added. > Incorrect timezone handling in Writable results in wrong dates in queries > - > > Key: HIVE-13948 > URL: https://issues.apache.org/jira/browse/HIVE-13948 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Blocker > Attachments: HIVE-13948.patch, HIVE-13948.patch > > > Modifying TestDateWritable to cover 200 years, adding all timezones to the > set, and making it accumulate errors, results in the following set (I bet > many are duplicates via different names, but there's enough). > This ONLY logs errors where YMD date mismatches. There are many more where > YMD is the same but the time mismatches, omitted for brevity. > Queries as simple as "select date(...);" reproduce the error (if Java tz is > set to a problematic tz) > I was investigating some case for a specific date and it seems like the > conversion from dates to ms, namely offset calculation that takes the offset > at UTC midnight and the offset at arbitrary time derived from that, is > completely bogus and it's not clear why it would work. > I think we either need to derive date from UTC and then create local date > from YMD if needed (for many cases e.g. toString for sinks, it would not be > needed at all), and/or add a lookup table for timezone used (for popular > dates, e.g. 1900-present, it would be 40k-odd entries, although the price of > building it is another question). > Format: tz-expected-actual > {noformat} > 2016-06-04T18:33:57,499 ERROR [main[]]: io.TestDateWritable > (TestDateWritable.java:testDaylightSavingsTime(234)) - > DATE MISMATCH: > Africa/Abidjan: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Accra: 1918-01-01 00:00:52 != 1918-12-31 23:59:08 > Africa/Bamako: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Banjul: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Bissau: 1912-01-01 00:02:20 != 1912-12-31 23:57:40 > Africa/Bissau: 1975-01-01 01:00:00 != 1975-12-31 23:00:00 > Africa/Casablanca: 1913-10-26 00:30:20 != 1913-10-25 23:29:40 > Africa/Ceuta: 1901-01-01 00:21:16 != 1901-12-31 23:38:44 > Africa/Conakry: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Dakar: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/El_Aaiun: 1976-04-14 01:00:00 != 1976-04-13 23:00:00 > Africa/Freetown: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Lome: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Monrovia: 1972-05-01 00:44:30 != 1972-04-30 23:15:30 > Africa/Nouakchott: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Ouagadougou: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Sao_Tome: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Timbuktu: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > America/Anguilla: 1912-03-02 00:06:04 != 1912-03-01 23:53:56 > America/Antigua: 1951-01-01 01:00:00 != 1951-12-31 23:00:00 > America/Araguaina: 1914-01-01 00:12:48 != 1914-12-31 23:47:12 > America/Araguaina: 1932-10-03 01:00:00 != 1932-10-02 23:00:00 > America/Araguaina: 1949-12-01 01:00:00 != 1949-11-30 23:00:00 > America/Argentina/Buenos_Aires: 1920-05-01 00:16:48 != 1920-04-30 23:43:12 > America/Argentina/Buenos_Aires: 1930-12-01 01:00:00 != 1930-11-30 23:00:00 > America/Argentina/Buenos_Aires: 1931-10-15 01:00:00 != 1931-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1932-11-01 01:00:00 != 1932-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1933-11-01 01:00:00 != 1933-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1934-11-01 01:00:00 != 1934-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1935-11-01 01:00:00 != 1935-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1936-11-01 01:00:00 != 1936-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1937-11-01 01:00:00 != 1937-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1938-11-01 01:00:00 != 1938-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1939-11-01 01:00:00 != 1939-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1940-07-01 01:00:00 != 1940-06-30 23:00:00 > America/Argentina/Buenos_Aires: 1941-10-15 01:00:00 != 1941-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1943-10-15 01:00:00 != 1943-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1946-10-01 01:00:00 != 1946-09-30 23:00:00 > America/Argentina/Buenos_Aires: 1963-12-15 01:00:00 != 1963-12-14 23:00:00 > America/Argentina/Buenos_Aires: 1964-10-15 01:00:00 != 1964-10-14
[jira] [Updated] (HIVE-13948) Incorrect timezone handling in Writable results in wrong dates in queries
[ https://issues.apache.org/jira/browse/HIVE-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13948: Attachment: (was: HIVE-13948.patch) > Incorrect timezone handling in Writable results in wrong dates in queries > - > > Key: HIVE-13948 > URL: https://issues.apache.org/jira/browse/HIVE-13948 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Blocker > Attachments: HIVE-13948.patch > > > Modifying TestDateWritable to cover 200 years, adding all timezones to the > set, and making it accumulate errors, results in the following set (I bet > many are duplicates via different names, but there's enough). > This ONLY logs errors where YMD date mismatches. There are many more where > YMD is the same but the time mismatches, omitted for brevity. > Queries as simple as "select date(...);" reproduce the error (if Java tz is > set to a problematic tz) > I was investigating some case for a specific date and it seems like the > conversion from dates to ms, namely offset calculation that takes the offset > at UTC midnight and the offset at arbitrary time derived from that, is > completely bogus and it's not clear why it would work. > I think we either need to derive date from UTC and then create local date > from YMD if needed (for many cases e.g. toString for sinks, it would not be > needed at all), and/or add a lookup table for timezone used (for popular > dates, e.g. 1900-present, it would be 40k-odd entries, although the price of > building it is another question). > Format: tz-expected-actual > {noformat} > 2016-06-04T18:33:57,499 ERROR [main[]]: io.TestDateWritable > (TestDateWritable.java:testDaylightSavingsTime(234)) - > DATE MISMATCH: > Africa/Abidjan: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Accra: 1918-01-01 00:00:52 != 1918-12-31 23:59:08 > Africa/Bamako: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Banjul: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Bissau: 1912-01-01 00:02:20 != 1912-12-31 23:57:40 > Africa/Bissau: 1975-01-01 01:00:00 != 1975-12-31 23:00:00 > Africa/Casablanca: 1913-10-26 00:30:20 != 1913-10-25 23:29:40 > Africa/Ceuta: 1901-01-01 00:21:16 != 1901-12-31 23:38:44 > Africa/Conakry: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Dakar: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/El_Aaiun: 1976-04-14 01:00:00 != 1976-04-13 23:00:00 > Africa/Freetown: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Lome: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Monrovia: 1972-05-01 00:44:30 != 1972-04-30 23:15:30 > Africa/Nouakchott: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Ouagadougou: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Sao_Tome: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Timbuktu: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > America/Anguilla: 1912-03-02 00:06:04 != 1912-03-01 23:53:56 > America/Antigua: 1951-01-01 01:00:00 != 1951-12-31 23:00:00 > America/Araguaina: 1914-01-01 00:12:48 != 1914-12-31 23:47:12 > America/Araguaina: 1932-10-03 01:00:00 != 1932-10-02 23:00:00 > America/Araguaina: 1949-12-01 01:00:00 != 1949-11-30 23:00:00 > America/Argentina/Buenos_Aires: 1920-05-01 00:16:48 != 1920-04-30 23:43:12 > America/Argentina/Buenos_Aires: 1930-12-01 01:00:00 != 1930-11-30 23:00:00 > America/Argentina/Buenos_Aires: 1931-10-15 01:00:00 != 1931-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1932-11-01 01:00:00 != 1932-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1933-11-01 01:00:00 != 1933-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1934-11-01 01:00:00 != 1934-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1935-11-01 01:00:00 != 1935-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1936-11-01 01:00:00 != 1936-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1937-11-01 01:00:00 != 1937-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1938-11-01 01:00:00 != 1938-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1939-11-01 01:00:00 != 1939-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1940-07-01 01:00:00 != 1940-06-30 23:00:00 > America/Argentina/Buenos_Aires: 1941-10-15 01:00:00 != 1941-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1943-10-15 01:00:00 != 1943-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1946-10-01 01:00:00 != 1946-09-30 23:00:00 > America/Argentina/Buenos_Aires: 1963-12-15 01:00:00 != 1963-12-14 23:00:00 > America/Argentina/Buenos_Aires: 1964-10-15 01:00:00 != 1964-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1965-10-15 01:00:00 != 1965-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1966-10-15 01:00:00 != 1966-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1967-10-01 01:00:00 != 1967-09-30 23:00:00 > America/Argentina/Buenos_Aires: 1968-10-0
[jira] [Updated] (HIVE-13948) Incorrect timezone handling in Writable results in wrong dates in queries
[ https://issues.apache.org/jira/browse/HIVE-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13948: Attachment: HIVE-13948.patch > Incorrect timezone handling in Writable results in wrong dates in queries > - > > Key: HIVE-13948 > URL: https://issues.apache.org/jira/browse/HIVE-13948 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Blocker > Attachments: HIVE-13948.patch > > > Modifying TestDateWritable to cover 200 years, adding all timezones to the > set, and making it accumulate errors, results in the following set (I bet > many are duplicates via different names, but there's enough). > This ONLY logs errors where YMD date mismatches. There are many more where > YMD is the same but the time mismatches, omitted for brevity. > Queries as simple as "select date(...);" reproduce the error (if Java tz is > set to a problematic tz) > I was investigating some case for a specific date and it seems like the > conversion from dates to ms, namely offset calculation that takes the offset > at UTC midnight and the offset at arbitrary time derived from that, is > completely bogus and it's not clear why it would work. > I think we either need to derive date from UTC and then create local date > from YMD if needed (for many cases e.g. toString for sinks, it would not be > needed at all), and/or add a lookup table for timezone used (for popular > dates, e.g. 1900-present, it would be 40k-odd entries, although the price of > building it is another question). > Format: tz-expected-actual > {noformat} > 2016-06-04T18:33:57,499 ERROR [main[]]: io.TestDateWritable > (TestDateWritable.java:testDaylightSavingsTime(234)) - > DATE MISMATCH: > Africa/Abidjan: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Accra: 1918-01-01 00:00:52 != 1918-12-31 23:59:08 > Africa/Bamako: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Banjul: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Bissau: 1912-01-01 00:02:20 != 1912-12-31 23:57:40 > Africa/Bissau: 1975-01-01 01:00:00 != 1975-12-31 23:00:00 > Africa/Casablanca: 1913-10-26 00:30:20 != 1913-10-25 23:29:40 > Africa/Ceuta: 1901-01-01 00:21:16 != 1901-12-31 23:38:44 > Africa/Conakry: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Dakar: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/El_Aaiun: 1976-04-14 01:00:00 != 1976-04-13 23:00:00 > Africa/Freetown: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Lome: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Monrovia: 1972-05-01 00:44:30 != 1972-04-30 23:15:30 > Africa/Nouakchott: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Ouagadougou: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Sao_Tome: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Timbuktu: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > America/Anguilla: 1912-03-02 00:06:04 != 1912-03-01 23:53:56 > America/Antigua: 1951-01-01 01:00:00 != 1951-12-31 23:00:00 > America/Araguaina: 1914-01-01 00:12:48 != 1914-12-31 23:47:12 > America/Araguaina: 1932-10-03 01:00:00 != 1932-10-02 23:00:00 > America/Araguaina: 1949-12-01 01:00:00 != 1949-11-30 23:00:00 > America/Argentina/Buenos_Aires: 1920-05-01 00:16:48 != 1920-04-30 23:43:12 > America/Argentina/Buenos_Aires: 1930-12-01 01:00:00 != 1930-11-30 23:00:00 > America/Argentina/Buenos_Aires: 1931-10-15 01:00:00 != 1931-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1932-11-01 01:00:00 != 1932-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1933-11-01 01:00:00 != 1933-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1934-11-01 01:00:00 != 1934-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1935-11-01 01:00:00 != 1935-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1936-11-01 01:00:00 != 1936-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1937-11-01 01:00:00 != 1937-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1938-11-01 01:00:00 != 1938-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1939-11-01 01:00:00 != 1939-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1940-07-01 01:00:00 != 1940-06-30 23:00:00 > America/Argentina/Buenos_Aires: 1941-10-15 01:00:00 != 1941-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1943-10-15 01:00:00 != 1943-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1946-10-01 01:00:00 != 1946-09-30 23:00:00 > America/Argentina/Buenos_Aires: 1963-12-15 01:00:00 != 1963-12-14 23:00:00 > America/Argentina/Buenos_Aires: 1964-10-15 01:00:00 != 1964-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1965-10-15 01:00:00 != 1965-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1966-10-15 01:00:00 != 1966-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1967-10-01 01:00:00 != 1967-09-30 23:00:00 > America/Argentina/Buenos_Aires: 1968-10-06 01:00:00
[jira] [Updated] (HIVE-13948) Incorrect timezone handling in Writable results in wrong dates in queries
[ https://issues.apache.org/jira/browse/HIVE-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13948: Status: Patch Available (was: Open) [~jdere] [~gopalv] [~ashutoshc] can someone take a look? > Incorrect timezone handling in Writable results in wrong dates in queries > - > > Key: HIVE-13948 > URL: https://issues.apache.org/jira/browse/HIVE-13948 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Blocker > Attachments: HIVE-13948.patch > > > Modifying TestDateWritable to cover 200 years, adding all timezones to the > set, and making it accumulate errors, results in the following set (I bet > many are duplicates via different names, but there's enough). > This ONLY logs errors where YMD date mismatches. There are many more where > YMD is the same but the time mismatches, omitted for brevity. > Queries as simple as "select date(...);" reproduce the error (if Java tz is > set to a problematic tz) > I was investigating some case for a specific date and it seems like the > conversion from dates to ms, namely offset calculation that takes the offset > at UTC midnight and the offset at arbitrary time derived from that, is > completely bogus and it's not clear why it would work. > I think we either need to derive date from UTC and then create local date > from YMD if needed (for many cases e.g. toString for sinks, it would not be > needed at all), and/or add a lookup table for timezone used (for popular > dates, e.g. 1900-present, it would be 40k-odd entries, although the price of > building it is another question). > Format: tz-expected-actual > {noformat} > 2016-06-04T18:33:57,499 ERROR [main[]]: io.TestDateWritable > (TestDateWritable.java:testDaylightSavingsTime(234)) - > DATE MISMATCH: > Africa/Abidjan: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Accra: 1918-01-01 00:00:52 != 1918-12-31 23:59:08 > Africa/Bamako: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Banjul: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Bissau: 1912-01-01 00:02:20 != 1912-12-31 23:57:40 > Africa/Bissau: 1975-01-01 01:00:00 != 1975-12-31 23:00:00 > Africa/Casablanca: 1913-10-26 00:30:20 != 1913-10-25 23:29:40 > Africa/Ceuta: 1901-01-01 00:21:16 != 1901-12-31 23:38:44 > Africa/Conakry: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Dakar: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/El_Aaiun: 1976-04-14 01:00:00 != 1976-04-13 23:00:00 > Africa/Freetown: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Lome: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Monrovia: 1972-05-01 00:44:30 != 1972-04-30 23:15:30 > Africa/Nouakchott: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Ouagadougou: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Sao_Tome: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Timbuktu: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > America/Anguilla: 1912-03-02 00:06:04 != 1912-03-01 23:53:56 > America/Antigua: 1951-01-01 01:00:00 != 1951-12-31 23:00:00 > America/Araguaina: 1914-01-01 00:12:48 != 1914-12-31 23:47:12 > America/Araguaina: 1932-10-03 01:00:00 != 1932-10-02 23:00:00 > America/Araguaina: 1949-12-01 01:00:00 != 1949-11-30 23:00:00 > America/Argentina/Buenos_Aires: 1920-05-01 00:16:48 != 1920-04-30 23:43:12 > America/Argentina/Buenos_Aires: 1930-12-01 01:00:00 != 1930-11-30 23:00:00 > America/Argentina/Buenos_Aires: 1931-10-15 01:00:00 != 1931-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1932-11-01 01:00:00 != 1932-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1933-11-01 01:00:00 != 1933-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1934-11-01 01:00:00 != 1934-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1935-11-01 01:00:00 != 1935-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1936-11-01 01:00:00 != 1936-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1937-11-01 01:00:00 != 1937-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1938-11-01 01:00:00 != 1938-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1939-11-01 01:00:00 != 1939-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1940-07-01 01:00:00 != 1940-06-30 23:00:00 > America/Argentina/Buenos_Aires: 1941-10-15 01:00:00 != 1941-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1943-10-15 01:00:00 != 1943-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1946-10-01 01:00:00 != 1946-09-30 23:00:00 > America/Argentina/Buenos_Aires: 1963-12-15 01:00:00 != 1963-12-14 23:00:00 > America/Argentina/Buenos_Aires: 1964-10-15 01:00:00 != 1964-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1965-10-15 01:00:00 != 1965-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1966-10-15 01:00:00 != 1966-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1967-10-01 01:00:00 != 1967-09-
[jira] [Updated] (HIVE-13948) Incorrect timezone handling in Writable results in wrong dates in queries
[ https://issues.apache.org/jira/browse/HIVE-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13948: Attachment: HIVE-13948.patch A patch. > Incorrect timezone handling in Writable results in wrong dates in queries > - > > Key: HIVE-13948 > URL: https://issues.apache.org/jira/browse/HIVE-13948 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Blocker > Attachments: HIVE-13948.patch > > > Modifying TestDateWritable to cover 200 years, adding all timezones to the > set, and making it accumulate errors, results in the following set (I bet > many are duplicates via different names, but there's enough). > This ONLY logs errors where YMD date mismatches. There are many more where > YMD is the same but the time mismatches, omitted for brevity. > Queries as simple as "select date(...);" reproduce the error (if Java tz is > set to a problematic tz) > I was investigating some case for a specific date and it seems like the > conversion from dates to ms, namely offset calculation that takes the offset > at UTC midnight and the offset at arbitrary time derived from that, is > completely bogus and it's not clear why it would work. > I think we either need to derive date from UTC and then create local date > from YMD if needed (for many cases e.g. toString for sinks, it would not be > needed at all), and/or add a lookup table for timezone used (for popular > dates, e.g. 1900-present, it would be 40k-odd entries, although the price of > building it is another question). > Format: tz-expected-actual > {noformat} > 2016-06-04T18:33:57,499 ERROR [main[]]: io.TestDateWritable > (TestDateWritable.java:testDaylightSavingsTime(234)) - > DATE MISMATCH: > Africa/Abidjan: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Accra: 1918-01-01 00:00:52 != 1918-12-31 23:59:08 > Africa/Bamako: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Banjul: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Bissau: 1912-01-01 00:02:20 != 1912-12-31 23:57:40 > Africa/Bissau: 1975-01-01 01:00:00 != 1975-12-31 23:00:00 > Africa/Casablanca: 1913-10-26 00:30:20 != 1913-10-25 23:29:40 > Africa/Ceuta: 1901-01-01 00:21:16 != 1901-12-31 23:38:44 > Africa/Conakry: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Dakar: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/El_Aaiun: 1976-04-14 01:00:00 != 1976-04-13 23:00:00 > Africa/Freetown: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Lome: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Monrovia: 1972-05-01 00:44:30 != 1972-04-30 23:15:30 > Africa/Nouakchott: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Ouagadougou: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Sao_Tome: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Timbuktu: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > America/Anguilla: 1912-03-02 00:06:04 != 1912-03-01 23:53:56 > America/Antigua: 1951-01-01 01:00:00 != 1951-12-31 23:00:00 > America/Araguaina: 1914-01-01 00:12:48 != 1914-12-31 23:47:12 > America/Araguaina: 1932-10-03 01:00:00 != 1932-10-02 23:00:00 > America/Araguaina: 1949-12-01 01:00:00 != 1949-11-30 23:00:00 > America/Argentina/Buenos_Aires: 1920-05-01 00:16:48 != 1920-04-30 23:43:12 > America/Argentina/Buenos_Aires: 1930-12-01 01:00:00 != 1930-11-30 23:00:00 > America/Argentina/Buenos_Aires: 1931-10-15 01:00:00 != 1931-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1932-11-01 01:00:00 != 1932-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1933-11-01 01:00:00 != 1933-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1934-11-01 01:00:00 != 1934-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1935-11-01 01:00:00 != 1935-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1936-11-01 01:00:00 != 1936-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1937-11-01 01:00:00 != 1937-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1938-11-01 01:00:00 != 1938-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1939-11-01 01:00:00 != 1939-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1940-07-01 01:00:00 != 1940-06-30 23:00:00 > America/Argentina/Buenos_Aires: 1941-10-15 01:00:00 != 1941-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1943-10-15 01:00:00 != 1943-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1946-10-01 01:00:00 != 1946-09-30 23:00:00 > America/Argentina/Buenos_Aires: 1963-12-15 01:00:00 != 1963-12-14 23:00:00 > America/Argentina/Buenos_Aires: 1964-10-15 01:00:00 != 1964-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1965-10-15 01:00:00 != 1965-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1966-10-15 01:00:00 != 1966-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1967-10-01 01:00:00 != 1967-09-30 23:00:00 > America/Argentina/Buenos_Aires: 1968-10-06
[jira] [Updated] (HIVE-13948) Incorrect timezone handling in Writable results in wrong dates in queries
[ https://issues.apache.org/jira/browse/HIVE-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13948: Summary: Incorrect timezone handling in Writable results in wrong dates in queries (was: Incorrect timezone handling in Writable results in wrong dates) > Incorrect timezone handling in Writable results in wrong dates in queries > - > > Key: HIVE-13948 > URL: https://issues.apache.org/jira/browse/HIVE-13948 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Blocker > > Modifying TestDateWritable to cover 200 years, adding all timezones to the > set, and making it accumulate errors, results in the following set (I bet > many are duplicates via different names, but there's enough). > This ONLY logs errors where YMD date mismatches. There are many more where > YMD is the same but the time mismatches, omitted for brevity. > I was investigating some case for a specific date and it seems like the > conversion from dates to ms, namely offset calculation that takes the offset > at UTC midnight and the offset at arbitrary time derived from that, is > completely bogus and it's not clear why it would work. > I think we either need to derive date from UTC and then create local date YMD > if needed, and/or add a lookup table for timezone used (for popular dates, > e.g. 1900-present, it would be 40k-odd entries, although the price of > building it is another question). > Format: tz-expected-actual > {noformat} > 2016-06-04T18:33:57,499 ERROR [main[]]: io.TestDateWritable > (TestDateWritable.java:testDaylightSavingsTime(234)) - > DATE MISMATCH: > Africa/Abidjan: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Accra: 1918-01-01 00:00:52 != 1918-12-31 23:59:08 > Africa/Bamako: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Banjul: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Bissau: 1912-01-01 00:02:20 != 1912-12-31 23:57:40 > Africa/Bissau: 1975-01-01 01:00:00 != 1975-12-31 23:00:00 > Africa/Casablanca: 1913-10-26 00:30:20 != 1913-10-25 23:29:40 > Africa/Ceuta: 1901-01-01 00:21:16 != 1901-12-31 23:38:44 > Africa/Conakry: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Dakar: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/El_Aaiun: 1976-04-14 01:00:00 != 1976-04-13 23:00:00 > Africa/Freetown: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Lome: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Monrovia: 1972-05-01 00:44:30 != 1972-04-30 23:15:30 > Africa/Nouakchott: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Ouagadougou: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Sao_Tome: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > Africa/Timbuktu: 1912-01-01 00:16:08 != 1912-12-31 23:43:52 > America/Anguilla: 1912-03-02 00:06:04 != 1912-03-01 23:53:56 > America/Antigua: 1951-01-01 01:00:00 != 1951-12-31 23:00:00 > America/Araguaina: 1914-01-01 00:12:48 != 1914-12-31 23:47:12 > America/Araguaina: 1932-10-03 01:00:00 != 1932-10-02 23:00:00 > America/Araguaina: 1949-12-01 01:00:00 != 1949-11-30 23:00:00 > America/Argentina/Buenos_Aires: 1920-05-01 00:16:48 != 1920-04-30 23:43:12 > America/Argentina/Buenos_Aires: 1930-12-01 01:00:00 != 1930-11-30 23:00:00 > America/Argentina/Buenos_Aires: 1931-10-15 01:00:00 != 1931-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1932-11-01 01:00:00 != 1932-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1933-11-01 01:00:00 != 1933-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1934-11-01 01:00:00 != 1934-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1935-11-01 01:00:00 != 1935-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1936-11-01 01:00:00 != 1936-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1937-11-01 01:00:00 != 1937-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1938-11-01 01:00:00 != 1938-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1939-11-01 01:00:00 != 1939-10-31 23:00:00 > America/Argentina/Buenos_Aires: 1940-07-01 01:00:00 != 1940-06-30 23:00:00 > America/Argentina/Buenos_Aires: 1941-10-15 01:00:00 != 1941-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1943-10-15 01:00:00 != 1943-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1946-10-01 01:00:00 != 1946-09-30 23:00:00 > America/Argentina/Buenos_Aires: 1963-12-15 01:00:00 != 1963-12-14 23:00:00 > America/Argentina/Buenos_Aires: 1964-10-15 01:00:00 != 1964-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1965-10-15 01:00:00 != 1965-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1966-10-15 01:00:00 != 1966-10-14 23:00:00 > America/Argentina/Buenos_Aires: 1967-10-01 01:00:00 != 1967-09-30 23:00:00 > America/Argentina/Buenos_Aires: 1968-10-06 01:00:00 != 1968-10-05 23:00:00 > America/Argentina/Buenos_Aires: 1969-10-05 01:00:00 != 1969-10-04 23:00:00 > America/Argentina/Buenos_Aires: 1974-01-