slim bouguerra created HIVE-19155: ------------------------------------- Summary: Day time saving cause Druid inserts to fail with org.apache.hive.druid.io.druid.java.util.common.UOE: Cannot add overlapping segments Key: HIVE-19155 URL: https://issues.apache.org/jira/browse/HIVE-19155 Project: Hive Issue Type: Bug Components: Druid integration Reporter: slim bouguerra Assignee: slim bouguerra
If you try to insert data around the daylight saving time hour the query fails with following exception {code} 2018-04-10T11:24:58,836 ERROR [065fdaa2-85f9-4e49-adaf-3dc14d51be90 main] exec.DDLTask: Failed org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hive.druid.io.druid.java.util.common.UOE: Cannot add overlapping segments [2015-03-08T05:00:00.000Z/2015-03-09T05:00:00.000Z and 2015-03-09T04:00:00.000Z/2015-03-10T04:00:00.000Z] with the same version [2018-04-10T11:24:48.388-07:00] at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:914) ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:919) ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4831) [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:394) [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2443) [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2114) [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1797) [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1538) [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1532) [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:204) [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) [hive-cli-3.1.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) [hive-cli-3.1.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) [hive-cli-3.1.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) [hive-cli-3.1.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1455) [hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1429) [hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:177) [hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) [hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] at org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver(TestMiniDruidCliDriver.java:59) [test-classes/:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_92] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_92] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_92] {code} You can reproduce this using the following DDL {code} create database druid_test; use druid_test; create table test_table(`timecolumn` timestamp, `userid` string, `num_l` float); insert into test_table values ('2015-03-08 00:00:00', 'i1-start', 4); insert into test_table values ('2015-03-08 23:59:59', 'i1-end', 1); insert into test_table values ('2015-03-09 00:00:00', 'i2-start', 4); insert into test_table values ('2015-03-09 23:59:59', 'i2-end', 1); insert into test_table values ('2015-03-10 00:00:00', 'i3-start', 2); insert into test_table values ('2015-03-10 23:59:59', 'i3-end', 2); CREATE TABLE druid_table STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' TBLPROPERTIES ("druid.segment.granularity" = "DAY") AS select cast(`timecolumn` as timestamp with local time zone) as `__time`, `userid`, `num_l` FROM test_table; {code} The fix is to always adjust the Druid segments identifiers to UTC. -- This message was sent by Atlassian JIRA (v7.6.3#76005)