slim bouguerra created HIVE-19155:
-------------------------------------

             Summary: Day time saving cause Druid inserts to fail with 
org.apache.hive.druid.io.druid.java.util.common.UOE: Cannot add overlapping 
segments
                 Key: HIVE-19155
                 URL: https://issues.apache.org/jira/browse/HIVE-19155
             Project: Hive
          Issue Type: Bug
          Components: Druid integration
            Reporter: slim bouguerra
            Assignee: slim bouguerra


If you try to insert data around the daylight saving time hour the query fails 
with following exception
{code}
2018-04-10T11:24:58,836 ERROR [065fdaa2-85f9-4e49-adaf-3dc14d51be90 main] 
exec.DDLTask: Failed
org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hive.druid.io.druid.java.util.common.UOE: Cannot add overlapping 
segments [2015-03-08T05:00:00.000Z/2015-03-09T05:00:00.000Z and 
2015-03-09T04:00:00.000Z/2015-03-10T04:00:00.000Z] with the same version 
[2018-04-10T11:24:48.388-07:00]
        at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:914) 
~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:919) 
~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4831) 
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:394) 
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) 
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2443) 
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2114) 
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1797) 
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1538) 
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1532) 
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) 
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:204) 
[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) 
[hive-cli-3.1.0-SNAPSHOT.jar:?]
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) 
[hive-cli-3.1.0-SNAPSHOT.jar:?]
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) 
[hive-cli-3.1.0-SNAPSHOT.jar:?]
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) 
[hive-cli-3.1.0-SNAPSHOT.jar:?]
        at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1455) 
[hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1429) 
[hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:177)
 [hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) 
[hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver(TestMiniDruidCliDriver.java:59)
 [test-classes/:?]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_92]
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_92]
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_92]
{code}

You can reproduce this using the following DDL 
{code}
create database druid_test;
use druid_test;

create table test_table(`timecolumn` timestamp, `userid` string, `num_l` float);

insert into test_table values ('2015-03-08 00:00:00', 'i1-start', 4);
insert into test_table values ('2015-03-08 23:59:59', 'i1-end', 1);

insert into test_table values ('2015-03-09 00:00:00', 'i2-start', 4);
insert into test_table values ('2015-03-09 23:59:59', 'i2-end', 1);

insert into test_table values ('2015-03-10 00:00:00', 'i3-start', 2);
insert into test_table values ('2015-03-10 23:59:59', 'i3-end', 2);

CREATE TABLE druid_table
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
TBLPROPERTIES ("druid.segment.granularity" = "DAY")
AS
select cast(`timecolumn` as timestamp with local time zone) as `__time`, 
`userid`, `num_l` FROM test_table;
{code}

The fix is to always adjust the Druid segments identifiers to UTC.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to