[ 
https://issues.apache.org/jira/browse/HIVE-19155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16438008#comment-16438008
 ] 

slim bouguerra commented on HIVE-19155:
---------------------------------------

[~ashutoshc] the Granularity Column is used by the dynamic partition operator 
stage in order to sort and send rows to the underline partition, we can use 
either but to keep code simple i am using only one which is the actual 
timestamp field.
 

> Day time saving cause Druid inserts to fail with 
> org.apache.hive.druid.io.druid.java.util.common.UOE: Cannot add overlapping 
> segments
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-19155
>                 URL: https://issues.apache.org/jira/browse/HIVE-19155
>             Project: Hive
>          Issue Type: Bug
>          Components: Druid integration
>            Reporter: slim bouguerra
>            Assignee: slim bouguerra
>            Priority: Major
>         Attachments: HIVE-19155.patch
>
>
> If you try to insert data around the daylight saving time hour the query 
> fails with following exception
> {code}
> 2018-04-10T11:24:58,836 ERROR [065fdaa2-85f9-4e49-adaf-3dc14d51be90 main] 
> exec.DDLTask: Failed
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hive.druid.io.druid.java.util.common.UOE: Cannot add overlapping 
> segments [2015-03-08T05:00:00.000Z/2015-03-09T05:00:00.000Z and 
> 2015-03-09T04:00:00.000Z/2015-03-10T04:00:00.000Z] with the same version 
> [2018-04-10T11:24:48.388-07:00]
>         at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:914) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:919) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4831) 
> [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:394) 
> [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) 
> [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
> [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2443) 
> [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2114) 
> [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1797) 
> [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1538) 
> [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1532) 
> [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) 
> [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:204) 
> [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) 
> [hive-cli-3.1.0-SNAPSHOT.jar:?]
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) 
> [hive-cli-3.1.0-SNAPSHOT.jar:?]
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) 
> [hive-cli-3.1.0-SNAPSHOT.jar:?]
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) 
> [hive-cli-3.1.0-SNAPSHOT.jar:?]
>         at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1455)
>  [hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1429) 
> [hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:177)
>  [hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) 
> [hive-it-util-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver(TestMiniDruidCliDriver.java:59)
>  [test-classes/:?]
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_92]
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_92]
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_92]
> {code}
> You can reproduce this using the following DDL 
> {code}
> create database druid_test;
> use druid_test;
> create table test_table(`timecolumn` timestamp, `userid` string, `num_l` 
> float);
> insert into test_table values ('2015-03-08 00:00:00', 'i1-start', 4);
> insert into test_table values ('2015-03-08 23:59:59', 'i1-end', 1);
> insert into test_table values ('2015-03-09 00:00:00', 'i2-start', 4);
> insert into test_table values ('2015-03-09 23:59:59', 'i2-end', 1);
> insert into test_table values ('2015-03-10 00:00:00', 'i3-start', 2);
> insert into test_table values ('2015-03-10 23:59:59', 'i3-end', 2);
> CREATE TABLE druid_table
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.segment.granularity" = "DAY")
> AS
> select cast(`timecolumn` as timestamp with local time zone) as `__time`, 
> `userid`, `num_l` FROM test_table;
> {code}
> The fix is to always adjust the Druid segments identifiers to UTC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to