Dileep Kumar Chiguruvada created HIVE-17485: -----------------------------------------------
Summary: Hive-Druid table on indexing for few segments- DruidRecordWriter.pushSegments throws ArrayIndexOutOfBoundsException Key: HIVE-17485 URL: https://issues.apache.org/jira/browse/HIVE-17485 Project: Hive Issue Type: Bug Components: Druid integration Affects Versions: 2.1.0 Reporter: Dileep Kumar Chiguruvada Hive-Druid table on indexing for few segments DruidRecordWriter.pushSegments throws ArrayIndexOutOfBoundsException. Error says {code} ERROR : Vertex failed, vertexName=Reducer 2, vertexId=vertex_1502725432788_0017_2_01, diagnostics=[Task failed, taskId=task_1502725432788_0017_2_01_000002, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1502725432788_0017_2_01_000002_0:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) (vectorizedVertexNum 1) Column vector types: 1:TIMESTAMP, 2:LONG, 3:BYTES, 4:LONG, 5:LONG, 6:LONG, 7:LONG, 8:LONG, 9:LONG, 10:LONG, 11:LONG, 12:LONG, 13:LONG, 14:LONG, 15:BYTES, 16:BYTES, 17:BYTES, 18:BYTES, 19:BYTES, 20:LONG, 21:LONG, 22:LONG, 23:LONG, 24:BYTES, 25:BYTES, 26:BYTES, 27:BYTES, 28:BYTES, 0:TIMESTAMP [1900-01-18 00:00:00.0, 2415038, "AAAAAAAAOLJNECAA", 0, 3, 1, 1900, 3, 1, 18, 1, 1900, 1, 3, "Wednesday", "1900Q1", "N", "N", "N", 2415021, 2415020, 2414673, 2414946, "N", "N", "N", "N", "N", 1900-01-18 00:00:00.0] at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:218) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:172) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) (vectorizedVertexNum 1) Column vector types: 1:TIMESTAMP, 2:LONG, 3:BYTES, 4:LONG, 5:LONG, 6:LONG, 7:LONG, 8:LONG, 9:LONG, 10:LONG, 11:LONG, 12:LONG, 13:LONG, 14:LONG, 15:BYTES, 16:BYTES, 17:BYTES, 18:BYTES, 19:BYTES, 20:LONG, 21:LONG, 22:LONG, 23:LONG, 24:BYTES, 25:BYTES, 26:BYTES, 27:BYTES, 28:BYTES, 0:TIMESTAMP [1900-01-18 00:00:00.0, 2415038, "AAAAAAAAOLJNECAA", 0, 3, 1, 1900, 3, 1, 18, 1, 1900, 1, 3, "Wednesday", "1900Q1", "N", "N", "N", 2415021, 2415020, 2414673, 2414946, "N", "N", "N", "N", "N", 1900-01-18 00:00:00.0] at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:406) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:248) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:319) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:189) ... 15 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) (vectorizedVertexNum 1) Column vector types: 1:TIMESTAMP, 2:LONG, 3:BYTES, 4:LONG, 5:LONG, 6:LONG, 7:LONG, 8:LONG, 9:LONG, 10:LONG, 11:LONG, 12:LONG, 13:LONG, 14:LONG, 15:BYTES, 16:BYTES, 17:BYTES, 18:BYTES, 19:BYTES, 20:LONG, 21:LONG, 22:LONG, 23:LONG, 24:BYTES, 25:BYTES, 26:BYTES, 27:BYTES, 28:BYTES, 0:TIMESTAMP [1900-01-18 00:00:00.0, 2415038, "AAAAAAAAOLJNECAA", 0, 3, 1, 1900, 3, 1, 18, 1, 1900, 1, 3, "Wednesday", "1900Q1", "N", "N", "N", 2415021, 2415020, 2414673, 2414946, "N", "N", "N", "N", "N", 1900-01-18 00:00:00.0] at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:489) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:397) ... 18 more Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 128 at org.apache.hive.druid.com.google.common.base.Throwables.propagate(Throwables.java:160) at org.apache.hadoop.hive.druid.io.DruidRecordWriter.pushSegments(DruidRecordWriter.java:218) at org.apache.hadoop.hive.druid.io.DruidRecordWriter.getSegmentIdentifierAndMaybePush(DruidRecordWriter.java:156) at org.apache.hadoop.hive.druid.io.DruidRecordWriter.write(DruidRecordWriter.java:239) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:752) at org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:101) at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:955) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:903) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:145) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:478) ... 19 more Caused by: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 128 at org.apache.hive.druid.com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) at org.apache.hive.druid.com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) at org.apache.hive.druid.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) at org.apache.hadoop.hive.druid.io.DruidRecordWriter.pushSegments(DruidRecordWriter.java:207) ... 27 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 128 at org.apache.hive.druid.com.fasterxml.jackson.core.sym.ByteQuadsCanonicalizer.addName(ByteQuadsCanonicalizer.java:870) at org.apache.hive.druid.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.addName(UTF8StreamJsonParser.java:2340) at org.apache.hive.druid.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.findName(UTF8StreamJsonParser.java:2224) {code} Hive-DruidstorageHandler table create as .. {code} 0: jdbc:hive2://ctr-e134-1499953498516-98952-> CREATE TABLE date_dim_drd 0: jdbc:hive2://ctr-e134-1499953498516-98952-> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' 0: jdbc:hive2://ctr-e134-1499953498516-98952-> TBLPROPERTIES ("druid.datasource" = "date_dim_drd") AS 0: jdbc:hive2://ctr-e134-1499953498516-98952-> SELECT CAST(d_date AS TIMESTAMP) AS `__time`, 0: jdbc:hive2://ctr-e134-1499953498516-98952-> d_date_sk, d_date_id, d_month_seq, d_week_seq, d_quarter_seq, d_year, d_dow, d_moy, d_dom, d_qoy, d_fy_year, d_fy_quarter_seq, d_fy_week_seq, d_day_name, d_quarter_name, d_holiday, d_weekend, d_following_holiday, d_first_dom, d_last_dom, d_same_day_ly, d_same_day_lq, d_current_day, d_current_week, d_current_month, d_current_quarter, d_current_year FROM date_dim; ....... VERTICES: 01/02 [=====>>---------------------] 20% ELAPSED TIME: 6.87 s -------------------------------------------------------------------------------- ERROR : Status: Failed ERROR : Vertex failed, vertexName=Reducer 2, vertexId=vertex_1502725432788_0017_2_01, diagnostics=[Task failed, taskId=task_1502725432788_0017_2_01_000002, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1502725432788_0017_2_01_000002_0:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) (vectorizedVertexNum 1) Column vector types: 1:TIMESTAMP, 2:LONG, 3:BYTES, 4:LONG, 5:LONG, 6:LONG, 7:LONG, 8:LONG, 9:LONG, 10:LONG, 11:LONG, 12:LONG, 13:LONG, 14:LONG, 15:BYTES, 16:BYTES, 17:BYTES, 18:BYTES, 19:BYTES, 20:LONG, 21:LONG, 22:LONG, 23:LONG, 24:BYTES, 25:BYTES, 26:BYTES, 27:BYTES, 28:BYTES, 0:TIMESTAMP [1900-01-18 00:00:00.0, 2415038, "AAAAAAAAOLJNECAA", 0, 3, 1, 1900, 3, 1, 18, 1, 1900, 1, 3, "Wednesday", "1900Q1", "N", "N", "N", 2415021, 2415020, 2414673, 2414946, "N", "N", "N", "N", "N", 1900-01-18 00:00:00.0] at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:218) ..... Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) (vectorizedVertexNum 1) Column vector types: 1:TIMESTAMP, 2:LONG, 3:BYTES, 4:LONG, 5:LONG, 6:LONG, 7:LONG, 8:LONG, 9:LONG, 10:LONG, 11:LONG, 12:LONG, 13:LONG, 14:LONG, 15:BYTES, 16:BYTES, 17:BYTES, 18:BYTES, 19:BYTES, 20:LONG, 21:LONG, 22:LONG, 23:LONG, 24:BYTES, 25:BYTES, 26:BYTES, 27:BYTES, 28:BYTES, 0:TIMESTAMP [1900-01-18 00:00:00.0, 2415038, "AAAAAAAAOLJNECAA", 0, 3, 1, 1900, 3, 1, 18, 1, 1900, 1, 3, "Wednesday", "1900Q1", "N", "N", "N", 2415021, 2415020, 2414673, 2414946, "N", "N", "N", "N", "N", 1900-01-18 00:00:00.0] at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:489) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:397) ... 18 more Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 128 ..... {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)