[jira] [Created] (HIVE-11269) Intermittent bug with kryo serializers
Soundararajan Velu created HIVE-11269: - Summary: Intermittent bug with kryo serializers Key: HIVE-11269 URL: https://issues.apache.org/jira/browse/HIVE-11269 Project: Hive Issue Type: Bug Affects Versions: 1.3.0 Reporter: Soundararajan Velu getting kryo serialization exception when running large queries, this works fine on hive 0.14, and works fine on 1.3.0 with the following flags, set hive.plan.serialization.format=kryo; set hive.exec.parallel=false; set hive.limit.optimize.enable=false; set hive.optimize.metadataonly=false; set hive.optimize.reducededuplication=false; set hive.optimize.sort.dynamic.partition=false; set hive.stats.fetch.partition.stats=false; set hive.vectorized.execution.enabled=false; set hive.vectorized.execution.reduce.enabled=false; set hive.cbo.enable=false; set hive.compute.query.using.stats=false; set hive.multigroupby.singlereducer=false; set hive.optimize.ppd=false; set hive.optimize.skewjoin.compiletime=false; set hive.optimize.skewjoin=false; set hive.optimize.union.remove=false; set hive.mapred.mode=nonstrict; set hive.auto.convert.join.noconditionaltask=false; set hive.optimize.sort.dynamic.partition=false; set hive.rpc.query.plan=true; StackTrace colExprMap (org.apache.hadoop.hive.ql.exec.SelectOperator) childOperators (org.apache.hadoop.hive.ql.exec.JoinOperator) reducer (org.apache.hadoop.hive.ql.plan.ReduceWork) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:462) at org.apache.hadoop.hive.ql.exec.Utilities.getReduceWork(Utilities.java:309) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:112) ... 14 more Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.NullPointerException Serialization trace: chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) colExprMap (org.apache.hadoop.hive.ql.exec.SelectOperator) childOperators (org.apache.hadoop.hive.ql.exec.JoinOperator) reducer (org.apache.hadoop.hive.ql.plan.ReduceWork) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106
[jira] [Created] (HIVE-11268) java.io.IOException: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increa
Soundararajan Velu created HIVE-11268: - Summary: java.io.IOException: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. Key: HIVE-11268 URL: https://issues.apache.org/jira/browse/HIVE-11268 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Environment: Hive 1.2 Reporter: Soundararajan Velu Priority: Critical Getting the below exception using, ORC table format, the table has 250 files with 1 file size 14gb and rest 11mb, table properties, TBLPROPERTIES ( 'COLUMN_STATS_ACCURATE'='false', 'numFiles'='250', 'numRows'='-1', 'orc.compress'='SNAPPY', 'orc.compress.size'='262144', 'orc.create.index'='true', 'orc.row.index.stride'='1', 'orc.stripe.size'='67108864', 'rawDataSize'='-1', 'totalSize'='16950715052', 'transient_lastDdlTime'='1436932029') Stack Trace, 2015-07-15 21:28:29,435 ERROR [main]: CliDriver (SessionState.java:printError(979)) - Failed with exception java.io.IOException:com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. java.io.IOException: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1674) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawLittleEndian64(CodedInputStream.java:493) at com.google.protobuf.CodedInputStream.readDouble(CodedInputStream.java:178) at org.apache.hadoop.hive.ql.io.orc.OrcProto$DoubleStatistics.(OrcProto.java:755) at org.apache.hadoop.hive.ql.io.orc.OrcProto$DoubleStatistics.(OrcProto.java:705) at org.apache.hadoop.hive.ql.io.orc.OrcProto$DoubleStatistics$1.parsePartialFrom(OrcProto.java:798) at org.apache.hadoop.hive.ql.io.orc.OrcProto$DoubleStatistics$1.parsePartialFrom(OrcProto.java:793) at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.(OrcProto.java:4884) at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.(OrcProto.java:4813) at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics$1.parsePartialFrom(OrcProto.java:5005) at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics$1.parsePartialFrom(OrcProto.java:5000) at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) at org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics.(OrcProto.java:14334) at org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics.(OrcProto.java:14281) at org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics$1.parsePartialFrom(OrcProto.java:14370) at org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics$1.parsePartialFrom(OrcProto.java:14365) at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) at org.apache.hadoop.
[jira] [Created] (HIVE-11176) aused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct cannot be cast to [Ljava.lang.Object;
Soundararajan Velu created HIVE-11176: - Summary: aused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct cannot be cast to [Ljava.lang.Object; Key: HIVE-11176 URL: https://issues.apache.org/jira/browse/HIVE-11176 Project: Hive Issue Type: Bug Components: Hive, Tez Affects Versions: 1.2.0, 1.0.0 Environment: Hive 1.2 and TEz 0.7 Reporter: Soundararajan Velu Priority: Critical Unreachable code: hive/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/StandardStructObjectInspector.java // With Data @Override @SuppressWarnings("unchecked") public Object getStructFieldData(Object data, StructField fieldRef) { if (data == null) { return null; } // We support both List and Object[] // so we have to do differently. boolean isArray = ! (data instanceof List); if (!isArray && !(data instanceof List)) { return data; } * The if condition above translates to if(!true && true) the code section cannot be reached, this causes a lot of class cast exceptions while using Tez and ORC file formats, Changed the code to boolean isArray = data.getClass().isArray(); if (!isArray && !(data instanceof List)) { return data; } Even then, lazystructs get passed as fields causing downstream cast exceptions like lazystruct cannot be cast to Text etc... So I changed the method to something like this, // With Data @Override @SuppressWarnings("unchecked") public Object getStructFieldData(Object data, StructField fieldRef) { if (data == null) { return null; } if (data instanceof LazyBinaryStruct) { data = ((LazyBinaryStruct) data).getFieldsAsList(); } // We support both List and Object[] // so we have to do differently. boolean isArray = data.getClass().isArray(); if (!isArray && !(data instanceof List)) { return data; } This is causing arrayindexout of bounds exception and other typecast exceptions in object inspectors, Please help, -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11046) Filesystem Closed Exception
Soundararajan Velu created HIVE-11046: - Summary: Filesystem Closed Exception Key: HIVE-11046 URL: https://issues.apache.org/jira/browse/HIVE-11046 Project: Hive Issue Type: Bug Components: Hive, Tez Affects Versions: 1.2.0, 0.7.0 Environment: Hive 1.2.0, Tez0.7.0, HDP2.2, Hadoop 2.6 Reporter: Soundararajan Velu TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Filesystem closed at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:345) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Filesystem closed at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:290) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 14 more Caused by: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:795) at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:629) at java.io.FilterInputStream.close(FilterInputStream.java:181) at org.apache.hadoop.io.compress.DecompressorStream.close(DecompressorStream.java:205) at org.apache.hadoop.util.LineReader.close(LineReader.java:150) at org.apache.hadoop.mapred.LineRecordReader.close(LineRecordReader.java:282) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doClose(HiveRecordReader.java:50) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.close(HiveContextAwareRecordReader.java:104) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:170) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:138) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61) ... 16 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11045) ArrayIndexOutOfBoundsException with Hive 1.2.0 and Tez 0.7.0
Soundararajan Velu created HIVE-11045: - Summary: ArrayIndexOutOfBoundsException with Hive 1.2.0 and Tez 0.7.0 Key: HIVE-11045 URL: https://issues.apache.org/jira/browse/HIVE-11045 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Environment: Hive 1.2.0, HDP 2.2, Hadoop 2.6, Tez 0.7.0 Reporter: Soundararajan Velu TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":4457890},"value":{"_col0":null,"_col1":null,"_col2":null,"_col3":null,"_col4":null,"_col5":null,"_col6":null,"_col7":null,"_col8":null,"_col9":null,"_col10":null,"_col11":null,"_col12":null,"_col13":null,"_col14":null,"_col15":null,"_col16":null,"_col17":"fkl_shipping_b2c","_col18":null,"_col19":null,"_col20":null,"_col21":null}} at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:345) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":4457890},"value":{"_col0":null,"_col1":null,"_col2":null,"_col3":null,"_col4":null,"_col5":null,"_col6":null,"_col7":null,"_col8":null,"_col9":null,"_col10":null,"_col11":null,"_col12":null,"_col13":null,"_col14":null,"_col15":null,"_col16":null,"_col17":"fkl_shipping_b2c","_col18":null,"_col19":null,"_col20":null,"_col21":null}} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:302) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:249) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":4457890},"value":{"_col0":null,"_col1":null,"_col2":null,"_col3":null,"_col4":null,"_col5":null,"_col6":null,"_col7":null,"_col8":null,"_col9":null,"_col10":null,"_col11":null,"_col12":null,"_col13":null,"_col14":null,"_col15":null,"_col16":null,"_col17":"fkl_shipping_b2c","_col18":null,"_col19":null,"_col20":null,"_col21":null}} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:292) ... 16 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {"key":{"_col0":6417306,"_col1":{0:{"_col0":"2014-08-01 02:14:02"}}},"value":{"_col0":"2014-08-01 02:14:02","_col1":20140801,"_col2":"sc_jarvis_b2c","_col3":"action_override","_col4":"WITHIN_GRACE_PERIOD","_col5":"policy_override"}} at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:413) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:381) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:206) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1016) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:821) at org.apache.hadoop.hive.ql.exec.Gr