[jira] [Commented] (HIVE-18107) CBO Multi Table Insert Query with JOIN operator and GROUPING SETS throws SemanticException Invalid table alias or column reference 'GROUPING__ID'
[ https://issues.apache.org/jira/browse/HIVE-18107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16259312#comment-16259312 ] Sergey Zadoroshnyak commented on HIVE-18107: [~jcamachorodriguez] [~ashutoshc] [~pxiong] This issue is reproducible only if hive.cbo.enable=true. If hive.cbo.enable=false, multi Table Insert Query will be successfully compiled and executed. > CBO Multi Table Insert Query with JOIN operator and GROUPING SETS throws > SemanticException Invalid table alias or column reference 'GROUPING__ID' > --- > > Key: HIVE-18107 > URL: https://issues.apache.org/jira/browse/HIVE-18107 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.3.0 >Reporter: Sergey Zadoroshnyak >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > > hive 2.3.0 > set hive.execution.engine=tez; > set hive.multigroupby.singlereducer=false; > *set hive.cbo.enable=true;* > Multi Table Insert Query. *Template:* > FROM (SELECT * FROM tableA) AS alias_a JOIN (SELECT * FROM tableB) AS > alias_b > ON (alias_a.column_1 = alias_b.column_1 AND alias_a.column_2 = > alias_b.column_2) > > INSERT OVERWRITE TABLE tableC PARTITION > ( > partition1='first_fragment' > ) > SELECT > GROUPING__ID, > alias_a.column4, > alias_a.column5, > alias_a.column6, > alias_a.column7, > count(1) > > >AS rownum > WHERE alias_b.column_3 = 1 > GROUP BY > alias_a.column4, > alias_a.column5, > alias_a.column6, > alias_a.column7 > GROUPING SETS > ( > (alias_a.column4), > (alias_a.column4,alias_a.column5), > (alias_a.column4,alias_a.column5,alias_a.column6,alias_a.column7) > ) > > INSERT OVERWRITE TABLE tableC PARTITION > ( >partition1='second_fragment' > ) > SELECT > GROUPING__ID, > alias_a.column4, > alias_a.column5, > alias_a.column6, > alias_a.column7, > count(1) > > > AS rownum > WHERE alias_b.column_3 = 2 > GROUP BY > alias_a.column4, > alias_a.column5, > alias_a.column6, > alias_a.column7 > GROUPING SETS > ( > (alias_a.column4), > (alias_a.column4,alias_a.column5), > (alias_a.column4,alias_a.column5,alias_a.column6,alias_a.column7) > ) > 16:39:17,822 ERROR CalcitePlanner:423 - CBO failed, skipping CBO. > org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:537 Invalid table > alias or column reference 'GROUPING__ID': (possible column names are:.. > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11600) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11548) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genSelectLogicalPlan(CalcitePlanner.java:3706) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3999) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1315) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1261) > at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:113) > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:997) > at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:149) > at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:106) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1069) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1085) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:364) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:286) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) > at org.apache.hadoop.hive.ql.Driver.
[jira] [Assigned] (HIVE-18107) CBO Multi Table Insert Query with JOIN operator and GROUPING SETS throws SemanticException Invalid table alias or column reference 'GROUPING__ID'
[ https://issues.apache.org/jira/browse/HIVE-18107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Zadoroshnyak reassigned HIVE-18107: -- > CBO Multi Table Insert Query with JOIN operator and GROUPING SETS throws > SemanticException Invalid table alias or column reference 'GROUPING__ID' > --- > > Key: HIVE-18107 > URL: https://issues.apache.org/jira/browse/HIVE-18107 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.3.0 >Reporter: Sergey Zadoroshnyak >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > > hive 2.3.0 > set hive.execution.engine=tez; > set hive.multigroupby.singlereducer=false; > *set hive.cbo.enable=true;* > Multi Table Insert Query. *Template:* > FROM (SELECT * FROM tableA) AS alias_a JOIN (SELECT * FROM tableB) AS > alias_b > ON (alias_a.column_1 = alias_b.column_1 AND alias_a.column_2 = > alias_b.column_2) > > INSERT OVERWRITE TABLE tableC PARTITION > ( > partition1='first_fragment' > ) > SELECT > GROUPING__ID, > alias_a.column4, > alias_a.column5, > alias_a.column6, > alias_a.column7, > count(1) > > >AS rownum > WHERE alias_b.column_3 = 1 > GROUP BY > alias_a.column4, > alias_a.column5, > alias_a.column6, > alias_a.column7 > GROUPING SETS > ( > (alias_a.column4), > (alias_a.column4,alias_a.column5), > (alias_a.column4,alias_a.column5,alias_a.column6,alias_a.column7) > ) > > INSERT OVERWRITE TABLE tableC PARTITION > ( >partition1='second_fragment' > ) > SELECT > GROUPING__ID, > alias_a.column4, > alias_a.column5, > alias_a.column6, > alias_a.column7, > count(1) > > > AS rownum > WHERE alias_b.column_3 = 2 > GROUP BY > alias_a.column4, > alias_a.column5, > alias_a.column6, > alias_a.column7 > GROUPING SETS > ( > (alias_a.column4), > (alias_a.column4,alias_a.column5), > (alias_a.column4,alias_a.column5,alias_a.column6,alias_a.column7) > ) > 16:39:17,822 ERROR CalcitePlanner:423 - CBO failed, skipping CBO. > org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:537 Invalid table > alias or column reference 'GROUPING__ID': (possible column names are:.. > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11600) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11548) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genSelectLogicalPlan(CalcitePlanner.java:3706) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3999) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1315) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1261) > at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:113) > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:997) > at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:149) > at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:106) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1069) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1085) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:364) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:286) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:511) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1316) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1294) > at > org.apache.hive.service.cli.operation.SQLOperation.pre
[jira] [Assigned] (HIVE-14530) Union All query returns incorrect results.
[ https://issues.apache.org/jira/browse/HIVE-14530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Zadoroshnyak reassigned HIVE-14530: -- Assignee: Jesus Camacho Rodriguez https://issues.apache.org/jira/browse/HIVE-13639 > Union All query returns incorrect results. > -- > > Key: HIVE-14530 > URL: https://issues.apache.org/jira/browse/HIVE-14530 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 2.1.0 > Environment: Hadoop 2.6 > Hive 2.1 >Reporter: wenhe li >Assignee: Jesus Camacho Rodriguez > > create table dw_tmp.l_test1 (id bigint,val string,trans_date string) row > format delimited fields terminated by ' ' ; > create table dw_tmp.l_test2 (id bigint,val string,trans_date string) row > format delimited fields terminated by ' ' ; > select * from dw_tmp.l_test1; > 1 table_1 2016-08-11 > select * from dw_tmp.l_test2; > 2 table_2 2016-08-11 > -- right like this > select > id, > 'table_1' , > trans_date > from dw_tmp.l_test1 > union all > select > id, > val, > trans_date > from dw_tmp.l_test2 ; > 1 table_1 2016-08-11 > 2 table_2 2016-08-11 > -- incorrect > select > id, > 999, > 'table_1' , > trans_date > from dw_tmp.l_test1 > union all > select > id, > 999, > val, > trans_date > from dw_tmp.l_test2 ; > 1 999 table_1 2016-08-11 > 2 999 table_1 2016-08-11 <-- here is wrong > -- incorrect > select > id, > 999, > 666, > 'table_1' , > trans_date > from dw_tmp.l_test1 > union all > select > id, > 999, > 666, > val, > trans_date > from dw_tmp.l_test2 ; > 1 999 666 table_1 2016-08-11 > 2 999 666 table_1 2016-08-11 <-- here is wrong > -- right > select > id, > 999, > 'table_1' , > trans_date, > '2016-11-11' > from dw_tmp.l_test1 > union all > select > id, > 999, > val, > trans_date, > trans_date > from dw_tmp.l_test2 ; > 1 999 table_1 2016-08-11 2016-11-11 > 2 999 table_2 2016-08-11 2016-08-11 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-14530) Union All query returns incorrect results.
[ https://issues.apache.org/jira/browse/HIVE-14530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Zadoroshnyak updated HIVE-14530: --- Comment: was deleted (was: https://issues.apache.org/jira/browse/HIVE-13639) > Union All query returns incorrect results. > -- > > Key: HIVE-14530 > URL: https://issues.apache.org/jira/browse/HIVE-14530 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 2.1.0 > Environment: Hadoop 2.6 > Hive 2.1 >Reporter: wenhe li >Assignee: Jesus Camacho Rodriguez > > create table dw_tmp.l_test1 (id bigint,val string,trans_date string) row > format delimited fields terminated by ' ' ; > create table dw_tmp.l_test2 (id bigint,val string,trans_date string) row > format delimited fields terminated by ' ' ; > select * from dw_tmp.l_test1; > 1 table_1 2016-08-11 > select * from dw_tmp.l_test2; > 2 table_2 2016-08-11 > -- right like this > select > id, > 'table_1' , > trans_date > from dw_tmp.l_test1 > union all > select > id, > val, > trans_date > from dw_tmp.l_test2 ; > 1 table_1 2016-08-11 > 2 table_2 2016-08-11 > -- incorrect > select > id, > 999, > 'table_1' , > trans_date > from dw_tmp.l_test1 > union all > select > id, > 999, > val, > trans_date > from dw_tmp.l_test2 ; > 1 999 table_1 2016-08-11 > 2 999 table_1 2016-08-11 <-- here is wrong > -- incorrect > select > id, > 999, > 666, > 'table_1' , > trans_date > from dw_tmp.l_test1 > union all > select > id, > 999, > 666, > val, > trans_date > from dw_tmp.l_test2 ; > 1 999 666 table_1 2016-08-11 > 2 999 666 table_1 2016-08-11 <-- here is wrong > -- right > select > id, > 999, > 'table_1' , > trans_date, > '2016-11-11' > from dw_tmp.l_test1 > union all > select > id, > 999, > val, > trans_date, > trans_date > from dw_tmp.l_test2 ; > 1 999 table_1 2016-08-11 2016-11-11 > 2 999 table_2 2016-08-11 2016-08-11 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14530) Union All query returns incorrect results.
[ https://issues.apache.org/jira/browse/HIVE-14530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15445362#comment-15445362 ] Sergey Zadoroshnyak commented on HIVE-14530: [~liwenhe] [~jcamachorodriguez] In my opinion, this issue is broken by https://issues.apache.org/jira/browse/HIVE-13639 By default Cost-based optimization in Hive, which uses the Calcite framework is enabled. (set hive.cbo.enable=true). [~jcamachorodriguez] introduced a new rule HiveUnionPullUpConstantsRule, which was added in relOptRules by CalcitePlanner. Please take a look at review request: https://reviews.apache.org/r/46974/diff/1#2 If we set hive.cbo.enable=false, the issue is not reproduced. [~liwenhe] Please update component -> CBO. [~jcamachorodriguez] Could you please take a look? > Union All query returns incorrect results. > -- > > Key: HIVE-14530 > URL: https://issues.apache.org/jira/browse/HIVE-14530 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 2.1.0 > Environment: Hadoop 2.6 > Hive 2.1 >Reporter: wenhe li >Assignee: Jesus Camacho Rodriguez > > create table dw_tmp.l_test1 (id bigint,val string,trans_date string) row > format delimited fields terminated by ' ' ; > create table dw_tmp.l_test2 (id bigint,val string,trans_date string) row > format delimited fields terminated by ' ' ; > select * from dw_tmp.l_test1; > 1 table_1 2016-08-11 > select * from dw_tmp.l_test2; > 2 table_2 2016-08-11 > -- right like this > select > id, > 'table_1' , > trans_date > from dw_tmp.l_test1 > union all > select > id, > val, > trans_date > from dw_tmp.l_test2 ; > 1 table_1 2016-08-11 > 2 table_2 2016-08-11 > -- incorrect > select > id, > 999, > 'table_1' , > trans_date > from dw_tmp.l_test1 > union all > select > id, > 999, > val, > trans_date > from dw_tmp.l_test2 ; > 1 999 table_1 2016-08-11 > 2 999 table_1 2016-08-11 <-- here is wrong > -- incorrect > select > id, > 999, > 666, > 'table_1' , > trans_date > from dw_tmp.l_test1 > union all > select > id, > 999, > 666, > val, > trans_date > from dw_tmp.l_test2 ; > 1 999 666 table_1 2016-08-11 > 2 999 666 table_1 2016-08-11 <-- here is wrong > -- right > select > id, > 999, > 'table_1' , > trans_date, > '2016-11-11' > from dw_tmp.l_test1 > union all > select > id, > 999, > val, > trans_date, > trans_date > from dw_tmp.l_test2 ; > 1 999 table_1 2016-08-11 2016-11-11 > 2 999 table_2 2016-08-11 2016-08-11 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422326#comment-15422326 ] Sergey Zadoroshnyak commented on HIVE-14483: [~sershe] Thank you very much > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Sergey Zadoroshnyak >Priority: Critical > Fix For: 1.3.0, 2.2.0, 2.1.1, 2.0.2 > > Attachments: HIVE-14483.01.patch > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420716#comment-15420716 ] Sergey Zadoroshnyak commented on HIVE-14483: [~sershe] .patch looks good and no test failures. Who has responsibility to push into master? > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Sergey Zadoroshnyak >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14483.01.patch > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416889#comment-15416889 ] Sergey Zadoroshnyak commented on HIVE-14483: [~sershe] Should we ignore these test failures? > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Sergey Zadoroshnyak >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14483.01.patch > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Zadoroshnyak updated HIVE-14483: --- Attachment: (was: 0001-HIVE-14483.patch) > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Sergey Zadoroshnyak >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14483.01.patch > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416083#comment-15416083 ] Sergey Zadoroshnyak commented on HIVE-14483: After upgrading into Hive 2.1.0, we only found exception for StringDirectTreeReader. But, I think, that you should ask [~owen.omalley]- he is responsible for user story https://issues.apache.org/jira/browse/HIVE-12159 and he keeps silent.. > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Sergey Zadoroshnyak >Priority: Critical > Fix For: 2.2.0 > > Attachments: 0001-HIVE-14483.patch, HIVE-14483.01.patch > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15415807#comment-15415807 ] Sergey Zadoroshnyak commented on HIVE-14483: [~sershe] Do you know who is responsible for Hive ORC module? > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Owen O'Malley >Priority: Critical > Fix For: 2.2.0 > > Attachments: 0001-HIVE-14483.patch > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15415069#comment-15415069 ] Sergey Zadoroshnyak edited comment on HIVE-14483 at 8/10/16 10:21 AM: -- please ignore this comment was (Author: spring): please ingore this comment > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Owen O'Malley >Priority: Critical > Fix For: 2.2.0 > > Attachments: 0001-HIVE-14483.patch > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15415080#comment-15415080 ] Sergey Zadoroshnyak commented on HIVE-14483: [~owen.omalley] [~prasanth_j] Could you please review pull request? > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Owen O'Malley >Priority: Critical > Fix For: 2.2.0 > > Attachments: 0001-HIVE-14483.patch > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15415067#comment-15415067 ] Sergey Zadoroshnyak edited comment on HIVE-14483 at 8/10/16 10:21 AM: -- please ignore this comment was (Author: spring): please ingore this comment > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Owen O'Malley >Priority: Critical > Fix For: 2.2.0 > > Attachments: 0001-HIVE-14483.patch > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Zadoroshnyak updated HIVE-14483: --- Attachment: 0001-HIVE-14483.patch Fix java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Owen O'Malley >Priority: Critical > Fix For: 2.2.0 > > Attachments: 0001-HIVE-14483.patch > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Zadoroshnyak updated HIVE-14483: --- Comment: was deleted (was: Fix java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays) > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Owen O'Malley >Priority: Critical > Fix For: 2.2.0 > > Attachments: 0001-HIVE-14483.patch > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Zadoroshnyak updated HIVE-14483: --- Status: Patch Available (was: Open) > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Owen O'Malley >Priority: Critical > Fix For: 2.2.0 > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15415068#comment-15415068 ] Sergey Zadoroshnyak commented on HIVE-14483: please ingore this comment > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Owen O'Malley >Priority: Critical > Fix For: 2.2.0 > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Zadoroshnyak updated HIVE-14483: --- Comment: was deleted (was: please ingore this comment) > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Owen O'Malley >Priority: Critical > Fix For: 2.2.0 > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15415069#comment-15415069 ] Sergey Zadoroshnyak commented on HIVE-14483: please ingore this comment > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Owen O'Malley >Priority: Critical > Fix For: 2.2.0 > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15415067#comment-15415067 ] Sergey Zadoroshnyak commented on HIVE-14483: please ingore this comment > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Owen O'Malley >Priority: Critical > Fix For: 2.2.0 > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413659#comment-15413659 ] Sergey Zadoroshnyak commented on HIVE-14483: [~owen.omalley] Could you please take a look? > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Owen O'Malley >Priority: Critical > Fix For: 2.2.0 > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12159) Create vectorized readers for the complex types
[ https://issues.apache.org/jira/browse/HIVE-12159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413607#comment-15413607 ] Sergey Zadoroshnyak commented on HIVE-12159: This patch causes java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays. Tracking at https://issues.apache.org/jira/browse/HIVE-14483 > Create vectorized readers for the complex types > --- > > Key: HIVE-12159 > URL: https://issues.apache.org/jira/browse/HIVE-12159 > Project: Hive > Issue Type: Sub-task >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.1.0 > > Attachments: HIVE-12159.patch, HIVE-12159.patch, HIVE-12159.patch, > HIVE-12159.patch, HIVE-12159.patch, HIVE-12159.patch, HIVE-12159.patch, > HIVE-12159.patch, HIVE-12159.patch > > > We need vectorized readers for the complex types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13951) GenericUDFArray should constant fold at compile time
[ https://issues.apache.org/jira/browse/HIVE-13951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Zadoroshnyak updated HIVE-13951: --- Assignee: Gopal V > GenericUDFArray should constant fold at compile time > > > Key: HIVE-13951 > URL: https://issues.apache.org/jira/browse/HIVE-13951 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.3.0, 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Gopal V >Priority: Critical > > 1. Hive constant propagation optimizer is enabled. > hive.optimize.constant.propagation=true; > 2. Hive query: > select array('Total','Total') from some_table; > ERROR: org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory > (ConstantPropagateProcFactory.java:evaluateFunction(939)) - Unable to > evaluate org.apache.hadoop.hive.ql.udf.generic.GenericUDFArray@3d26c423. > Return value unrecoginizable. > Details: > During compilation of query, hive checks if any subexpression of a specified > expression can be evaluated to be constant and replaces such subexpression > with the constant. > If the expression is a deterministic UDF and all the subexpressions are > constants, the value will be calculated immediately during compilation time > (not runtime) > So array is a deterministic UDF, 'Total' is string constant. So Hive tries > to replace result of evaluation UDF with the constant. > But looks like, that Hive only supports primitives and struct objects. > So, array is not supported yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13951) GenericUDFArray should constant fold at compile time
[ https://issues.apache.org/jira/browse/HIVE-13951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Zadoroshnyak updated HIVE-13951: --- Priority: Critical (was: Major) > GenericUDFArray should constant fold at compile time > > > Key: HIVE-13951 > URL: https://issues.apache.org/jira/browse/HIVE-13951 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.3.0, 2.1.0 >Reporter: Sergey Zadoroshnyak >Priority: Critical > > 1. Hive constant propagation optimizer is enabled. > hive.optimize.constant.propagation=true; > 2. Hive query: > select array('Total','Total') from some_table; > ERROR: org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory > (ConstantPropagateProcFactory.java:evaluateFunction(939)) - Unable to > evaluate org.apache.hadoop.hive.ql.udf.generic.GenericUDFArray@3d26c423. > Return value unrecoginizable. > Details: > During compilation of query, hive checks if any subexpression of a specified > expression can be evaluated to be constant and replaces such subexpression > with the constant. > If the expression is a deterministic UDF and all the subexpressions are > constants, the value will be calculated immediately during compilation time > (not runtime) > So array is a deterministic UDF, 'Total' is string constant. So Hive tries > to replace result of evaluation UDF with the constant. > But looks like, that Hive only supports primitives and struct objects. > So, array is not supported yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13951) GenericUDFArray should constant fold at compile time
[ https://issues.apache.org/jira/browse/HIVE-13951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15316492#comment-15316492 ] Sergey Zadoroshnyak commented on HIVE-13951: [~xuefuz] [~gopalv] [~hsubramaniyan] Could you please take a look? > GenericUDFArray should constant fold at compile time > > > Key: HIVE-13951 > URL: https://issues.apache.org/jira/browse/HIVE-13951 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.3.0, 2.1.0 >Reporter: Sergey Zadoroshnyak > > 1. Hive constant propagation optimizer is enabled. > hive.optimize.constant.propagation=true; > 2. Hive query: > select array('Total','Total') from some_table; > ERROR: org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory > (ConstantPropagateProcFactory.java:evaluateFunction(939)) - Unable to > evaluate org.apache.hadoop.hive.ql.udf.generic.GenericUDFArray@3d26c423. > Return value unrecoginizable. > Details: > During compilation of query, hive checks if any subexpression of a specified > expression can be evaluated to be constant and replaces such subexpression > with the constant. > If the expression is a deterministic UDF and all the subexpressions are > constants, the value will be calculated immediately during compilation time > (not runtime) > So array is a deterministic UDF, 'Total' is string constant. So Hive tries > to replace result of evaluation UDF with the constant. > But looks like, that Hive only supports primitives and struct objects. > So, array is not supported yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13021) GenericUDAFEvaluator.isEstimable(agg) always returns false
[ https://issues.apache.org/jira/browse/HIVE-13021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Zadoroshnyak updated HIVE-13021: --- Priority: Critical (was: Major) > GenericUDAFEvaluator.isEstimable(agg) always returns false > -- > > Key: HIVE-13021 > URL: https://issues.apache.org/jira/browse/HIVE-13021 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.2.1 >Reporter: Sergey Zadoroshnyak >Assignee: Gopal V >Priority: Critical > Labels: Performance > > GenericUDAFEvaluator.isEstimable(agg) always returns false, because > annotation AggregationType has default RetentionPolicy.CLASS and cannot be > retained by the VM at run time. > As result estimate method will never be executed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8188) ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight loop
[ https://issues.apache.org/jira/browse/HIVE-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15134050#comment-15134050 ] Sergey Zadoroshnyak commented on HIVE-8188: --- [~hagleitn] [~prasanth_j] [~thejas] GenericUDAFEvaluator.isEstimable(agg) always returns false, because annotation AggregationType has default RetentionPolicy.CLASS and cannot be retained by the VM at run time. As result estimate method will never be executed. I am going to open new jira issue, if no objections > ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight > loop > - > > Key: HIVE-8188 > URL: https://issues.apache.org/jira/browse/HIVE-8188 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 0.14.0 >Reporter: Gopal V >Assignee: Gopal V > Labels: Performance > Fix For: 0.14.0 > > Attachments: HIVE-8188.1.patch, HIVE-8188.2.patch, > udf-deterministic.png > > > When running a near-constant UDF, most of the CPU is burnt within the VM > trying to read the class annotations for every row. > !udf-deterministic.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)