[ https://issues.apache.org/jira/browse/HIVE-27059?focusedWorklogId=844242&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-844242 ]
ASF GitHub Bot logged work on HIVE-27059: ----------------------------------------- Author: ASF GitHub Bot Created on: 08/Feb/23 07:12 Start Date: 08/Feb/23 07:12 Worklog Time Spent: 10m Work Description: uncleGen opened a new pull request, #4042: URL: https://github.com/apache/hive/pull/4042 ### What changes were proposed in this pull request? Query will fail when use collect_list (or collect_set) and disable map-side aggregationg: ``` Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryMap cannot be cast to java.util.Map at org.apache.hadoop.hive.serde2.objectinspector.StandardMapObjectInspector.getMap(StandardMapObjectInspector.java:85) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:437) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:362) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMkCollectionEvaluator.putIntoCollection(GenericUDAFMkCollectionEvaluator.java:154) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMkCollectionEvaluator.iterate(GenericUDAFMkCollectionEvaluator.java:120) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:192) at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:638) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:877) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:721) at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:787) ``` To reproduce this issue: ``` create table tb1 (a int, b string, c string); insert into tb1 values (1, "100", "101"); insert into tb1 values (1, "102", "103"); insert into tb1 values (2, "200", "201"); set hive.map.aggr=false; select a, collect_list(map("b",b,"c",c)) as col1 from tb1 group by a; select a, collect_set(array(b, c)) as col1 from tb1 group by a; ``` ### Why are the changes needed? Wrong object inspector will be created when use collect_list and disable map-side aggregation. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? UT and manual test Issue Time Tracking ------------------- Worklog Id: (was: 844242) Remaining Estimate: 0h Time Spent: 10m > Wrong object inspector will be created when use collect_list and disable > map-side aggregation > --------------------------------------------------------------------------------------------- > > Key: HIVE-27059 > URL: https://issues.apache.org/jira/browse/HIVE-27059 > Project: Hive > Issue Type: Bug > Components: Query Planning > Affects Versions: 2.3.8, 3.1.3, 4.0.0-alpha-2 > Environment: > Reporter: Genmao Yu > Assignee: Genmao Yu > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Query will fail when use collect_list (or collect_set) and disable map-side > aggregationg: > {code:java} > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryMap cannot be cast to > java.util.Map > at > org.apache.hadoop.hive.serde2.objectinspector.StandardMapObjectInspector.getMap(StandardMapObjectInspector.java:85) > at > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:437) > at > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:362) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMkCollectionEvaluator.putIntoCollection(GenericUDAFMkCollectionEvaluator.java:154) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMkCollectionEvaluator.iterate(GenericUDAFMkCollectionEvaluator.java:120) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:192) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:638) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:877) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:721) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:787) > {code} > To reproduce this issue: > {code:sql} > create table tb1 (a int, b string, c string); > insert into tb1 values (1, "100", "101"); > insert into tb1 values (1, "102", "103"); > insert into tb1 values (2, "200", "201"); > set hive.map.aggr=false; > select a, collect_list(map("b",b,"c",c)) as col1 from tb1 group by a; > select a, collect_set(array(b, c)) as col1 from tb1 group by a; > {code} > To work around this issue: > {code:sql} > set hive.map.aggr=true; > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)