[ 
https://issues.apache.org/jira/browse/HIVE-27059?focusedWorklogId=844242&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-844242
 ]

ASF GitHub Bot logged work on HIVE-27059:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Feb/23 07:12
            Start Date: 08/Feb/23 07:12
    Worklog Time Spent: 10m 
      Work Description: uncleGen opened a new pull request, #4042:
URL: https://github.com/apache/hive/pull/4042

   ### What changes were proposed in this pull request?
   Query will fail when use collect_list (or collect_set) and disable map-side 
aggregationg:
   ```
   Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryMap cannot be cast to 
java.util.Map
           at 
org.apache.hadoop.hive.serde2.objectinspector.StandardMapObjectInspector.getMap(StandardMapObjectInspector.java:85)
           at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:437)
           at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:362)
           at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMkCollectionEvaluator.putIntoCollection(GenericUDAFMkCollectionEvaluator.java:154)
           at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMkCollectionEvaluator.iterate(GenericUDAFMkCollectionEvaluator.java:120)
           at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:192)
           at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:638)
           at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:877)
           at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:721)
           at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:787)
   ```
   
   To reproduce this issue:
   ```
   create table tb1 (a int, b string, c string);
   insert into tb1 values (1, "100", "101");
   insert into tb1 values (1, "102", "103");
   insert into tb1 values (2, "200", "201");
   set hive.map.aggr=false;
   select a, collect_list(map("b",b,"c",c)) as col1 from tb1 group by a;
   select a, collect_set(array(b, c)) as col1 from tb1 group by a;
   ```
   
   
   ### Why are the changes needed?
   
   Wrong object inspector will be created when use collect_list and disable 
map-side aggregation.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   
   ### How was this patch tested?
   
   UT and manual test




Issue Time Tracking
-------------------

            Worklog Id:     (was: 844242)
    Remaining Estimate: 0h
            Time Spent: 10m

> Wrong object inspector will be created when use collect_list and disable 
> map-side aggregation
> ---------------------------------------------------------------------------------------------
>
>                 Key: HIVE-27059
>                 URL: https://issues.apache.org/jira/browse/HIVE-27059
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>    Affects Versions: 2.3.8, 3.1.3, 4.0.0-alpha-2
>         Environment: 
>            Reporter: Genmao Yu
>            Assignee: Genmao Yu
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Query will fail when use collect_list (or collect_set) and disable map-side 
> aggregationg:
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryMap cannot be cast to 
> java.util.Map
>         at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardMapObjectInspector.getMap(StandardMapObjectInspector.java:85)
>         at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:437)
>         at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:362)
>         at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMkCollectionEvaluator.putIntoCollection(GenericUDAFMkCollectionEvaluator.java:154)
>         at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMkCollectionEvaluator.iterate(GenericUDAFMkCollectionEvaluator.java:120)
>         at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:192)
>         at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:638)
>         at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:877)
>         at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:721)
>         at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:787)
> {code}
> To reproduce this issue:
> {code:sql}
> create table tb1 (a int, b string, c string);
> insert into tb1 values (1, "100", "101");
> insert into tb1 values (1, "102", "103");
> insert into tb1 values (2, "200", "201");
> set hive.map.aggr=false;
> select a, collect_list(map("b",b,"c",c)) as col1 from tb1 group by a;
> select a, collect_set(array(b, c)) as col1 from tb1 group by a;
> {code}
> To work around this issue:
> {code:sql}
> set hive.map.aggr=true;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to