[ https://issues.apache.org/jira/browse/HIVE-21111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797787#comment-16797787 ]
zhuwei commented on HIVE-21111: ------------------------------- [~lirui] Since it's related to table data size , it's not easy to reproduce it from beginning. The root cause is that a child task of conditional task is still conditional task. Please take a look at the code that I pasted in description, I think this bug is obvious. The SQL that triggered this bug in our product environment is like this: set hive.auto.convert.join=true; set hive.optimize.skewjoin = true; explain insert overwrite table dw.dwd_tc_order_old_d_orign select a.order_no, a.kdt_id, a.store_id, a.order_type, a.features, a.state, a.close_state, a.pay_state, b.origin_price, a.buy_way, b.goods_num, b.goods_pay, a.express_type, case when ((a.state >=6 and a.state <> 99) or a.express_time <> 0) then 1 else 0 end as express_state, case when ((a.state >=6 and a.state <> 99) or a.express_time <> 0) then 'a' else 'b' end as express_state_name, if((a.order_type=6 and a.pay_state>0),1,a.stock_state) as stock_state, a.customer_id, a.customer_type, a.customer_name, a.buyer_id, a.buyer_phone, if(a.book_time=0 or a.book_time is null,'0',udf.format_unixtime(a.book_time)) as book_time, if(a.pay_time=0 or a.pay_time is null,'0',udf.format_unixtime(a.pay_time)) as pay_time, if(a.express_time=0 or a.express_time is null,'0',udf.format_unixtime(a.express_time)) as express_time, if(a.success_time=0 or a.success_time is null,'0',udf.format_unixtime(a.success_time)) as success_time, if(a.close_time=0 or a.close_time is null,0,udf.format_unixtime(a.close_time)) as close_time, if(a.feedback_time=0 or a.feedback_time is null,'0',udf.format_unixtime(a.feedback_time)) as feedback_time FROM ( select order_no, kdt_id,store_id,features,state,close_state,pay_state,order_type, buy_way,express_type,activity_type, express_state,feedback,refund_state,stock_state,customer_id,customer_type,customer_name,buyer_id,buyer_phone, book_time,pay_time, express_time,success_time,close_time,feedback_time FROM ods.tc_seller_order where kdt_id<>0 and (length(order_no)<> 24 OR substr(order_no,1,1) <> 'E' OR substr(order_no,-5,1) <> '0') ) a join ( select order_no, cast(sum(price * num)as bigint) as origin_price , sum(num) AS goods_num, cast(sum(pay_price*num) AS bigint) AS goods_pay from ods.tc_order_item where (length(order_no)<> 24 OR substr(order_no,1,1) <> 'E' OR substr(order_no,-5,1) <> '0') group by order_no ) b on a.order_no = b.order_no; > ConditionalTask cannot be cast to MapRedTask > -------------------------------------------- > > Key: HIVE-21111 > URL: https://issues.apache.org/jira/browse/HIVE-21111 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer > Affects Versions: 2.1.1, 3.1.1, 2.3.4 > Reporter: zhuwei > Assignee: zhuwei > Priority: Major > Attachments: HIVE-21111.1.patch > > > We met error like this in our product environment: > java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.ConditionalTask > cannot be cast to org.apache.hadoop.hive.ql.exec.mr.MapRedTask > at > org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch(AbstractJoinTaskDispatcher.java:173) > > There is a bug in function > org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch: > if (tsk.isMapRedTask()) { > Task<? extends Serializable> newTask = this.processCurrentTask((MapRedTask) > tsk, > ((ConditionalTask) currTask), physicalContext.getContext()); > walkerCtx.addToDispatchList(newTask); > } > In the above code, when tsk is instance of ConditionalTask, > tsk.isMapRedTask() still can be true, but it cannot be cast to MapRedTask. -- This message was sent by Atlassian JIRA (v7.6.3#76005)