[ https://issues.apache.org/jira/browse/HIVE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated HIVE-8920: ------------------------------ Comment: was deleted (was: With latest Spark branch, I got the following plan {code} hive> explain from (select * from dec2 union all select * from dec3) s > insert overwrite table dec4 select name, avg(value) group by name > insert overwrite table dec5 select name, sum(value) group by name; OK STAGE DEPENDENCIES: Stage-2 is a root stage Stage-0 depends on stages: Stage-2 Stage-3 depends on stages: Stage-0 Stage-1 depends on stages: Stage-2 Stage-4 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-2 Spark Edges: Union 2 <- Map 1 (NONE, 0), Map 4 (NONE, 0) Reducer 3 <- Union 2 (SORT, 1) {code} The plan seems correct to me. [~csun], what's the real issue here?) > SplitSparkWorkResolver doesn't work with UnionWork [Spark Branch] > ----------------------------------------------------------------- > > Key: HIVE-8920 > URL: https://issues.apache.org/jira/browse/HIVE-8920 > Project: Hive > Issue Type: Sub-task > Components: Spark > Affects Versions: spark-branch > Reporter: Chao > > The following query will not work: > {code} > from (select * from table0 union all select * from table1) s > insert overwrite table table3 select s.x, count(1) group by s.x > insert overwrite table table4 select s.y, count(1) group by s.y; > {code} > Currently, the plan for this query, before SplitSparkWorkResolver, looks like > below: > {noformat} > M1 M2 > \ / \ > U3 R5 > | > R4 > {noformat} > In {{SplitSparkWorkResolver#splitBaseWork}}, it assumes that the > {{childWork}} is a ReduceWork, but for this case, you can see that for M2 the > childWork could be UnionWork U3. Thus, the code will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)