[
https://issues.apache.org/jira/browse/HIVE-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14243562#comment-14243562
]
Xuefu Zhang commented on HIVE-9041:
-----------------------------------
So the problem is a single (static) instance to cache IOContext (in case of rdd
caching), where there could be more than one input being processed. Maybe we
should be more sophisticated than using a single static variable.
> Generate better plan for queries containing both union and multi-insert
> [Spark Branch]
> --------------------------------------------------------------------------------------
>
> Key: HIVE-9041
> URL: https://issues.apache.org/jira/browse/HIVE-9041
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Affects Versions: spark-branch
> Reporter: Chao
> Assignee: Chao
>
> This is a follow-up for HIVE-8920. For queries like:
> {code}
> from (select * from table0 union all select * from table1) s
> insert overwrite table table3 select s.x, count(1) group by s.x
> insert overwrite table table4 select s.y, count(1) group by s.y;
> {code}
> Currently we generate the following plan:
> {noformat}
> M1 M2
> \ / \
> U3 R5
> |
> R4
> {noformat}
> It's better, however, to have the following plan:
> {noformat}
> M1 M2
> |\ /|
> | \/ |
> | /\ |
> R4 R5
> {noformat}
> Also, we can do some reseach in this JIRA to see if it's possible
> to remove UnionWork once and for all.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)