[
https://issues.apache.org/jira/browse/HIVE-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14243556#comment-14243556
]
Chao commented on HIVE-9041:
----------------------------
I just found another bug regarding IOContext, when caching is turned on.
Taking the sample query above as example, right now I have this result plan:
{noformat}
MW 1 (table0) MW 2 (table1) MW 3 (table0) MW 4 (table1)
\ / \ /
\ / \ /
\ / \ /
\ / \ /
RW 1 RW 2
{noformat}
Suppose MapWorks are executed from left to right, also suppose we are just
running with a single thread.
Then, the following will happen:
1. executing MW 1: since this is the first time we access table0, initialize
IOContext and make input path point to table0;
2. executing MW 2: since this is the first time we access table1, initialize
IOContext and make input path point to table1;
3. executing MW 3: since this is the second time access table0, *do not*
initialize IOContext, and use the copy saved in step 2), *which is table1*.
Step 3 will then fail.
> Generate better plan for queries containing both union and multi-insert
> [Spark Branch]
> --------------------------------------------------------------------------------------
>
> Key: HIVE-9041
> URL: https://issues.apache.org/jira/browse/HIVE-9041
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Affects Versions: spark-branch
> Reporter: Chao
> Assignee: Chao
>
> This is a follow-up for HIVE-8920. For queries like:
> {code}
> from (select * from table0 union all select * from table1) s
> insert overwrite table table3 select s.x, count(1) group by s.x
> insert overwrite table table4 select s.y, count(1) group by s.y;
> {code}
> Currently we generate the following plan:
> {noformat}
> M1 M2
> \ / \
> U3 R5
> |
> R4
> {noformat}
> It's better, however, to have the following plan:
> {noformat}
> M1 M2
> |\ /|
> | \/ |
> | /\ |
> R4 R5
> {noformat}
> Also, we can do some reseach in this JIRA to see if it's possible
> to remove UnionWork once and for all.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)