Chao created HIVE-8215:
--------------------------
Summary: Multi-table insertion optimization #3: use 1+1 tasks
instead of 1+N tasks [Spark Branch]
Key: HIVE-8215
URL: https://issues.apache.org/jira/browse/HIVE-8215
Project: Hive
Issue Type: Improvement
Components: Spark
Reporter: Chao
Currently, for multi-table insertion it generates 1+N tasks - "1" is the task
that generates input, and "N" are the insert queries that read from the input
and write to separate output tables.
In order to make these N tasks run in parallel, we rely on
{{hive.exec.parallel}} to be set to {{true}}. In this patch, we propose an
alternative approach, which is to combine these N tasks into one single task,
which contains N separate operator trees, which in execution leads to N result
RDDs. We then may be able to execute these N RDDs in parallel inside Spark,
without needing {{hive.exec.parallel}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)