[
https://issues.apache.org/jira/browse/HIVE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782314#comment-13782314
]
Johannes Alkjær commented on HIVE-4598:
---------------------------------------
Adding an extra select block, fixes the execution plan though,
{code}
EXPLAIN
FROM (
SELECT * FROM (
FROM ( SELECT * FROM sample ) mapout
REDUCE * USING 'cat' AS x,y
) reduced
) zz
insert overwrite local directory '/tmp/a' select * where x='a' or x='b'
insert overwrite local directory '/tmp/b' select * where x='c' or x='d';
{code}
{code}
ABSTRACT SYNTAX TREE:
(TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_SUBQUERY
(TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_TABREF
(TOK_TABNAME sample))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE))
(TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)))) mapout)) (TOK_INSERT
(TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR
(TOK_TRANSFORM (TOK_EXPLIST TOK_ALLCOLREF) TOK_SERDE TOK_RECORDWRITER 'cat'
TOK_SERDE TOK_RECORDREADER (TOK_ALIASLIST x y)))))) reduced)) (TOK_INSERT
(TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR
TOK_ALLCOLREF)))) zz)) (TOK_INSERT (TOK_DESTINATION (TOK_LOCAL_DIR '/tmp/a'))
(TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (or (= (TOK_TABLE_OR_COL x)
'a') (= (TOK_TABLE_OR_COL x) 'b')))) (TOK_INSERT (TOK_DESTINATION
(TOK_LOCAL_DIR '/tmp/b')) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE
(or (= (TOK_TABLE_OR_COL x) 'c') (= (TOK_TABLE_OR_COL x) 'd')))))
STAGE DEPENDENCIES:
Stage-2 is a root stage
Stage-0 depends on stages: Stage-2
Stage-1 depends on stages: Stage-2
STAGE PLANS:
Stage: Stage-2
Map Reduce
Alias -> Map Operator Tree:
zz:reduced:mapout:sample
TableScan
alias: sample
Select Operator
expressions:
expr: key
type: string
expr: val
type: string
outputColumnNames: _col0, _col1
Transform Operator
command: cat
output info:
input format: org.apache.hadoop.mapred.TextInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Select Operator
expressions:
expr: _col0
type: string
expr: _col1
type: string
outputColumnNames: _col0, _col1
Filter Operator
predicate:
expr: ((_col0 = 'a') or (_col0 = 'b'))
type: boolean
Select Operator
expressions:
expr: _col0
type: string
expr: _col1
type: string
outputColumnNames: _col0, _col1
File Output Operator
compressed: false
GlobalTableId: 1
table:
input format:
org.apache.hadoop.mapred.TextInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Filter Operator
predicate:
expr: ((_col0 = 'c') or (_col0 = 'd'))
type: boolean
Select Operator
expressions:
expr: _col0
type: string
expr: _col1
type: string
outputColumnNames: _col0, _col1
File Output Operator
compressed: false
GlobalTableId: 2
table:
input format:
org.apache.hadoop.mapred.TextInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Stage: Stage-0
Move Operator
files:
hdfs directory: false
destination: /tmp/a
Stage: Stage-1
Move Operator
files:
hdfs directory: false
destination: /tmp/b
{code}
> Incorrect results when using subquery in multi table insert
> -----------------------------------------------------------
>
> Key: HIVE-4598
> URL: https://issues.apache.org/jira/browse/HIVE-4598
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.10.0, 0.11.0
> Reporter: Sebastian
>
> I'm using a multi table insert like this:
> FROM <x>
> INSERT INTO TABLE t PARTITION (type='x')
> SELECT * WHERE type='x'
> INSERT INTO TABLE t PARTITION (type='y')
> SELECT * WHERE type='y';
> Now when <x> is the name of a table, everything works as expected.
> However if I use a subquery as <x>, the query runs but it inserts all results
> from the subquery into each partition, as if there were no "WHERE" clauses in
> the selects.
--
This message was sent by Atlassian JIRA
(v6.1#6144)