[ https://issues.apache.org/jira/browse/HIVE-20266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vineet Garg updated HIVE-20266: ------------------------------- Description: {code:sql} CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE); {code} {code:sql} explain INSERT INTO tablePartitioned partition(p1, p2) select key, value, value, key as p1, 3 as p2 from src limit 10; {code} *Without CBO* {noformat} Map 1 Map Operator Tree: TableScan alias: src Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string), value (type: string), value (type: string), key (type: string), 3 (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4 Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE Limit Number of rows: 10 Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: int) {noformat} *With CBO* {noformat} Map 1 Map Operator Tree: TableScan alias: src Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string), value (type: string), value (type: string), key (type: string) outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE Limit Number of rows: 10 Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string) {noformat} CBO has 4 columns being shuffled as compared to 3 in non-cbo was: {code:sql} CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE); {code} {code:sql} explain INSERT INTO tablePartitioned partition(p1, p2) select key, value, value, key as p1, 3 as p2 from src limit 10; {code} > Extra column is being shuffled in cbo as compared to non-cbo > ------------------------------------------------------------ > > Key: HIVE-20266 > URL: https://issues.apache.org/jira/browse/HIVE-20266 > Project: Hive > Issue Type: Improvement > Components: Query Planning > Reporter: Vineet Garg > Assignee: Vineet Garg > Priority: Major > > {code:sql} > CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING > NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE); > {code} > {code:sql} > explain INSERT INTO tablePartitioned partition(p1, p2) select key, value, > value, key as p1, 3 as p2 from src limit 10; > {code} > *Without CBO* > {noformat} > Map 1 > Map Operator Tree: > TableScan > alias: src > Statistics: Num rows: 2500 Data size: 26560 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: key (type: string), value (type: string), > value (type: string), key (type: string), 3 (type: int) > outputColumnNames: _col0, _col1, _col2, _col3, _col4 > Statistics: Num rows: 2500 Data size: 26560 Basic stats: > COMPLETE Column stats: NONE > Limit > Number of rows: 10 > Statistics: Num rows: 10 Data size: 100 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 10 Data size: 100 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string), _col1 (type: > string), _col2 (type: string), _col3 (type: string), _col4 (type: int) > {noformat} > *With CBO* > {noformat} > Map 1 > Map Operator Tree: > TableScan > alias: src > Statistics: Num rows: 2500 Data size: 26560 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: key (type: string), value (type: string), > value (type: string), key (type: string) > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 2500 Data size: 26560 Basic stats: > COMPLETE Column stats: NONE > Limit > Number of rows: 10 > Statistics: Num rows: 10 Data size: 100 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 10 Data size: 100 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string), _col1 (type: > string), _col2 (type: string), _col3 (type: string) > {noformat} > CBO has 4 columns being shuffled as compared to 3 in non-cbo -- This message was sent by Atlassian JIRA (v7.6.3#76005)