Sergey Shelukhin created HIVE-8462: -------------------------------------- Summary: CBO duplicates columns Key: HIVE-8462 URL: https://issues.apache.org/jira/browse/HIVE-8462 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical
{noformat} select *, rank() over(partition by key order by value) as rr from src1 {noformat} Original plan appears to be incorrect: {noformat} HiveProjectRel(key=[$0], value=[$1], (tok_function rank (tok_windowspec (tok_partitioningspec (tok_distributeby (tok_table_or_col key)) (tok_orderby (tok_tabsortcolnameasc (tok_table_or_col value))))))=[$5], rr=[$5]) HiveProjectRel(key=[$0], value=[$1], block__offset__inside__file=[$2], input__file__name=[$3], row__id=[$4], (tok_function rank (tok_windowspec (tok_partitioningspec (tok_distributeby (tok_table_or_col key)) (tok_orderby (tok_tabsortcolnameasc (tok_table_or_col value))))))=[rank() OVER (PARTITION BY $0 ORDER BY $1 ROWS BETWEEN 2147483647 FOLLOWING AND 2147483647 PRECEDING)]) HiveTableScanRel(table=[[default.src1]]) {noformat} and final AST has {noformat} TOK_SELEXPR . TOK_TABLE_OR_COL $hdt$_0 (tok_function rank (tok_windowspec (tok_partitioningspec (tok_distributeby (tok_table_or_col key)) (tok_orderby (tok_tabsortcolnameasc (tok_table_or_col value)))))) (tok_function rank (tok_windowspec (tok_partitioningspec (tok_distributeby (tok_table_or_col key)) (tok_orderby (tok_tabsortcolnameasc (tok_table_or_col value)))))) TOK_SELEXPR . TOK_TABLE_OR_COL $hdt$_0 (tok_function rank (tok_windowspec (tok_partitioningspec (tok_distributeby (tok_table_or_col key)) (tok_orderby (tok_tabsortcolnameasc (tok_table_or_col value)))))) rr {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)