[
https://issues.apache.org/jira/browse/HIVE-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14144025#comment-14144025
]
Sergey Shelukhin commented on HIVE-8225:
----------------------------------------
Minimum query to reproduce the issue:
{noformat}
select unionsrc.key FROM (select 'tst1' as key, count(1) as value from src s1)
unionsrc;
{noformat} - returns 500 rows of tst1 whereas it should return just 1.
If you add value to select list -
{noformat}
select unionsrc.key,unionsrc.value FROM (select 'tst1' as key, count(1) as
value from src s1) unionsrc;
{noformat} the problem disappears.
ASTs for both queries, both before and after CBO differ only in addition of the
last select expression; however, as is obvious from the below, in CBO the
absence of said expression causes the count to be completely gone.
Initial AST for the 2nd query (correct result):
{noformat}
TOK_QUERY
TOK_FROM
TOK_SUBQUERY
TOK_QUERY
TOK_FROM
TOK_TABREF
TOK_TABNAME
src
s1
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
'tst1'
key
TOK_SELEXPR
TOK_FUNCTION
count
1
value
unionsrc
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
.
TOK_TABLE_OR_COL
unionsrc
key
TOK_SELEXPR
.
TOK_TABLE_OR_COL
unionsrc
value
{noformat}
Post-CBO
{noformat}
TOK_QUERY
TOK_FROM
TOK_SUBQUERY
TOK_QUERY
TOK_FROM
TOK_SUBQUERY
TOK_QUERY
TOK_FROM
TOK_TABREF
TOK_TABNAME
default
src
s1
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
0
DUMMY
$hdt$_2
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
1
$f0
$hdt$_3
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
'tst1'
unionsrc.key
TOK_SELEXPR
TOK_FUNCTION
count
.
TOK_TABLE_OR_COL
$hdt$_3
$f0
unionsrc.value
{noformat}
Note where count is... in key-only query, where this SELEXPER is gone, there's
no count, so result changes.
> CBO trunk merge: union11 test fails due to incorrect plan
> ---------------------------------------------------------
>
> Key: HIVE-8225
> URL: https://issues.apache.org/jira/browse/HIVE-8225
> Project: Hive
> Issue Type: Bug
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
>
> The result changes to as if the union didn't have count() inside. The issue
> can be fixed by using srcunion.value outside the subquery in count (replace
> count(1) with count(srcunion.value)). Otherwise, it looks like count(1) node
> from union-ed queries is not present in AST at all, which might cause this
> result.
> Interestingly, adding group by to each query in a union produces completely
> weird result (count(1) is 309 for each key, whereas it should be 1 and the
> "logical" incorrect value if internal count is lost is 500)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)