[ 
https://issues.apache.org/jira/browse/HIVE-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach reopened HIVE-1723:
----------------------------------


> The result of left semi join is not correct
> -------------------------------------------
>
>                 Key: HIVE-1723
>                 URL: https://issues.apache.org/jira/browse/HIVE-1723
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>
> In the test case semijoin.q, there is a query:
> select /*+ mapjoin(b) */ a.key from t3 a left semi join t1 b on a.key = b.key 
> sort by a.key;
> I think this query will return a wrong result if table t1 is larger than 
> 25000 different keys
> To be simple, I tried a very similar query:
> select /*+ mapjoin(b) */ a.key from test_semijoin a left semi join 
> test_semijoin b on a.key = b.key sort by a.key;
> The table of test_semijoin is like
> 0     0
> 1     1
> 2     2
> 3     3
> 4     4
> 5     5
> ...    ...
> ...          ....
> 25000   25000
> 25001   25001
> ...          ....
> ...          ....
> 25999   25999
> 26000   26000
> So we can easily estimate the correct result of this query should be the same 
> keys from table test_semijoin itsel.
> Actually, the result is only part of that: only from 0 to 24544.
> 0
> 1
> 2
> ..
> ..
> 24543
> 24544

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to