Xia Zhao created HIVE-21545:
-------------------------------
Summary: Left Semi Join: right table can be referenced in select?
Key: HIVE-21545
URL: https://issues.apache.org/jira/browse/HIVE-21545
Project: Hive
Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Xia Zhao
In the below left semi join queries:
create table movies (id int, title string);
insert into movies values (1, "Toy Story"), (2, "Star Wars"), (3, "Harry
Potter");
create table ratings (user_id int, movie_id int, rating decimal(2,1));
insert into ratings values (1, 1, 5.0), (2, 1, 4.5), (1, 2, 5.0), (3, 3, 5.0);
Q1: without reference to table alias, the right hand table column user_id is
referenced in SELECT without prompting errors and results available. This is
NOT expected left semi join behaviour. "The restrictions of using LEFT SEMI
JOIN are that the right-hand-side table should only be referenced in the join
condition (ON-clause), but not in WHERE- or SELECT-clauses etc."
SELECT user_id, count(id) from movies
left SEMI JOIN ratings
on id=movie_id and rating = 5
group by user_id;
Q2: with reference to table alias, errors prompts when referencing right hand
table column user_id in SELECT. Expected left semi join behaviour.
SELECT r.user_id, count(m.id) from movies m
left SEMI JOIN ratings r
on m.id=r.movie_id and r.rating = 5
group by r.user_id;
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)