[ https://issues.apache.org/jira/browse/HIVE-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088261#comment-16088261 ]
Vineet Garg commented on HIVE-15758: ------------------------------------ Your example looks semantically correct to me. Current rewrite in hive for queries such as {code:sql} select * from part where p_size <> (select count(p_size) from part pp where part.p_type = pp.p_type) {code} only require one join (left outer join) but with the example you provided It will need two joins. Current rewrite is similar to your example but instead of row_id we use correlated columns to group by and to join. > Allow correlated scalar subqueries with aggregates which has non-equi join > predicates > ------------------------------------------------------------------------------------- > > Key: HIVE-15758 > URL: https://issues.apache.org/jira/browse/HIVE-15758 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer > Reporter: Vineet Garg > Assignee: Vineet Garg > Labels: sub-query > Attachments: HIVE-15758.1.patch, HIVE-15758.2.patch > > > Queries such as > {code} select * from part where p_size <> (select count(p_size) from part pp > where part.p_type <> pp.p_type); {code} are currently not allowed since HIVE > doesn't know how to rewrite such queries to preserve the correctness for > cases when there is zero row -- This message was sent by Atlassian JIRA (v6.4.14#64029)