[ 
https://issues.apache.org/jira/browse/HIVE-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088261#comment-16088261
 ] 

Vineet Garg commented on HIVE-15758:
------------------------------------

Your example looks semantically correct to me. Current rewrite in hive for 
queries such as {code:sql} select * from part where p_size <> (select 
count(p_size) from part pp where part.p_type = pp.p_type) {code} only require 
one join (left outer join) but with the example you provided It will need two 
joins.
Current rewrite is similar to your example but instead of row_id we use 
correlated columns to group by and to join.

> Allow correlated scalar subqueries with aggregates which has non-equi join 
> predicates
> -------------------------------------------------------------------------------------
>
>                 Key: HIVE-15758
>                 URL: https://issues.apache.org/jira/browse/HIVE-15758
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Logical Optimizer
>            Reporter: Vineet Garg
>            Assignee: Vineet Garg
>              Labels: sub-query
>         Attachments: HIVE-15758.1.patch, HIVE-15758.2.patch
>
>
> Queries such as 
> {code} select * from part where p_size <> (select count(p_size) from part pp 
> where part.p_type <> pp.p_type); {code} are currently not allowed since HIVE 
> doesn't know how to rewrite such queries to preserve the correctness for 
> cases when there is zero row



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to