[jira] [Commented] (HIVE-4697) Subqueries with IN and NOT IN

2013-06-10 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679872#comment-13679872
 ] 

Brock Noland commented on HIVE-4697:


Matt,

Isn't this a duplicate of HIVE-1799 and HIVE-784?

Brock

 Subqueries with IN and NOT IN
 -

 Key: HIVE-4697
 URL: https://issues.apache.org/jira/browse/HIVE-4697
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Matthew Weaver
Assignee: Matthew Weaver
   Original Estimate: 840h
  Remaining Estimate: 840h

 h5. Functional Requirements
 * Support {{WHERE x IN (column subquery);}}
 ** {{column subquery}} returns one column, any number of rows.
 * Support {{WHERE x NOT IN (column subquery)}};
 * Support same types of subqueries in {{HAVING}}.
 ** E.g.
 {code:sql}
 SELECT key FROM t1 GROUP BY key 
 HAVING COUNT(value) IN (SELECT p FROM t2);  {code}
 * Correlated subqueries not supported, for now at least
 ** But still need to check for correlation, and bail if it occurs.
 ** Correlated subquery:
 *** A subquery that references a table that appears in a containing query 
 ([MySQL|http://dev.mysql.com/doc/refman/5.7/en/correlated-subqueries.html]), 
 thus requiring subquery evaluation to look outside its scope.
 *** The subquery depends on the outer query for its values, so the subquery 
 must be executed once for each row of the outer query.  Also known as 
 _repeating Subqueries_.
 h5. Tasks
 * Rewrite {{IN (column-subquery)}} as a {{LEFT SEMI JOIN}}.
 ** Not ready for public consumption.  In particular, no check for correlated 
 terms.
 ** With test queries.
 * Add check for correlated terms, return informative error message.
 * Rewrite {{WHERE NOT IN (column-subquery)}} as a {{LEFT OUTER JOIN}}.
 ** Return rows that don't match the right side
 * Rewrite subqueries in {{HAVING}}, using {{LEFT SEMI JOIN}} and {{LEFT OUTER 
 JOIN}} as above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4697) Subqueries with IN and NOT IN

2013-06-10 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679883#comment-13679883
 ] 

Brock Noland commented on HIVE-4697:


FWIW, I don't care what JIRA we take this issue forward with but let's link to 
any related issues and resolve any duplicates.

 Subqueries with IN and NOT IN
 -

 Key: HIVE-4697
 URL: https://issues.apache.org/jira/browse/HIVE-4697
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Matthew Weaver
Assignee: Matthew Weaver
   Original Estimate: 840h
  Remaining Estimate: 840h

 h5. Functional Requirements
 * Support {{WHERE x IN (column subquery);}}
 ** {{column subquery}} returns one column, any number of rows.
 * Support {{WHERE x NOT IN (column subquery)}};
 * Support same types of subqueries in {{HAVING}}.
 ** E.g.
 {code:sql}
 SELECT key FROM t1 GROUP BY key 
 HAVING COUNT(value) IN (SELECT p FROM t2);  {code}
 * Correlated subqueries not supported, for now at least
 ** But still need to check for correlation, and bail if it occurs.
 ** Correlated subquery:
 *** A subquery that references a table that appears in a containing query 
 ([MySQL|http://dev.mysql.com/doc/refman/5.7/en/correlated-subqueries.html]), 
 thus requiring subquery evaluation to look outside its scope.
 *** The subquery depends on the outer query for its values, so the subquery 
 must be executed once for each row of the outer query.  Also known as 
 _repeating Subqueries_.
 h5. Tasks
 * Rewrite {{IN (column-subquery)}} as a {{LEFT SEMI JOIN}}.
 ** Not ready for public consumption.  In particular, no check for correlated 
 terms.
 ** With test queries.
 * Add check for correlated terms, return informative error message.
 * Rewrite {{WHERE NOT IN (column-subquery)}} as a {{LEFT OUTER JOIN}}.
 ** Return rows that don't match the right side
 * Rewrite subqueries in {{HAVING}}, using {{LEFT SEMI JOIN}} and {{LEFT OUTER 
 JOIN}} as above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira