[jira] [Commented] (HIVE-4697) Subqueries with IN and NOT IN
[ https://issues.apache.org/jira/browse/HIVE-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679872#comment-13679872 ] Brock Noland commented on HIVE-4697: Matt, Isn't this a duplicate of HIVE-1799 and HIVE-784? Brock Subqueries with IN and NOT IN - Key: HIVE-4697 URL: https://issues.apache.org/jira/browse/HIVE-4697 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Matthew Weaver Assignee: Matthew Weaver Original Estimate: 840h Remaining Estimate: 840h h5. Functional Requirements * Support {{WHERE x IN (column subquery);}} ** {{column subquery}} returns one column, any number of rows. * Support {{WHERE x NOT IN (column subquery)}}; * Support same types of subqueries in {{HAVING}}. ** E.g. {code:sql} SELECT key FROM t1 GROUP BY key HAVING COUNT(value) IN (SELECT p FROM t2); {code} * Correlated subqueries not supported, for now at least ** But still need to check for correlation, and bail if it occurs. ** Correlated subquery: *** A subquery that references a table that appears in a containing query ([MySQL|http://dev.mysql.com/doc/refman/5.7/en/correlated-subqueries.html]), thus requiring subquery evaluation to look outside its scope. *** The subquery depends on the outer query for its values, so the subquery must be executed once for each row of the outer query. Also known as _repeating Subqueries_. h5. Tasks * Rewrite {{IN (column-subquery)}} as a {{LEFT SEMI JOIN}}. ** Not ready for public consumption. In particular, no check for correlated terms. ** With test queries. * Add check for correlated terms, return informative error message. * Rewrite {{WHERE NOT IN (column-subquery)}} as a {{LEFT OUTER JOIN}}. ** Return rows that don't match the right side * Rewrite subqueries in {{HAVING}}, using {{LEFT SEMI JOIN}} and {{LEFT OUTER JOIN}} as above. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4697) Subqueries with IN and NOT IN
[ https://issues.apache.org/jira/browse/HIVE-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679883#comment-13679883 ] Brock Noland commented on HIVE-4697: FWIW, I don't care what JIRA we take this issue forward with but let's link to any related issues and resolve any duplicates. Subqueries with IN and NOT IN - Key: HIVE-4697 URL: https://issues.apache.org/jira/browse/HIVE-4697 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Matthew Weaver Assignee: Matthew Weaver Original Estimate: 840h Remaining Estimate: 840h h5. Functional Requirements * Support {{WHERE x IN (column subquery);}} ** {{column subquery}} returns one column, any number of rows. * Support {{WHERE x NOT IN (column subquery)}}; * Support same types of subqueries in {{HAVING}}. ** E.g. {code:sql} SELECT key FROM t1 GROUP BY key HAVING COUNT(value) IN (SELECT p FROM t2); {code} * Correlated subqueries not supported, for now at least ** But still need to check for correlation, and bail if it occurs. ** Correlated subquery: *** A subquery that references a table that appears in a containing query ([MySQL|http://dev.mysql.com/doc/refman/5.7/en/correlated-subqueries.html]), thus requiring subquery evaluation to look outside its scope. *** The subquery depends on the outer query for its values, so the subquery must be executed once for each row of the outer query. Also known as _repeating Subqueries_. h5. Tasks * Rewrite {{IN (column-subquery)}} as a {{LEFT SEMI JOIN}}. ** Not ready for public consumption. In particular, no check for correlated terms. ** With test queries. * Add check for correlated terms, return informative error message. * Rewrite {{WHERE NOT IN (column-subquery)}} as a {{LEFT OUTER JOIN}}. ** Return rows that don't match the right side * Rewrite subqueries in {{HAVING}}, using {{LEFT SEMI JOIN}} and {{LEFT OUTER JOIN}} as above. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira