spark git commit: [SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery (Joins + CTE)
Repository: spark Updated Branches: refs/heads/master 5ad10c531 -> a8a139820 [SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery (Joins + CTE) ## What changes were proposed in this pull request? This PR adds the third and final set of tests for EXISTS subquery. File name| Brief description | - exists-cte.sql |Tests Exist subqueries referencing CTE exists-joins-and-set-ops.sql|Tests Exists subquery used in Joins (Both when joins occurs in outer and suquery blocks) DB2 results are attached here as reference : [exists-cte-db2.txt](https://github.com/apache/spark/files/752091/exists-cte-db2.txt) [exists-joins-and-set-ops-db2.txt](https://github.com/apache/spark/files/753283/exists-joins-and-set-ops-db2.txt) (updated) ## How was this patch tested? The test result is compared with the result run from another SQL engine (in this case is IBM DB2). If the result are equivalent, we assume the result is correct. Author: Dilip BiswalCloses #16802 from dilipbiswal/exists-pr3. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a8a13982 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a8a13982 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a8a13982 Branch: refs/heads/master Commit: a8a139820c4a77a0b017b621bec6273cc09c8476 Parents: 5ad10c5 Author: Dilip Biswal Authored: Wed Feb 15 17:34:05 2017 +0100 Committer: Herman van Hovell Committed: Wed Feb 15 17:34:05 2017 +0100 -- .../subquery/exists-subquery/exists-cte.sql | 142 .../exists-joins-and-set-ops.sql| 228 .../subquery/exists-subquery/exists-cte.sql.out | 200 ++ .../exists-joins-and-set-ops.sql.out| 363 +++ 4 files changed, 933 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/a8a13982/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-cte.sql -- diff --git a/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-cte.sql b/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-cte.sql new file mode 100644 index 000..c678483 --- /dev/null +++ b/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-cte.sql @@ -0,0 +1,142 @@ +-- Tests EXISTS subquery used along with +-- Common Table Expressions(CTE) + +CREATE TEMPORARY VIEW EMP AS SELECT * FROM VALUES + (100, "emp 1", date "2005-01-01", 100.00D, 10), + (100, "emp 1", date "2005-01-01", 100.00D, 10), + (200, "emp 2", date "2003-01-01", 200.00D, 10), + (300, "emp 3", date "2002-01-01", 300.00D, 20), + (400, "emp 4", date "2005-01-01", 400.00D, 30), + (500, "emp 5", date "2001-01-01", 400.00D, NULL), + (600, "emp 6 - no dept", date "2001-01-01", 400.00D, 100), + (700, "emp 7", date "2010-01-01", 400.00D, 100), + (800, "emp 8", date "2016-01-01", 150.00D, 70) +AS EMP(id, emp_name, hiredate, salary, dept_id); + +CREATE TEMPORARY VIEW DEPT AS SELECT * FROM VALUES + (10, "dept 1", "CA"), + (20, "dept 2", "NY"), + (30, "dept 3", "TX"), + (40, "dept 4 - unassigned", "OR"), + (50, "dept 5 - unassigned", "NJ"), + (70, "dept 7", "FL") +AS DEPT(dept_id, dept_name, state); + +CREATE TEMPORARY VIEW BONUS AS SELECT * FROM VALUES + ("emp 1", 10.00D), + ("emp 1", 20.00D), + ("emp 2", 300.00D), + ("emp 2", 100.00D), + ("emp 3", 300.00D), + ("emp 4", 100.00D), + ("emp 5", 1000.00D), + ("emp 6 - no dept", 500.00D) +AS BONUS(emp_name, bonus_amt); + +-- CTE used inside subquery with correlated condition +-- TC.01.01 +WITH bonus_cte + AS (SELECT * + FROM bonus + WHERE EXISTS (SELECT dept.dept_id, + emp.emp_name, + Max(salary), + Count(*) + FROM emp + JOIN dept + ON dept.dept_id = emp.dept_id + WHERE bonus.emp_name = emp.emp_name + GROUP BY dept.dept_id, +emp.emp_name + ORDER BY emp.emp_name)) +SELECT * +FROM bonus a +WHERE a.bonus_amt > 30 + AND EXISTS (SELECT 1 + FROM bonus_cte b + WHERE a.emp_name = b.emp_name); + +-- Inner join between two CTEs with correlated condition +-- TC.01.02 +WITH emp_cte + AS (SELECT * + FROM emp + WHERE id >= 100 +AND id <= 300), + dept_cte + AS (SELECT * + FROM dept + WHERE
spark git commit: [SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery (Aggregate, Having, Orderby, Limit)
Repository: spark Updated Branches: refs/heads/master c618ccdbe -> 64cae22f7 [SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery (Aggregate, Having, Orderby, Limit) ## What changes were proposed in this pull request? This PR adds the second set of tests for EXISTS subquery. File name| Brief description | - exists-aggregate.sql |Tests aggregate expressions in outer query and EXISTS subquery. exists-having.sql|Tests HAVING clause in subquery. exists-orderby-limit.sql|Tests EXISTS subquery support with ORDER BY and LIMIT clauses. DB2 results are attached here as reference : [exists-aggregate-db2.txt](https://github.com/apache/spark/files/743287/exists-aggregate-db2.txt) [exists-having-db2.txt](https://github.com/apache/spark/files/743286/exists-having-db2.txt) [exists-orderby-limit-db2.txt](https://github.com/apache/spark/files/743288/exists-orderby-limit-db2.txt) ## How the patch was tested. The test result is compared with the result run from another SQL engine (in this case is IBM DB2). If the result are equivalent, we assume the result is correct. Author: Dilip BiswalCloses #16760 from dilipbiswal/exists-pr2. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/64cae22f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/64cae22f Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/64cae22f Branch: refs/heads/master Commit: 64cae22f7cbba793e32d2c8ccb4b7981208070fd Parents: c618ccd Author: Dilip Biswal Authored: Thu Feb 9 00:31:51 2017 -0500 Committer: gatorsmile Committed: Thu Feb 9 00:31:51 2017 -0500 -- .../exists-subquery/exists-aggregate.sql| 115 ++ .../subquery/exists-subquery/exists-having.sql | 94 .../exists-subquery/exists-orderby-limit.sql| 118 ++ .../exists-subquery/exists-aggregate.sql.out| 183 +++ .../exists-subquery/exists-having.sql.out | 153 + .../exists-orderby-limit.sql.out| 222 +++ 6 files changed, 885 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/64cae22f/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-aggregate.sql -- diff --git a/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-aggregate.sql b/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-aggregate.sql new file mode 100644 index 000..b5f458f --- /dev/null +++ b/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-aggregate.sql @@ -0,0 +1,115 @@ +-- Tests aggregate expressions in outer query and EXISTS subquery. + +CREATE TEMPORARY VIEW EMP AS SELECT * FROM VALUES + (100, "emp 1", date "2005-01-01", 100.00D, 10), + (100, "emp 1", date "2005-01-01", 100.00D, 10), + (200, "emp 2", date "2003-01-01", 200.00D, 10), + (300, "emp 3", date "2002-01-01", 300.00D, 20), + (400, "emp 4", date "2005-01-01", 400.00D, 30), + (500, "emp 5", date "2001-01-01", 400.00D, NULL), + (600, "emp 6 - no dept", date "2001-01-01", 400.00D, 100), + (700, "emp 7", date "2010-01-01", 400.00D, 100), + (800, "emp 8", date "2016-01-01", 150.00D, 70) +AS EMP(id, emp_name, hiredate, salary, dept_id); + +CREATE TEMPORARY VIEW DEPT AS SELECT * FROM VALUES + (10, "dept 1", "CA"), + (20, "dept 2", "NY"), + (30, "dept 3", "TX"), + (40, "dept 4 - unassigned", "OR"), + (50, "dept 5 - unassigned", "NJ"), + (70, "dept 7", "FL") +AS DEPT(dept_id, dept_name, state); + +CREATE TEMPORARY VIEW BONUS AS SELECT * FROM VALUES + ("emp 1", 10.00D), + ("emp 1", 20.00D), + ("emp 2", 300.00D), + ("emp 2", 100.00D), + ("emp 3", 300.00D), + ("emp 4", 100.00D), + ("emp 5", 1000.00D), + ("emp 6 - no dept", 500.00D) +AS BONUS(emp_name, bonus_amt); + +-- Aggregate in outer query block. +-- TC.01.01 +SELECT emp.dept_id, + avg(salary), + sum(salary) +FROM emp +WHERE EXISTS (SELECT state + FROM dept + WHERE dept.dept_id = emp.dept_id) +GROUP BY dept_id; + +-- Aggregate in inner/subquery block +-- TC.01.02 +SELECT emp_name +FROM emp +WHERE EXISTS (SELECT max(dept.dept_id) a + FROM dept + WHERE dept.dept_id = emp.dept_id + GROUP BY dept.dept_id); + +-- Aggregate expression in both outer and inner query block. +-- TC.01.03 +SELECT count(*) +FROM emp +WHERE EXISTS (SELECT max(dept.dept_id) a + FROM dept + WHERE dept.dept_id = emp.dept_id + GROUP BY dept.dept_id); + +-- Nested exists with aggregate expression in inner most query
spark git commit: [SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery
Repository: spark Updated Branches: refs/heads/master f7c07db85 -> e2e7b12ce [SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery ## What changes were proposed in this pull request? This PR adds the first set of tests for EXISTS subquery. File name| Brief description | - exists-basic.sql |Tests EXISTS and NOT EXISTS subqueries with both correlated and local predicates. exists-within-and-or.sql|Tests EXISTS and NOT EXISTS subqueries embedded in AND or OR expression. DB2 results are attached here as reference : [exists-basic-db2.txt](https://github.com/apache/spark/files/733031/exists-basic-db2.txt) [exists-and-or-db2.txt](https://github.com/apache/spark/files/733030/exists-and-or-db2.txt) ## How was this patch tested? This patch is adding tests. Author: Dilip BiswalCloses #16710 from dilipbiswal/exist-basic. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e2e7b12c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e2e7b12c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e2e7b12c Branch: refs/heads/master Commit: e2e7b12ce8fdf9d0bf0b7fce9283018c7d805988 Parents: f7c07db Author: Dilip Biswal Authored: Sun Jan 29 12:51:59 2017 -0800 Committer: gatorsmile Committed: Sun Jan 29 12:51:59 2017 -0800 -- .../subquery/exists-subquery/exists-basic.sql | 123 +++ .../exists-subquery/exists-within-and-or.sql| 96 + .../exists-subquery/exists-basic.sql.out| 214 +++ .../exists-within-and-or.sql.out| 156 ++ 4 files changed, 589 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/e2e7b12c/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-basic.sql -- diff --git a/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-basic.sql b/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-basic.sql new file mode 100644 index 000..332e858 --- /dev/null +++ b/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-basic.sql @@ -0,0 +1,123 @@ +-- Tests EXISTS subquery support. Tests basic form +-- of EXISTS subquery (both EXISTS and NOT EXISTS) + +CREATE TEMPORARY VIEW EMP AS SELECT * FROM VALUES + (100, "emp 1", date "2005-01-01", 100.00D, 10), + (100, "emp 1", date "2005-01-01", 100.00D, 10), + (200, "emp 2", date "2003-01-01", 200.00D, 10), + (300, "emp 3", date "2002-01-01", 300.00D, 20), + (400, "emp 4", date "2005-01-01", 400.00D, 30), + (500, "emp 5", date "2001-01-01", 400.00D, NULL), + (600, "emp 6 - no dept", date "2001-01-01", 400.00D, 100), + (700, "emp 7", date "2010-01-01", 400.00D, 100), + (800, "emp 8", date "2016-01-01", 150.00D, 70) +AS EMP(id, emp_name, hiredate, salary, dept_id); + +CREATE TEMPORARY VIEW DEPT AS SELECT * FROM VALUES + (10, "dept 1", "CA"), + (20, "dept 2", "NY"), + (30, "dept 3", "TX"), + (40, "dept 4 - unassigned", "OR"), + (50, "dept 5 - unassigned", "NJ"), + (70, "dept 7", "FL") +AS DEPT(dept_id, dept_name, state); + +CREATE TEMPORARY VIEW BONUS AS SELECT * FROM VALUES + ("emp 1", 10.00D), + ("emp 1", 20.00D), + ("emp 2", 300.00D), + ("emp 2", 100.00D), + ("emp 3", 300.00D), + ("emp 4", 100.00D), + ("emp 5", 1000.00D), + ("emp 6 - no dept", 500.00D) +AS BONUS(emp_name, bonus_amt); + +-- uncorrelated exist query +-- TC.01.01 +SELECT * +FROM emp +WHERE EXISTS (SELECT 1 + FROM dept + WHERE dept.dept_id > 10 + AND dept.dept_id < 30); + +-- simple correlated predicate in exist subquery +-- TC.01.02 +SELECT * +FROM emp +WHERE EXISTS (SELECT dept.dept_name + FROM dept + WHERE emp.dept_id = dept.dept_id); + +-- correlated outer isnull predicate +-- TC.01.03 +SELECT * +FROM emp +WHERE EXISTS (SELECT dept.dept_name + FROM dept + WHERE emp.dept_id = dept.dept_id + OR emp.dept_id IS NULL); + +-- Simple correlation with a local predicate in outer query +-- TC.01.04 +SELECT * +FROM emp +WHERE EXISTS (SELECT dept.dept_name + FROM dept + WHERE emp.dept_id = dept.dept_id) + AND emp.id > 200; + +-- Outer references (emp.id) should not be pruned from outer plan +-- TC.01.05 +SELECT emp.emp_name +FROM emp +WHERE EXISTS (SELECT dept.state + FROM dept + WHERE emp.dept_id = dept.dept_id) + AND emp.id > 200; + +-- not exists with correlated predicate +-- TC.01.06 +SELECT * +FROM dept +WHERE NOT EXISTS