[GitHub] spark pull request #16760: [SPARK-18872][SQL][TESTS] New test cases for EXIS...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16760 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16760: [SPARK-18872][SQL][TESTS] New test cases for EXIS...
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16760#discussion_r98794587 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-aggregate.sql.out --- @@ -0,0 +1,183 @@ +-- Automatically generated by SQLQueryTestSuite +-- Number of queries: 11 + + +-- !query 0 +CREATE TEMPORARY VIEW EMP AS SELECT * FROM VALUES + (100, "emp 1", date "2005-01-01", 100.00D, 10), + (100, "emp 1", date "2005-01-01", 100.00D, 10), + (200, "emp 2", date "2003-01-01", 200.00D, 10), + (300, "emp 3", date "2002-01-01", 300.00D, 20), + (400, "emp 4", date "2005-01-01", 400.00D, 30), + (500, "emp 5", date "2001-01-01", 400.00D, NULL), + (600, "emp 6 - no dept", date "2001-01-01", 400.00D, 100), + (700, "emp 7", date "2010-01-01", 400.00D, 100), + (800, "emp 8", date "2016-01-01", 150.00D, 70) +AS EMP(id, emp_name, hiredate, salary, dept_id) +-- !query 0 schema +struct<> +-- !query 0 output + + + +-- !query 1 +CREATE TEMPORARY VIEW DEPT AS SELECT * FROM VALUES + (10, "dept 1", "CA"), + (20, "dept 2", "NY"), + (30, "dept 3", "TX"), + (40, "dept 4 - unassigned", "OR"), + (50, "dept 5 - unassigned", "NJ"), + (70, "dept 7", "FL") +AS DEPT(dept_id, dept_name, state) +-- !query 1 schema +struct<> +-- !query 1 output + + + +-- !query 2 +CREATE TEMPORARY VIEW BONUS AS SELECT * FROM VALUES + ("emp 1", 10.00D), + ("emp 1", 20.00D), + ("emp 2", 300.00D), + ("emp 2", 100.00D), + ("emp 3", 300.00D), + ("emp 4", 100.00D), + ("emp 5", 1000.00D), + ("emp 6 - no dept", 500.00D) +AS BONUS(emp_name, bonus_amt) +-- !query 2 schema +struct<> +-- !query 2 output + + + +-- !query 3 +SELECT emp.dept_id, + avg(salary), + sum(salary) +FROM emp +WHERE EXISTS (SELECT state + FROM dept + WHERE dept.dept_id = emp.dept_id) +GROUP BY dept_id +-- !query 3 schema +struct +-- !query 3 output +10 133.34 400.0 +20 300.0 300.0 +30 400.0 400.0 +70 150.0 150.0 + + +-- !query 4 +SELECT emp_name +FROM emp +WHERE EXISTS (SELECT max(dept.dept_id) a + FROM dept + WHERE dept.dept_id = emp.dept_id + GROUP BY dept.dept_id) +-- !query 4 schema +struct +-- !query 4 output +emp 1 +emp 1 +emp 2 +emp 3 +emp 4 +emp 8 + + +-- !query 5 +SELECT count(*) +FROM emp +WHERE EXISTS (SELECT max(dept.dept_id) a + FROM dept + WHERE dept.dept_id = emp.dept_id + GROUP BY dept.dept_id) +-- !query 5 schema +struct +-- !query 5 output +6 + + +-- !query 6 +SELECT * +FROM bonus +WHERE EXISTS (SELECT 1 + FROM emp + WHERE emp.emp_name = bonus.emp_name + AND EXISTS (SELECT max(dept.dept_id) + FROM dept + WHERE emp.dept_id = dept.dept_id + GROUP BY dept.dept_id)) +-- !query 6 schema +struct +-- !query 6 output +emp 1 10.0 +emp 1 20.0 +emp 2 100.0 +emp 2 300.0 +emp 3 300.0 +emp 4 100.0 + + +-- !query 7 +SELECT emp.dept_id, + Avg(salary), + Sum(salary) +FROM emp +WHERE NOT EXISTS (SELECT state + FROM dept + WHERE dept.dept_id = emp.dept_id) +GROUP BY dept_id +-- !query 7 schema +struct +-- !query 7 output +100400.0 800.0 +NULL 400.0 400.0 + + +-- !query 8 +SELECT emp_name +FROM emp +WHERE NOT EXISTS (SELECT max(dept.dept_id) a + FROM dept + WHERE dept.dept_id = emp.dept_id + GROUP BY dept.dept_id) +-- !query 8 schema +struct +-- !query 8 output +emp 5 +emp 6 - no dept +emp 7 + + +-- !query 9 +SELECT count(*) +FROM emp +WHERE NOT EXISTS (SELECT max(dept.dept_id) a + FROM dept + WHERE dept.dept_id = emp.dept_id + GROUP BY dept.dept_id) +-- !query 9 schema +struct +-- !query 9 output +3 + + +-- !query 10 +SELECT * +FROM bonus +WHERE NOT EXISTS (SELECT 1 + FROM emp + WHERE emp.emp_name = bonus.emp_name + AND EXISTS (SELECT Max(dept.dept_id) +
[GitHub] spark pull request #16760: [SPARK-18872][SQL][TESTS] New test cases for EXIS...
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16760#discussion_r98794624 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-having.sql.out --- @@ -0,0 +1,153 @@ +-- Automatically generated by SQLQueryTestSuite +-- Number of queries: 8 + + +-- !query 0 +CREATE TEMPORARY VIEW EMP AS SELECT * FROM VALUES + (100, "emp 1", date "2005-01-01", 100.00D, 10), + (100, "emp 1", date "2005-01-01", 100.00D, 10), + (200, "emp 2", date "2003-01-01", 200.00D, 10), + (300, "emp 3", date "2002-01-01", 300.00D, 20), + (400, "emp 4", date "2005-01-01", 400.00D, 30), + (500, "emp 5", date "2001-01-01", 400.00D, NULL), + (600, "emp 6 - no dept", date "2001-01-01", 400.00D, 100), + (700, "emp 7", date "2010-01-01", 400.00D, 100), + (800, "emp 8", date "2016-01-01", 150.00D, 70) +AS EMP(id, emp_name, hiredate, salary, dept_id) +-- !query 0 schema +struct<> +-- !query 0 output + + + +-- !query 1 +CREATE TEMPORARY VIEW DEPT AS SELECT * FROM VALUES + (10, "dept 1", "CA"), + (20, "dept 2", "NY"), + (30, "dept 3", "TX"), + (40, "dept 4 - unassigned", "OR"), + (50, "dept 5 - unassigned", "NJ"), + (70, "dept 7", "FL") +AS DEPT(dept_id, dept_name, state) +-- !query 1 schema +struct<> +-- !query 1 output + + + +-- !query 2 +CREATE TEMPORARY VIEW BONUS AS SELECT * FROM VALUES + ("emp 1", 10.00D), + ("emp 1", 20.00D), + ("emp 2", 300.00D), + ("emp 2", 100.00D), + ("emp 3", 300.00D), + ("emp 4", 100.00D), + ("emp 5", 1000.00D), + ("emp 6 - no dept", 500.00D) +AS BONUS(emp_name, bonus_amt) +-- !query 2 schema +struct<> +-- !query 2 output + + + +-- !query 3 +SELECT dept_id, count(*) +FROM emp +GROUP BY dept_id +HAVING EXISTS (SELECT 1 + FROM bonus + WHERE bonus_amt < min(emp.salary)) +-- !query 3 schema +struct +-- !query 3 output +10 3 +1002 +20 1 +30 1 +70 1 +NULL 1 + + +-- !query 4 +SELECT * +FROM dept +WHERE EXISTS (SELECT dept_id, + Count(*) + FROM emp + GROUP BY dept_id + HAVING EXISTS (SELECT 1 + FROM bonus + WHERE bonus_amt < Min(emp.salary))) +-- !query 4 schema +struct +-- !query 4 output +10 dept 1 CA +20 dept 2 NY +30 dept 3 TX +40 dept 4 - unassigned OR +50 dept 5 - unassigned NJ +70 dept 7 FL + + +-- !query 5 +SELECT dept_id, + Max(salary) +FROM emp gp +WHERE EXISTS (SELECT dept_id, + Count(*) + FROM emp p + GROUP BY dept_id + HAVING EXISTS (SELECT 1 + FROM bonus + WHERE bonus_amt < Min(p.salary))) +GROUP BY gp.dept_id +-- !query 5 schema +struct +-- !query 5 output +10 200.0 +100400.0 +20 300.0 +30 400.0 +70 150.0 +NULL 400.0 + + +-- !query 6 +SELECT * +FROM dept +WHERE EXISTS (SELECT dept_id, +Count(*) + FROM emp + GROUP BY dept_id + HAVING EXISTS (SELECT 1 +FROM bonus +WHERE bonus_amt > Min(emp.salary))) +-- !query 6 schema +struct +-- !query 6 output +10 dept 1 CA +20 dept 2 NY +30 dept 3 TX +40 dept 4 - unassigned OR +50 dept 5 - unassigned NJ +70 dept 7 FL + + +-- !query 7 +SELECT * +FROM dept +WHERE EXISTS (SELECT dept_id, + count(emp.dept_id) + FROM emp + WHERE dept.dept_id = dept_id + GROUP BY dept_id + HAVING EXISTS (SELECT 1 + FROM bonus + WHERE ( bonus_amt > min(emp.salary) + AND count(emp.dept_id) > 1 ))) +-- !query 7 schema +struct +-- !query 7 output +10 dept 1 CA --- End diff -- I have compared the result set matched with the result from DB2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request #16760: [SPARK-18872][SQL][TESTS] New test cases for EXIS...
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16760#discussion_r98794661 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-orderby-limit.sql.out --- @@ -0,0 +1,222 @@ +-- Automatically generated by SQLQueryTestSuite +-- Number of queries: 12 + + +-- !query 0 +CREATE TEMPORARY VIEW EMP AS SELECT * FROM VALUES + (100, "emp 1", date "2005-01-01", 100.00D, 10), + (100, "emp 1", date "2005-01-01", 100.00D, 10), + (200, "emp 2", date "2003-01-01", 200.00D, 10), + (300, "emp 3", date "2002-01-01", 300.00D, 20), + (400, "emp 4", date "2005-01-01", 400.00D, 30), + (500, "emp 5", date "2001-01-01", 400.00D, NULL), + (600, "emp 6 - no dept", date "2001-01-01", 400.00D, 100), + (700, "emp 7", date "2010-01-01", 400.00D, 100), + (800, "emp 8", date "2016-01-01", 150.00D, 70) +AS EMP(id, emp_name, hiredate, salary, dept_id) +-- !query 0 schema +struct<> +-- !query 0 output + + + +-- !query 1 +CREATE TEMPORARY VIEW DEPT AS SELECT * FROM VALUES + (10, "dept 1", "CA"), + (20, "dept 2", "NY"), + (30, "dept 3", "TX"), + (40, "dept 4 - unassigned", "OR"), + (50, "dept 5 - unassigned", "NJ"), + (70, "dept 7", "FL") +AS DEPT(dept_id, dept_name, state) +-- !query 1 schema +struct<> +-- !query 1 output + + + +-- !query 2 +CREATE TEMPORARY VIEW BONUS AS SELECT * FROM VALUES + ("emp 1", 10.00D), + ("emp 1", 20.00D), + ("emp 2", 300.00D), + ("emp 2", 100.00D), + ("emp 3", 300.00D), + ("emp 4", 100.00D), + ("emp 5", 1000.00D), + ("emp 6 - no dept", 500.00D) +AS BONUS(emp_name, bonus_amt) +-- !query 2 schema +struct<> +-- !query 2 output + + + +-- !query 3 +SELECT * +FROM emp +WHERE EXISTS (SELECT dept.dept_id + FROM dept + WHERE emp.dept_id = dept.dept_id + ORDER BY state) +ORDER BY hiredate +-- !query 3 schema +struct +-- !query 3 output +300emp 3 2002-01-01 300.0 20 +200emp 2 2003-01-01 200.0 10 +100emp 1 2005-01-01 100.0 10 +100emp 1 2005-01-01 100.0 10 +400emp 4 2005-01-01 400.0 30 +800emp 8 2016-01-01 150.0 70 + + +-- !query 4 +SELECT id, + hiredate +FROM emp +WHERE EXISTS (SELECT dept.dept_id + FROM dept + WHERE emp.dept_id = dept.dept_id + ORDER BY state) +ORDER BY hiredate DESC +-- !query 4 schema +struct +-- !query 4 output +8002016-01-01 +1002005-01-01 +1002005-01-01 +4002005-01-01 +2002003-01-01 +3002002-01-01 + + +-- !query 5 +SELECT * +FROM emp +WHERE NOT EXISTS (SELECT dept.dept_id + FROM dept + WHERE emp.dept_id = dept.dept_id + ORDER BY state) +ORDER BY hiredate +-- !query 5 schema +struct +-- !query 5 output +500emp 5 2001-01-01 400.0 NULL +600emp 6 - no dept 2001-01-01 400.0 100 +700emp 7 2010-01-01 400.0 100 + + +-- !query 6 +SELECT emp_name +FROM emp +WHERE NOT EXISTS (SELECT max(dept.dept_id) a + FROM dept + WHERE dept.dept_id = emp.dept_id + GROUP BY state + ORDER BY state) +-- !query 6 schema +struct +-- !query 6 output +emp 5 +emp 6 - no dept +emp 7 + + +-- !query 7 +SELECT count(*) +FROM emp +WHERE NOT EXISTS (SELECT max(dept.dept_id) a + FROM dept + WHERE dept.dept_id = emp.dept_id + GROUP BY dept_id + ORDER BY dept_id) +-- !query 7 schema +struct +-- !query 7 output +3 + + +-- !query 8 +SELECT * +FROM emp +WHERE EXISTS (SELECT dept.dept_name + FROM dept + WHERE dept.dept_id > 10 + LIMIT 1) +-- !query 8 schema +struct +-- !query 8 output +100emp 1 2005-01-01 100.0 10 +100emp 1 2005-01-01 100.0 10 +200emp 2 2003-01-01 200.0 10 +300emp 3 2002-01-01 300.0 20 +400emp 4 2005-01-01 400.0 30 +500emp 5 2001-01-01 400.0 NULL +600emp 6 - no dept 2001-01-01 400.0 100 +700emp 7 2010-01-01 400.0 100 +800emp 8 2016-01-01 150.0
[GitHub] spark pull request #16760: [SPARK-18872][SQL][TESTS] New test cases for EXIS...
GitHub user dilipbiswal opened a pull request: https://github.com/apache/spark/pull/16760 [SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery (Aggregate, Having, Orderby, Limit) ## What changes were proposed in this pull request? This PR adds the second set of tests for EXISTS subquery. File name| Brief description | - exists-aggregate.sql |Tests aggregate expressions in outer query and EXISTS subquery. exists-having.sql|Tests HAVING clause in subquery. exists-orderby-limit.sql|Tests EXISTS subquery support with ORDER BY and LIMIT clauses. DB2 results are attached here as reference : [exists-aggregate-db2.txt](https://github.com/apache/spark/files/743287/exists-aggregate-db2.txt) [exists-having-db2.txt](https://github.com/apache/spark/files/743286/exists-having-db2.txt) [exists-orderby-limit-db2.txt](https://github.com/apache/spark/files/743288/exists-orderby-limit-db2.txt) (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dilipbiswal/spark exists-pr2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16760.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16760 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org