[GitHub] spark pull request #16760: [SPARK-18872][SQL][TESTS] New test cases for EXIS...

2017-02-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16760


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16760: [SPARK-18872][SQL][TESTS] New test cases for EXIS...

2017-01-31 Thread nsyca
Github user nsyca commented on a diff in the pull request:

https://github.com/apache/spark/pull/16760#discussion_r98794587
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-aggregate.sql.out
 ---
@@ -0,0 +1,183 @@
+-- Automatically generated by SQLQueryTestSuite
+-- Number of queries: 11
+
+
+-- !query 0
+CREATE TEMPORARY VIEW EMP AS SELECT * FROM VALUES
+  (100, "emp 1", date "2005-01-01", 100.00D, 10),
+  (100, "emp 1", date "2005-01-01", 100.00D, 10),
+  (200, "emp 2", date "2003-01-01", 200.00D, 10),
+  (300, "emp 3", date "2002-01-01", 300.00D, 20),
+  (400, "emp 4", date "2005-01-01", 400.00D, 30),
+  (500, "emp 5", date "2001-01-01", 400.00D, NULL),
+  (600, "emp 6 - no dept", date "2001-01-01", 400.00D, 100),
+  (700, "emp 7", date "2010-01-01", 400.00D, 100),
+  (800, "emp 8", date "2016-01-01", 150.00D, 70)
+AS EMP(id, emp_name, hiredate, salary, dept_id)
+-- !query 0 schema
+struct<>
+-- !query 0 output
+
+
+
+-- !query 1
+CREATE TEMPORARY VIEW DEPT AS SELECT * FROM VALUES
+  (10, "dept 1", "CA"),
+  (20, "dept 2", "NY"),
+  (30, "dept 3", "TX"),
+  (40, "dept 4 - unassigned", "OR"),
+  (50, "dept 5 - unassigned", "NJ"),
+  (70, "dept 7", "FL")
+AS DEPT(dept_id, dept_name, state)
+-- !query 1 schema
+struct<>
+-- !query 1 output
+
+
+
+-- !query 2
+CREATE TEMPORARY VIEW BONUS AS SELECT * FROM VALUES
+  ("emp 1", 10.00D),
+  ("emp 1", 20.00D),
+  ("emp 2", 300.00D),
+  ("emp 2", 100.00D),
+  ("emp 3", 300.00D),
+  ("emp 4", 100.00D),
+  ("emp 5", 1000.00D),
+  ("emp 6 - no dept", 500.00D)
+AS BONUS(emp_name, bonus_amt)
+-- !query 2 schema
+struct<>
+-- !query 2 output
+
+
+
+-- !query 3
+SELECT emp.dept_id, 
+   avg(salary),
+   sum(salary)
+FROM   emp 
+WHERE  EXISTS (SELECT state 
+   FROM   dept 
+   WHERE  dept.dept_id = emp.dept_id) 
+GROUP  BY dept_id
+-- !query 3 schema
+struct
+-- !query 3 output
+10 133.34  400.0
+20 300.0   300.0
+30 400.0   400.0
+70 150.0   150.0
+
+
+-- !query 4
+SELECT emp_name 
+FROM   emp 
+WHERE  EXISTS (SELECT max(dept.dept_id) a 
+   FROM   dept 
+   WHERE  dept.dept_id = emp.dept_id 
+   GROUP  BY dept.dept_id)
+-- !query 4 schema
+struct
+-- !query 4 output
+emp 1
+emp 1
+emp 2
+emp 3
+emp 4
+emp 8
+
+
+-- !query 5
+SELECT count(*) 
+FROM   emp 
+WHERE  EXISTS (SELECT max(dept.dept_id) a 
+   FROM   dept 
+   WHERE  dept.dept_id = emp.dept_id 
+   GROUP  BY dept.dept_id)
+-- !query 5 schema
+struct
+-- !query 5 output
+6
+
+
+-- !query 6
+SELECT * 
+FROM   bonus 
+WHERE  EXISTS (SELECT 1 
+   FROM   emp 
+   WHERE  emp.emp_name = bonus.emp_name 
+  AND EXISTS (SELECT max(dept.dept_id) 
+  FROM   dept 
+  WHERE  emp.dept_id = dept.dept_id 
+  GROUP  BY dept.dept_id))
+-- !query 6 schema
+struct
+-- !query 6 output
+emp 1  10.0
+emp 1  20.0
+emp 2  100.0
+emp 2  300.0
+emp 3  300.0
+emp 4  100.0
+
+
+-- !query 7
+SELECT emp.dept_id, 
+   Avg(salary), 
+   Sum(salary) 
+FROM   emp 
+WHERE  NOT EXISTS (SELECT state 
+   FROM   dept 
+   WHERE  dept.dept_id = emp.dept_id) 
+GROUP  BY dept_id
+-- !query 7 schema
+struct
+-- !query 7 output
+100400.0   800.0
+NULL   400.0   400.0
+
+
+-- !query 8
+SELECT emp_name 
+FROM   emp 
+WHERE  NOT EXISTS (SELECT max(dept.dept_id) a 
+   FROM   dept 
+   WHERE  dept.dept_id = emp.dept_id 
+   GROUP  BY dept.dept_id)
+-- !query 8 schema
+struct
+-- !query 8 output
+emp 5
+emp 6 - no dept
+emp 7
+
+
+-- !query 9
+SELECT count(*) 
+FROM   emp 
+WHERE  NOT EXISTS (SELECT max(dept.dept_id) a 
+   FROM   dept 
+   WHERE  dept.dept_id = emp.dept_id 
+   GROUP  BY dept.dept_id)
+-- !query 9 schema
+struct
+-- !query 9 output
+3
+
+
+-- !query 10
+SELECT * 
+FROM   bonus 
+WHERE  NOT EXISTS (SELECT 1 
+   FROM   emp 
+   WHERE  emp.emp_name = bonus.emp_name 
+  AND EXISTS (SELECT Max(dept.dept_id) 
+

[GitHub] spark pull request #16760: [SPARK-18872][SQL][TESTS] New test cases for EXIS...

2017-01-31 Thread nsyca
Github user nsyca commented on a diff in the pull request:

https://github.com/apache/spark/pull/16760#discussion_r98794624
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-having.sql.out
 ---
@@ -0,0 +1,153 @@
+-- Automatically generated by SQLQueryTestSuite
+-- Number of queries: 8
+
+
+-- !query 0
+CREATE TEMPORARY VIEW EMP AS SELECT * FROM VALUES
+  (100, "emp 1", date "2005-01-01", 100.00D, 10),
+  (100, "emp 1", date "2005-01-01", 100.00D, 10),
+  (200, "emp 2", date "2003-01-01", 200.00D, 10),
+  (300, "emp 3", date "2002-01-01", 300.00D, 20),
+  (400, "emp 4", date "2005-01-01", 400.00D, 30),
+  (500, "emp 5", date "2001-01-01", 400.00D, NULL),
+  (600, "emp 6 - no dept", date "2001-01-01", 400.00D, 100),
+  (700, "emp 7", date "2010-01-01", 400.00D, 100),
+  (800, "emp 8", date "2016-01-01", 150.00D, 70)
+AS EMP(id, emp_name, hiredate, salary, dept_id)
+-- !query 0 schema
+struct<>
+-- !query 0 output
+
+
+
+-- !query 1
+CREATE TEMPORARY VIEW DEPT AS SELECT * FROM VALUES
+  (10, "dept 1", "CA"),
+  (20, "dept 2", "NY"),
+  (30, "dept 3", "TX"),
+  (40, "dept 4 - unassigned", "OR"),
+  (50, "dept 5 - unassigned", "NJ"),
+  (70, "dept 7", "FL")
+AS DEPT(dept_id, dept_name, state)
+-- !query 1 schema
+struct<>
+-- !query 1 output
+
+
+
+-- !query 2
+CREATE TEMPORARY VIEW BONUS AS SELECT * FROM VALUES
+  ("emp 1", 10.00D),
+  ("emp 1", 20.00D),
+  ("emp 2", 300.00D),
+  ("emp 2", 100.00D),
+  ("emp 3", 300.00D),
+  ("emp 4", 100.00D),
+  ("emp 5", 1000.00D),
+  ("emp 6 - no dept", 500.00D)
+AS BONUS(emp_name, bonus_amt)
+-- !query 2 schema
+struct<>
+-- !query 2 output
+
+
+
+-- !query 3
+SELECT dept_id, count(*) 
+FROM   emp 
+GROUP  BY dept_id 
+HAVING EXISTS (SELECT 1 
+   FROM   bonus 
+   WHERE  bonus_amt < min(emp.salary))
+-- !query 3 schema
+struct
+-- !query 3 output
+10 3
+1002
+20 1
+30 1
+70 1
+NULL   1
+
+
+-- !query 4
+SELECT * 
+FROM   dept 
+WHERE  EXISTS (SELECT dept_id, 
+  Count(*) 
+   FROM   emp 
+   GROUP  BY dept_id 
+   HAVING EXISTS (SELECT 1 
+  FROM   bonus 
+  WHERE bonus_amt < Min(emp.salary)))
+-- !query 4 schema
+struct
+-- !query 4 output
+10 dept 1  CA
+20 dept 2  NY
+30 dept 3  TX
+40 dept 4 - unassigned OR
+50 dept 5 - unassigned NJ
+70 dept 7  FL
+
+
+-- !query 5
+SELECT dept_id, 
+   Max(salary) 
+FROM   emp gp 
+WHERE  EXISTS (SELECT dept_id, 
+  Count(*) 
+   FROM   emp p
+   GROUP  BY dept_id 
+   HAVING EXISTS (SELECT 1 
+  FROM   bonus 
+  WHERE  bonus_amt < Min(p.salary))) 
+GROUP  BY gp.dept_id
+-- !query 5 schema
+struct
+-- !query 5 output
+10 200.0
+100400.0
+20 300.0
+30 400.0
+70 150.0
+NULL   400.0
+
+
+-- !query 6
+SELECT * 
+FROM   dept 
+WHERE  EXISTS (SELECT dept_id, 
+Count(*) 
+ FROM   emp 
+ GROUP  BY dept_id 
+ HAVING EXISTS (SELECT 1 
+FROM   bonus 
+WHERE  bonus_amt > Min(emp.salary)))
+-- !query 6 schema
+struct
+-- !query 6 output
+10 dept 1  CA
+20 dept 2  NY
+30 dept 3  TX
+40 dept 4 - unassigned OR
+50 dept 5 - unassigned NJ
+70 dept 7  FL
+
+
+-- !query 7
+SELECT * 
+FROM   dept 
+WHERE  EXISTS (SELECT dept_id, 
+  count(emp.dept_id)
+   FROM   emp 
+   WHERE  dept.dept_id = dept_id 
+   GROUP  BY dept_id 
+   HAVING EXISTS (SELECT 1 
+  FROM   bonus 
+  WHERE  ( bonus_amt > min(emp.salary) 
+   AND count(emp.dept_id) > 1 )))
+-- !query 7 schema
+struct
+-- !query 7 output
+10 dept 1  CA
--- End diff --

I have compared the result set matched with the result from DB2.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request #16760: [SPARK-18872][SQL][TESTS] New test cases for EXIS...

2017-01-31 Thread nsyca
Github user nsyca commented on a diff in the pull request:

https://github.com/apache/spark/pull/16760#discussion_r98794661
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-orderby-limit.sql.out
 ---
@@ -0,0 +1,222 @@
+-- Automatically generated by SQLQueryTestSuite
+-- Number of queries: 12
+
+
+-- !query 0
+CREATE TEMPORARY VIEW EMP AS SELECT * FROM VALUES
+  (100, "emp 1", date "2005-01-01", 100.00D, 10),
+  (100, "emp 1", date "2005-01-01", 100.00D, 10),
+  (200, "emp 2", date "2003-01-01", 200.00D, 10),
+  (300, "emp 3", date "2002-01-01", 300.00D, 20),
+  (400, "emp 4", date "2005-01-01", 400.00D, 30),
+  (500, "emp 5", date "2001-01-01", 400.00D, NULL),
+  (600, "emp 6 - no dept", date "2001-01-01", 400.00D, 100),
+  (700, "emp 7", date "2010-01-01", 400.00D, 100),
+  (800, "emp 8", date "2016-01-01", 150.00D, 70)
+AS EMP(id, emp_name, hiredate, salary, dept_id)
+-- !query 0 schema
+struct<>
+-- !query 0 output
+
+
+
+-- !query 1
+CREATE TEMPORARY VIEW DEPT AS SELECT * FROM VALUES
+  (10, "dept 1", "CA"),
+  (20, "dept 2", "NY"),
+  (30, "dept 3", "TX"),
+  (40, "dept 4 - unassigned", "OR"),
+  (50, "dept 5 - unassigned", "NJ"),
+  (70, "dept 7", "FL")
+AS DEPT(dept_id, dept_name, state)
+-- !query 1 schema
+struct<>
+-- !query 1 output
+
+
+
+-- !query 2
+CREATE TEMPORARY VIEW BONUS AS SELECT * FROM VALUES
+  ("emp 1", 10.00D),
+  ("emp 1", 20.00D),
+  ("emp 2", 300.00D),
+  ("emp 2", 100.00D),
+  ("emp 3", 300.00D),
+  ("emp 4", 100.00D),
+  ("emp 5", 1000.00D),
+  ("emp 6 - no dept", 500.00D)
+AS BONUS(emp_name, bonus_amt)
+-- !query 2 schema
+struct<>
+-- !query 2 output
+
+
+
+-- !query 3
+SELECT * 
+FROM   emp 
+WHERE  EXISTS (SELECT dept.dept_id 
+   FROM   dept 
+   WHERE  emp.dept_id = dept.dept_id 
+   ORDER  BY state) 
+ORDER  BY hiredate
+-- !query 3 schema
+struct
+-- !query 3 output
+300emp 3   2002-01-01  300.0   20
+200emp 2   2003-01-01  200.0   10
+100emp 1   2005-01-01  100.0   10
+100emp 1   2005-01-01  100.0   10
+400emp 4   2005-01-01  400.0   30
+800emp 8   2016-01-01  150.0   70
+
+
+-- !query 4
+SELECT id, 
+   hiredate 
+FROM   emp 
+WHERE  EXISTS (SELECT dept.dept_id 
+   FROM   dept 
+   WHERE  emp.dept_id = dept.dept_id 
+   ORDER  BY state) 
+ORDER  BY hiredate DESC
+-- !query 4 schema
+struct
+-- !query 4 output
+8002016-01-01
+1002005-01-01
+1002005-01-01
+4002005-01-01
+2002003-01-01
+3002002-01-01
+
+
+-- !query 5
+SELECT * 
+FROM   emp 
+WHERE  NOT EXISTS (SELECT dept.dept_id 
+   FROM   dept 
+   WHERE  emp.dept_id = dept.dept_id 
+   ORDER  BY state) 
+ORDER  BY hiredate
+-- !query 5 schema
+struct
+-- !query 5 output
+500emp 5   2001-01-01  400.0   NULL
+600emp 6 - no dept 2001-01-01  400.0   100
+700emp 7   2010-01-01  400.0   100
+
+
+-- !query 6
+SELECT emp_name 
+FROM   emp 
+WHERE  NOT EXISTS (SELECT max(dept.dept_id) a 
+   FROM   dept 
+   WHERE  dept.dept_id = emp.dept_id 
+   GROUP  BY state 
+   ORDER  BY state)
+-- !query 6 schema
+struct
+-- !query 6 output
+emp 5
+emp 6 - no dept
+emp 7
+
+
+-- !query 7
+SELECT count(*) 
+FROM   emp 
+WHERE  NOT EXISTS (SELECT max(dept.dept_id) a 
+   FROM   dept 
+   WHERE  dept.dept_id = emp.dept_id 
+   GROUP  BY dept_id 
+   ORDER  BY dept_id)
+-- !query 7 schema
+struct
+-- !query 7 output
+3
+
+
+-- !query 8
+SELECT * 
+FROM   emp 
+WHERE  EXISTS (SELECT dept.dept_name 
+   FROM   dept 
+   WHERE  dept.dept_id > 10 
+   LIMIT  1)
+-- !query 8 schema
+struct
+-- !query 8 output
+100emp 1   2005-01-01  100.0   10
+100emp 1   2005-01-01  100.0   10
+200emp 2   2003-01-01  200.0   10
+300emp 3   2002-01-01  300.0   20
+400emp 4   2005-01-01  400.0   30
+500emp 5   2001-01-01  400.0   NULL
+600emp 6 - no dept 2001-01-01  400.0   100
+700emp 7   2010-01-01  400.0   100
+800emp 8   2016-01-01  150.0   

[GitHub] spark pull request #16760: [SPARK-18872][SQL][TESTS] New test cases for EXIS...

2017-01-31 Thread dilipbiswal
GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/16760

[SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery (Aggregate, 
Having, Orderby, Limit)



## What changes were proposed in this pull request?
This PR adds the second set of tests for EXISTS subquery.  

File name| Brief description
| -
exists-aggregate.sql  |Tests aggregate expressions in outer 
query and EXISTS subquery.
exists-having.sql|Tests HAVING clause in subquery.
exists-orderby-limit.sql|Tests EXISTS subquery support with ORDER BY and 
LIMIT clauses.

DB2 results are attached here as reference :


[exists-aggregate-db2.txt](https://github.com/apache/spark/files/743287/exists-aggregate-db2.txt)

[exists-having-db2.txt](https://github.com/apache/spark/files/743286/exists-having-db2.txt)

[exists-orderby-limit-db2.txt](https://github.com/apache/spark/files/743288/exists-orderby-limit-db2.txt)


(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark exists-pr2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16760.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16760






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org