spark git commit: [SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery (Joins + CTE)

2017-02-15 Thread hvanhovell
Repository: spark
Updated Branches:
  refs/heads/master 5ad10c531 -> a8a139820


[SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery (Joins + CTE)

## What changes were proposed in this pull request?

This PR adds the third and final set of tests for EXISTS subquery.

File name| Brief description
| -
exists-cte.sql  |Tests Exist subqueries referencing CTE
exists-joins-and-set-ops.sql|Tests Exists subquery used in Joins (Both when 
joins occurs in outer and suquery blocks)

DB2 results are attached here as reference :

[exists-cte-db2.txt](https://github.com/apache/spark/files/752091/exists-cte-db2.txt)
[exists-joins-and-set-ops-db2.txt](https://github.com/apache/spark/files/753283/exists-joins-and-set-ops-db2.txt)
 (updated)

## How was this patch tested?
The test result is compared with the result run from another SQL engine (in 
this case is IBM DB2). If the result are equivalent, we assume the result is 
correct.

Author: Dilip Biswal 

Closes #16802 from dilipbiswal/exists-pr3.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a8a13982
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a8a13982
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a8a13982

Branch: refs/heads/master
Commit: a8a139820c4a77a0b017b621bec6273cc09c8476
Parents: 5ad10c5
Author: Dilip Biswal 
Authored: Wed Feb 15 17:34:05 2017 +0100
Committer: Herman van Hovell 
Committed: Wed Feb 15 17:34:05 2017 +0100

--
 .../subquery/exists-subquery/exists-cte.sql | 142 
 .../exists-joins-and-set-ops.sql| 228 
 .../subquery/exists-subquery/exists-cte.sql.out | 200 ++
 .../exists-joins-and-set-ops.sql.out| 363 +++
 4 files changed, 933 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a8a13982/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-cte.sql
--
diff --git 
a/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-cte.sql
 
b/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-cte.sql
new file mode 100644
index 000..c678483
--- /dev/null
+++ 
b/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-cte.sql
@@ -0,0 +1,142 @@
+-- Tests EXISTS subquery used along with 
+-- Common Table Expressions(CTE)
+
+CREATE TEMPORARY VIEW EMP AS SELECT * FROM VALUES
+  (100, "emp 1", date "2005-01-01", 100.00D, 10),
+  (100, "emp 1", date "2005-01-01", 100.00D, 10),
+  (200, "emp 2", date "2003-01-01", 200.00D, 10),
+  (300, "emp 3", date "2002-01-01", 300.00D, 20),
+  (400, "emp 4", date "2005-01-01", 400.00D, 30),
+  (500, "emp 5", date "2001-01-01", 400.00D, NULL),
+  (600, "emp 6 - no dept", date "2001-01-01", 400.00D, 100),
+  (700, "emp 7", date "2010-01-01", 400.00D, 100),
+  (800, "emp 8", date "2016-01-01", 150.00D, 70)
+AS EMP(id, emp_name, hiredate, salary, dept_id);
+
+CREATE TEMPORARY VIEW DEPT AS SELECT * FROM VALUES
+  (10, "dept 1", "CA"),
+  (20, "dept 2", "NY"),
+  (30, "dept 3", "TX"),
+  (40, "dept 4 - unassigned", "OR"),
+  (50, "dept 5 - unassigned", "NJ"),
+  (70, "dept 7", "FL")
+AS DEPT(dept_id, dept_name, state);
+
+CREATE TEMPORARY VIEW BONUS AS SELECT * FROM VALUES
+  ("emp 1", 10.00D),
+  ("emp 1", 20.00D),
+  ("emp 2", 300.00D),
+  ("emp 2", 100.00D),
+  ("emp 3", 300.00D),
+  ("emp 4", 100.00D),
+  ("emp 5", 1000.00D),
+  ("emp 6 - no dept", 500.00D)
+AS BONUS(emp_name, bonus_amt);
+
+-- CTE used inside subquery with correlated condition 
+-- TC.01.01 
+WITH bonus_cte 
+ AS (SELECT * 
+ FROM   bonus 
+ WHERE  EXISTS (SELECT dept.dept_id, 
+ emp.emp_name, 
+ Max(salary), 
+ Count(*) 
+  FROM   emp 
+ JOIN dept 
+   ON dept.dept_id = emp.dept_id 
+  WHERE  bonus.emp_name = emp.emp_name 
+  GROUP  BY dept.dept_id, 
+emp.emp_name 
+  ORDER  BY emp.emp_name)) 
+SELECT * 
+FROM   bonus a 
+WHERE  a.bonus_amt > 30 
+   AND EXISTS (SELECT 1 
+   FROM   bonus_cte b 
+   WHERE  a.emp_name = b.emp_name); 
+
+-- Inner join between two CTEs with correlated condition
+-- TC.01.02
+WITH emp_cte 
+ AS (SELECT * 
+ FROM   emp 
+ WHERE  id >= 100 
+AND id <= 300), 
+ dept_cte 
+ AS (SELECT * 
+ FROM   dept 
+ WHERE  

spark git commit: [SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery (Aggregate, Having, Orderby, Limit)

2017-02-08 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/master c618ccdbe -> 64cae22f7


[SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery (Aggregate, 
Having, Orderby, Limit)

## What changes were proposed in this pull request?
This PR adds the second set of tests for EXISTS subquery.

File name| Brief description
| -
exists-aggregate.sql  |Tests aggregate expressions in outer query 
and EXISTS subquery.
exists-having.sql|Tests HAVING clause in subquery.
exists-orderby-limit.sql|Tests EXISTS subquery support with ORDER BY and LIMIT 
clauses.

DB2 results are attached here as reference :

[exists-aggregate-db2.txt](https://github.com/apache/spark/files/743287/exists-aggregate-db2.txt)
[exists-having-db2.txt](https://github.com/apache/spark/files/743286/exists-having-db2.txt)
[exists-orderby-limit-db2.txt](https://github.com/apache/spark/files/743288/exists-orderby-limit-db2.txt)

##  How the patch was tested.
The test result is compared with the result run from another SQL engine (in 
this case is IBM DB2). If the result are equivalent, we assume the result is 
correct.

Author: Dilip Biswal 

Closes #16760 from dilipbiswal/exists-pr2.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/64cae22f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/64cae22f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/64cae22f

Branch: refs/heads/master
Commit: 64cae22f7cbba793e32d2c8ccb4b7981208070fd
Parents: c618ccd
Author: Dilip Biswal 
Authored: Thu Feb 9 00:31:51 2017 -0500
Committer: gatorsmile 
Committed: Thu Feb 9 00:31:51 2017 -0500

--
 .../exists-subquery/exists-aggregate.sql| 115 ++
 .../subquery/exists-subquery/exists-having.sql  |  94 
 .../exists-subquery/exists-orderby-limit.sql| 118 ++
 .../exists-subquery/exists-aggregate.sql.out| 183 +++
 .../exists-subquery/exists-having.sql.out   | 153 +
 .../exists-orderby-limit.sql.out| 222 +++
 6 files changed, 885 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/64cae22f/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-aggregate.sql
--
diff --git 
a/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-aggregate.sql
 
b/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-aggregate.sql
new file mode 100644
index 000..b5f458f
--- /dev/null
+++ 
b/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-aggregate.sql
@@ -0,0 +1,115 @@
+-- Tests aggregate expressions in outer query and EXISTS subquery.
+
+CREATE TEMPORARY VIEW EMP AS SELECT * FROM VALUES
+  (100, "emp 1", date "2005-01-01", 100.00D, 10),
+  (100, "emp 1", date "2005-01-01", 100.00D, 10),
+  (200, "emp 2", date "2003-01-01", 200.00D, 10),
+  (300, "emp 3", date "2002-01-01", 300.00D, 20),
+  (400, "emp 4", date "2005-01-01", 400.00D, 30),
+  (500, "emp 5", date "2001-01-01", 400.00D, NULL),
+  (600, "emp 6 - no dept", date "2001-01-01", 400.00D, 100),
+  (700, "emp 7", date "2010-01-01", 400.00D, 100),
+  (800, "emp 8", date "2016-01-01", 150.00D, 70)
+AS EMP(id, emp_name, hiredate, salary, dept_id);
+
+CREATE TEMPORARY VIEW DEPT AS SELECT * FROM VALUES
+  (10, "dept 1", "CA"),
+  (20, "dept 2", "NY"),
+  (30, "dept 3", "TX"),
+  (40, "dept 4 - unassigned", "OR"),
+  (50, "dept 5 - unassigned", "NJ"),
+  (70, "dept 7", "FL")
+AS DEPT(dept_id, dept_name, state);
+
+CREATE TEMPORARY VIEW BONUS AS SELECT * FROM VALUES
+  ("emp 1", 10.00D),
+  ("emp 1", 20.00D),
+  ("emp 2", 300.00D),
+  ("emp 2", 100.00D),
+  ("emp 3", 300.00D),
+  ("emp 4", 100.00D),
+  ("emp 5", 1000.00D),
+  ("emp 6 - no dept", 500.00D)
+AS BONUS(emp_name, bonus_amt);
+
+-- Aggregate in outer query block.
+-- TC.01.01
+SELECT emp.dept_id, 
+   avg(salary),
+   sum(salary)
+FROM   emp 
+WHERE  EXISTS (SELECT state 
+   FROM   dept 
+   WHERE  dept.dept_id = emp.dept_id) 
+GROUP  BY dept_id; 
+
+-- Aggregate in inner/subquery block
+-- TC.01.02
+SELECT emp_name 
+FROM   emp 
+WHERE  EXISTS (SELECT max(dept.dept_id) a 
+   FROM   dept 
+   WHERE  dept.dept_id = emp.dept_id 
+   GROUP  BY dept.dept_id); 
+
+-- Aggregate expression in both outer and inner query block.
+-- TC.01.03
+SELECT count(*) 
+FROM   emp 
+WHERE  EXISTS (SELECT max(dept.dept_id) a 
+   FROM   dept 
+   WHERE  dept.dept_id = emp.dept_id 
+   GROUP  BY dept.dept_id); 
+
+-- Nested exists with aggregate expression in inner most query 

spark git commit: [SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery

2017-01-29 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/master f7c07db85 -> e2e7b12ce


[SPARK-18872][SQL][TESTS] New test cases for EXISTS subquery

## What changes were proposed in this pull request?
This PR adds the first set of tests for EXISTS subquery.

File name| Brief description
| -
exists-basic.sql  |Tests EXISTS and NOT EXISTS subqueries with both 
correlated and local predicates.
exists-within-and-or.sql|Tests EXISTS and NOT EXISTS subqueries embedded in AND 
or OR expression.

DB2 results are attached here as reference :

[exists-basic-db2.txt](https://github.com/apache/spark/files/733031/exists-basic-db2.txt)
[exists-and-or-db2.txt](https://github.com/apache/spark/files/733030/exists-and-or-db2.txt)

## How was this patch tested?
This patch is adding tests.

Author: Dilip Biswal 

Closes #16710 from dilipbiswal/exist-basic.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e2e7b12c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e2e7b12c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e2e7b12c

Branch: refs/heads/master
Commit: e2e7b12ce8fdf9d0bf0b7fce9283018c7d805988
Parents: f7c07db
Author: Dilip Biswal 
Authored: Sun Jan 29 12:51:59 2017 -0800
Committer: gatorsmile 
Committed: Sun Jan 29 12:51:59 2017 -0800

--
 .../subquery/exists-subquery/exists-basic.sql   | 123 +++
 .../exists-subquery/exists-within-and-or.sql|  96 +
 .../exists-subquery/exists-basic.sql.out| 214 +++
 .../exists-within-and-or.sql.out| 156 ++
 4 files changed, 589 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/e2e7b12c/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-basic.sql
--
diff --git 
a/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-basic.sql
 
b/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-basic.sql
new file mode 100644
index 000..332e858
--- /dev/null
+++ 
b/sql/core/src/test/resources/sql-tests/inputs/subquery/exists-subquery/exists-basic.sql
@@ -0,0 +1,123 @@
+-- Tests EXISTS subquery support. Tests basic form 
+-- of EXISTS subquery (both EXISTS and NOT EXISTS)
+
+CREATE TEMPORARY VIEW EMP AS SELECT * FROM VALUES
+  (100, "emp 1", date "2005-01-01", 100.00D, 10),
+  (100, "emp 1", date "2005-01-01", 100.00D, 10),
+  (200, "emp 2", date "2003-01-01", 200.00D, 10),
+  (300, "emp 3", date "2002-01-01", 300.00D, 20),
+  (400, "emp 4", date "2005-01-01", 400.00D, 30),
+  (500, "emp 5", date "2001-01-01", 400.00D, NULL),
+  (600, "emp 6 - no dept", date "2001-01-01", 400.00D, 100),
+  (700, "emp 7", date "2010-01-01", 400.00D, 100),
+  (800, "emp 8", date "2016-01-01", 150.00D, 70)
+AS EMP(id, emp_name, hiredate, salary, dept_id);
+
+CREATE TEMPORARY VIEW DEPT AS SELECT * FROM VALUES
+  (10, "dept 1", "CA"),
+  (20, "dept 2", "NY"),
+  (30, "dept 3", "TX"),
+  (40, "dept 4 - unassigned", "OR"),
+  (50, "dept 5 - unassigned", "NJ"),
+  (70, "dept 7", "FL")
+AS DEPT(dept_id, dept_name, state);
+
+CREATE TEMPORARY VIEW BONUS AS SELECT * FROM VALUES
+  ("emp 1", 10.00D),
+  ("emp 1", 20.00D),
+  ("emp 2", 300.00D),
+  ("emp 2", 100.00D),
+  ("emp 3", 300.00D),
+  ("emp 4", 100.00D),
+  ("emp 5", 1000.00D),
+  ("emp 6 - no dept", 500.00D)
+AS BONUS(emp_name, bonus_amt);
+
+-- uncorrelated exist query 
+-- TC.01.01
+SELECT * 
+FROM   emp 
+WHERE  EXISTS (SELECT 1 
+   FROM   dept 
+   WHERE  dept.dept_id > 10 
+  AND dept.dept_id < 30); 
+
+-- simple correlated predicate in exist subquery
+-- TC.01.02
+SELECT * 
+FROM   emp 
+WHERE  EXISTS (SELECT dept.dept_name 
+   FROM   dept 
+   WHERE  emp.dept_id = dept.dept_id); 
+
+-- correlated outer isnull predicate
+-- TC.01.03
+SELECT * 
+FROM   emp 
+WHERE  EXISTS (SELECT dept.dept_name 
+   FROM   dept 
+   WHERE  emp.dept_id = dept.dept_id 
+   OR emp.dept_id IS NULL);
+
+-- Simple correlation with a local predicate in outer query
+-- TC.01.04
+SELECT * 
+FROM   emp 
+WHERE  EXISTS (SELECT dept.dept_name 
+   FROM   dept 
+   WHERE  emp.dept_id = dept.dept_id) 
+   AND emp.id > 200; 
+
+-- Outer references (emp.id) should not be pruned from outer plan
+-- TC.01.05
+SELECT emp.emp_name 
+FROM   emp 
+WHERE  EXISTS (SELECT dept.state 
+   FROM   dept 
+   WHERE  emp.dept_id = dept.dept_id) 
+   AND emp.id > 200;
+
+-- not exists with correlated predicate
+-- TC.01.06
+SELECT * 
+FROM   dept 
+WHERE  NOT EXISTS