[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24881: [SPARK-23160][SQL][TEST] Port window.sql

GitBox Sat, 15 Jun 2019 20:47:46 -0700

dongjoon-hyun commented on a change in pull request #24881: 
[SPARK-23160][SQL][TEST] Port window.sql
URL: https://github.com/apache/spark/pull/24881#discussion_r294066562


 ##########
 File path: sql/core/src/test/resources/sql-tests/inputs/pgSQL/window.sql
 ##########
 @@ -0,0 +1,918 @@
+--
+-- Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+--
+-- Window Functions Testing
+-- 
https://github.com/postgres/postgres/blob/REL_12_BETA1/src/test/regress/sql/window.sql
+
+CREATE TABLE empsalary (
+    depname string,
+    empno integer,
+    salary int,
+    enroll_date date
+) USING parquet;
+
+INSERT INTO empsalary VALUES
+('develop', 10, 5200, '2007-08-01'),
+('sales', 1, 5000, '2006-10-01'),
+('personnel', 5, 3500, '2007-12-10'),
+('sales', 4, 4800, '2007-08-08'),
+('personnel', 2, 3900, '2006-12-23'),
+('develop', 7, 4200, '2008-01-01'),
+('develop', 9, 4500, '2008-01-01'),
+('sales', 3, 4800, '2007-08-01'),
+('develop', 8, 6000, '2006-10-01'),
+('develop', 11, 5200, '2007-08-15');
+
+SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM 
empsalary ORDER BY depname, salary;
+
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY 
salary) FROM empsalary;
+
+SELECT four, ten, SUM(SUM(four)) OVER (PARTITION BY four), AVG(ten) FROM tenk1
+GROUP BY four, ten ORDER BY four, ten;
+
+SELECT depname, empno, salary, sum(salary) OVER w FROM empsalary WINDOW w AS 
(PARTITION BY depname);
+
+-- I get an error when trying `order by rank() over w`, however it works for 
`order by r' if column rank is renamed to r
+SELECT depname, empno, salary, rank() OVER w as r FROM empsalary WINDOW w AS 
(PARTITION BY depname ORDER BY salary) ORDER BY r;
+
+SELECT COUNT(*) OVER () FROM tenk1 WHERE unique2 < 10;
+
+SELECT COUNT(*) OVER w FROM tenk1 WHERE unique2 < 10 WINDOW w AS ();
+
+SELECT four FROM tenk1 WHERE FALSE WINDOW w AS (PARTITION BY ten);
+
+SELECT sum(four) OVER (PARTITION BY ten ORDER BY unique2) AS sum_1, ten, four 
FROM tenk1 WHERE unique2 < 10;
+
+SELECT row_number() OVER (ORDER BY unique2) FROM tenk1 WHERE unique2 < 10;
+
+SELECT rank() OVER (PARTITION BY four ORDER BY ten) AS rank_1, ten, four FROM 
tenk1 WHERE unique2 < 10;
+
+SELECT dense_rank() OVER (PARTITION BY four ORDER BY ten), ten, four FROM 
tenk1 WHERE unique2 < 10;
+
+SELECT percent_rank() OVER (PARTITION BY four ORDER BY ten), ten, four FROM 
tenk1 WHERE unique2 < 10;
+
+SELECT cume_dist() OVER (PARTITION BY four ORDER BY ten), ten, four FROM tenk1 
WHERE unique2 < 10;
+
+SELECT ntile(3) OVER (ORDER BY ten, four), ten, four FROM tenk1 WHERE unique2 
< 10;
+
+-- Spark does not accept null as input for `ntile`
 
 Review comment:
   Please file an Apache Spark JIRA issue if it doesn't exist.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24881: [SPARK-23160][SQL][TEST] Port window.sql

Reply via email to