comphead commented on code in PR #6402:
URL: https://github.com/apache/arrow-datafusion/pull/6402#discussion_r1204659270


##########
docs/source/user-guide/sql/window_functions.md:
##########
@@ -0,0 +1,263 @@
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either expressioness or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+
+# Window Functions
+
+A _window function_ performs a calculation across a set of table rows that are 
somehow related to the current row. This is comparable to the type of 
calculation that can be done with an aggregate function. However, window 
functions do not cause rows to become grouped into a single output row like 
non-window aggregate calls would. Instead, the rows retain their separate 
identities. Behind the scenes, the window function is able to access more than 
just the current row of the query result
+
+Here is an example that shows how to compare each employee's salary with the 
average salary in his or her department:
+
+```sql
+SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM 
empsalary;
+
++-----------+-------+--------+-------------------+
+| depname   | empno | salary | avg               |
++-----------+-------+--------+-------------------+
+| personnel | 2     | 3900   | 3700.0            |
+| personnel | 5     | 3500   | 3700.0            |
+| develop   | 8     | 6000   | 5020.0            |
+| develop   | 10    | 5200   | 5020.0            |
+| develop   | 11    | 5200   | 5020.0            |
+| develop   | 9     | 4500   | 5020.0            |
+| develop   | 7     | 4200   | 5020.0            |
+| sales     | 1     | 5000   | 4866.666666666667 |
+| sales     | 4     | 4800   | 4866.666666666667 |
+| sales     | 3     | 4800   | 4866.666666666667 |
++-----------+-------+--------+-------------------+
+```
+
+A window function call always contains an OVER clause directly following the 
window function's name and argument(s). This is what syntactically 
distinguishes it from a normal function or non-window aggregate. The OVER 
clause determines exactly how the rows of the query are split up for processing 
by the window function. The PARTITION BY clause within OVER divides the rows 
into groups, or partitions, that share the same values of the PARTITION BY 
expression(s). For each row, the window function is computed across the rows 
that fall into the same partition as the current row. The previous example 
showed how to count the average of a column per partition.
+
+You can also control the order in which rows are processed by window functions 
using ORDER BY within OVER. (The window ORDER BY does not even have to match 
the order in which the rows are output.) Here is an example:
+
+```sql
+SELECT depname, empno, salary,
+       rank() OVER (PARTITION BY depname ORDER BY salary DESC)
+FROM empsalary;
+
++-----------+-------+--------+--------+
+| depname   | empno | salary | rank   |
++-----------+-------+--------+--------+
+| personnel | 2     | 3900   | 1      |
+| develop   | 8     | 6000   | 1      |
+| develop   | 10    | 5200   | 2      |
+| develop   | 11    | 5200   | 2      |
+| develop   | 9     | 4500   | 4      |
+| develop   | 7     | 4200   | 5      |
+| sales     | 1     | 5000   | 1      |
+| sales     | 4     | 4800   | 2      |
+| personnel | 5     | 3500   | 2      |
+| sales     | 3     | 4800   | 2      |
++-----------+-------+--------+--------+
+```
+
+There is another important concept associated with window functions: for each 
row, there is a set of rows within its partition called its window frame. Some 
window functions act only on the rows of the window frame, rather than of the 
whole partition. Here is an example of using window frames in queries:
+
+```sql
+SELECT depname, empno, salary,
+    avg(salary) OVER(ORDER BY salary ASC ROWS BETWEEN 1 PRECEDING AND 1 
FOLLOWING) AS avg,
+    min(salary) OVER(ORDER BY empno ASC ROWS BETWEEN UNBOUNDED PRECEDING AND 
CURRENT ROW) AS cum_min
+FROM empsalary
+ORDER BY empno ASC;
+
++-----------+-------+--------+--------------------+---------+
+| depname   | empno | salary | avg                | cum_min |
++-----------+-------+--------+--------------------+---------+
+| sales     | 1     | 5000   | 5000.0             | 5000    |
+| personnel | 2     | 3900   | 3866.6666666666665 | 3900    |
+| sales     | 3     | 4800   | 4700.0             | 3900    |
+| sales     | 4     | 4800   | 4866.666666666667  | 3900    |
+| personnel | 5     | 3500   | 3700.0             | 3500    |
+| develop   | 7     | 4200   | 4200.0             | 3500    |
+| develop   | 8     | 6000   | 5600.0             | 3500    |
+| develop   | 9     | 4500   | 4500.0             | 3500    |
+| develop   | 10    | 5200   | 5133.333333333333  | 3500    |
+| develop   | 11    | 5200   | 5466.666666666667  | 3500    |
++-----------+-------+--------+--------------------+---------+
+```
+
+## Syntax
+
+The syntax for the OVER-clause is
+
+```sql
+function([expr])
+  OVER(
+    [PARTITION BY expr[, …]]
+    [ORDER BY expr [ ASC | DESC ][, …]]
+    [ frame_clause ]
+    )
+```
+
+where **frame_clause** is one of:
+
+```sql
+  { RANGE | ROWS | GROUPS } frame_start

Review Comment:
   👍 RANGE and GROUPS require ORDER BY. perhaps we can add such text 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to