vtlim commented on code in PR #14739: URL: https://github.com/apache/druid/pull/14739#discussion_r1282383119
########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. Review Comment: Instead of having this in a note, could this go into a Prerequisites section? ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. Review Comment: ```suggestion The following example groups results with the same `channel` value into windows. For each window, the query returns the rank of each row in ascending order based upon its `delta` value. ``` ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY abs(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| event_time | channel | change | rank_value | +| -- | -- | -- | -- | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 7 | 3 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 63 | 6 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 91 | 7 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2440 | 8 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2703 | 9 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 6900 |10 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 2 | 2 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 13 | 3 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 28 | 4 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 53 | 5 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 56 | 6 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 59 | 7 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 391 | 8 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 894 | 9 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 4358 | 10 | + +</details> + +Window functions are similar to [aggregation funtctions](./aggregations.md). + +You can use the OVER clause treat other Druid aggretation functions as window functions. For example, the sum of a value for rows within a window. + +When working with window functions, consider the following: +- Window functions only work on GROUP BY queries. +- Window functions support aliasing. + +## Define a window with the OVER clause + +The OVER clause defines the query windows for window functions as follows: +- PARTITION BY indicates the dimension that defines the rows within the window +- ORDER BY specifies the order of the rows within the windows. +Sometimes windows are called partitions, but don't confuse them with the segment partitioning feature of Druid. + + +The PARTITIONED BY clause defines the dimenson to use for the window. For example, the following OVER clause sets the window dimension to `channel` and orders the results by the absolute value of `delta` ascending: + +```sql +... +rank() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC) +... +``` + +## Window function reference + +|Function|Notes|Default| +|--------|-----|-------| +| `ROW_NUMBER()`| Returns the number of the row within the window| None | +|`RANK()`| Returns the rank for a row within a window | None | +|`DENSE_RANK()`| Returns the ranks for a row within a window without gaps. For example, if two rows tie for rank of 1, the subsequent row is ranked 2. | None | Review Comment: ```suggestion |`DENSE_RANK()`| Returns the rank for a row within a window without gaps. For example, if two rows tie for rank of 1, the subsequent row is ranked 2. | None | ``` Does Druid do any kind of sorting for which tied row gets returned first? ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY abs(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| event_time | channel | change | rank_value | +| -- | -- | -- | -- | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 7 | 3 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 63 | 6 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 91 | 7 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2440 | 8 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2703 | 9 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 6900 |10 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 2 | 2 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 13 | 3 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 28 | 4 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 53 | 5 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 56 | 6 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 59 | 7 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 391 | 8 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 894 | 9 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 4358 | 10 | + +</details> + +Window functions are similar to [aggregation funtctions](./aggregations.md). + +You can use the OVER clause treat other Druid aggretation functions as window functions. For example, the sum of a value for rows within a window. + +When working with window functions, consider the following: +- Window functions only work on GROUP BY queries. +- Window functions support aliasing. + +## Define a window with the OVER clause + +The OVER clause defines the query windows for window functions as follows: +- PARTITION BY indicates the dimension that defines the rows within the window +- ORDER BY specifies the order of the rows within the windows. +Sometimes windows are called partitions, but don't confuse them with the segment partitioning feature of Druid. + + +The PARTITIONED BY clause defines the dimenson to use for the window. For example, the following OVER clause sets the window dimension to `channel` and orders the results by the absolute value of `delta` ascending: + +```sql +... +rank() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC) Review Comment: Add an explanation on why you would use this query instead of the one above, even though they are functionally the same. Because the previous one is a GROUP BY but this one can be used in other queries such as...? ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY abs(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| event_time | channel | change | rank_value | +| -- | -- | -- | -- | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 7 | 3 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 63 | 6 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 91 | 7 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2440 | 8 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2703 | 9 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 6900 |10 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 2 | 2 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 13 | 3 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 28 | 4 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 53 | 5 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 56 | 6 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 59 | 7 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 391 | 8 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 894 | 9 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 4358 | 10 | + +</details> + +Window functions are similar to [aggregation funtctions](./aggregations.md). + +You can use the OVER clause treat other Druid aggretation functions as window functions. For example, the sum of a value for rows within a window. + +When working with window functions, consider the following: +- Window functions only work on GROUP BY queries. +- Window functions support aliasing. + +## Define a window with the OVER clause + +The OVER clause defines the query windows for window functions as follows: +- PARTITION BY indicates the dimension that defines the rows within the window +- ORDER BY specifies the order of the rows within the windows. +Sometimes windows are called partitions, but don't confuse them with the segment partitioning feature of Druid. + + +The PARTITIONED BY clause defines the dimenson to use for the window. For example, the following OVER clause sets the window dimension to `channel` and orders the results by the absolute value of `delta` ascending: Review Comment: Duplicate? >The PARTITIONED BY clause defines the dimenson to use for the window. ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY abs(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| event_time | channel | change | rank_value | +| -- | -- | -- | -- | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 7 | 3 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 63 | 6 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 91 | 7 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2440 | 8 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2703 | 9 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 6900 |10 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 2 | 2 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 13 | 3 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 28 | 4 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 53 | 5 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 56 | 6 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 59 | 7 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 391 | 8 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 894 | 9 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 4358 | 10 | + +</details> + +Window functions are similar to [aggregation funtctions](./aggregations.md). + +You can use the OVER clause treat other Druid aggretation functions as window functions. For example, the sum of a value for rows within a window. Review Comment: ```suggestion You can use the OVER clause to treat other Druid aggregation functions as window functions. For example, the sum of a value for rows within a window. ``` ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY abs(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| event_time | channel | change | rank_value | +| -- | -- | -- | -- | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 7 | 3 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 63 | 6 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 91 | 7 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2440 | 8 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2703 | 9 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 6900 |10 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 2 | 2 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 13 | 3 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 28 | 4 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 53 | 5 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 56 | 6 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 59 | 7 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 391 | 8 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 894 | 9 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 4358 | 10 | + +</details> + +Window functions are similar to [aggregation funtctions](./aggregations.md). + +You can use the OVER clause treat other Druid aggretation functions as window functions. For example, the sum of a value for rows within a window. + +When working with window functions, consider the following: +- Window functions only work on GROUP BY queries. +- Window functions support aliasing. + +## Define a window with the OVER clause + +The OVER clause defines the query windows for window functions as follows: +- PARTITION BY indicates the dimension that defines the rows within the window +- ORDER BY specifies the order of the rows within the windows. +Sometimes windows are called partitions, but don't confuse them with the segment partitioning feature of Druid. + + +The PARTITIONED BY clause defines the dimenson to use for the window. For example, the following OVER clause sets the window dimension to `channel` and orders the results by the absolute value of `delta` ascending: + +```sql +... +rank() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC) Review Comment: nit: consistency on casing between this query and the preceding query. See `rank/RANK` and `abs/ABS` ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY abs(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| event_time | channel | change | rank_value | +| -- | -- | -- | -- | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 7 | 3 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 63 | 6 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 91 | 7 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2440 | 8 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2703 | 9 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 6900 |10 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 2 | 2 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 13 | 3 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 28 | 4 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 53 | 5 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 56 | 6 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 59 | 7 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 391 | 8 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 894 | 9 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 4358 | 10 | + +</details> + +Window functions are similar to [aggregation funtctions](./aggregations.md). + +You can use the OVER clause treat other Druid aggretation functions as window functions. For example, the sum of a value for rows within a window. + +When working with window functions, consider the following: +- Window functions only work on GROUP BY queries. +- Window functions support aliasing. + +## Define a window with the OVER clause + +The OVER clause defines the query windows for window functions as follows: +- PARTITION BY indicates the dimension that defines the rows within the window +- ORDER BY specifies the order of the rows within the windows. +Sometimes windows are called partitions, but don't confuse them with the segment partitioning feature of Druid. + + +The PARTITIONED BY clause defines the dimenson to use for the window. For example, the following OVER clause sets the window dimension to `channel` and orders the results by the absolute value of `delta` ascending: + +```sql +... +rank() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC) +... +``` + +## Window function reference + +|Function|Notes|Default| +|--------|-----|-------| +| `ROW_NUMBER()`| Returns the number of the row within the window| None | +|`RANK()`| Returns the rank for a row within a window | None | +|`DENSE_RANK()`| Returns the ranks for a row within a window without gaps. For example, if two rows tie for rank of 1, the subsequent row is ranked 2. | None | +|`PERCENT_RANK()`| Returns the rank of the row calculated as a percentage according to the formula: `(rank - 1) / (total partition rows - 1)` | None | +|`CUME_DIST()`| Returns the cumulative distribution of the current row within the window calculated as `number of partition rows preceding or peers with current row` / `total partition rows` | None | Review Comment: Since a lot of the descriptions use the term "window" instead of "partition" the formulas should say window as well ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY abs(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| event_time | channel | change | rank_value | +| -- | -- | -- | -- | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 7 | 3 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 63 | 6 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 91 | 7 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2440 | 8 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2703 | 9 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 6900 |10 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 2 | 2 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 13 | 3 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 28 | 4 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 53 | 5 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 56 | 6 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 59 | 7 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 391 | 8 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 894 | 9 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 4358 | 10 | + +</details> + +Window functions are similar to [aggregation funtctions](./aggregations.md). + +You can use the OVER clause treat other Druid aggretation functions as window functions. For example, the sum of a value for rows within a window. + +When working with window functions, consider the following: +- Window functions only work on GROUP BY queries. +- Window functions support aliasing. + +## Define a window with the OVER clause + +The OVER clause defines the query windows for window functions as follows: +- PARTITION BY indicates the dimension that defines the rows within the window +- ORDER BY specifies the order of the rows within the windows. +Sometimes windows are called partitions, but don't confuse them with the segment partitioning feature of Druid. + + +The PARTITIONED BY clause defines the dimenson to use for the window. For example, the following OVER clause sets the window dimension to `channel` and orders the results by the absolute value of `delta` ascending: + +```sql +... +rank() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC) +... +``` + +## Window function reference + +|Function|Notes|Default| +|--------|-----|-------| +| `ROW_NUMBER()`| Returns the number of the row within the window| None | +|`RANK()`| Returns the rank for a row within a window | None | +|`DENSE_RANK()`| Returns the ranks for a row within a window without gaps. For example, if two rows tie for rank of 1, the subsequent row is ranked 2. | None | +|`PERCENT_RANK()`| Returns the rank of the row calculated as a percentage according to the formula: `(rank - 1) / (total partition rows - 1)` | None | +|`CUME_DIST()`| Returns the cumulative distribution of the current row within the window calculated as `number of partition rows preceding or peers with current row` / `total partition rows` | None | Review Comment: >preceding or peers with current row Maybe "at the same rank or higher as the current row"? ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY abs(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| event_time | channel | change | rank_value | +| -- | -- | -- | -- | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 7 | 3 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 63 | 6 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 91 | 7 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2440 | 8 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2703 | 9 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 6900 |10 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 2 | 2 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 13 | 3 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 28 | 4 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 53 | 5 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 56 | 6 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 59 | 7 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 391 | 8 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 894 | 9 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 4358 | 10 | + +</details> + +Window functions are similar to [aggregation funtctions](./aggregations.md). + +You can use the OVER clause treat other Druid aggretation functions as window functions. For example, the sum of a value for rows within a window. + +When working with window functions, consider the following: +- Window functions only work on GROUP BY queries. +- Window functions support aliasing. + +## Define a window with the OVER clause + +The OVER clause defines the query windows for window functions as follows: +- PARTITION BY indicates the dimension that defines the rows within the window +- ORDER BY specifies the order of the rows within the windows. +Sometimes windows are called partitions, but don't confuse them with the segment partitioning feature of Druid. + + +The PARTITIONED BY clause defines the dimenson to use for the window. For example, the following OVER clause sets the window dimension to `channel` and orders the results by the absolute value of `delta` ascending: + +```sql +... +rank() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC) +... +``` + +## Window function reference + +|Function|Notes|Default| +|--------|-----|-------| +| `ROW_NUMBER()`| Returns the number of the row within the window| None | +|`RANK()`| Returns the rank for a row within a window | None | +|`DENSE_RANK()`| Returns the ranks for a row within a window without gaps. For example, if two rows tie for rank of 1, the subsequent row is ranked 2. | None | +|`PERCENT_RANK()`| Returns the rank of the row calculated as a percentage according to the formula: `(rank - 1) / (total partition rows - 1)` | None | +|`CUME_DIST()`| Returns the cumulative distribution of the current row within the window calculated as `number of partition rows preceding or peers with current row` / `total partition rows` | None | +|`NTILE(tiles)`| Divides the results as evenly as possible into the number of tiles, also called buckets, and returns a value from 1 to the value of `tiles` |None | +|`LAG(expr, offset, default)`| Returns the value evaluated at the row that precedes the current row by the offset number within the window. If there is no such row, returns the default. | None | +|`LEAD(expr, offset, default)`| Returns the value evaluated at the row that followsthe current row by the offset number within the window; if there is no such row, returns the default | None | +|`FIRST_VALUE(expr)`| Returns the value for the expression for the first row within the window| None | +|`LAST_VALUE(expr)`| Returns the value for the expression for the first row within the window | None | Review Comment: ```suggestion |`LAST_VALUE(expr)`| Returns the value for the expression for the last row within the window | None | ``` ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY abs(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| event_time | channel | change | rank_value | +| -- | -- | -- | -- | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 7 | 3 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 63 | 6 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 91 | 7 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2440 | 8 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2703 | 9 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 6900 |10 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 2 | 2 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 13 | 3 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 28 | 4 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 53 | 5 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 56 | 6 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 59 | 7 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 391 | 8 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 894 | 9 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 4358 | 10 | + +</details> + +Window functions are similar to [aggregation funtctions](./aggregations.md). + +You can use the OVER clause treat other Druid aggretation functions as window functions. For example, the sum of a value for rows within a window. + +When working with window functions, consider the following: +- Window functions only work on GROUP BY queries. +- Window functions support aliasing. + +## Define a window with the OVER clause + +The OVER clause defines the query windows for window functions as follows: +- PARTITION BY indicates the dimension that defines the rows within the window +- ORDER BY specifies the order of the rows within the windows. +Sometimes windows are called partitions, but don't confuse them with the segment partitioning feature of Druid. + + +The PARTITIONED BY clause defines the dimenson to use for the window. For example, the following OVER clause sets the window dimension to `channel` and orders the results by the absolute value of `delta` ascending: + +```sql +... +rank() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC) +... +``` + +## Window function reference + +|Function|Notes|Default| +|--------|-----|-------| +| `ROW_NUMBER()`| Returns the number of the row within the window| None | +|`RANK()`| Returns the rank for a row within a window | None | +|`DENSE_RANK()`| Returns the ranks for a row within a window without gaps. For example, if two rows tie for rank of 1, the subsequent row is ranked 2. | None | +|`PERCENT_RANK()`| Returns the rank of the row calculated as a percentage according to the formula: `(rank - 1) / (total partition rows - 1)` | None | +|`CUME_DIST()`| Returns the cumulative distribution of the current row within the window calculated as `number of partition rows preceding or peers with current row` / `total partition rows` | None | +|`NTILE(tiles)`| Divides the results as evenly as possible into the number of tiles, also called buckets, and returns a value from 1 to the value of `tiles` |None | +|`LAG(expr, offset, default)`| Returns the value evaluated at the row that precedes the current row by the offset number within the window. If there is no such row, returns the default. | None | Review Comment: ```suggestion |`LAG(expr, offset, default)`| Returns the value evaluated at the row that precedes the current row by the offset number within the window; if there is no such row, returns the given default value | None | ``` ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY abs(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| event_time | channel | change | rank_value | +| -- | -- | -- | -- | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 7 | 3 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 63 | 6 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 91 | 7 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2440 | 8 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2703 | 9 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 6900 |10 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 2 | 2 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 13 | 3 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 28 | 4 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 53 | 5 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 56 | 6 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 59 | 7 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 391 | 8 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 894 | 9 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 4358 | 10 | + +</details> + +Window functions are similar to [aggregation funtctions](./aggregations.md). + +You can use the OVER clause treat other Druid aggretation functions as window functions. For example, the sum of a value for rows within a window. + +When working with window functions, consider the following: +- Window functions only work on GROUP BY queries. +- Window functions support aliasing. + +## Define a window with the OVER clause + +The OVER clause defines the query windows for window functions as follows: +- PARTITION BY indicates the dimension that defines the rows within the window +- ORDER BY specifies the order of the rows within the windows. +Sometimes windows are called partitions, but don't confuse them with the segment partitioning feature of Druid. + + +The PARTITIONED BY clause defines the dimenson to use for the window. For example, the following OVER clause sets the window dimension to `channel` and orders the results by the absolute value of `delta` ascending: + +```sql +... +rank() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC) +... +``` + +## Window function reference + +|Function|Notes|Default| +|--------|-----|-------| +| `ROW_NUMBER()`| Returns the number of the row within the window| None | +|`RANK()`| Returns the rank for a row within a window | None | +|`DENSE_RANK()`| Returns the ranks for a row within a window without gaps. For example, if two rows tie for rank of 1, the subsequent row is ranked 2. | None | +|`PERCENT_RANK()`| Returns the rank of the row calculated as a percentage according to the formula: `(rank - 1) / (total partition rows - 1)` | None | +|`CUME_DIST()`| Returns the cumulative distribution of the current row within the window calculated as `number of partition rows preceding or peers with current row` / `total partition rows` | None | +|`NTILE(tiles)`| Divides the results as evenly as possible into the number of tiles, also called buckets, and returns a value from 1 to the value of `tiles` |None | +|`LAG(expr, offset, default)`| Returns the value evaluated at the row that precedes the current row by the offset number within the window. If there is no such row, returns the default. | None | +|`LEAD(expr, offset, default)`| Returns the value evaluated at the row that followsthe current row by the offset number within the window; if there is no such row, returns the default | None | Review Comment: ```suggestion |`LEAD(expr, offset, default)`| Returns the value evaluated at the row that follows the current row by the offset number within the window; if there is no such row, returns the given default value | None | ``` ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY abs(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| event_time | channel | change | rank_value | +| -- | -- | -- | -- | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 7 | 3 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 63 | 6 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 91 | 7 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2440 | 8 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2703 | 9 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 6900 |10 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 2 | 2 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 13 | 3 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 28 | 4 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 53 | 5 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 56 | 6 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 59 | 7 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 391 | 8 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 894 | 9 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 4358 | 10 | + +</details> + +Window functions are similar to [aggregation funtctions](./aggregations.md). + +You can use the OVER clause treat other Druid aggretation functions as window functions. For example, the sum of a value for rows within a window. + +When working with window functions, consider the following: +- Window functions only work on GROUP BY queries. +- Window functions support aliasing. + +## Define a window with the OVER clause + +The OVER clause defines the query windows for window functions as follows: +- PARTITION BY indicates the dimension that defines the rows within the window +- ORDER BY specifies the order of the rows within the windows. +Sometimes windows are called partitions, but don't confuse them with the segment partitioning feature of Druid. + + +The PARTITIONED BY clause defines the dimenson to use for the window. For example, the following OVER clause sets the window dimension to `channel` and orders the results by the absolute value of `delta` ascending: + +```sql +... +rank() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC) +... +``` + +## Window function reference + +|Function|Notes|Default| +|--------|-----|-------| +| `ROW_NUMBER()`| Returns the number of the row within the window| None | +|`RANK()`| Returns the rank for a row within a window | None | +|`DENSE_RANK()`| Returns the ranks for a row within a window without gaps. For example, if two rows tie for rank of 1, the subsequent row is ranked 2. | None | +|`PERCENT_RANK()`| Returns the rank of the row calculated as a percentage according to the formula: `(rank - 1) / (total partition rows - 1)` | None | +|`CUME_DIST()`| Returns the cumulative distribution of the current row within the window calculated as `number of partition rows preceding or peers with current row` / `total partition rows` | None | +|`NTILE(tiles)`| Divides the results as evenly as possible into the number of tiles, also called buckets, and returns a value from 1 to the value of `tiles` |None | +|`LAG(expr, offset, default)`| Returns the value evaluated at the row that precedes the current row by the offset number within the window. If there is no such row, returns the default. | None | +|`LEAD(expr, offset, default)`| Returns the value evaluated at the row that followsthe current row by the offset number within the window; if there is no such row, returns the default | None | +|`FIRST_VALUE(expr)`| Returns the value for the expression for the first row within the window| None | +|`LAST_VALUE(expr)`| Returns the value for the expression for the first row within the window | None | + + +## Example + +The following example illustrates all of the built-in window functions to compare the number of characters changed per event for a channel in the Wikipedia data set. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + ROW_NUMBER() OVER w AS row_no, + RANK() OVER w AS rank_no, + DENSE_RANK() OVER w AS dense_rank_no, + PERCENT_RANK() OVER w AS pct_rank, + CUME_DIST() OVER w AS cumlative_dist, + NTILE(4) OVER w AS ntile_val, + LAG(ABS(delta), 1, 0) OVER w AS lag_val, + LEAD(ABS(delta), 1, 0) OVER w AS lead_val, + FIRST_VALUE(ABS(delta)) OVER w AS first_val, + LAST_VALUE(ABS(delta)) OVER w AS last_val +FROM wikipedia +WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia') +GROUP BY channel, ABS(delta), FLOOR(__time TO DAY) +WINDOW w AS (PARTITION BY channel ORDER BY ABS(delta) ASC) +``` + +<details> +<summary> View results </summary> + +|event_time|channel|change|row_no|rank_no|dense_rank_no|pct_rank|cumlative_dist|ntile_val|lag_val|lead_val|first_val|last_val| Review Comment: ```suggestion |event_time|channel|change|row_no|rank_no|dense_rank_no|pct_rank|cumulative_dist|ntile_val|lag_val|lead_val|first_val|last_val| ``` ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY abs(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| event_time | channel | change | rank_value | +| -- | -- | -- | -- | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 7 | 3 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 63 | 6 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 91 | 7 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2440 | 8 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2703 | 9 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 6900 |10 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 2 | 2 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 13 | 3 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 28 | 4 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 53 | 5 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 56 | 6 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 59 | 7 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 391 | 8 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 894 | 9 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 4358 | 10 | + +</details> + +Window functions are similar to [aggregation funtctions](./aggregations.md). + +You can use the OVER clause treat other Druid aggretation functions as window functions. For example, the sum of a value for rows within a window. + +When working with window functions, consider the following: +- Window functions only work on GROUP BY queries. +- Window functions support aliasing. + +## Define a window with the OVER clause + +The OVER clause defines the query windows for window functions as follows: +- PARTITION BY indicates the dimension that defines the rows within the window +- ORDER BY specifies the order of the rows within the windows. +Sometimes windows are called partitions, but don't confuse them with the segment partitioning feature of Druid. + + +The PARTITIONED BY clause defines the dimenson to use for the window. For example, the following OVER clause sets the window dimension to `channel` and orders the results by the absolute value of `delta` ascending: + +```sql +... +rank() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC) +... +``` + +## Window function reference + +|Function|Notes|Default| +|--------|-----|-------| +| `ROW_NUMBER()`| Returns the number of the row within the window| None | +|`RANK()`| Returns the rank for a row within a window | None | +|`DENSE_RANK()`| Returns the ranks for a row within a window without gaps. For example, if two rows tie for rank of 1, the subsequent row is ranked 2. | None | +|`PERCENT_RANK()`| Returns the rank of the row calculated as a percentage according to the formula: `(rank - 1) / (total partition rows - 1)` | None | +|`CUME_DIST()`| Returns the cumulative distribution of the current row within the window calculated as `number of partition rows preceding or peers with current row` / `total partition rows` | None | +|`NTILE(tiles)`| Divides the results as evenly as possible into the number of tiles, also called buckets, and returns a value from 1 to the value of `tiles` |None | Review Comment: I'm not sure I understand what this function does even after rereading it a few times. Are the results divided per each window? If I call `NTILES(5)`, it returns a result from 1 to 5, but how do I know what will be returned? ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY abs(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| event_time | channel | change | rank_value | +| -- | -- | -- | -- | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 7 | 3 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 63 | 6 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 91 | 7 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2440 | 8 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2703 | 9 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 6900 |10 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 2 | 2 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 13 | 3 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 28 | 4 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 53 | 5 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 56 | 6 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 59 | 7 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 391 | 8 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 894 | 9 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 4358 | 10 | + +</details> + +Window functions are similar to [aggregation funtctions](./aggregations.md). + +You can use the OVER clause treat other Druid aggretation functions as window functions. For example, the sum of a value for rows within a window. + +When working with window functions, consider the following: +- Window functions only work on GROUP BY queries. +- Window functions support aliasing. + +## Define a window with the OVER clause + +The OVER clause defines the query windows for window functions as follows: +- PARTITION BY indicates the dimension that defines the rows within the window +- ORDER BY specifies the order of the rows within the windows. +Sometimes windows are called partitions, but don't confuse them with the segment partitioning feature of Druid. + + +The PARTITIONED BY clause defines the dimenson to use for the window. For example, the following OVER clause sets the window dimension to `channel` and orders the results by the absolute value of `delta` ascending: + +```sql +... +rank() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC) +... +``` + +## Window function reference + +|Function|Notes|Default| Review Comment: Do you need a default column if all the defaults here are None? ########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,164 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +> Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +> This document describes the SQL language. +> +> Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. +> There are known issues where ORDER BY only works on ascending order and certain options may cause errors. +> Set the context parameter `windowsAreForClosers: true` to use window functions. + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example returns the rank of each row in ascending order based upon its `delta` value as related to other rows with the same `channel` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY abs(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| event_time | channel | change | rank_value | +| -- | -- | -- | -- | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 7 | 3 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 56 | 4 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 63 | 6 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 91 | 7 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2440 | 8 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 2703 | 9 | +| 2016-06-27T00:00:00.000Z | #kk.wikipedia | 6900 |10 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 1 | 1 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 2 | 2 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 13 | 3 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 28 | 4 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 53 | 5 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 56 | 6 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 59 | 7 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 391 | 8 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 894 | 9 | +| 2016-06-27T00:00:00.000Z | #lt.wikipedia | 4358 | 10 | + +</details> + +Window functions are similar to [aggregation funtctions](./aggregations.md). + +You can use the OVER clause treat other Druid aggretation functions as window functions. For example, the sum of a value for rows within a window. + +When working with window functions, consider the following: +- Window functions only work on GROUP BY queries. +- Window functions support aliasing. + +## Define a window with the OVER clause + +The OVER clause defines the query windows for window functions as follows: +- PARTITION BY indicates the dimension that defines the rows within the window +- ORDER BY specifies the order of the rows within the windows. +Sometimes windows are called partitions, but don't confuse them with the segment partitioning feature of Druid. + + +The PARTITIONED BY clause defines the dimenson to use for the window. For example, the following OVER clause sets the window dimension to `channel` and orders the results by the absolute value of `delta` ascending: + +```sql +... +rank() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC) +... +``` + +## Window function reference + +|Function|Notes|Default| +|--------|-----|-------| +| `ROW_NUMBER()`| Returns the number of the row within the window| None | +|`RANK()`| Returns the rank for a row within a window | None | +|`DENSE_RANK()`| Returns the ranks for a row within a window without gaps. For example, if two rows tie for rank of 1, the subsequent row is ranked 2. | None | +|`PERCENT_RANK()`| Returns the rank of the row calculated as a percentage according to the formula: `(rank - 1) / (total partition rows - 1)` | None | +|`CUME_DIST()`| Returns the cumulative distribution of the current row within the window calculated as `number of partition rows preceding or peers with current row` / `total partition rows` | None | +|`NTILE(tiles)`| Divides the results as evenly as possible into the number of tiles, also called buckets, and returns a value from 1 to the value of `tiles` |None | +|`LAG(expr, offset, default)`| Returns the value evaluated at the row that precedes the current row by the offset number within the window. If there is no such row, returns the default. | None | +|`LEAD(expr, offset, default)`| Returns the value evaluated at the row that followsthe current row by the offset number within the window; if there is no such row, returns the default | None | +|`FIRST_VALUE(expr)`| Returns the value for the expression for the first row within the window| None | +|`LAST_VALUE(expr)`| Returns the value for the expression for the first row within the window | None | + + +## Example + +The following example illustrates all of the built-in window functions to compare the number of characters changed per event for a channel in the Wikipedia data set. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + ROW_NUMBER() OVER w AS row_no, + RANK() OVER w AS rank_no, + DENSE_RANK() OVER w AS dense_rank_no, + PERCENT_RANK() OVER w AS pct_rank, + CUME_DIST() OVER w AS cumlative_dist, Review Comment: ```suggestion CUME_DIST() OVER w AS cumulative_dist, ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
