This is an automated email from the ASF dual-hosted git repository.
abhishekrb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git
The following commit(s) were added to refs/heads/master by this push:
new a805c5612e2 Adds Druid SQL query examples for the Stats aggregator
Native Queries (#16277)
a805c5612e2 is described below
commit a805c5612e2d6fe56afae3df3653409160ca087f
Author: Nikhil Rao <[email protected]>
AuthorDate: Mon Apr 15 08:05:34 2024 -0700
Adds Druid SQL query examples for the Stats aggregator Native Queries
(#16277)
* Adds Druid SQL query examples for the Timeseries and GroupBy Native
queries in the stats aggregator docs page
* Updates intervals in Native Query to remove excess Time part in timestamp
* Moves Druid SQL section above Native query because sql used more often by
users
* removes old Druid SQL sections
* Adds TopN Druid SQL query using ORDER BY and LIMIT
* Adds table for Druid SQL variance and standard deviation functions
* Update docs/development/extensions-core/stats.md
Co-authored-by: Abhishek Radhakrishnan <[email protected]>
---------
Co-authored-by: Karan Kumar <[email protected]>
Co-authored-by: Abhishek Radhakrishnan <[email protected]>
---
docs/development/extensions-core/stats.md | 59 +++++++++++++++++++++++++++++--
1 file changed, 56 insertions(+), 3 deletions(-)
diff --git a/docs/development/extensions-core/stats.md
b/docs/development/extensions-core/stats.md
index be4121369a1..e0df34a004e 100644
--- a/docs/development/extensions-core/stats.md
+++ b/docs/development/extensions-core/stats.md
@@ -56,6 +56,19 @@ Numer. Math, 58 (1991) pp. 583--590
precisely the same across query runs.
:::
+### Variance and Standard Deviation SQL Aggregators
+
+You can use the variance and standard deviation aggregation functions in the
SELECT clause of any Druid SQL query.
+
+|Function|Notes|Default|
+|--------|-----|-------|
+|`VAR_POP(expr)`|Computes variance population of `expr`.|`null` or `0` if
`druid.generic.useDefaultValueForNull=true` (deprecated legacy mode)|
+|`VAR_SAMP(expr)`|Computes variance sample of `expr`.|`null` or `0` if
`druid.generic.useDefaultValueForNull=true` (deprecated legacy mode)|
+|`VARIANCE(expr)`|Computes variance sample of `expr`.|`null` or `0` if
`druid.generic.useDefaultValueForNull=true` (deprecated legacy mode)|
+|`STDDEV_POP(expr)`|Computes standard deviation population of `expr`.|`null`
or `0` if `druid.generic.useDefaultValueForNull=true` (deprecated legacy mode)|
+|`STDDEV_SAMP(expr)`|Computes standard deviation sample of `expr`.|`null` or
`0` if `druid.generic.useDefaultValueForNull=true` (deprecated legacy mode)|
+|`STDDEV(expr)`|Computes standard deviation sample of `expr`.|`null` or `0` if
`druid.generic.useDefaultValueForNull=true` (deprecated legacy mode)|
+
### Pre-aggregating variance at ingestion time
To use this feature, an "variance" aggregator must be included at indexing
time.
@@ -107,6 +120,18 @@ To acquire standard deviation from variance, user can use
"stddev" post aggregat
### Timeseries query
+#### Druid SQL
+
+```SQL
+SELECT
+ DATE_TRUNC('day', __time),
+ VARIANCE("index_var") AS index_var
+FROM "testing"
+WHERE TIME_IN_INTERVAL(__time, '2013-03-01/2016-03-20')
+GROUP BY 1
+```
+
+#### Native Query
```json
{
"queryType": "timeseries",
@@ -120,13 +145,28 @@ To acquire standard deviation from variance, user can use
"stddev" post aggregat
}
],
"intervals": [
- "2016-03-01T00:00:00.000/2013-03-20T00:00:00.000"
+ "2016-03-01/2013-03-20"
]
}
```
### TopN query
+#### Druid SQL
+
+```SQL
+SELECT
+ alias,
+ VARIANCE("index") AS index_var
+FROM "testing"
+WHERE TIME_IN_INTERVAL(__time, '2016-03-06/2016-03-07')
+GROUP BY 1
+ORDER BY 2
+LIMIT 5
+```
+
+#### Native Query
+
```json
{
"queryType": "topN",
@@ -149,13 +189,26 @@ To acquire standard deviation from variance, user can use
"stddev" post aggregat
}
],
"intervals": [
- "2016-03-06T00:00:00/2016-03-06T23:59:59"
+ "2016-03-06/2016-03-07"
]
}
```
### GroupBy query
+#### Druid SQL
+
+```SQL
+SELECT
+ alias,
+ VARIANCE("index") AS index_var
+FROM "testing"
+WHERE TIME_IN_INTERVAL(__time, '2016-03-06/2016-03-07')
+GROUP BY alias
+```
+
+#### Native Query
+
```json
{
"queryType": "groupBy",
@@ -177,7 +230,7 @@ To acquire standard deviation from variance, user can use
"stddev" post aggregat
}
],
"intervals": [
- "2016-03-06T00:00:00/2016-03-06T23:59:59"
+ "2016-03-06/2016-03-07"
]
}
```
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]