This is an automated email from the ASF dual-hosted git repository.
techdocsmith pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git
The following commit(s) were added to refs/heads/master by this push:
new e36e187a632 add tuple examples (#17667)
e36e187a632 is described below
commit e36e187a632ef1b3784a1c0f8b8048d2101070db
Author: Victoria Lim <[email protected]>
AuthorDate: Mon Jan 27 16:45:04 2025 -0800
add tuple examples (#17667)
---
docs/querying/sql-aggregations.md | 5 +-
docs/querying/sql-functions.md | 160 +++++++++++++++++++++++++++++++++++++-
2 files changed, 159 insertions(+), 6 deletions(-)
diff --git a/docs/querying/sql-aggregations.md
b/docs/querying/sql-aggregations.md
index 2af45a530e0..1a90f9c72d6 100644
--- a/docs/querying/sql-aggregations.md
+++ b/docs/querying/sql-aggregations.md
@@ -146,9 +146,8 @@ Load the [DataSketches
extension](../development/extensions-core/datasketches-ex
|Function|Notes|Default|
|--------|-----|-------|
-|`DS_TUPLE_DOUBLES(expr, [nominalEntries])`|Creates a [Tuple
sketch](../development/extensions-core/datasketches-tuple.md) on the values of
`expr` which is a column containing Tuple sketches which contain an array of
double values as their Summary Objects. The `nominalEntries` override parameter
is optional and described in the Tuple sketch documentation.
-|`DS_TUPLE_DOUBLES(dimensionColumnExpr, metricColumnExpr, ...,
[nominalEntries])`|Creates a [Tuple
sketch](../development/extensions-core/datasketches-tuple.md) which contains an
array of double values as its Summary Object based on the dimension value of
`dimensionColumnExpr` and the numeric metric values contained in one or more
`metricColumnExpr` columns. If the last value of the array is a numeric
literal, Druid assumes that the value is an override parameter for [nominal
entries](.. [...]
-
+|`DS_TUPLE_DOUBLES(expr[, nominalEntries])`|Creates a [Tuple
sketch](../development/extensions-core/datasketches-tuple.md) on a precomputed
sketch column `expr`, where the precomputed Tuple sketch contains an array of
double values as its Summary Object. The `nominalEntries` override parameter is
optional and described in the Tuple sketch documentation.
+|`DS_TUPLE_DOUBLES(dimensionColumnExpr, metricColumnExpr1[, metricColumnExpr2,
...], [nominalEntries])`|Creates a [Tuple
sketch](../development/extensions-core/datasketches-tuple.md) on raw data. The
Tuples sketch will contain an array of double values as its Summary Object
based on the dimension value of `dimensionColumnExpr` and the numeric metric
values contained in one or more `metricColumnExpr` columns. If the last value
of the array is a numeric literal, Druid assumes that the valu [...]
### T-Digest sketch functions
diff --git a/docs/querying/sql-functions.md b/docs/querying/sql-functions.md
index 75e9d54ed8c..519623c833c 100644
--- a/docs/querying/sql-functions.md
+++ b/docs/querying/sql-functions.md
@@ -2129,12 +2129,34 @@ Returns the following:
## DS_TUPLE_DOUBLES
-Creates a Tuple sketch which contains an array of double values as the Summary
Object. If the last value of the array is a numeric literal, Druid assumes that
the value is an override parameter for [nominal
entries](../development/extensions-core/datasketches-tuple.md).
+Creates a Tuple sketch on raw data or a precomputed sketch column. See
[DataSketches Tuple Sketch
module](../development/extensions-core/datasketches-tuple.md) for a description
of parameters.
-* **Syntax**: `DS_TUPLE_DOUBLES(expr, [nominalEntries])`
- `DS_TUPLE_DOUBLES(dimensionColumnExpr, metricColumnExpr, ...,
[nominalEntries])`
+* **Syntax**: `DS_TUPLE_DOUBLES(expr[, nominalEntries])`
+ `DS_TUPLE_DOUBLES(dimensionColumnExpr, metricColumnExpr1[,
metricColumnExpr2, ...], [nominalEntries])`
* **Function type:** Aggregation
+<details><summary>Example</summary>
+
+The following example creates a Tuples sketch column that stores the arrival
and departure delay minutes for each airline in `flight-carriers`:
+
+```sql
+SELECT
+ "Reporting_Airline",
+ DS_TUPLE_DOUBLES("Reporting_Airline", "ArrDelayMinutes", "DepDelayMinutes")
AS tuples_delay
+FROM "flight-carriers"
+GROUP BY 1
+LIMIT 2
+```
+
+Returns the following:
+
+|`Reporting_Airline`|`tuples_delay`|
+|-------------------|--------------|
+|`AA`|`1.0`|
+|`AS`|`1.0`|
+
+</details>
+
[Learn more](sql-aggregations.md)
## DS_TUPLE_DOUBLES_INTERSECT
@@ -2144,6 +2166,37 @@ Returns an intersection of Tuple sketches which each
contain an array of double
* **Syntax**: `DS_TUPLE_DOUBLES_INTERSECT(expr, ..., [nominalEntries])`
* **Function type:** Scalar, sketch
+<details><summary>Example</summary>
+
+The following example calculates the total minutes of arrival delay for
airlines flying out of `SFO` or `LAX`.
+An airline that doesn't fly out of both airports returns a value of 0.
+
+```sql
+SELECT
+ "Reporting_Airline",
+ DS_TUPLE_DOUBLES_METRICS_SUM_ESTIMATE(
+ DS_TUPLE_DOUBLES_INTERSECT(
+ DS_TUPLE_DOUBLES("Reporting_Airline", "ArrDelayMinutes") FILTER(WHERE
"Origin" = 'SFO'),
+ DS_TUPLE_DOUBLES("Reporting_Airline", "ArrDelayMinutes") FILTER(WHERE
"Origin" = 'LAX')
+ )
+ ) AS arrival_delay_sfo_lax
+FROM "flight-carriers"
+GROUP BY 1
+LIMIT 5
+```
+
+Returns the following:
+
+|`Reporting_Airline`|`arrival_delay_sfo_lax`|
+|----|---------|
+|`AA`|`[33296]`|
+|`AS`|`[13694]`|
+|`B6`|`[0]`|
+|`CO`|`[13582]`|
+|`DH`|`[0]`|
+
+</details>
+
[Learn more](sql-scalar.md#tuple-sketch-functions)
## DS_TUPLE_DOUBLES_METRICS_SUM_ESTIMATE
@@ -2153,6 +2206,47 @@ Computes approximate sums of the values contained within
a Tuple sketch which co
* **Syntax**: `DS_TUPLE_DOUBLES_METRICS_SUM_ESTIMATE(expr)`
* **Function type:** Scalar, sketch
+<details><summary>Example</summary>
+
+The following example calculates the sum of arrival and departure delay
minutes for each airline in `flight-carriers`:
+
+```sql
+SELECT
+ "Reporting_Airline",
+ DS_TUPLE_DOUBLES_METRICS_SUM_ESTIMATE(DS_TUPLE_DOUBLES("Reporting_Airline",
"ArrDelayMinutes", "DepDelayMinutes")) AS sum_delays
+FROM "flight-carriers"
+GROUP BY 1
+LIMIT 2
+```
+
+Returns the following:
+
+|`Reporting_Airline`|`sum_delays`|
+|----|-----------------|
+|`AA`|`[612831,474309]`|
+|`AS`|`[157340,141462]`|
+
+Compare this example with an analogous SQL statement that doesn't use
approximations:
+
+```sql
+SELECT
+ "Reporting_Airline",
+ SUM("ArrDelayMinutes") AS sum_arrival_delay,
+ SUM("DepDelayMinutes") AS sum_departure_delay
+FROM "flight-carriers"
+GROUP BY 1
+LIMIT 2
+```
+
+Returns the following:
+
+|`Reporting_Airline`|`sum_arrival_delay`|`sum_departure_delay`|
+|----|--------|--------|
+|`AA`|`612831`|`475735`|
+|`AS`|`157340`|`143620`|
+
+</details>
+
[Learn more](sql-scalar.md#tuple-sketch-functions)
## DS_TUPLE_DOUBLES_NOT
@@ -2162,6 +2256,36 @@ Returns a set difference of Tuple sketches which each
contain an array of double
* **Syntax**: `DS_TUPLE_DOUBLES_NOT(expr, ..., [nominalEntries])`
* **Function type:** Scalar, sketch
+<details><summary>Example</summary>
+
+The following example calculates the total minutes of arrival delay for
airlines that fly out of `SFO` but not `LAX`.
+
+```sql
+SELECT
+ "Reporting_Airline",
+ DS_TUPLE_DOUBLES_METRICS_SUM_ESTIMATE(
+ DS_TUPLE_DOUBLES_NOT(
+ DS_TUPLE_DOUBLES("Reporting_Airline", "ArrDelayMinutes") FILTER(WHERE
"Origin" = 'SFO'),
+ DS_TUPLE_DOUBLES("Reporting_Airline", "ArrDelayMinutes") FILTER(WHERE
"Origin" = 'LAX')
+ )
+ ) AS arrival_delay_sfo_lax
+FROM "flight-carriers"
+GROUP BY 1
+LIMIT 5
+```
+
+Returns the following:
+
+|`Reporting_Airline`|`arrival_delay_sfo_lax`|
+|----|---------|
+|`AA`|`[0]`|
+|`AS`|`[0]`|
+|`B6`|`[0]`|
+|`CO`|`[0]`|
+|`DH`|`[93]`|
+
+</details>
+
[Learn more](sql-scalar.md#tuple-sketch-functions)
## DS_TUPLE_DOUBLES_UNION
@@ -2171,6 +2295,36 @@ Returns a union of Tuple sketches which each contain an
array of double values a
* **Syntax**: `DS_TUPLE_DOUBLES_UNION(expr, ..., [nominalEntries])`
* **Function type:** Scalar, sketch
+<details><summary>Example</summary>
+
+The following example calculates the total minutes of arrival delay for
airlines flying out of either `SFO` or `LAX`.
+
+```sql
+SELECT
+ "Reporting_Airline",
+ DS_TUPLE_DOUBLES_METRICS_SUM_ESTIMATE(
+ DS_TUPLE_DOUBLES_UNION(
+ DS_TUPLE_DOUBLES("Reporting_Airline", "ArrDelayMinutes") FILTER(WHERE
"Origin" = 'SFO'),
+ DS_TUPLE_DOUBLES("Reporting_Airline", "ArrDelayMinutes") FILTER(WHERE
"Origin" = 'LAX')
+ )
+ ) AS arrival_delay_sfo_lax
+FROM "flight-carriers"
+GROUP BY 1
+LIMIT 5
+```
+
+Returns the following:
+
+|`Reporting_Airline`|`arrival_delay_sfo_lax`|
+|----|---------|
+|`AA`|`[33296]`|
+|`AS`|`[13694]`|
+|`B6`|`[0]`|
+|`CO`|`[13582]`|
+|`DH`|`[93]`|
+
+</details>
+
[Learn more](sql-scalar.md#tuple-sketch-functions)
## EARLIEST
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]