This is an automated email from the ASF dual-hosted git repository.

techdocsmith pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


The following commit(s) were added to refs/heads/master by this push:
     new e36e187a632 add tuple examples (#17667)
e36e187a632 is described below

commit e36e187a632ef1b3784a1c0f8b8048d2101070db
Author: Victoria Lim <[email protected]>
AuthorDate: Mon Jan 27 16:45:04 2025 -0800

    add tuple examples (#17667)
---
 docs/querying/sql-aggregations.md |   5 +-
 docs/querying/sql-functions.md    | 160 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 159 insertions(+), 6 deletions(-)

diff --git a/docs/querying/sql-aggregations.md 
b/docs/querying/sql-aggregations.md
index 2af45a530e0..1a90f9c72d6 100644
--- a/docs/querying/sql-aggregations.md
+++ b/docs/querying/sql-aggregations.md
@@ -146,9 +146,8 @@ Load the [DataSketches 
extension](../development/extensions-core/datasketches-ex
 
 |Function|Notes|Default|
 |--------|-----|-------|
-|`DS_TUPLE_DOUBLES(expr, [nominalEntries])`|Creates a [Tuple 
sketch](../development/extensions-core/datasketches-tuple.md) on the values of 
`expr` which is a column containing Tuple sketches which contain an array of 
double values as their Summary Objects. The `nominalEntries` override parameter 
is optional and described in the Tuple sketch documentation.
-|`DS_TUPLE_DOUBLES(dimensionColumnExpr, metricColumnExpr, ..., 
[nominalEntries])`|Creates a [Tuple 
sketch](../development/extensions-core/datasketches-tuple.md) which contains an 
array of double values as its Summary Object based on the dimension value of 
`dimensionColumnExpr` and the numeric metric values contained in one or more 
`metricColumnExpr` columns. If the last value of the array is a numeric 
literal, Druid assumes that the value is an override parameter for [nominal 
entries](.. [...]
-
+|`DS_TUPLE_DOUBLES(expr[, nominalEntries])`|Creates a [Tuple 
sketch](../development/extensions-core/datasketches-tuple.md) on a precomputed 
sketch column `expr`, where the precomputed Tuple sketch contains an array of 
double values as its Summary Object. The `nominalEntries` override parameter is 
optional and described in the Tuple sketch documentation.
+|`DS_TUPLE_DOUBLES(dimensionColumnExpr, metricColumnExpr1[, metricColumnExpr2, 
...], [nominalEntries])`|Creates a [Tuple 
sketch](../development/extensions-core/datasketches-tuple.md) on raw data. The 
Tuples sketch will contain an array of double values as its Summary Object 
based on the dimension value of `dimensionColumnExpr` and the numeric metric 
values contained in one or more `metricColumnExpr` columns. If the last value 
of the array is a numeric literal, Druid assumes that the valu [...]
 
 ### T-Digest sketch functions
 
diff --git a/docs/querying/sql-functions.md b/docs/querying/sql-functions.md
index 75e9d54ed8c..519623c833c 100644
--- a/docs/querying/sql-functions.md
+++ b/docs/querying/sql-functions.md
@@ -2129,12 +2129,34 @@ Returns the following:
 
 ## DS_TUPLE_DOUBLES
 
-Creates a Tuple sketch which contains an array of double values as the Summary 
Object. If the last value of the array is a numeric literal, Druid assumes that 
the value is an override parameter for [nominal 
entries](../development/extensions-core/datasketches-tuple.md).
+Creates a Tuple sketch on raw data or a precomputed sketch column. See 
[DataSketches Tuple Sketch 
module](../development/extensions-core/datasketches-tuple.md) for a description 
of parameters.
 
-* **Syntax**: `DS_TUPLE_DOUBLES(expr, [nominalEntries])`  
-              `DS_TUPLE_DOUBLES(dimensionColumnExpr, metricColumnExpr, ..., 
[nominalEntries])`
+* **Syntax**: `DS_TUPLE_DOUBLES(expr[, nominalEntries])`  
+              `DS_TUPLE_DOUBLES(dimensionColumnExpr, metricColumnExpr1[, 
metricColumnExpr2, ...], [nominalEntries])`
 * **Function type:** Aggregation
 
+<details><summary>Example</summary>
+
+The following example creates a Tuples sketch column that stores the arrival 
and departure delay minutes for each airline in `flight-carriers`:
+
+```sql
+SELECT
+  "Reporting_Airline",
+  DS_TUPLE_DOUBLES("Reporting_Airline", "ArrDelayMinutes", "DepDelayMinutes") 
AS tuples_delay
+FROM "flight-carriers"
+GROUP BY 1
+LIMIT 2
+```
+
+Returns the following:
+
+|`Reporting_Airline`|`tuples_delay`|
+|-------------------|--------------|
+|`AA`|`1.0`|
+|`AS`|`1.0`|
+
+</details>
+
 [Learn more](sql-aggregations.md)
 
 ## DS_TUPLE_DOUBLES_INTERSECT
@@ -2144,6 +2166,37 @@ Returns an intersection of Tuple sketches which each 
contain an array of double
 * **Syntax**: `DS_TUPLE_DOUBLES_INTERSECT(expr, ..., [nominalEntries])`
 * **Function type:** Scalar, sketch
 
+<details><summary>Example</summary>
+
+The following example calculates the total minutes of arrival delay for 
airlines flying out of `SFO` or `LAX`.
+An airline that doesn't fly out of both airports returns a value of 0.
+
+```sql
+SELECT
+  "Reporting_Airline",
+  DS_TUPLE_DOUBLES_METRICS_SUM_ESTIMATE(
+    DS_TUPLE_DOUBLES_INTERSECT(
+      DS_TUPLE_DOUBLES("Reporting_Airline", "ArrDelayMinutes") FILTER(WHERE 
"Origin" = 'SFO'),
+      DS_TUPLE_DOUBLES("Reporting_Airline", "ArrDelayMinutes") FILTER(WHERE 
"Origin" = 'LAX')
+    )
+  ) AS arrival_delay_sfo_lax
+FROM "flight-carriers"
+GROUP BY 1
+LIMIT 5
+```
+
+Returns the following:
+
+|`Reporting_Airline`|`arrival_delay_sfo_lax`|
+|----|---------|
+|`AA`|`[33296]`|
+|`AS`|`[13694]`|
+|`B6`|`[0]`|
+|`CO`|`[13582]`|
+|`DH`|`[0]`|
+
+</details>
+
 [Learn more](sql-scalar.md#tuple-sketch-functions)
 
 ## DS_TUPLE_DOUBLES_METRICS_SUM_ESTIMATE
@@ -2153,6 +2206,47 @@ Computes approximate sums of the values contained within 
a Tuple sketch which co
 * **Syntax**: `DS_TUPLE_DOUBLES_METRICS_SUM_ESTIMATE(expr)`
 * **Function type:** Scalar, sketch
 
+<details><summary>Example</summary>
+
+The following example calculates the sum of arrival and departure delay 
minutes for each airline in `flight-carriers`:
+
+```sql
+SELECT
+  "Reporting_Airline",
+  DS_TUPLE_DOUBLES_METRICS_SUM_ESTIMATE(DS_TUPLE_DOUBLES("Reporting_Airline", 
"ArrDelayMinutes", "DepDelayMinutes")) AS sum_delays
+FROM "flight-carriers"
+GROUP BY 1
+LIMIT 2
+```
+
+Returns the following:
+
+|`Reporting_Airline`|`sum_delays`|
+|----|-----------------|
+|`AA`|`[612831,474309]`|
+|`AS`|`[157340,141462]`|
+
+Compare this example with an analogous SQL statement that doesn't use 
approximations:
+
+```sql
+SELECT
+  "Reporting_Airline",
+  SUM("ArrDelayMinutes") AS sum_arrival_delay,
+  SUM("DepDelayMinutes") AS sum_departure_delay
+FROM "flight-carriers"
+GROUP BY 1
+LIMIT 2
+```
+
+Returns the following:
+
+|`Reporting_Airline`|`sum_arrival_delay`|`sum_departure_delay`|
+|----|--------|--------|
+|`AA`|`612831`|`475735`|
+|`AS`|`157340`|`143620`|
+
+</details>
+
 [Learn more](sql-scalar.md#tuple-sketch-functions)
 
 ## DS_TUPLE_DOUBLES_NOT
@@ -2162,6 +2256,36 @@ Returns a set difference of Tuple sketches which each 
contain an array of double
 * **Syntax**: `DS_TUPLE_DOUBLES_NOT(expr, ..., [nominalEntries])`
 * **Function type:** Scalar, sketch
 
+<details><summary>Example</summary>
+
+The following example calculates the total minutes of arrival delay for 
airlines that fly out of `SFO` but not `LAX`.
+
+```sql
+SELECT
+  "Reporting_Airline",
+  DS_TUPLE_DOUBLES_METRICS_SUM_ESTIMATE(
+    DS_TUPLE_DOUBLES_NOT(
+      DS_TUPLE_DOUBLES("Reporting_Airline", "ArrDelayMinutes") FILTER(WHERE 
"Origin" = 'SFO'),
+      DS_TUPLE_DOUBLES("Reporting_Airline", "ArrDelayMinutes") FILTER(WHERE 
"Origin" = 'LAX')
+    )
+  ) AS arrival_delay_sfo_lax
+FROM "flight-carriers"
+GROUP BY 1
+LIMIT 5
+```
+
+Returns the following:
+
+|`Reporting_Airline`|`arrival_delay_sfo_lax`|
+|----|---------|
+|`AA`|`[0]`|
+|`AS`|`[0]`|
+|`B6`|`[0]`|
+|`CO`|`[0]`|
+|`DH`|`[93]`|
+
+</details>
+
 [Learn more](sql-scalar.md#tuple-sketch-functions)
 
 ## DS_TUPLE_DOUBLES_UNION
@@ -2171,6 +2295,36 @@ Returns a union of Tuple sketches which each contain an 
array of double values a
 * **Syntax**: `DS_TUPLE_DOUBLES_UNION(expr, ..., [nominalEntries])`
 * **Function type:** Scalar, sketch
 
+<details><summary>Example</summary>
+
+The following example calculates the total minutes of arrival delay for 
airlines flying out of either `SFO` or `LAX`.
+
+```sql
+SELECT
+  "Reporting_Airline",
+  DS_TUPLE_DOUBLES_METRICS_SUM_ESTIMATE(
+    DS_TUPLE_DOUBLES_UNION(
+      DS_TUPLE_DOUBLES("Reporting_Airline", "ArrDelayMinutes") FILTER(WHERE 
"Origin" = 'SFO'),
+      DS_TUPLE_DOUBLES("Reporting_Airline", "ArrDelayMinutes") FILTER(WHERE 
"Origin" = 'LAX')
+    )
+  ) AS arrival_delay_sfo_lax
+FROM "flight-carriers"
+GROUP BY 1
+LIMIT 5
+```
+
+Returns the following:
+
+|`Reporting_Airline`|`arrival_delay_sfo_lax`|
+|----|---------|
+|`AA`|`[33296]`|
+|`AS`|`[13694]`|
+|`B6`|`[0]`|
+|`CO`|`[13582]`|
+|`DH`|`[93]`|
+
+</details>
+
 [Learn more](sql-scalar.md#tuple-sketch-functions)
 
 ## EARLIEST


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to