FrankChen021 commented on a change in pull request #12091:
URL: https://github.com/apache/druid/pull/12091#discussion_r773563661



##########
File path: docs/querying/sql.md
##########
@@ -328,13 +328,13 @@ Only the COUNT, ARRAY_AGG, and STRING_AGG aggregations 
can accept the DISTINCT k
 |Function|Notes|Default|
 |--------|-----|-------|
 |`COUNT(*)`|Counts the number of rows.|`0`|
-|`COUNT(DISTINCT expr)`|Counts distinct values of expr.<br><br>When 
"useApproximateCountDistinct" is set to "true" (the default), this is an alias 
for APPROX_COUNT_DISTINCT. The specific algorithm that will be used depends on 
the value of 
[`druid.sql.approxCountDistinct.function`](../configuration/index.md#sql). In 
this mode, you can could strings, numbers, or prebuilt sketches. If counting 
prebuilt sketches, the prebuilt sketch type must match the selected 
algorithm.<br><br>When "useApproximateCountDistinct" is set to "false", the 
computation will be exact. In this case, expr must be string or numeric, since 
exact counts are not possible using prebuilt sketches. In exact mode, only one 
distinct count per query is permitted unless "useGroupingSetForExactDistinct" 
is enabled.|
+|`COUNT(DISTINCT expr)`|Counts distinct values of expr.<br><br>When 
"useApproximateCountDistinct" is set to "true" (the default), this is an alias 
for `APPROX_COUNT_DISTINCT`. The specific algorithm that will be used depends 
on the value of 
[`druid.sql.approxCountDistinct.function`](../configuration/index.md#sql). In 
this mode, you can use strings, numbers, or prebuilt sketches. If counting 
prebuilt sketches, the prebuilt sketch type must match the selected 
algorithm.<br><br>When "useApproximateCountDistinct" is set to "false", the 
computation will be exact. In this case, `expr` must be string or numeric, 
since exact counts are not possible using prebuilt sketches. In exact mode, 
only one distinct count per query is permitted unless 
"useGroupingSetForExactDistinct" is enabled.|
 |`SUM(expr)`|Sums numbers.|`null` if 
`druid.generic.useDefaultValueForNull=false`, otherwise `0`|
 |`MIN(expr)`|Takes the minimum of numbers.|`null` if 
`druid.generic.useDefaultValueForNull=false`, otherwise `9223372036854775807` 
(maximum LONG value)|
 |`MAX(expr)`|Takes the maximum of numbers.|`null` if 
`druid.generic.useDefaultValueForNull=false`, otherwise `-9223372036854775808` 
(minimum LONG value)|
 |`AVG(expr)`|Averages numbers.|`null` if 
`druid.generic.useDefaultValueForNull=false`, otherwise `0`|
-|`APPROX_COUNT_DISTINCT(expr)`|Counts distinct values of expr using an 
approximate algorithm. The expr can be a regular column or a prebuilt sketch 
column.<br><br>The specific algorithm that will be used depends on the value of 
[`druid.sql.approxCountDistinct.function`](../configuration/index.md#sql). By 
default, this is `APPROX_COUNT_DISTINCT_BUILTIN`. If the [DataSketches 
extension](../development/extensions-core/datasketches-extension.md) is loaded, 
this can also be set to `APPROX_COUNT_DISTINCT_DS_HLL` or 
`APPROX_COUNT_DISTINCT_DS_THETA`.<br><br>When run on prebuilt sketch columns, 
the sketch column type must match the implementation of this function. For 
example: when `druid.sql.approxCountDistinct.function` is set to 
`APPROX_COUNT_DISTINCT_BUILTIN`, this function will be able to run on prebuilt 
hyperUnique columns, but not on prebuilt HLLSketchBuild columns.|
-|`APPROX_COUNT_DISTINCT_BUILTIN(expr)`|_Usage note:_ consider using 
`APPROX_COUNT_DISTINCT_DS_HLL` instead, which offers better accuracy in many 
cases.<br/><br/>Counts distinct values of expr using Druid's built-in 
"cardinality" or "hyperUnique" aggregators, which implement a variant of 
[HyperLogLog](http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf). The 
expr can be a string, number, or prebuilt hyperUnique column. This is always 
approximate, regardless of the value of "useApproximateCountDistinct".|
+|`APPROX_COUNT_DISTINCT(expr)`|Counts distinct values of `expr` using an 
approximate algorithm. The `expr` can be a regular column or a prebuilt sketch 
column.<br><br>The specific algorithm that will be used depends on the value of 
[`druid.sql.approxCountDistinct.function`](../configuration/index.md#sql). By 
default, this is `APPROX_COUNT_DISTINCT_BUILTIN`. If the [DataSketches 
extension](../development/extensions-core/datasketches-extension.md) is loaded, 
this can also be set to `APPROX_COUNT_DISTINCT_DS_HLL` or 
`APPROX_COUNT_DISTINCT_DS_THETA`.<br><br>When run on prebuilt sketch columns, 
the sketch column type must match the implementation of this function. For 
example: when `druid.sql.approxCountDistinct.function` is set to 
`APPROX_COUNT_DISTINCT_BUILTIN`, this function will be able to run on prebuilt 
hyperUnique columns, but not on prebuilt HLLSketchBuild columns.|
+|`APPROX_COUNT_DISTINCT_BUILTIN(expr)`|_Usage note:_ consider using 
`APPROX_COUNT_DISTINCT_DS_HLL` instead, which offers better accuracy in many 
cases.<br/><br/>Counts distinct values of `expr` using Druid's built-in 
"cardinality" or "hyperUnique" aggregators, which implement a variant of 
[HyperLogLog](http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf). The 
`expr` can be a string, number, or prebuilt hyperUnique column. This is always 
approximate, regardless of the value of "useApproximateCountDistinct".|
 |`APPROX_COUNT_DISTINCT_DS_HLL(expr, [lgK, tgtHllType])`|Counts distinct 
values of expr, which can be a regular column or an [HLL 
sketch](../development/extensions-core/datasketches-hll.md) column. Results are 
always approximate, regardless of the value of 
[`useApproximateCountDistinct`](#connection-context). The `lgK` and 
`tgtHllType` parameters here are, like the equivalents in the 
[aggregator](../development/extensions-core/datasketches-hll.md#aggregators), 
described in the HLL sketch documentation. The [DataSketches 
extension](../development/extensions-core/datasketches-extension.md) must be 
loaded to use this function.   See also `COUNT(DISTINCT expr)`.  |`0`|
 |`APPROX_COUNT_DISTINCT_DS_THETA(expr, [size])`|Counts distinct values of 
expr, which can be a regular column or a [Theta 
sketch](../development/extensions-core/datasketches-theta.md) column. Results 
are always approximate, regardless of the value of 
[`useApproximateCountDistinct`](#connection-context).  The `size` parameter is 
described in the Theta sketch documentation. The [DataSketches 
extension](../development/extensions-core/datasketches-extension.md) must be 
loaded to use this function. See also `COUNT(DISTINCT expr)`. |`0`|
 |`DS_HLL(expr, [lgK, tgtHllType])`|Creates an [HLL 
sketch](../development/extensions-core/datasketches-hll.md) on the values of 
expr, which can be a regular column or a column containing HLL sketches. The 
`lgK` and `tgtHllType` parameters are described in the HLL sketch 
documentation. The [DataSketches 
extension](../development/extensions-core/datasketches-extension.md) must be 
loaded to use this function.|`'0'` (STRING)|

Review comment:
       ```suggestion
   |`DS_HLL(expr, [lgK, tgtHllType])`|Creates an [HLL 
sketch](../development/extensions-core/datasketches-hll.md) on the values of 
`expr`, which can be a regular column or a column containing HLL sketches. The 
`lgK` and `tgtHllType` parameters are described in the HLL sketch 
documentation. The [DataSketches 
extension](../development/extensions-core/datasketches-extension.md) must be 
loaded to use this function.|`'0'` (STRING)|
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to