[
https://issues.apache.org/jira/browse/CALCITE-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated CALCITE-7330:
------------------------------------
Labels: pull-request-available (was: )
> AggregateCaseToFilterRule should not be applied on aggregate functions that
> don't skip NULL inputs
> --------------------------------------------------------------------------------------------------
>
> Key: CALCITE-7330
> URL: https://issues.apache.org/jira/browse/CALCITE-7330
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.41.0
> Reporter: Alessandro Solimando
> Assignee: Alessandro Solimando
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.42.0
>
>
> h2. Problem
> _AggregateCaseToFilterRule_ incorrectly transforms queries for aggregate
> functions where _NULL_ values have semantic significance.
> The rule transforms:
> {noformat}
> AGG(CASE WHEN condition THEN value ELSE NULL END)
> {noformat}
> to:
> {noformat}
> AGG(value) FILTER (WHERE condition)
> {noformat}
> This transformation is valid for standard SQL aggregates (SUM, AVG, MIN, MAX,
> COUNT) which ignore {_}NULL{_}s.
> However, custom user-defined aggregate functions (UDAFs) may treat NULL
> values as semantically significant inputs, making this transformation
> incorrect (an example are Postgres array aggregate functions which are mostly
> not ignoring NULL values, see the
> [doc|https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE]).
> The former passes NULL to the aggregate, the latter filters out those rows
> entirely.
> AggregateCall.ignoreNulls() Is Not Enough
> At first I thought that the existing _AggregateCall.ignoreNulls()_ could be
> used, but this field represents whether the user explicitly specified _IGNORE
> NULLS_ or _RESPECT NULLS_ syntax in their SQL query, which is only valid for
> window functions, while _AggregateCaseToFilterRule_ operates on _Aggregate_
> nodes, so it's not modeling what we are interested into here.
> My proposal is to add a _SqlAggFunction.skipsNullInputs()_ method to express
> the semantic null-handling behavior of aggregate function types:
> - Returns true by default (standard SQL behavior, all built-in aggregates
> ignore NULLs), so *+fully backward compatible+*
> - Custom UDAFs can override to return false to indicate NULLs are
> semantically significant
> - _AggregateCaseToFilterRule_ checks this method before applying the
> transformation
--
This message was sent by Atlassian Jira
(v8.20.10#820010)