Alessandro Solimando created CALCITE-7330:
---------------------------------------------
Summary: AggregateCaseToFilterRule should not be applied on
aggregate functions that don't ignore NULLs
Key: CALCITE-7330
URL: https://issues.apache.org/jira/browse/CALCITE-7330
Project: Calcite
Issue Type: Bug
Components: core
Affects Versions: 1.41.0
Environment: +underlined text+
Reporter: Alessandro Solimando
Assignee: Alessandro Solimando
Fix For: 1.42.0
h2. Problem
_AggregateCaseToFilterRule_ incorrectly transforms queries for aggregate
functions where _NULL_ values have semantic significance.
The rule transforms:
{noformat}
AGG(CASE WHEN condition THEN value ELSE NULL END)
{noformat}
to:
{noformat}
AGG(value) FILTER (WHERE condition)
{noformat}
This transformation is valid for standard SQL aggregates (SUM, AVG, MIN, MAX,
COUNT) which ignore {_}NULL{_}s.
However, custom user-defined aggregate functions (UDAFs) may treat NULL values
as semantically significant inputs, making this transformation incorrect (an
example are Postgres array aggregate functions which are mostly not ignoring
NULL values, see the
[doc|https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE]).
The former passes NULL to the aggregate, the latter filters out those rows
entirely.
AggregateCall.ignoreNulls() Is Not Enough
At first I thought that the existing _AggregateCall.ignoreNulls()_ could be
used, but this field represents whether the user explicitly specified _IGNORE
NULLS_ or _RESPECT NULLS_ syntax in their SQL query, which is only valid for
window functions, while _AggregateCaseToFilterRule_ operates on _Aggregate_
nodes, so it's not modeling what we are interested into here.
My proposal is to add a _SqlAggFunction.ignoresNulls()_ method to express the
semantic null-handling behavior of aggregate function types:
- Returns true by default (standard SQL behavior, all built-in aggregates
ignore NULLs), so *+fully backward compatible+*
- Custom UDAFs can override to return false to indicate NULLs are semantically
significant
- _AggregateCaseToFilterRule_ checks this method before applying the
transformation
--
This message was sent by Atlassian Jira
(v8.20.10#820010)