[HACKERS] NULL-handling in aggregate(DISTINCT ...)

Andrew Gierth Wed, 11 Nov 2009 17:28:56 -0800

Quoth the comments in nodeAgg.c:

 * We don't currently implement DISTINCT aggs for aggs having more
 * than one argument.  This isn't required for anything in the SQL
 * spec, but really it ought to be implemented for
 * feature-completeness.  FIXME someday.


and:

 * DISTINCT always suppresses nulls, per SQL spec, regardless of the
 * transition function's strictness.

(What the SQL spec actually says is that aggregate calls which are
<general set operation> ignore all nulls regardless of whether they
are ALL or DISTINCT. Other kinds of aggregates are not permitted by
the spec to use ALL or DISTINCT.)

Currently we have this behaviour:

postgres=# select array_agg(all a) from (values (1),(null)) v(a);
 array_agg 
-----------
 {1,NULL}
(1 row)

postgres=# select array_agg(distinct a) from (values (1),(null)) v(a);
 array_agg 
-----------
 {1}
(1 row)

which personally I feel is somewhat wrong, since 1 and NULL are in
fact distinct, but which is due to the logic expressed in the second
comment above. (The spec does not allow array_agg(distinct a) so it
is no help here.)

Now the question: If the limit of one argument for DISTINCT aggs were
removed (which I'm considering doing as part of an update to the
aggregate ORDER BY patch I posted a while back), what should be the
behaviour of agg(distinct x,y) where one or both of x or y is null?
And should it depend on the strictness of the transition function?

-- 
Andrew (irc:RhodiumToad)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] NULL-handling in aggregate(DISTINCT ...)

Reply via email to