On 6/20/23 12:59, Joel Jacobson wrote:
> On Mon, Jun 19, 2023, at 02:00, jian he wrote:
>> select hashset_contains('{1,2}'::int4hashset,NULL::int);
>> should return null?
>
> I agree, it should.
>
> I've now changed all functions except int4hashset() (the init function)
> and the aggregate functions to be STRICT.
I don't think this is correct / consistent with what we do elsewhere.
IMHO it's perfectly fine to have a hashset containing a NULL value,
because then it can affect results of membership checks.
Consider these IN / ANY queries:
test=# select 4 in (1,2,3);
?column?
----------
f
(1 row)
test=# select 4 = ANY(ARRAY[1,2,3]);
?column?
----------
f
(1 row)
now add a NULL:
test=# select 4 in (1,2,3,null);
?column?
----------
(1 row)
test=# select 4 = ANY(ARRAY[1,2,3,NULL]);
?column?
----------
(1 row)
I don't see why a (hash)set should behave any differently. It's true
arrays don't behave like this:
test=# select array[1,2,3,4,NULL] @> ARRAY[5];
?column?
----------
f
(1 row)
but I'd say that's more an anomaly than something we should replicate.
This is also what the SQL standard does for multisets - there's SQL:20nn
draft at http://www.wiscorp.com/SQLStandards.html, and the <member
predicate> section (p. 475) explains how this should work with NULL.
So if we see a set as a special case of multiset (with no duplicates),
then we have to handle NULLs this way too. It'd be weird to have this
behavior inconsistent.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company