Hi David,

Thanks for review

On 2/19/26 18:38, David Geier wrote:
+1 on the general idea.

For constant arrays I propose adding a simple pre-check before entering
the per-element loop: detect whether the array contains at least one
NULL element (e.g., via memchr() for the deconstructed array case). If
so, and we are in the ALL / NOT IN case, we can immediately return
selectivity = 0.0 and skip all further computation. This would avoid
extra per-element estimation work while preserving semantics.
How much overhead does the memchr() call add? It seems like this
approach optimizes the edge case at the expense of the common case,
which doesn't seem like a good trade-off.

How about instead adding a flag to ArrayType which indicates if the
array contains NULL or not. This flag could be set in
construct_md_array() which already iterates over all elements. The flag
would need to be kept up-to-date, e.g. in array_set_element() and
possibly other functions modifying the array.

It seems we might reinventing the wheel.

There is already ARR_HASNULL() which allows us to detect the presence of NULL in ArrayType.

In cases where array elements are not known to be constants in advance,
such a pre-check is less straightforward, because each element must
first be inspected to determine whether it is a Const and whether it is
NULL. That already requires iterating through the list, so introducing a
separate early pass would not actually reduce the amount of work.
Therefore, it like makes sense to keep the current behavior in that
situation.
Agreed.

After thinking about this more, is seems reasonable to short-circuit еру loop when we detect a NULL element by checking whether the element is a Const and NULL.


I've attached v2 patch.

--
Best regards.
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
From 76775013947d9345b4754e33d9b807845f86546f Mon Sep 17 00:00:00 2001
From: Ilia Evdokimov <[email protected]>
Date: Fri, 20 Feb 2026 12:03:10 +0300
Subject: [PATCH v2] Reduce planning time for large NOT IN lists containing
 NULL

For x <> ALL (...) / x NOT IN (...), the presence of a NULL element
makes the selectivity 0.0.

The planner currently still iterates over all elements and computes
per-element selectivity, even though the final result is known.

Add an early NULL check for constant arrays and immediately return
0.0 under ALL semantics.

This reduces planning time for large NOT IN / <> ALL lists without
changing semantics.
---
 src/backend/utils/adt/selfuncs.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 29fec655593..9770656f125 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -2025,6 +2025,10 @@ scalararraysel(PlannerInfo *root,
 						  elmlen, elmbyval, elmalign,
 						  &elem_values, &elem_nulls, &num_elems);
 
+		/* Selectivity of "WHERE x NOT IN (NULL, ... )" is always 0 "*/
+		if (!useOr && ARR_HASNULL(arrayval))
+			return (Selectivity) 0.0;
+
 		/*
 		 * For generic operators, we assume the probability of success is
 		 * independent for each array element.  But for "= ANY" or "<> ALL",
@@ -2115,6 +2119,9 @@ scalararraysel(PlannerInfo *root,
 			List	   *args;
 			Selectivity s2;
 
+			/* Selectivity of "WHERE x NOT IN (NULL, ... )" is always 0 "*/
+			if (!useOr && IsA(elem, Const) && ((Const *) elem)->constisnull)
+				return (Selectivity) 0.0;
 			/*
 			 * Theoretically, if elem isn't of nominal_element_type we should
 			 * insert a RelabelType, but it seems unlikely that any operator
-- 
2.34.1

Reply via email to