Reduce planning time for large NOT IN lists containing NULL

Ilia Evdokimov Wed, 18 Feb 2026 06:12:15 -0800

Hi hackers,

In this thread [0] an interesting idea came up about avoidingunnecessary work during selectivity estimation for x <> ALL (NULL, ...)or x NOT IN (NULL, ...)

Semantically, if the array contains at least one NULL, the selectivityof x NOT IN (...) is always 0.0, regardless of the other elements in thelist.

Currently, the planner still iterates over all array elements andinvokes the operator's selectivity estimator for each of them. For largeIN / ALL lists, this increases planning time.

For constant arrays I propose adding a simple pre-check before enteringthe per-element loop: detect whether the array contains at least oneNULL element (e.g., via memchr() for the deconstructed array case). Ifso, and we are in the ALL / NOT IN case, we can immediately returnselectivity = 0.0 and skip all further computation. This would avoidextra per-element estimation work while preserving semantics.

In cases where array elements are not known to be constants in advance,such a pre-check is less straightforward, because each element mustfirst be inspected to determine whether it is a Const and whether it isNULL. That already requires iterating through the list, so introducing aseparate early pass would not actually reduce the amount of work.Therefore, it like makes sense to keep the current behavior in thatsituation.


Thoughts?

[0]:https://www.postgresql.org/message-id/CAHza6qdAH99C0xa27YDTixiNVFa99j90QYquUPAxL0JwahmggA%40mail.gmail.com


--
Best regards.
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/

From 86914f44a37b3c9afbd18b333eb5c451f41ce65b Mon Sep 17 00:00:00 2001
From: Ilia Evdokimov <[email protected]>
Date: Wed, 18 Feb 2026 17:06:35 +0300
Subject: [PATCH v1] Reduce planning time for large NOT IN lists containing
 NULL

For x <> ALL (...) / x NOT IN (...), the presence of a NULL element
makes the selectivity 0.0.

The planner currently still iterates over all elements and computes
per-element selectivity, even though the final result is known.

Add an early NULL check for constant arrays and immediately return
0.0 under ALL semantics.

This reduces planning time for large NOT IN / <> ALL lists without
changing semantics.
---
 src/backend/utils/adt/selfuncs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 29fec655593..a0b57bd57aa 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -2025,6 +2025,9 @@ scalararraysel(PlannerInfo *root,
 						  elmlen, elmbyval, elmalign,
 						  &elem_values, &elem_nulls, &num_elems);
 
+		if (!useOr && memchr(elem_nulls, true, num_elems) != NULL)
+			return (Selectivity) 0.0;
+
 		/*
 		 * For generic operators, we assume the probability of success is
 		 * independent for each array element.  But for "= ANY" or "<> ALL",
-- 
2.34.1

Reduce planning time for large NOT IN lists containing NULL

Reply via email to