25.08.2025 16:28, Richard Guo пишет:
On Sat, Aug 23, 2025 at 12:27 AM Sergey Soloviev
<[email protected]> wrote:
I would like write a test in 'join.sql', but for now it requires patches
to easily reproduce the bug. I appreciate it if someone could find
an easier way to reproduce the bug and write a simple test.
Nice catch!  Here's a query that reproduces the error without needing
to hack the code.

create table t (a int, b int);
create unique index on t (a);

select t1.a from t t1
   left join t t2 on t1.a = t2.a
        join t t3 on true
where exists (select 1 from t t4
                 join t t5 on t4.b = t5.b
                 join t t6 on t5.b = t6.b
               where t1.a = t4.a and t3.a = t5.a and t4.a = 2);
ERROR:  variable not found in subplan target lists

This bug was introduced by commit:

commit a3179ab692be4314d5ee5cd56598976c487d5ef2
Author: Tom Lane <[email protected]>
Date:   Fri Sep 27 16:04:04 2024 -0400

     Recalculate where-needed data accurately after a join removal.

(I'm a bit surprised it took us so long to discover it.)

When distributing qual clause "t1.a = t4.a", distribute_qual_to_rels
adds t1.a and t4.a that are used in this clause to the targetlists of
their relations.  (However, if the clause gets absorbed into an EC
that contains const, this can result in adding Vars to relations that
do not actually require them.  But that's a different problem.)

However, when rebuilding attr_needed bits for t1.a and t4.a after the
join removal, rebuild_eclass_attr_needed fails to restore the bits
because it skips ECs that contain constant, as explained by Sergey.
Later, it turns out that t4.a is needed at the join leval t4/t5/t6 for
the unique-ification of the RHS of the semi-join.

The proposed patch can fix this error.  However, I'm wondering if we
could address it from the unique-ification side instead.  If a Var
we're trying to unique-ify is known to be equal to a constant, then we
shouldn't need to unique-ify that Var -- and if it's not needed at
upper levels, it shouldn't need to be in the targetlist of the unique
path.  For example, in the query above, t4.a does not need to be added
in the targetlist of the unique path, right?

Thanks
Richard
Hi, Richard!

Thanks for finding this example. I have added it to join.sql regress test.
New patch version is attached.

I also thought about case to just check Var to be a constant. But
the main question is 'if it is the only node affected?'. Example shows
that error is caused by Unique path, but maybe there are another
nodes which will cause such error.

Yes only 'create_unique_paths' creates new TargetEntry. But I think
the root cause is that when recreating 'attr_needed', the dependency
between this relation and some other relation is lost.
---
Regards,
Sergey Soloviev
Tantor Labs LLC
From dad22e812b3147e159f0203a41e9bf042a1d25d4 Mon Sep 17 00:00:00 2001
From: Sergey Soloviev <[email protected]>
Date: Mon, 25 Aug 2025 19:53:39 +0300
Subject: [PATCH] fix: variable not found in subplan target lists

When restoring 'attr_needed' after self-join removal we should also
consider EC with constants because some plan nodes can use such constant
attributes.

For example, after converting `EXISTS` to JOIN planner can create
`UniquePath` referencing attribute which is contained in EC with
constant. But because `attr_needed` is NULL then this attribute is not
returned by JOIN, so at `set_plan_references` fails to find this
attribute in targetlist of node below.
---
 src/backend/optimizer/path/equivclass.c | 21 +++++++----
 src/test/regress/expected/join.out      | 47 +++++++++++++++++++++++++
 src/test/regress/sql/join.sql           | 26 ++++++++++++++
 3 files changed, 88 insertions(+), 6 deletions(-)

diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 441f12f6c50..58ca75ab7a8 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -2586,18 +2586,27 @@ rebuild_eclass_attr_needed(PlannerInfo *root)
 		 */
 		Assert(ec->ec_childmembers == NULL);
 
-		/* Need do anything only for a multi-member, no-const EC. */
-		if (list_length(ec->ec_members) > 1 && !ec->ec_has_const)
+		/*
+		 * Need do anything only for a multi-member. Note that some
+		 * wrapping plan nodes (i.e. UniquePath) can reference attributes
+		 * contained in EC with constants.
+		 */
+		if (list_length(ec->ec_members) > 1)
 		{
 			ListCell   *lc2;
 
 			foreach(lc2, ec->ec_members)
 			{
 				EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
-				List	   *vars = pull_var_clause((Node *) cur_em->em_expr,
-												   PVC_RECURSE_AGGREGATES |
-												   PVC_RECURSE_WINDOWFUNCS |
-												   PVC_INCLUDE_PLACEHOLDERS);
+				List	   *vars;
+
+				if (cur_em->em_is_const)
+					continue;
+
+				vars = pull_var_clause((Node *) cur_em->em_expr,
+										PVC_RECURSE_AGGREGATES |
+										PVC_RECURSE_WINDOWFUNCS |
+										PVC_INCLUDE_PLACEHOLDERS);
 
 				add_vars_to_attr_needed(root, vars, ec->ec_relids);
 				list_free(vars);
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 98b05c94a11..71654f528b1 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -6500,6 +6500,53 @@ where t1.a = s.c;
 ----------
 (0 rows)
 
+rollback;
+-- yet another join removal bug: EquivalenceClass cleanup must preserve
+-- dependency with JOINs above even if EC has constant
+begin;
+create table t (a int, b int);
+create unique index on t (a);
+explain (costs off)
+select t1.a from t t1
+  left join t t2 on t1.a = t2.a
+       join t t3 on true
+where exists (select 1 from t t4
+                join t t5 on t4.b = t5.b
+                join t t6 on t5.b = t6.b
+              where t1.a = t4.a and t3.a = t5.a and t4.a = 2);
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Nested Loop
+   ->  Nested Loop
+         ->  Index Only Scan using t_a_idx on t t1
+               Index Cond: (a = 2)
+         ->  HashAggregate
+               Group Key: t4.a, t5.a
+               ->  Hash Join
+                     Hash Cond: (t6.b = t4.b)
+                     ->  Seq Scan on t t6
+                     ->  Hash
+                           ->  Hash Join
+                                 Hash Cond: (t5.b = t4.b)
+                                 ->  Seq Scan on t t5
+                                 ->  Hash
+                                       ->  Index Scan using t_a_idx on t t4
+                                             Index Cond: (a = 2)
+   ->  Index Only Scan using t_a_idx on t t3
+         Index Cond: (a = t5.a)
+(18 rows)
+
+select t1.a from t t1
+  left join t t2 on t1.a = t2.a
+       join t t3 on true
+where exists (select 1 from t t4
+                join t t5 on t4.b = t5.b
+                join t t6 on t5.b = t6.b
+              where t1.a = t4.a and t3.a = t5.a and t4.a = 2);
+ a 
+---
+(0 rows)
+
 rollback;
 -- test cases where we can remove a join, but not a PHV computed at it
 begin;
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 5f0a475894d..d62b4abd817 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -2420,6 +2420,32 @@ where t1.a = s.c;
 
 rollback;
 
+-- yet another join removal bug: EquivalenceClass cleanup must preserve
+-- dependency with JOINs above even if EC has constant
+begin;
+
+create table t (a int, b int);
+create unique index on t (a);
+
+explain (costs off)
+select t1.a from t t1
+  left join t t2 on t1.a = t2.a
+       join t t3 on true
+where exists (select 1 from t t4
+                join t t5 on t4.b = t5.b
+                join t t6 on t5.b = t6.b
+              where t1.a = t4.a and t3.a = t5.a and t4.a = 2);
+
+select t1.a from t t1
+  left join t t2 on t1.a = t2.a
+       join t t3 on true
+where exists (select 1 from t t4
+                join t t5 on t4.b = t5.b
+                join t t6 on t5.b = t6.b
+              where t1.a = t4.a and t3.a = t5.a and t4.a = 2);
+
+rollback;
+
 -- test cases where we can remove a join, but not a PHV computed at it
 begin;
 
-- 
2.43.0

Reply via email to