25.08.2025 16:28, Richard Guo пишет:
On Sat, Aug 23, 2025 at 12:27 AM Sergey Soloviev <[email protected]> wrote:I would like write a test in 'join.sql', but for now it requires patches to easily reproduce the bug. I appreciate it if someone could find an easier way to reproduce the bug and write a simple test.Nice catch! Here's a query that reproduces the error without needing to hack the code. create table t (a int, b int); create unique index on t (a); select t1.a from t t1 left join t t2 on t1.a = t2.a join t t3 on true where exists (select 1 from t t4 join t t5 on t4.b = t5.b join t t6 on t5.b = t6.b where t1.a = t4.a and t3.a = t5.a and t4.a = 2); ERROR: variable not found in subplan target lists This bug was introduced by commit: commit a3179ab692be4314d5ee5cd56598976c487d5ef2 Author: Tom Lane <[email protected]> Date: Fri Sep 27 16:04:04 2024 -0400 Recalculate where-needed data accurately after a join removal. (I'm a bit surprised it took us so long to discover it.) When distributing qual clause "t1.a = t4.a", distribute_qual_to_rels adds t1.a and t4.a that are used in this clause to the targetlists of their relations. (However, if the clause gets absorbed into an EC that contains const, this can result in adding Vars to relations that do not actually require them. But that's a different problem.) However, when rebuilding attr_needed bits for t1.a and t4.a after the join removal, rebuild_eclass_attr_needed fails to restore the bits because it skips ECs that contain constant, as explained by Sergey. Later, it turns out that t4.a is needed at the join leval t4/t5/t6 for the unique-ification of the RHS of the semi-join. The proposed patch can fix this error. However, I'm wondering if we could address it from the unique-ification side instead. If a Var we're trying to unique-ify is known to be equal to a constant, then we shouldn't need to unique-ify that Var -- and if it's not needed at upper levels, it shouldn't need to be in the targetlist of the unique path. For example, in the query above, t4.a does not need to be added in the targetlist of the unique path, right? Thanks Richard
Hi, Richard! Thanks for finding this example. I have added it to join.sql regress test. New patch version is attached. I also thought about case to just check Var to be a constant. But the main question is 'if it is the only node affected?'. Example shows that error is caused by Unique path, but maybe there are another nodes which will cause such error. Yes only 'create_unique_paths' creates new TargetEntry. But I think the root cause is that when recreating 'attr_needed', the dependency between this relation and some other relation is lost. --- Regards, Sergey Soloviev Tantor Labs LLC
From dad22e812b3147e159f0203a41e9bf042a1d25d4 Mon Sep 17 00:00:00 2001 From: Sergey Soloviev <[email protected]> Date: Mon, 25 Aug 2025 19:53:39 +0300 Subject: [PATCH] fix: variable not found in subplan target lists When restoring 'attr_needed' after self-join removal we should also consider EC with constants because some plan nodes can use such constant attributes. For example, after converting `EXISTS` to JOIN planner can create `UniquePath` referencing attribute which is contained in EC with constant. But because `attr_needed` is NULL then this attribute is not returned by JOIN, so at `set_plan_references` fails to find this attribute in targetlist of node below. --- src/backend/optimizer/path/equivclass.c | 21 +++++++---- src/test/regress/expected/join.out | 47 +++++++++++++++++++++++++ src/test/regress/sql/join.sql | 26 ++++++++++++++ 3 files changed, 88 insertions(+), 6 deletions(-) diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index 441f12f6c50..58ca75ab7a8 100644 --- a/src/backend/optimizer/path/equivclass.c +++ b/src/backend/optimizer/path/equivclass.c @@ -2586,18 +2586,27 @@ rebuild_eclass_attr_needed(PlannerInfo *root) */ Assert(ec->ec_childmembers == NULL); - /* Need do anything only for a multi-member, no-const EC. */ - if (list_length(ec->ec_members) > 1 && !ec->ec_has_const) + /* + * Need do anything only for a multi-member. Note that some + * wrapping plan nodes (i.e. UniquePath) can reference attributes + * contained in EC with constants. + */ + if (list_length(ec->ec_members) > 1) { ListCell *lc2; foreach(lc2, ec->ec_members) { EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2); - List *vars = pull_var_clause((Node *) cur_em->em_expr, - PVC_RECURSE_AGGREGATES | - PVC_RECURSE_WINDOWFUNCS | - PVC_INCLUDE_PLACEHOLDERS); + List *vars; + + if (cur_em->em_is_const) + continue; + + vars = pull_var_clause((Node *) cur_em->em_expr, + PVC_RECURSE_AGGREGATES | + PVC_RECURSE_WINDOWFUNCS | + PVC_INCLUDE_PLACEHOLDERS); add_vars_to_attr_needed(root, vars, ec->ec_relids); list_free(vars); diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out index 98b05c94a11..71654f528b1 100644 --- a/src/test/regress/expected/join.out +++ b/src/test/regress/expected/join.out @@ -6500,6 +6500,53 @@ where t1.a = s.c; ---------- (0 rows) +rollback; +-- yet another join removal bug: EquivalenceClass cleanup must preserve +-- dependency with JOINs above even if EC has constant +begin; +create table t (a int, b int); +create unique index on t (a); +explain (costs off) +select t1.a from t t1 + left join t t2 on t1.a = t2.a + join t t3 on true +where exists (select 1 from t t4 + join t t5 on t4.b = t5.b + join t t6 on t5.b = t6.b + where t1.a = t4.a and t3.a = t5.a and t4.a = 2); + QUERY PLAN +---------------------------------------------------------------------------- + Nested Loop + -> Nested Loop + -> Index Only Scan using t_a_idx on t t1 + Index Cond: (a = 2) + -> HashAggregate + Group Key: t4.a, t5.a + -> Hash Join + Hash Cond: (t6.b = t4.b) + -> Seq Scan on t t6 + -> Hash + -> Hash Join + Hash Cond: (t5.b = t4.b) + -> Seq Scan on t t5 + -> Hash + -> Index Scan using t_a_idx on t t4 + Index Cond: (a = 2) + -> Index Only Scan using t_a_idx on t t3 + Index Cond: (a = t5.a) +(18 rows) + +select t1.a from t t1 + left join t t2 on t1.a = t2.a + join t t3 on true +where exists (select 1 from t t4 + join t t5 on t4.b = t5.b + join t t6 on t5.b = t6.b + where t1.a = t4.a and t3.a = t5.a and t4.a = 2); + a +--- +(0 rows) + rollback; -- test cases where we can remove a join, but not a PHV computed at it begin; diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql index 5f0a475894d..d62b4abd817 100644 --- a/src/test/regress/sql/join.sql +++ b/src/test/regress/sql/join.sql @@ -2420,6 +2420,32 @@ where t1.a = s.c; rollback; +-- yet another join removal bug: EquivalenceClass cleanup must preserve +-- dependency with JOINs above even if EC has constant +begin; + +create table t (a int, b int); +create unique index on t (a); + +explain (costs off) +select t1.a from t t1 + left join t t2 on t1.a = t2.a + join t t3 on true +where exists (select 1 from t t4 + join t t5 on t4.b = t5.b + join t t6 on t5.b = t6.b + where t1.a = t4.a and t3.a = t5.a and t4.a = 2); + +select t1.a from t t1 + left join t t2 on t1.a = t2.a + join t t3 on true +where exists (select 1 from t t4 + join t t5 on t4.b = t5.b + join t t6 on t5.b = t6.b + where t1.a = t4.a and t3.a = t5.a and t4.a = 2); + +rollback; + -- test cases where we can remove a join, but not a PHV computed at it begin; -- 2.43.0
