Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-19 Thread Tom Lane
Dean Rasheed  writes:
> Fair enough. I have no further comments.

Pushed then.  Thanks for reviewing!  I gave you credit as co-author.

regards, tom lane




Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-19 Thread Dean Rasheed
On Tue, 19 Mar 2024 at 21:40, Tom Lane  wrote:
>
> I'm inclined to leave that alone.  The actual source sub-SELECT
> could only have had one column, so no real confusion is possible.
> Yeah, there's a resjunk grouping column visible in the plan as well,
> but those exist in many other queries, and we've not gotten questions
> about them.
>

Fair enough. I have no further comments.

Regards,
Dean




Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-19 Thread Tom Lane
Dean Rasheed  writes:
> One final case that could possibly be improved is this one from 
> aggregates.out:

> explain (verbose, costs off)
> select array(select sum(x+y) s
> from generate_series(1,3) y group by y order by s)
>   from generate_series(1,3) x;
> QUERY PLAN
> ---
>  Function Scan on pg_catalog.generate_series x
>Output: ARRAY(SubPlan 1)
>Function Call: generate_series(1, 3)
>SubPlan 1
>  ->  Sort
>Output: (sum((x.x + y.y))), y.y
>Sort Key: (sum((x.x + y.y)))
>->  HashAggregate
>  Output: sum((x.x + y.y)), y.y
>  Group Key: y.y
>  ->  Function Scan on pg_catalog.generate_series y
>Output: y.y
>Function Call: generate_series(1, 3)

> ARRAY operates on a SELECT with a single targetlist item, but in this
> case it looks like the subplan output has 2 columns, which might
> confuse people.

I'm inclined to leave that alone.  The actual source sub-SELECT
could only have had one column, so no real confusion is possible.
Yeah, there's a resjunk grouping column visible in the plan as well,
but those exist in many other queries, and we've not gotten questions
about them.

(Perhaps some documentation about junk columns needs to be added?
I'm not eager to write it though.)

I had actually had a similar thought about sticking ".col1" onto
EXPR_SUBLINK cases, but I concluded it was mostly pedantry.
Nobody's likely to get confused.

regards, tom lane




Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-19 Thread Dean Rasheed
On Tue, 19 Mar 2024 at 16:42, Tom Lane  wrote:
>
> Here's a hopefully-final version that makes that adjustment and
> tweaks a couple of comments.
>

This looks very good to me.

One final case that could possibly be improved is this one from aggregates.out:

explain (verbose, costs off)
select array(select sum(x+y) s
from generate_series(1,3) y group by y order by s)
  from generate_series(1,3) x;
QUERY PLAN
---
 Function Scan on pg_catalog.generate_series x
   Output: ARRAY(SubPlan 1)
   Function Call: generate_series(1, 3)
   SubPlan 1
 ->  Sort
   Output: (sum((x.x + y.y))), y.y
   Sort Key: (sum((x.x + y.y)))
   ->  HashAggregate
 Output: sum((x.x + y.y)), y.y
 Group Key: y.y
 ->  Function Scan on pg_catalog.generate_series y
   Output: y.y
   Function Call: generate_series(1, 3)

ARRAY operates on a SELECT with a single targetlist item, but in this
case it looks like the subplan output has 2 columns, which might
confuse people.

I wonder if we should output "ARRAY((SubPlan 1).col1)" to make it
clearer. Since ARRAY_SUBLINK is a special case, which always collects
the first column's values, we could just always output "col1" for
ARRAY.

Regards,
Dean




Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-19 Thread Tom Lane
I wrote:
> I won't update the patch right now, but "(rescan SubPlan N)"
> seems like a winner to me.

Here's a hopefully-final version that makes that adjustment and
tweaks a couple of comments.

regards, tom lane

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 58a603ac56..acbbf3b56c 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3062,10 +3062,10 @@ select exists(select 1 from pg_enum), sum(c1) from ft1;
 QUERY PLAN
 --
  Foreign Scan
-   Output: $0, (sum(ft1.c1))
+   Output: (InitPlan 1).col1, (sum(ft1.c1))
Relations: Aggregate on (public.ft1)
Remote SQL: SELECT sum("C 1") FROM "S 1"."T 1"
-   InitPlan 1 (returns $0)
+   InitPlan 1
  ->  Seq Scan on pg_catalog.pg_enum
 (6 rows)
 
@@ -3080,8 +3080,8 @@ select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1;
 QUERY PLAN 
 ---
  GroupAggregate
-   Output: $0, sum(ft1.c1)
-   InitPlan 1 (returns $0)
+   Output: (InitPlan 1).col1, sum(ft1.c1)
+   InitPlan 1
  ->  Seq Scan on pg_catalog.pg_enum
->  Foreign Scan on public.ft1
  Output: ft1.c1
@@ -3305,10 +3305,10 @@ select sum(c1) filter (where (c1 / c1) * random() <= 1) from ft1 group by c2 ord
 
 explain (verbose, costs off)
 select sum(c2) filter (where c2 in (select c2 from ft1 where c2 < 5)) from ft1;
-QUERY PLAN 

+  QUERY PLAN   
+---
  Aggregate
-   Output: sum(ft1.c2) FILTER (WHERE (hashed SubPlan 1))
+   Output: sum(ft1.c2) FILTER (WHERE (ANY (ft1.c2 = (hashed SubPlan 1).col1)))
->  Foreign Scan on public.ft1
  Output: ft1.c2
  Remote SQL: SELECT c2 FROM "S 1"."T 1"
@@ -6171,9 +6171,9 @@ UPDATE ft2 AS target SET (c2, c7) = (
  Update on public.ft2 target
Remote SQL: UPDATE "S 1"."T 1" SET c2 = $2, c7 = $3 WHERE ctid = $1
->  Foreign Scan on public.ft2 target
- Output: $1, $2, (SubPlan 1 (returns $1,$2)), target.ctid, target.*
+ Output: (SubPlan 1).col1, (SubPlan 1).col2, (rescan SubPlan 1), target.ctid, target.*
  Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1" WHERE (("C 1" > 1100)) FOR UPDATE
- SubPlan 1 (returns $1,$2)
+ SubPlan 1
->  Foreign Scan on public.ft2 src
  Output: (src.c2 * 10), src.c7
  Remote SQL: SELECT c2, c7 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
@@ -11685,9 +11685,9 @@ SELECT * FROM local_tbl t1 LEFT JOIN (SELECT *, (SELECT count(*) FROM async_pt W
QUERY PLAN   
 
  Nested Loop Left Join
-   Output: t1.a, t1.b, t1.c, async_pt.a, async_pt.b, async_pt.c, ($0)
+   Output: t1.a, t1.b, t1.c, async_pt.a, async_pt.b, async_pt.c, ((InitPlan 1).col1)
Join Filter: (t1.a = async_pt.a)
-   InitPlan 1 (returns $0)
+   InitPlan 1
  ->  Aggregate
Output: count(*)
->  Append
@@ -11699,10 +11699,10 @@ SELECT * FROM local_tbl t1 LEFT JOIN (SELECT *, (SELECT count(*) FROM async_pt W
  Output: t1.a, t1.b, t1.c
->  Append
  ->  Async Foreign Scan on public.async_p1 async_pt_1
-   Output: async_pt_1.a, async_pt_1.b, async_pt_1.c, $0
+   Output: async_pt_1.a, async_pt_1.b, async_pt_1.c, (InitPlan 1).col1
Remote SQL: SELECT a, b, c FROM public.base_tbl1 WHERE ((a < 3000))
  ->  Async Foreign Scan on public.async_p2 async_pt_2
-   Output: async_pt_2.a, async_pt_2.b, async_pt_2.c, $0
+   Output: async_pt_2.a, async_pt_2.b, async_pt_2.c, (InitPlan 1).col1
Remote SQL: SELECT a, b, c FROM public.base_tbl2 WHERE ((a < 3000))
 (20 rows)
 
@@ -11713,7 +11713,7 @@ SELECT * FROM local_tbl t1 LEFT JOIN (SELECT *, (SELECT count(*) FROM async_pt W
  Nested Loop Left Join (actual rows=1 loops=1)
Join Filter: (t1.a = async_pt.a)
Rows Removed by Join Filter: 399
-   InitPlan 1 (returns $0)
+   InitPlan 1
  ->  Aggregate (actual rows=1 loops=1)
->  Append (actual rows=400 loops=1)
  ->  Async Foreign Scan on async_p1 async_pt_4 (actual rows=200 loops=1)
@@ -11936,11 +11936,11 @@ CREATE FOREIGN TABLE foreign_tbl2 () INHERITS (foreign_tbl)
   SERVER loopback OPTIONS (table_name 'base_tbl');
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT a FROM base_tbl WHERE (a, random() > 0) IN (SELECT a, random() > 0 FROM foreign_tbl);

Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-18 Thread Tom Lane
Dean Rasheed  writes:
> On Mon, 18 Mar 2024 at 23:19, Tom Lane  wrote:
>> After thinking a bit more, I understood what was bothering me about
>> that notation: it looks too much like a call of a user-defined
>> function named "rescan()".  I think we'd be better off with the
>> all-caps "RESCAN()".

> Or perhaps move the parentheses, and write "(rescan SubPlan N)" or
> "(reset SubPlan N)". Dunno.

Oh, I like that!  It seems rather parallel to the existing "hashed"
annotation.  If I had it to do over, I'd likely do the "hashed"
bit differently --- but as the proposal currently stands, we are
not changing "hashed", so we might as well double down on that.

I won't update the patch right now, but "(rescan SubPlan N)"
seems like a winner to me.

regards, tom lane




Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-18 Thread Dean Rasheed
On Mon, 18 Mar 2024 at 23:19, Tom Lane  wrote:
>
> > Hm.  I used "rescan(SubPlan)" in the attached, but it still looks
> > a bit odd to my eye.
>
> After thinking a bit more, I understood what was bothering me about
> that notation: it looks too much like a call of a user-defined
> function named "rescan()".  I think we'd be better off with the
> all-caps "RESCAN()".
>

Or perhaps move the parentheses, and write "(rescan SubPlan N)" or
"(reset SubPlan N)". Dunno.

Regards,
Dean




Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-18 Thread Tom Lane
I wrote:
> Hm.  I used "rescan(SubPlan)" in the attached, but it still looks
> a bit odd to my eye.

After thinking a bit more, I understood what was bothering me about
that notation: it looks too much like a call of a user-defined
function named "rescan()".  I think we'd be better off with the
all-caps "RESCAN()".

regards, tom lane




Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-18 Thread Tom Lane
Dean Rasheed  writes:
> The get_rule_expr() code could perhaps be simplified a bit, getting
> rid of the show_subplan_name variable and moving the recursive calls
> to get_rule_expr() to after the switch statement -- if testexpr is
> non-NULL, print it, else print the subplan name probably works for all
> subplan types.

Oooh, good idea.  The symmetry wasn't apparent when we started, but
it's there now, and the code does look nicer this way.

> The "colN" notation has grown on me, especially when you look at
> examples like those in partition_prune.out with a mix of Param types.

OK, I've left it like that in the attached v5, but I'm still open
to other opinions.

>> The undecorated reference to (SubPlan 1) is fairly confusing, since
>> it doesn't correspond to anything that will actually get output.
>> I suggest that perhaps instead this should read
>> Output: (SubPlan 1).col1, (SubPlan 1).col2, IGNORE(SubPlan 1), i.tableoid, 
>> i.ctid
>> or
>> Output: (SubPlan 1).col1, (SubPlan 1).col2, RESET(SubPlan 1), i.tableoid, 
>> i.ctid

> I think "RESET()" or "RESCAN()" or something like that is better than
> "INGORE()", because it indicates that it is actually doing something.
> I don't really have a better idea. Perhaps not all uppercase though,
> since that seems to go against the rest of the EXPLAIN output.

Hm.  I used "rescan(SubPlan)" in the attached, but it still looks
a bit odd to my eye.

I did some more work on the documentation too, to show the difference
between hashed and not-hashed subplans.  I feel like we're pretty
close here, with the possible exception of how to show MULTIEXPR.

regards, tom lane

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 58a603ac56..8d75ca56c9 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3062,10 +3062,10 @@ select exists(select 1 from pg_enum), sum(c1) from ft1;
 QUERY PLAN
 --
  Foreign Scan
-   Output: $0, (sum(ft1.c1))
+   Output: (InitPlan 1).col1, (sum(ft1.c1))
Relations: Aggregate on (public.ft1)
Remote SQL: SELECT sum("C 1") FROM "S 1"."T 1"
-   InitPlan 1 (returns $0)
+   InitPlan 1
  ->  Seq Scan on pg_catalog.pg_enum
 (6 rows)
 
@@ -3080,8 +3080,8 @@ select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1;
 QUERY PLAN 
 ---
  GroupAggregate
-   Output: $0, sum(ft1.c1)
-   InitPlan 1 (returns $0)
+   Output: (InitPlan 1).col1, sum(ft1.c1)
+   InitPlan 1
  ->  Seq Scan on pg_catalog.pg_enum
->  Foreign Scan on public.ft1
  Output: ft1.c1
@@ -3305,10 +3305,10 @@ select sum(c1) filter (where (c1 / c1) * random() <= 1) from ft1 group by c2 ord
 
 explain (verbose, costs off)
 select sum(c2) filter (where c2 in (select c2 from ft1 where c2 < 5)) from ft1;
-QUERY PLAN 

+  QUERY PLAN   
+---
  Aggregate
-   Output: sum(ft1.c2) FILTER (WHERE (hashed SubPlan 1))
+   Output: sum(ft1.c2) FILTER (WHERE (ANY (ft1.c2 = (hashed SubPlan 1).col1)))
->  Foreign Scan on public.ft1
  Output: ft1.c2
  Remote SQL: SELECT c2 FROM "S 1"."T 1"
@@ -6171,9 +6171,9 @@ UPDATE ft2 AS target SET (c2, c7) = (
  Update on public.ft2 target
Remote SQL: UPDATE "S 1"."T 1" SET c2 = $2, c7 = $3 WHERE ctid = $1
->  Foreign Scan on public.ft2 target
- Output: $1, $2, (SubPlan 1 (returns $1,$2)), target.ctid, target.*
+ Output: (SubPlan 1).col1, (SubPlan 1).col2, rescan(SubPlan 1), target.ctid, target.*
  Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1" WHERE (("C 1" > 1100)) FOR UPDATE
- SubPlan 1 (returns $1,$2)
+ SubPlan 1
->  Foreign Scan on public.ft2 src
  Output: (src.c2 * 10), src.c7
  Remote SQL: SELECT c2, c7 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
@@ -11685,9 +11685,9 @@ SELECT * FROM local_tbl t1 LEFT JOIN (SELECT *, (SELECT count(*) FROM async_pt W
QUERY PLAN   
 
  Nested Loop Left Join
-   Output: t1.a, t1.b, t1.c, async_pt.a, async_pt.b, async_pt.c, ($0)
+   Output: t1.a, t1.b, t1.c, async_pt.a, async_pt.b, async_pt.c, ((InitPlan 1).col1)
Join Filter: (t1.a = async_pt.a)
-   InitPlan 1 (returns $0)
+   InitPlan 1
  ->  Aggregate
Output: count(*)
->  Append
@@ -11699,10 +11699,10 @@ SELECT * FROM local_tbl t1 LEFT JOIN 

Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-18 Thread Dean Rasheed
On Sun, 17 Mar 2024 at 19:39, Tom Lane  wrote:
>
> Here's a cleaned-up version that seems to successfully resolve
> PARAM_EXEC references in all cases.  I haven't changed the
> "(plan_name).colN" notation, but that's an easy fix once we have
> consensus on the spelling.

I took a more detailed look at this and the code and doc changes all
look good to me.

There's a comment at the end of find_param_generator() that should
probably say "No generator found", rather than "No referent found".

The get_rule_expr() code could perhaps be simplified a bit, getting
rid of the show_subplan_name variable and moving the recursive calls
to get_rule_expr() to after the switch statement -- if testexpr is
non-NULL, print it, else print the subplan name probably works for all
subplan types.

The "colN" notation has grown on me, especially when you look at
examples like those in partition_prune.out with a mix of Param types.
Not using "$n" for 2 different purposes is good, and I much prefer
this to the original "FROM [HASHED] SubPlan N (returns ...)" notation.

> There are two other loose ends bothering me:
>
> 1.  I see that Gather nodes sometimes print things like
>
>->  Gather (actual rows=N loops=N)
>  Workers Planned: 2
>  Params Evaluated: $0, $1
>  Workers Launched: N
>
> I propose we nuke the "Params Evaluated" output altogether.

+1

> 2.  MULTIEXPR_SUBLINK subplans now result in EXPLAIN output like
>
>->  Result
>  Output: (SubPlan 1).col1, (SubPlan 1).col2, (SubPlan 1), i.tableoid, 
> i.ctid
>
> The undecorated reference to (SubPlan 1) is fairly confusing, since
> it doesn't correspond to anything that will actually get output.
> I suggest that perhaps instead this should read
>
>  Output: (SubPlan 1).col1, (SubPlan 1).col2, IGNORE(SubPlan 1), 
> i.tableoid, i.ctid
>
> or
>
>  Output: (SubPlan 1).col1, (SubPlan 1).col2, RESET(SubPlan 1), 
> i.tableoid, i.ctid

I think "RESET()" or "RESCAN()" or something like that is better than
"INGORE()", because it indicates that it is actually doing something.
I don't really have a better idea. Perhaps not all uppercase though,
since that seems to go against the rest of the EXPLAIN output.

Regards,
Dean




Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-17 Thread Tom Lane
I wrote:
> Dean Rasheed  writes:
>> Overall, I think this is heading in the right direction. I think we
>> just need a good way to say "the n'th output column of the subplan",
>> that can't be confused with anything else in the output.

> We could consider notations like "(SubPlan 1 column 2)", which
> couldn't be confused with anything else, if only because a name
> like that would have to be double-quoted.  It's a little more
> verbose but not that much.  I fear "(SubPlan 1 col 2)" is too
> short to be clear.

Here's a cleaned-up version that seems to successfully resolve
PARAM_EXEC references in all cases.  I haven't changed the
"(plan_name).colN" notation, but that's an easy fix once we have
consensus on the spelling.

There are two other loose ends bothering me:

1.  I see that Gather nodes sometimes print things like

   ->  Gather (actual rows=N loops=N)
 Workers Planned: 2
 Params Evaluated: $0, $1
 Workers Launched: N

This now sticks out like a sore thumb, because there's no other
reference to $0 or $1 in the EXPLAIN output.  We could possibly
adjust the code to print something like

Params Evaluated: (InitPlan 1).col1, (InitPlan 2).col1

but I think that's pretty silly.  This looks to me like a code
debugging aid that shouldn't have survived past initial development.
It's of zero use to end users, and it doesn't correspond to anything
we bother to mention in EXPLAIN output in any other case: initplans
just magically get evaluated at the right times.  I propose we
nuke the "Params Evaluated" output altogether.

2.  MULTIEXPR_SUBLINK subplans now result in EXPLAIN output like

explain (verbose, costs off)
update inhpar i set (f1, f2) = (select i.f1, i.f2 || '-' from int4_tbl limit 1);
  ...
   ->  Result
 Output: (SubPlan 1).col1, (SubPlan 1).col2, (SubPlan 1), i.tableoid, 
i.ctid

The undecorated reference to (SubPlan 1) is fairly confusing, since
it doesn't correspond to anything that will actually get output.
I suggest that perhaps instead this should read

 Output: (SubPlan 1).col1, (SubPlan 1).col2, IGNORE(SubPlan 1), 
i.tableoid, i.ctid

or

 Output: (SubPlan 1).col1, (SubPlan 1).col2, RESET(SubPlan 1), 
i.tableoid, i.ctid

the point of "RESET()" being that what the executor actually does
there is to re-arm the SubPlan to be evaluated again on the next
pass through the targetlist.  I'm not greatly in love with either
of those ideas, though.  Any thoughts?

regards, tom lane

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 58a603ac56..2609255da6 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3062,10 +3062,10 @@ select exists(select 1 from pg_enum), sum(c1) from ft1;
 QUERY PLAN
 --
  Foreign Scan
-   Output: $0, (sum(ft1.c1))
+   Output: (InitPlan 1).col1, (sum(ft1.c1))
Relations: Aggregate on (public.ft1)
Remote SQL: SELECT sum("C 1") FROM "S 1"."T 1"
-   InitPlan 1 (returns $0)
+   InitPlan 1
  ->  Seq Scan on pg_catalog.pg_enum
 (6 rows)
 
@@ -3080,8 +3080,8 @@ select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1;
 QUERY PLAN 
 ---
  GroupAggregate
-   Output: $0, sum(ft1.c1)
-   InitPlan 1 (returns $0)
+   Output: (InitPlan 1).col1, sum(ft1.c1)
+   InitPlan 1
  ->  Seq Scan on pg_catalog.pg_enum
->  Foreign Scan on public.ft1
  Output: ft1.c1
@@ -3305,10 +3305,10 @@ select sum(c1) filter (where (c1 / c1) * random() <= 1) from ft1 group by c2 ord
 
 explain (verbose, costs off)
 select sum(c2) filter (where c2 in (select c2 from ft1 where c2 < 5)) from ft1;
-QUERY PLAN 

+  QUERY PLAN   
+---
  Aggregate
-   Output: sum(ft1.c2) FILTER (WHERE (hashed SubPlan 1))
+   Output: sum(ft1.c2) FILTER (WHERE (ANY (ft1.c2 = (hashed SubPlan 1).col1)))
->  Foreign Scan on public.ft1
  Output: ft1.c2
  Remote SQL: SELECT c2 FROM "S 1"."T 1"
@@ -6171,9 +6171,9 @@ UPDATE ft2 AS target SET (c2, c7) = (
  Update on public.ft2 target
Remote SQL: UPDATE "S 1"."T 1" SET c2 = $2, c7 = $3 WHERE ctid = $1
->  Foreign Scan on public.ft2 target
- Output: $1, $2, (SubPlan 1 (returns $1,$2)), target.ctid, target.*
+ Output: (SubPlan 1).col1, (SubPlan 1).col2, (SubPlan 1), target.ctid, target.*
  Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1" WHERE (("C 1" > 1100)) FOR UPDATE
- SubPlan 1 (returns $1,$2)
+ SubPlan 1
->  

Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-17 Thread Tom Lane
Dean Rasheed  writes:
> On Sat, 16 Mar 2024 at 17:25, Tom Lane  wrote:
>> After looking at your draft some more, it occurred to me that we're
>> not that far from getting rid of showing "$n" entirely in this
>> context.  Instead of (subplan_name).$n, we could write something like
>> (subplan_name).colN or (subplan_name).columnN or (subplan_name).fN,

> I think it would be confusing if there were tables whose columns are
> named "colN". In that case, a SQL qual like "t1.col2 = t2.col2" might
> be output as something like "t1.col2 = (SubPlan 1).col3", since the
> subplan's targetlist wouldn't necessarily just output the table
> columns in order.

Perhaps.  I think people who are using columns named like that are
already accustomed to having to pay close attention to which table
the column is shown as qualified by.  So I'm not sure there'd really
be much problem in practice.

> I actually think "$n" is not so bad (especially if we make n the
> column number). The fact that it's qualified by the subplan name ought
> to be sufficient to avoid it being confused with an external
> parameter.

That's an interesting compromise position, but I'm not sure that
it buys much compared to the approach shown in your draft (that
is, continuing to use the real param IDs).  The real IDs at least
have the advantage of being unique.

> Whatever name is chosen, I think we should still output "(returns
> ...)" on the subplan nodes.

We should do that if we continue to show real param IDs, but
if we change to using column numbers then I think it's pointless.
Output like

SubPlan 1 (returns $1,$2)
  ...
SubPlan 2 (returns $1,$2,$3)

seems to me that it'd be more confusing not less so.  Does SubPlan 2
return the same values as SubPlan 1 plus more?

> Overall, I think this is heading in the right direction. I think we
> just need a good way to say "the n'th output column of the subplan",
> that can't be confused with anything else in the output.

We could consider notations like "(SubPlan 1 column 2)", which
couldn't be confused with anything else, if only because a name
like that would have to be double-quoted.  It's a little more
verbose but not that much.  I fear "(SubPlan 1 col 2)" is too
short to be clear.

regards, tom lane




Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-17 Thread Dean Rasheed
On Sat, 16 Mar 2024 at 17:25, Tom Lane  wrote:
>
> After looking at your draft some more, it occurred to me that we're
> not that far from getting rid of showing "$n" entirely in this
> context.  Instead of (subplan_name).$n, we could write something like
> (subplan_name).colN or (subplan_name).columnN or (subplan_name).fN,
> depending on your taste for verboseness.  "fN" would correspond to the
> names we assign to columns of anonymous record types, but it hasn't
> got much else to recommend it.  In the attached I used "colN";
> "columnN" would be my next choice.

Using the column number rather than the parameter index looks a lot
neater, especially in output with multiple subplans. Of those choices,
"colN" looks nicest, however...

I think it would be confusing if there were tables whose columns are
named "colN". In that case, a SQL qual like "t1.col2 = t2.col2" might
be output as something like "t1.col2 = (SubPlan 1).col3", since the
subplan's targetlist wouldn't necessarily just output the table
columns in order.

I actually think "$n" is not so bad (especially if we make n the
column number). The fact that it's qualified by the subplan name ought
to be sufficient to avoid it being confused with an external
parameter. Maybe there are other options, but I think it's important
to choose something that's unlikely to be confused with a real column
name.

Whatever name is chosen, I think we should still output "(returns
...)" on the subplan nodes. In a complex query there might be a lot of
output columns, and the expressions might be quite complex, making it
hard to see how many columns the subplan is returning. Besides,
without that, it might not be obvious to everyone what "colN" (or
whatever we settle on) means in places that refer to the subplan.

> You could also imagine trying to use the sub-SELECT's actual output
> column names, but I fear that would be ambiguous --- too often it'd
> be "?column?" or some other non-unique name.

Yeah, I think that's a non-starter, because the output column names
aren't necessarily unique.

> The attached proof-of-concept is incomplete: it fails to replace some
> $n occurrences with subplan references, as is visible in some of the
> test cases.  I believe your quick hack in get_parameter() is not
> correct in detail, but for the moment I didn't bother to debug it.

Yeah, that's exactly what it was, a quick hack. I just wanted to get
some output to see what it would look like in a few real cases.

Overall, I think this is heading in the right direction. I think we
just need a good way to say "the n'th output column of the subplan",
that can't be confused with anything else in the output.

Regards,
Dean




Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-16 Thread Tom Lane
I wrote:
> Dean Rasheed  writes:
>> One thing that concerns me about making even greater use of "$n" is
>> the potential for confusion with generic plan parameters.

> True.

After looking at your draft some more, it occurred to me that we're
not that far from getting rid of showing "$n" entirely in this
context.  Instead of (subplan_name).$n, we could write something like
(subplan_name).colN or (subplan_name).columnN or (subplan_name).fN,
depending on your taste for verboseness.  "fN" would correspond to the
names we assign to columns of anonymous record types, but it hasn't
got much else to recommend it.  In the attached I used "colN";
"columnN" would be my next choice.

You could also imagine trying to use the sub-SELECT's actual output
column names, but I fear that would be ambiguous --- too often it'd
be "?column?" or some other non-unique name.

> Hmm.  I guess what bothers me about that is that it could be read to
> suggest that the initplan or subplan is evaluated again for each
> output parameter.

This objection seems like it could be solved through documentation,
so I wrote some.

The attached proof-of-concept is incomplete: it fails to replace some
$n occurrences with subplan references, as is visible in some of the
test cases.  I believe your quick hack in get_parameter() is not
correct in detail, but for the moment I didn't bother to debug it.
I'm just presenting this as a POC to see if this is the direction
people would like to go in.  If there's not objections, I'll try to
make a bulletproof implementation.

regards, tom lane

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 58a603ac56..585ba60ae3 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3062,10 +3062,10 @@ select exists(select 1 from pg_enum), sum(c1) from ft1;
 QUERY PLAN
 --
  Foreign Scan
-   Output: $0, (sum(ft1.c1))
+   Output: (InitPlan 1).col1, (sum(ft1.c1))
Relations: Aggregate on (public.ft1)
Remote SQL: SELECT sum("C 1") FROM "S 1"."T 1"
-   InitPlan 1 (returns $0)
+   InitPlan 1
  ->  Seq Scan on pg_catalog.pg_enum
 (6 rows)
 
@@ -3080,8 +3080,8 @@ select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1;
 QUERY PLAN 
 ---
  GroupAggregate
-   Output: $0, sum(ft1.c1)
-   InitPlan 1 (returns $0)
+   Output: (InitPlan 1).col1, sum(ft1.c1)
+   InitPlan 1
  ->  Seq Scan on pg_catalog.pg_enum
->  Foreign Scan on public.ft1
  Output: ft1.c1
@@ -3305,10 +3305,10 @@ select sum(c1) filter (where (c1 / c1) * random() <= 1) from ft1 group by c2 ord
 
 explain (verbose, costs off)
 select sum(c2) filter (where c2 in (select c2 from ft1 where c2 < 5)) from ft1;
-QUERY PLAN 

+  QUERY PLAN   
+---
  Aggregate
-   Output: sum(ft1.c2) FILTER (WHERE (hashed SubPlan 1))
+   Output: sum(ft1.c2) FILTER (WHERE (ANY (ft1.c2 = (hashed SubPlan 1).col1)))
->  Foreign Scan on public.ft1
  Output: ft1.c2
  Remote SQL: SELECT c2 FROM "S 1"."T 1"
@@ -6171,9 +6171,9 @@ UPDATE ft2 AS target SET (c2, c7) = (
  Update on public.ft2 target
Remote SQL: UPDATE "S 1"."T 1" SET c2 = $2, c7 = $3 WHERE ctid = $1
->  Foreign Scan on public.ft2 target
- Output: $1, $2, (SubPlan 1 (returns $1,$2)), target.ctid, target.*
+ Output: $1, $2, (SubPlan 1), target.ctid, target.*
  Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1" WHERE (("C 1" > 1100)) FOR UPDATE
- SubPlan 1 (returns $1,$2)
+ SubPlan 1
->  Foreign Scan on public.ft2 src
  Output: (src.c2 * 10), src.c7
  Remote SQL: SELECT c2, c7 FROM "S 1"."T 1" WHERE (($1::integer = "C 1"))
@@ -11687,7 +11687,7 @@ SELECT * FROM local_tbl t1 LEFT JOIN (SELECT *, (SELECT count(*) FROM async_pt W
  Nested Loop Left Join
Output: t1.a, t1.b, t1.c, async_pt.a, async_pt.b, async_pt.c, ($0)
Join Filter: (t1.a = async_pt.a)
-   InitPlan 1 (returns $0)
+   InitPlan 1
  ->  Aggregate
Output: count(*)
->  Append
@@ -11713,7 +11713,7 @@ SELECT * FROM local_tbl t1 LEFT JOIN (SELECT *, (SELECT count(*) FROM async_pt W
  Nested Loop Left Join (actual rows=1 loops=1)
Join Filter: (t1.a = async_pt.a)
Rows Removed by Join Filter: 399
-   InitPlan 1 (returns $0)
+   InitPlan 1
  ->  Aggregate (actual rows=1 loops=1)
->  Append (actual rows=400 loops=1)
  ->  Async Foreign Scan on async_p1 async_pt_4 

Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-15 Thread Tom Lane
Dean Rasheed  writes:
> One thing that concerns me about making even greater use of "$n" is
> the potential for confusion with generic plan parameters.

True.

> Another possibility is to put the SubPlan and InitPlan names inline,
> rather than outputting "FROM SubPlan ...". I had a go at hacking that
> up and this was the result:

>Output: (($3 = (InitPlan 1).$0) AND ($4 = (InitPlan 1).$1) AND
> ((($1 = (SubPlan 2).$3) AND ($2 = (SubPlan 2).$4

Hmm.  I guess what bothers me about that is that it could be read to
suggest that the initplan or subplan is evaluated again for each
output parameter.  Perhaps it'll be sufficiently clear as long as
we keep the labeling

>InitPlan 1 (returns $0,$1)
>SubPlan 2 (returns $3,$4)

but I'm not sure.  Anybody else have an opinion?

(I didn't read your changes to the code yet --- I think at this
point we can just debate proposed output without worrying about
how to implement it.)

regards, tom lane




Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-09 Thread Dean Rasheed
On Fri, 16 Feb 2024 at 19:39, Tom Lane  wrote:
>
> > So now I'm thinking that we do have enough detail in the present
> > proposal, and we just need to think about whether there's some
> > nicer way to present it than the particular spelling I used here.
>

One thing that concerns me about making even greater use of "$n" is
the potential for confusion with generic plan parameters. Maybe it's
always possible to work out which is which from context, but still it
looks messy:

drop table if exists foo;
create table foo(id int, x int, y int);

explain (verbose, costs off, generic_plan)
select row($3,$4) = (select x,y from foo where id=y) and
   row($1,$2) = (select min(x+y),max(x+y) from generate_series(1,3) x)
from generate_series(1,3) y;

  QUERY PLAN
---
 Function Scan on pg_catalog.generate_series y
   Output: (($3 = $0) AND ($4 = $1) AND (ROWCOMPARE (($1 = $3) AND ($2
= $4)) FROM SubPlan 2 (returns $3,$4)))
   Function Call: generate_series(1, 3)
   InitPlan 1 (returns $0,$1)
 ->  Seq Scan on public.foo
   Output: foo.x, foo.y
   Filter: (foo.id = foo.y)
   SubPlan 2 (returns $3,$4)
 ->  Aggregate
   Output: min((x.x + y.y)), max((x.x + y.y))
   ->  Function Scan on pg_catalog.generate_series x
 Output: x.x
 Function Call: generate_series(1, 3)

Another odd thing about that is the inconsistency between how the
SubPlan and InitPlan expressions are displayed. I think "ROWCOMPARE"
is really just an internal detail that could be omitted without losing
anything. But the "FROM SubPlan ..." is useful to work out where it's
coming from. Should it also output "FROM InitPlan ..."? I think that
would risk making it harder to read.

Another possibility is to put the SubPlan and InitPlan names inline,
rather than outputting "FROM SubPlan ...". I had a go at hacking that
up and this was the result:

  QUERY PLAN
---
 Function Scan on pg_catalog.generate_series y
   Output: (($3 = (InitPlan 1).$0) AND ($4 = (InitPlan 1).$1) AND
((($1 = (SubPlan 2).$3) AND ($2 = (SubPlan 2).$4
   Function Call: generate_series(1, 3)
   InitPlan 1 (returns $0,$1)
 ->  Seq Scan on public.foo
   Output: foo.x, foo.y
   Filter: (foo.id = foo.y)
   SubPlan 2 (returns $3,$4)
 ->  Aggregate
   Output: min((x.x + y.y)), max((x.x + y.y))
   ->  Function Scan on pg_catalog.generate_series x
 Output: x.x
 Function Call: generate_series(1, 3)

It's a little more verbose in this case, but in a lot of other cases
it ended up being more compact.

The code is a bit messy, but I think the regression test output
(attached) is clearer and easier to interpret. SubPlans and InitPlans
are displayed consistently, and it's easier to distinguish
SubPlan/InitPlan outputs from external parameters.

There are a few more regression test changes, corresponding to cases
where InitPlans are referenced, such as:

  Seq Scan on document
-   Filter: ((dlevel <= $0) AND f_leak(dtitle))
+   Filter: ((dlevel <= (InitPlan 1).$0) AND f_leak(dtitle))
InitPlan 1 (returns $0)
  ->  Index Scan using uaccount_pkey on uaccount
Index Cond: (pguser = CURRENT_USER)

but I think that's useful extra clarification.

Regards,
Dean
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
new file mode 100644
index c355e8f..f0ff936
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3021,7 +3021,7 @@ select exists(select 1 from pg_enum), su
 QUERY PLAN
 --
  Foreign Scan
-   Output: $0, (sum(ft1.c1))
+   Output: (InitPlan 1).$0, (sum(ft1.c1))
Relations: Aggregate on (public.ft1)
Remote SQL: SELECT sum("C 1") FROM "S 1"."T 1"
InitPlan 1 (returns $0)
@@ -3039,7 +3039,7 @@ select exists(select 1 from pg_enum), su
 QUERY PLAN 
 ---
  GroupAggregate
-   Output: $0, sum(ft1.c1)
+   Output: (InitPlan 1).$0, sum(ft1.c1)
InitPlan 1 (returns $0)
  ->  Seq Scan on pg_catalog.pg_enum
->  Foreign Scan on public.ft1
@@ -3264,14 +3264,14 @@ select sum(c1) filter (where (c1 / c1) *
 
 explain (verbose, costs off)
 select sum(c2) filter (where c2 in (select c2 from ft1 where c2 < 5)) from ft1;
-QUERY PLAN 

+ QUERY PLAN  

Re: Improving EXPLAIN's display of SubPlan nodes

2024-03-03 Thread Andrey M. Borodin



> On 17 Feb 2024, at 00:38, Tom Lane  wrote:
> 
> Here's a rebase over 9f1337639 --- no code changes, but this affects
> some of the new or changed expected outputs from that commit.

Aleksander, as long as your was reviewing this previously, I’ve added you to 
reviewers of this CF entry [0]. Please, ping me or remove yourself, it it’s 
actually not a case.

BTW, as long as there’s a new version and some input from Tom, would you be 
willing to post fleshier review?

Thanks for working on this!


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4782/



Re: Improving EXPLAIN's display of SubPlan nodes

2024-02-16 Thread Tom Lane
I wrote:
> So now I'm thinking that we do have enough detail in the present
> proposal, and we just need to think about whether there's some
> nicer way to present it than the particular spelling I used here.

Here's a rebase over 9f1337639 --- no code changes, but this affects
some of the new or changed expected outputs from that commit.

regards, tom lane

From f84343864e74501df627f986e326f766072398cd Mon Sep 17 00:00:00 2001
From: Tom Lane 
Date: Fri, 16 Feb 2024 14:32:23 -0500
Subject: [PATCH v2] Improve EXPLAIN's display of SubPlan nodes.

Represent the SubLinkType as best we can, and show the testexpr
where relevant.  To aid in interpreting testexprs, add the output
parameter IDs to the subplan name for subplans as well as initplans.

Discussion: https://postgr.es/m/2838538.1705692...@sss.pgh.pa.us
---
 .../postgres_fdw/expected/postgres_fdw.out|  16 +-
 src/backend/optimizer/plan/subselect.c|  30 ++--
 src/backend/utils/adt/ruleutils.c |  45 -
 src/test/regress/expected/aggregates.out  |   2 +-
 src/test/regress/expected/insert_conflict.out |   2 +-
 src/test/regress/expected/join.out|  24 +--
 src/test/regress/expected/memoize.out |   2 +-
 src/test/regress/expected/rowsecurity.out |  56 +++---
 src/test/regress/expected/select_parallel.out |  22 +--
 src/test/regress/expected/subselect.out   | 159 --
 src/test/regress/expected/updatable_views.out |  24 +--
 src/test/regress/sql/subselect.sql|  14 ++
 12 files changed, 255 insertions(+), 141 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index c355e8f3f7..bf067738e3 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3264,14 +3264,14 @@ select sum(c1) filter (where (c1 / c1) * random() <= 1) from ft1 group by c2 ord
 
 explain (verbose, costs off)
 select sum(c2) filter (where c2 in (select c2 from ft1 where c2 < 5)) from ft1;
-QUERY PLAN 

+ QUERY PLAN  
+-
  Aggregate
-   Output: sum(ft1.c2) FILTER (WHERE (hashed SubPlan 1))
+   Output: sum(ft1.c2) FILTER (WHERE (ANY (ft1.c2 = $0) FROM HASHED SubPlan 1 (returns $0)))
->  Foreign Scan on public.ft1
  Output: ft1.c2
  Remote SQL: SELECT c2 FROM "S 1"."T 1"
-   SubPlan 1
+   SubPlan 1 (returns $0)
  ->  Foreign Scan on public.ft1 ft1_1
Output: ft1_1.c2
Remote SQL: SELECT c2 FROM "S 1"."T 1" WHERE ((c2 < 5))
@@ -11895,12 +11895,12 @@ CREATE FOREIGN TABLE foreign_tbl2 () INHERITS (foreign_tbl)
   SERVER loopback OPTIONS (table_name 'base_tbl');
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT a FROM base_tbl WHERE (a, random() > 0) IN (SELECT a, random() > 0 FROM foreign_tbl);
- QUERY PLAN  
--
+QUERY PLAN
+--
  Seq Scan on public.base_tbl
Output: base_tbl.a
-   Filter: (SubPlan 1)
-   SubPlan 1
+   Filter: (ANY ((base_tbl.a = $1) AND ((random() > '0'::double precision) = $2)) FROM SubPlan 1 (returns $1,$2))
+   SubPlan 1 (returns $1,$2)
  ->  Result
Output: base_tbl.a, (random() > '0'::double precision)
->  Append
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 47e14723d2..d5919a9164 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -326,6 +326,7 @@ build_subplan(PlannerInfo *root, Plan *plan, PlannerInfo *subroot,
 	Node	   *result;
 	SubPlan*splan;
 	bool		isInitPlan;
+	StringInfoData splanname;
 	ListCell   *lc;
 
 	/*
@@ -560,22 +561,31 @@ build_subplan(PlannerInfo *root, Plan *plan, PlannerInfo *subroot,
    splan->plan_id);
 
 	/* Label the subplan for EXPLAIN purposes */
-	splan->plan_name = palloc(32 + 12 * list_length(splan->setParam));
-	sprintf(splan->plan_name, "%s %d",
-			isInitPlan ? "InitPlan" : "SubPlan",
-			splan->plan_id);
+	initStringInfo();
+	appendStringInfo(, "%s %d",
+	 isInitPlan ? "InitPlan" : "SubPlan",
+	 splan->plan_id);
 	if (splan->setParam)
 	{
-		char	   *ptr = splan->plan_name + strlen(splan->plan_name);
-
-		ptr += sprintf(ptr, " (returns ");
+		appendStringInfoString(, " (returns ");
 		foreach(lc, splan->setParam)
 		{
-			ptr += sprintf(ptr, "$%d%s",
-		   lfirst_int(lc),
-		 

Re: Improving EXPLAIN's display of SubPlan nodes

2024-01-22 Thread Tom Lane
I wrote:
> The main thing that's still missing compared to what is in the plan
> data structure is information about which Param is which.  I think
> we have the subplan output Params relatively well covered through
> the expedient of listing them in the generated plan_name, but it's
> still not apparent to me how we could shoehorn subplan input
> Params into this (or whether it's worth doing).

Actually ... it looks like it probably isn't worth doing, because
it's already the case that we don't expose input Params as such.
EXPLAIN searches for the referent of an input Param and displays
that (cf find_param_referent()).  Just for experimental purposes,
I wrote a follow-on patch to add printout of the parParam and args
list (attached, as .txt so the cfbot doesn't think it's a patch).
This produces results like

explain (verbose, costs off)
select array(select sum(x+y) s
from generate_series(1,3) y group by y order by s)
  from generate_series(1,3) x;
QUERY PLAN 
---
 Function Scan on pg_catalog.generate_series x
   Output: ARRAY(SubPlan 1 PASSING $0 := x.x)
   ^ added by delta patch
   Function Call: generate_series(1, 3)
   SubPlan 1
 ->  Sort
   Output: (sum((x.x + y.y))), y.y
   Sort Key: (sum((x.x + y.y)))
   ->  HashAggregate
 Output: sum((x.x + y.y)), y.y
  ^^^ actual reference to $0
 Group Key: y.y
 ->  Function Scan on pg_catalog.generate_series y
   Output: y.y
   Function Call: generate_series(1, 3)
(13 rows)

As you can see, it's not necessary to explain what $0 is because
that name isn't shown anywhere else --- the references to "x.x" in
the subplan are actually uses of $0.

So now I'm thinking that we do have enough detail in the present
proposal, and we just need to think about whether there's some
nicer way to present it than the particular spelling I used here.

One idea I considered briefly is to pull the same trick with
regards to output parameters --- that is, instead of adding all
the "returns $n" annotations to subplans, drill down and print
the subplan's relevant targetlist expression instead of "$n".
On balance I think that might be more confusing not less so,
though.  SQL users are used to the idea that a sub-select can
"see" variables from the outer query, but not at all vice-versa.
I think it probably wouldn't be formally ambiguous, because
ruleutils already de-duplicates table aliases across the whole
tree, but it still seems likely to be confusing.  Also, people
are already pretty used to seeing $n to represent the outputs
of InitPlans, and I've not heard many complaints suggesting
that we should change that.

regards, tom lane

diff --git a/src/backend/utils/adt/ruleutils.c 
b/src/backend/utils/adt/ruleutils.c
index 9ee35640ba..b3da98a2f7 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -8911,9 +8911,30 @@ get_rule_expr(Node *node, deparse_context *context,
break;
}
if (subplan->useHashTable)
-   appendStringInfo(buf, "HASHED %s)", 
subplan->plan_name);
+   appendStringInfo(buf, "HASHED %s", 
subplan->plan_name);
else
-   appendStringInfo(buf, "%s)", 
subplan->plan_name);
+   appendStringInfo(buf, "%s", 
subplan->plan_name);
+   if (subplan->parParam)
+   {
+   ListCell   *lc3;
+   ListCell   *lc4;
+   bool first = true;
+
+   appendStringInfoString(buf, " PASSING 
");
+   forboth(lc3, subplan->parParam, lc4, 
subplan->args)
+   {
+   int paramid 
= lfirst_int(lc3);
+   Node   *arg = (Node *) 
lfirst(lc4);
+
+   if (first)
+   first = false;
+   else
+   
appendStringInfoString(buf, ", ");
+   appendStringInfo(buf, "$%d := 
", paramid);
+   get_rule_expr(arg, context, 
showimplicit);
+   }
+   }
+

Re: Improving EXPLAIN's display of SubPlan nodes

2024-01-22 Thread Chantal Keller

Hi Aleksander and Tom

I do confirm that I requested to get this information, in order to 
recover the formula to filter on.


Thanks to both of you
Chantal




Le 22/01/2024 à 18:07, Tom Lane a écrit :

Aleksander Alekseev  writes:

Although something like:



```
+   Filter: (ANY (base_tbl.a = $1) FROM SubPlan 1 (returns $1))
+   SubPlan 1 (returns $1)
```



... arguably doesn't give much more information to the user comparing
to what we have now:



```
-   Filter: (SubPlan 1)
-   SubPlan 1
```


Yeah, I would probably not have thought to do this on my own; but
we had an actual user request for it.  I think arguably the main
benefit is to confirm "yes, this is the sub-select you think it is".

The main thing that's still missing compared to what is in the plan
data structure is information about which Param is which.  I think
we have the subplan output Params relatively well covered through
the expedient of listing them in the generated plan_name, but it's
still not apparent to me how we could shoehorn subplan input
Params into this (or whether it's worth doing).

regards, tom lane





Re: Improving EXPLAIN's display of SubPlan nodes

2024-01-22 Thread Tom Lane
Aleksander Alekseev  writes:
> Although something like:

> ```
> +   Filter: (ANY (base_tbl.a = $1) FROM SubPlan 1 (returns $1))
> +   SubPlan 1 (returns $1)
> ```

> ... arguably doesn't give much more information to the user comparing
> to what we have now:

> ```
> -   Filter: (SubPlan 1)
> -   SubPlan 1
> ```

Yeah, I would probably not have thought to do this on my own; but
we had an actual user request for it.  I think arguably the main
benefit is to confirm "yes, this is the sub-select you think it is".

The main thing that's still missing compared to what is in the plan
data structure is information about which Param is which.  I think
we have the subplan output Params relatively well covered through
the expedient of listing them in the generated plan_name, but it's
still not apparent to me how we could shoehorn subplan input
Params into this (or whether it's worth doing).

regards, tom lane




Re: Improving EXPLAIN's display of SubPlan nodes

2024-01-22 Thread Aleksander Alekseev
Hi,

> EXPLAIN has always been really poor at displaying SubPlan nodes
> in expressions: you don't get much more than "(SubPlan N)".
> This is mostly because every time I thought about it, I despaired
> of trying to represent all the information in a SubPlan accurately.
> However, a recent discussion[1] made me realize that we could do
> a lot better just by displaying the SubLinkType and the testexpr
> (if relevant).  So here's a proposed patch.  You can see what
> it does by examining the regression test changes.
>
> There's plenty of room to bikeshed about exactly how to display
> this stuff, and I'm open to suggestions.
>
> BTW, I was somewhat depressed to discover that we have exactly
> zero regression coverage of the ROWCOMPARE_SUBLINK code paths;
> not only was EXPLAIN output not covered, but the planner and
> executor too.  So I added some simple tests for that.  Otherwise
> I think existing coverage is enough for this.

I reviewed the code and tested the patch on MacOS. It looks good to me.

Although something like:

```
+   Filter: (ANY (base_tbl.a = $1) FROM SubPlan 1 (returns $1))
+   SubPlan 1 (returns $1)
```

... arguably doesn't give much more information to the user comparing
to what we have now:

```
-   Filter: (SubPlan 1)
-   SubPlan 1
```

... I believe this is the right step toward more detailed EXPLAINs,
and perhaps could be useful for debugging and/or educational purposes.
Also the patch improves code coverage.

-- 
Best regards,
Aleksander Alekseev