Re: [HACKERS] Oddity in EXPLAIN for foreign/custom join pushdown plans

Etsuro Fujita Thu, 28 Jul 2016 05:05:15 -0700

On 2016/07/28 10:01, Kouhei Kaigai wrote:

What I'm saying is here:

EXPLAIN (COSTS false, VERBOSE)
SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY
t1.c3, t1.c1 OFFSET 100 LIMIT 10;
                      QUERY PLAN
-------------------------------------------
    Limit
      Output: t1.c1, t2.c1, t1.c3
      ->  Foreign Scan
                  ^^^^
            Output: t1.c1, t2.c1, t1.c3
            Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
            Remote SQL: SELECT r1."C 1", r1.c3, r2."C 1" FROM ("S 1"."T
1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1")))) ORDER BY
r1.c3 ASC N\
ULLS LAST, r1."C 1" ASC NULLS LAST
(6 rows)

Your concern is that EXPLAIN shows ForeignScan node as "Foreign Scan"
although it actually works as "Foreign Join".

That may be so, but my point is that the target relations involved inthe foreign join (ie, ft1 and ft2) should be printed somewhere in theEXPLAIN output by core, as in EXPLAIN for a simple foreign table scan.

My suggestion makes EXPLAIN print "Foreign %s" according to the label
assigned by the extension. Only FDW driver knows how this ForeignScan
node actually performs on, thus, only FDW driver can set meaningful label.

This label may be "Join", may be "Partial Aggregate + Join", or something
others according to the actual job of this node.

OK, thanks for that. But sorry, I'm not sure that's a good idea. Onereason for that is that FDWs/extentions might give different labels tothe same upper-planner action, which would lead to confusing EXPLAINs.Another is that labeling might be annoying, so some FDWs/extensions maynot want to do that.

Couldn't we print EXPLAINs in a more unified way? A simple idea is tointroduce a broad label, eg, "Foreign Processing", as a unified label;if the FDW/extension performs any upper-planner actions remotely, thenthe label "Foreign Processing" will be printed by core, and followingthat, the FDW/extension would print what they want, usingExplainForeignScan/ExplainCustomScan. Probably something like this:


  Foreign Processing
    Remote Operations: ...

In the Remote Operations line, the FDW/extension could print any infoabout remote operations, eg, "Scan/Join + Aggregate". Different datasources have different concepts/terms, so it seems reasonable to me thatthere would be different descriptions by different data sources, to thesame plan actions done remotely.

 - A flag to turn on/off printing relation(s) name


ISTM that the relation information should be always printed even in the
case of scanrelid=0 by the core, because that would be essential for
analyzing EXPLAIN results.

We have no information which relations are actually scanned by ForeignScan
and CustomScan. Your patch abuses fs_relids and custom_relids, however,
these fields don't indicate the relations actually scan by these nodes.
It just tracks relations processed by this node or its children.

Maybe my explanation was not enough, but I just mean that *joiningrelations* should be printed somewhere in the EXPLAIN output for aforeign/custom join as well, by core.

postgres=# explain select id from t0 natural join t1 natural join t2;
                                QUERY PLAN
---------------------------------------------------------------------------
 Custom Scan (GpuJoin)  (cost=12385.67..291245.35 rows=9815465 width=4)
   GPU Projection: t0.id
   Depth 1: GpuHashJoin, HashKeys: (t0.bid)
            JoinQuals: (t0.bid = t2.bid)
            Nrows (in/out: 98.15%), KDS-Hash (size: 13.47MB, nbatches: 1)
   Depth 2: GpuHashJoin, HashKeys: (t0.aid)
            JoinQuals: (t0.aid = t1.aid)
            Nrows (in/out: 100.00%), KDS-Hash (size: 13.47MB, nbatches: 1)
   ->  Seq Scan on t0  (cost=0.00..183333.96 rows=9999996 width=12)
   ->  Seq Scan on t2  (cost=0.00..1935.00 rows=100000 width=4)
   ->  Seq Scan on t1  (cost=0.00..1935.00 rows=100000 width=4)
(11 rows)

If EXPLAIN command attaches something like "on t0,t1,t2" after the GpuJoin
*with no choice*, it is problematic and misleading.
It shall be controllable by the extension which knows what tables are
actually scanned. (Please note that I never says extension makes the string.
It is a job of core explain. I suggest it needs to be controllable.)

Again, I just mean that the info "on t0,t1,t2" should be printed afterthe GpuJoin label, as its joining relations.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Oddity in EXPLAIN for foreign/custom join pushdown plans

Reply via email to