Hi,

On 12/7/21 10:44, 曾文旌(义从) wrote:
> Hi Hackers
> 
> For my previous proposal, I developed a prototype and passed
> regression testing. It works similarly to subquery's qual pushdown.
> We know that sublink expands at the beginning of each level of
> query. At this stage, The query's conditions and equivalence classes
> are not processed. But after generate_base_implied_equalities the
> conditions are processed,  which is why qual can push down to 
> subquery but sublink not.
> 
> My POC implementation chose to delay the sublink expansion in the
> SELECT clause (targetList) and where clause. Specifically, it is
> delayed after generate_base_implied_equalities. Thus, the equivalent
> conditions already established in the Up level query can be easily
> obtained in the sublink expansion process (make_subplan).
> 
> For example, if the up level query has a.id = 10 and the sublink
> query has a.id = b.id, then we get b.id = 10 and push it down to the
> sublink quey. If b is a partitioned table and is partitioned by id,
> then a large number of unrelated subpartitions are pruned out, This
> optimizes a significant amount of Planner and SQL execution time, 
> especially if the partitioned table has a large number of
> subpartitions and is what I want.
> 
> Currently, There were two SQL failures in the regression test,
> because the expansion order of sublink was changed, which did not
> affect the execution result of SQL.
> 
> Look forward to your suggestions on this proposal.
> 

I took a quick look, and while I don't see / can't think of any problems
with delaying it until after generating implied equalities, there seems
to be a number of gaps.


1) Are there any regression tests exercising this modified behavior?
Maybe there are, but if the only changes are due to change in order of
targetlist entries, that doesn't seem like a clear proof.

It'd be good to add a couple tests exercising both the positive and
negative case (i.e. when we can and can't pushdown a qual).


2) apparently, contrib/postgres_fdw does crash like this:

  #3  0x000000000077b412 in adjust_appendrel_attrs_mutator
      (node=0x13f7ea0, context=0x7fffc3351b30) at appendinfo.c:470
  470          Assert(!IsA(node, SubLink));
  (gdb) p node
  $1 = (Node *) 0x13f7ea0
  (gdb) p *node
  $2 = {type = T_SubLink}

  Backtrace attached.

3) various parts of the patch really need at least some comments, like:

  - try_push_outer_qual_to_sublink_query really needs some docs

  - new stuff at the end of initsplan.c


4) generate_base_implied_equalities

   shouldn't this

        if (ec->ec_processed)
            ;

   really be?

        if (ec->ec_processed)
            continue;

5) I'm not sure why we need the new ec_processed flag.

6) So we now have lazy_process_sublink callback? Does that mean we
expand sublinks in two places - sometimes lazily, sometimes not?

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Core was generated by `postgres: user contrib_regression [local] EXPLAIN        
                     '.
Program terminated with signal SIGABRT, Aborted.
#0  0x000079a07cfd79e5 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install 
glibc-2.32-10.fc33.x86_64 libgcc-10.3.1-1.fc33.x86_64
(gdb) bt
#0  0x000079a07cfd79e5 in raise () from /lib64/libc.so.6
#1  0x000079a07cfc08a4 in abort () from /lib64/libc.so.6
#2  0x000000000094f22a in ExceptionalCondition 
(conditionName=conditionName@entry=0xab44ca "!IsA(node, SubLink)", 
errorType=errorType@entry=0x9a2017 "FailedAssertion", 
fileName=fileName@entry=0xab55cf "appendinfo.c", 
lineNumber=lineNumber@entry=470) at assert.c:69
#3  0x000000000077b412 in adjust_appendrel_attrs_mutator (node=0x13f7ea0, 
context=0x7fffc3351b30) at appendinfo.c:470
#4  0x000000000071f8f9 in expression_tree_mutator (node=0x13f7f90, 
mutator=0x77ae20 <adjust_appendrel_attrs_mutator>, context=0x7fffc3351b30) at 
nodeFuncs.c:3240
#5  0x000000000077b025 in adjust_appendrel_attrs_mutator (node=0x13f7f90, 
context=0x7fffc3351b30) at appendinfo.c:390
#6  0x0000000000720066 in expression_tree_mutator (node=0x13eca78, 
mutator=0x77ae20 <adjust_appendrel_attrs_mutator>, context=0x7fffc3351b30) at 
nodeFuncs.c:3109
#7  0x000000000077b512 in adjust_appendrel_attrs (root=root@entry=0x13be5b0, 
node=<optimized out>, nappinfos=nappinfos@entry=1, 
appinfos=appinfos@entry=0x7fffc3351bd0) at appendinfo.c:210
#8  0x000000000073b88c in set_append_rel_size (rte=<optimized out>, rti=4, 
rel=0xf19538, root=0x13be5b0) at allpaths.c:1056
#9  set_rel_size (root=0x13be5b0, rel=0xf19538, rti=4, rte=<optimized out>) at 
allpaths.c:386
#10 0x000000000073e250 in set_base_rel_sizes (root=<optimized out>) at 
allpaths.c:326
#11 make_one_rel (root=root@entry=0x13be5b0, joinlist=joinlist@entry=0x13edf10) 
at allpaths.c:188
#12 0x0000000000763fde in query_planner (root=root@entry=0x13be5b0, 
qp_callback=qp_callback@entry=0x764eb0 <standard_qp_callback>, 
qp_extra=qp_extra@entry=0x7fffc3351d90, lps_callback=<optimized out>) at 
planmain.c:286
#13 0x00000000007694fa in grouping_planner (root=<optimized out>, 
tuple_fraction=<optimized out>) at planner.c:1459
#14 0x000000000076bea6 in subquery_planner (glob=glob@entry=0x12f1038, 
parse=parse@entry=0x13e8980, parent_root=parent_root@entry=0x0, 
hasRecursion=hasRecursion@entry=false, tuple_fraction=tuple_fraction@entry=0) 
at planner.c:1028
#15 0x000000000076c4f3 in standard_planner (parse=0x13e8980, 
query_string=<optimized out>, cursorOptions=2048, boundParams=<optimized out>) 
at planner.c:407
#16 0x000000000083c0f8 in pg_plan_query (querytree=0x13e8980, 
querytree@entry=0x7fffc3351fc0, query_string=query_string@entry=0xe16b20 
"EXPLAIN (VERBOSE, COSTS OFF)\nSELECT * FROM local_tbl t1 LEFT JOIN (SELECT *, 
(SELECT count(*) FROM async_pt WHERE a < 3000) FROM async_pt WHERE a < 3000) t2 
ON t1.a = t2.a;", 
    cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) 
at postgres.c:848
#17 0x000000000065cb71 in ExplainOneQuery (query=0x7fffc3351fc0, 
cursorOptions=2048, into=0x0, es=0x12f0d50, queryString=0xe16b20 "EXPLAIN 
(VERBOSE, COSTS OFF)\nSELECT * FROM local_tbl t1 LEFT JOIN (SELECT *, (SELECT 
count(*) FROM async_pt WHERE a < 3000) FROM async_pt WHERE a < 3000) t2 ON t1.a 
= t2.a;", 
    params=0x0, queryEnv=0x0) at explain.c:397
#18 0x000000000065d2ee in ExplainQuery (pstate=0x13c1c90, stmt=0x13e87a0, 
params=0x0, dest=0x13c1bf8) at ../../../src/include/nodes/nodes.h:603
#19 0x00000000008419d6 in standard_ProcessUtility (pstmt=0x1322428, 
queryString=0xe16b20 "EXPLAIN (VERBOSE, COSTS OFF)\nSELECT * FROM local_tbl t1 
LEFT JOIN (SELECT *, (SELECT count(*) FROM async_pt WHERE a < 3000) FROM 
async_pt WHERE a < 3000) t2 ON t1.a = t2.a;", readOnlyTree=<optimized out>, 
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x13c1bf8, 
qc=0x7fffc3352240) at utility.c:863
#20 0x000000000083fe9f in PortalRunUtility (portal=portal@entry=0xe7b060, 
pstmt=0x1322428, isTopLevel=<optimized out>, 
setHoldSnapshot=setHoldSnapshot@entry=true, dest=dest@entry=0x13c1bf8, 
qc=qc@entry=0x7fffc3352240) at pquery.c:1155
#21 0x0000000000840360 in FillPortalStore (portal=0xe7b060, 
isTopLevel=<optimized out>) at ../../../src/include/nodes/nodes.h:603
#22 0x000000000084066d in PortalRun (portal=portal@entry=0xe7b060, 
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, 
run_once=run_once@entry=true, dest=dest@entry=0x1322518, 
altdest=altdest@entry=0x1322518, qc=0x7fffc3352430) at pquery.c:760
#23 0x000000000083c5d4 in exec_simple_query (query_string=0xe16b20 "EXPLAIN 
(VERBOSE, COSTS OFF)\nSELECT * FROM local_tbl t1 LEFT JOIN (SELECT *, (SELECT 
count(*) FROM async_pt WHERE a < 3000) FROM async_pt WHERE a < 3000) t2 ON t1.a 
= t2.a;") at postgres.c:1215
#24 0x000000000083e057 in PostgresMain (dbname=<optimized out>, 
username=<optimized out>) at postgres.c:4498
#25 0x00000000007b189b in BackendRun (port=<optimized out>, port=<optimized 
out>) at postmaster.c:4594
#26 BackendStartup (port=<optimized out>) at postmaster.c:4322
#27 ServerLoop () at postmaster.c:1802
#28 0x00000000007b27b1 in PostmasterMain (argc=argc@entry=8, 
argv=argv@entry=0xe105d0) at postmaster.c:1474
#29 0x000000000050a35c in main (argc=8, argv=0xe105d0) at main.c:198

Reply via email to