Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?

2021-12-07 Thread Tomas Vondra
Hi,

On 12/7/21 10:44, 曾文旌(义从) wrote:
> Hi Hackers
> 
> For my previous proposal, I developed a prototype and passed
> regression testing. It works similarly to subquery's qual pushdown.
> We know that sublink expands at the beginning of each level of
> query. At this stage, The query's conditions and equivalence classes
> are not processed. But after generate_base_implied_equalities the
> conditions are processed,  which is why qual can push down to 
> subquery but sublink not.
> 
> My POC implementation chose to delay the sublink expansion in the
> SELECT clause (targetList) and where clause. Specifically, it is
> delayed after generate_base_implied_equalities. Thus, the equivalent
> conditions already established in the Up level query can be easily
> obtained in the sublink expansion process (make_subplan).
> 
> For example, if the up level query has a.id = 10 and the sublink
> query has a.id = b.id, then we get b.id = 10 and push it down to the
> sublink quey. If b is a partitioned table and is partitioned by id,
> then a large number of unrelated subpartitions are pruned out, This
> optimizes a significant amount of Planner and SQL execution time, 
> especially if the partitioned table has a large number of
> subpartitions and is what I want.
> 
> Currently, There were two SQL failures in the regression test,
> because the expansion order of sublink was changed, which did not
> affect the execution result of SQL.
> 
> Look forward to your suggestions on this proposal.
> 

I took a quick look, and while I don't see / can't think of any problems
with delaying it until after generating implied equalities, there seems
to be a number of gaps.


1) Are there any regression tests exercising this modified behavior?
Maybe there are, but if the only changes are due to change in order of
targetlist entries, that doesn't seem like a clear proof.

It'd be good to add a couple tests exercising both the positive and
negative case (i.e. when we can and can't pushdown a qual).


2) apparently, contrib/postgres_fdw does crash like this:

  #3  0x0077b412 in adjust_appendrel_attrs_mutator
  (node=0x13f7ea0, context=0x7fffc3351b30) at appendinfo.c:470
  470  Assert(!IsA(node, SubLink));
  (gdb) p node
  $1 = (Node *) 0x13f7ea0
  (gdb) p *node
  $2 = {type = T_SubLink}

  Backtrace attached.

3) various parts of the patch really need at least some comments, like:

  - try_push_outer_qual_to_sublink_query really needs some docs

  - new stuff at the end of initsplan.c


4) generate_base_implied_equalities

   shouldn't this

if (ec->ec_processed)
;

   really be?

if (ec->ec_processed)
continue;

5) I'm not sure why we need the new ec_processed flag.

6) So we now have lazy_process_sublink callback? Does that mean we
expand sublinks in two places - sometimes lazily, sometimes not?

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL CompanyCore was generated by `postgres: user contrib_regression [local] EXPLAIN
 '.
Program terminated with signal SIGABRT, Aborted.
#0  0x79a07cfd79e5 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install 
glibc-2.32-10.fc33.x86_64 libgcc-10.3.1-1.fc33.x86_64
(gdb) bt
#0  0x79a07cfd79e5 in raise () from /lib64/libc.so.6
#1  0x79a07cfc08a4 in abort () from /lib64/libc.so.6
#2  0x0094f22a in ExceptionalCondition 
(conditionName=conditionName@entry=0xab44ca "!IsA(node, SubLink)", 
errorType=errorType@entry=0x9a2017 "FailedAssertion", 
fileName=fileName@entry=0xab55cf "appendinfo.c", 
lineNumber=lineNumber@entry=470) at assert.c:69
#3  0x0077b412 in adjust_appendrel_attrs_mutator (node=0x13f7ea0, 
context=0x7fffc3351b30) at appendinfo.c:470
#4  0x0071f8f9 in expression_tree_mutator (node=0x13f7f90, 
mutator=0x77ae20 , context=0x7fffc3351b30) at 
nodeFuncs.c:3240
#5  0x0077b025 in adjust_appendrel_attrs_mutator (node=0x13f7f90, 
context=0x7fffc3351b30) at appendinfo.c:390
#6  0x00720066 in expression_tree_mutator (node=0x13eca78, 
mutator=0x77ae20 , context=0x7fffc3351b30) at 
nodeFuncs.c:3109
#7  0x0077b512 in adjust_appendrel_attrs (root=root@entry=0x13be5b0, 
node=, nappinfos=nappinfos@entry=1, 
appinfos=appinfos@entry=0x7fffc3351bd0) at appendinfo.c:210
#8  0x0073b88c in set_append_rel_size (rte=, rti=4, 
rel=0xf19538, root=0x13be5b0) at allpaths.c:1056
#9  set_rel_size (root=0x13be5b0, rel=0xf19538, rti=4, rte=) at 
allpaths.c:386
#10 0x0073e250 in set_base_rel_sizes (root=) at 
allpaths.c:326
#11 make_one_rel (root=root@entry=0x13be5b0, joinlist=joinlist@entry=0x13edf10) 
at allpaths.c:188
#12 0x00763fde in query_planner (root=root@entry=0x13be5b0, 
qp_callback=qp_callback@entry=0x764eb0 , 
qp_extra=qp_extra@entry=0x7fffc3351d90, lps_callback=) at 
planmain.c:286
#13 0x007694fa in grouping_planner (root=, 
tuple_fraction=) at planner.c:145

Re: Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?

2021-12-11 Thread Zhihong Yu
On Sat, Dec 11, 2021 at 7:31 AM 曾文旌(义从)  wrote:

>
>
> --原始邮件 --
> *发件人:*Tomas Vondra 
> *发送时间:*Wed Dec 8 11:26:35 2021
> *收件人:*曾文旌(义从) , shawn wang <
> shawn.wang...@gmail.com>, ggys...@gmail.com ,
> PostgreSQL Hackers 
> *抄送:*wjzeng 
> *主题:*Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?
>
>> Hi,
>>
>> On 12/7/21 10:44, 曾文旌(义从) wrote:
>> > Hi Hackers
>> >
>> > For my previous proposal, I developed a prototype and passed
>> > regression testing. It works similarly to subquery's qual pushdown.
>> > We know that sublink expands at the beginning of each level of
>> > query. At this stage, The query's conditions and equivalence classes
>> > are not processed. But after generate_base_implied_equalities the
>> > conditions are processed,  which is why qual can push down to
>> > subquery but sublink not.
>> >
>> > My POC implementation chose to delay the sublink expansion in the
>> > SELECT clause (targetList) and where clause. Specifically, it is
>> > delayed after generate_base_implied_equalities. Thus, the equivalent
>> > conditions already established in the Up level query can be easily
>> > obtained in the sublink expansion process (make_subplan).
>> >
>> > For example, if the up level query has a.id = 10 and the sublink
>> > query has a.id = b.id, then we get b.id = 10 and push it down to the
>> > sublink quey. If b is a partitioned table and is partitioned by id,
>> > then a large number of unrelated subpartitions are pruned out, This
>> > optimizes a significant amount of Planner and SQL execution time,
>> > especially if the partitioned table has a large number of
>> > subpartitions and is what I want.
>> >
>> > Currently, There were two SQL failures in the regression test,
>> > because the expansion order of sublink was changed, which did not
>> > affect the execution result of SQL.
>> >
>> > Look forward to your suggestions on this proposal.
>> >
>>
>> I took a quick look, and while I don't see / can't think of any problems
>> with delaying it until after generating implied equalities, there seems
>> to be a number of gaps.
>>
>> *Thank you for your attention.*
>>
>> 1) Are there any regression tests exercising this modified behavior?
>> Maybe there are, but if the only changes are due to change in order of
>> targetlist entries, that doesn't seem like a clear proof.
>>
>> It'd be good to add a couple tests exercising both the positive and
>> negative case (i.e. when we can and can't pushdown a qual).
>>
>> *I added several samples to the regress(qual_pushdown_to_sublink.sql). *
>> *and I
>> used the partition table to show the plan status of qual being pushed down 
>> into sublink.*
>>
>> *Hopefully this will help you understand the details of this patch. Later, I 
>> will add more cases.*
>>
>> 2) apparently, contrib/postgres_fdw does crash like this:
>>
>>   #3  0x0077b412 in adjust_appendrel_attrs_mutator
>>   (node=0x13f7ea0, context=0x7fffc3351b30) at appendinfo.c:470
>>   470  Assert(!IsA(node, SubLink));
>>   (gdb) p node
>>   $1 = (Node *) 0x13f7ea0
>>   (gdb) p *node
>>   $2 = {type = T_SubLink}
>>
>>   Backtrace attached.
>>
>> *For the patch attached in the last email, I passed all the tests under
>> src/test/regress.*
>> *As you pointed out, there was a problem with regression under contrib(in
>> contrib/postgres_fdw). *
>> *This time I fixed it and the current patch (V2) can pass the
>> check-world.*
>>
>> 3) various parts of the patch really need at least some comments, like:
>>
>>   - try_push_outer_qual_to_sublink_query really needs some docs
>>
>>   - new stuff at the end of initsplan.c
>>
>> *Ok, I added some comments and will
>> add more. If you have questions about any details,*
>> *please point them out directly.*
>>
>> 4) generate_base_implied_equalities
>>
>>shouldn't this
>>
>> if (ec->ec_processed)
>> ;
>>
>>really be?
>>
>> if (ec->ec_processed)
>> continue;
>>
>> *You are right. I fixed it.*
>>
>> 5) I'm not sure why we need the new ec_processed flag.
>>
>>
>> *I did this to eliminate duplicate equalities from the two 
>> generate_base_implied_equalities calls*
>>
>&

回复:Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?

2021-12-11 Thread 曾文旌(义从)



 --原始邮件 --
发件人:Tomas Vondra 
发送时间:Wed Dec 8 11:26:35 2021
收件人:曾文旌(义从) , shawn wang 
, ggys...@gmail.com , PostgreSQL 
Hackers 
抄送:wjzeng 
主题:Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?

Hi,

On 12/7/21 10:44, 曾文旌(义从) wrote:
> Hi Hackers
> 
> For my previous proposal, I developed a prototype and passed
> regression testing. It works similarly to subquery's qual pushdown.
> We know that sublink expands at the beginning of each level of
> query. At this stage, The query's conditions and equivalence classes
> are not processed. But after generate_base_implied_equalities the
> conditions are processed,  which is why qual can push down to 
> subquery but sublink not.
> 
> My POC implementation chose to delay the sublink expansion in the
> SELECT clause (targetList) and where clause. Specifically, it is
> delayed after generate_base_implied_equalities. Thus, the equivalent
> conditions already established in the Up level query can be easily
> obtained in the sublink expansion process (make_subplan).
> 
> For example, if the up level query has a.id = 10 and the sublink
> query has a.id = b.id, then we get b.id = 10 and push it down to the
> sublink quey. If b is a partitioned table and is partitioned by id,
> then a large number of unrelated subpartitions are pruned out, This
> optimizes a significant amount of Planner and SQL execution time, 
> especially if the partitioned table has a large number of
> subpartitions and is what I want.
> 
> Currently, There were two SQL failures in the regression test,
> because the expansion order of sublink was changed, which did not
> affect the execution result of SQL.
> 
> Look forward to your suggestions on this proposal.
> 

I took a quick look, and while I don't see / can't think of any problems
with delaying it until after generating implied equalities, there seems
to be a number of gaps.

Thank you for your attention.

1) Are there any regression tests exercising this modified behavior?
Maybe there are, but if the only changes are due to change in order of
targetlist entries, that doesn't seem like a clear proof.

It'd be good to add a couple tests exercising both the positive and
negative case (i.e. when we can and can't pushdown a qual).

I added several samples to the regress(qual_pushdown_to_sublink.sql). 
and I used the partition table to show the plan status of qual being pushed 
down into sublink.
Hopefully this will help you understand the details of this patch. Later, I 
will add more cases.
2) apparently, contrib/postgres_fdw does crash like this:

  #3  0x0077b412 in adjust_appendrel_attrs_mutator
  (node=0x13f7ea0, context=0x7fffc3351b30) at appendinfo.c:470
  470  Assert(!IsA(node, SubLink));
  (gdb) p node
  $1 = (Node *) 0x13f7ea0
  (gdb) p *node
  $2 = {type = T_SubLink}

  Backtrace attached.

For the patch attached in the last email, I passed all the tests under 
src/test/regress.
As you pointed out, there was a problem with regression under contrib(in 
contrib/postgres_fdw). 
This time I fixed it and the current patch (V2) can pass the check-world.


3) various parts of the patch really need at least some comments, like:

  - try_push_outer_qual_to_sublink_query really needs some docs

  - new stuff at the end of initsplan.c

Ok, I added some comments and will add more. If you have questions about any 
details,
please point them out directly.

4) generate_base_implied_equalities

   shouldn't this

if (ec->ec_processed)
;

   really be?

if (ec->ec_processed)
continue;

You are right. I fixed it.

5) I'm not sure why we need the new ec_processed flag.

I did this to eliminate duplicate equalities from the two 
generate_base_implied_equalities calls
1) I need the base equivalent expression generated after 
generate_base_implied_equalities,
which is used to pushdown qual to sublink(lazy_process_sublinks)
2) The expansion of sublink may result in an equivalent expression with 
parameters, such as a = $1,
which needs to deal with the equivalence classes again.
3) So, I added ec_processed and asked to process it again 
(generate_base_implied_equalities)
after the equivalence class changed (add_eq_member/process_equivalence).

Maybe you have a better suggestion, please let me know.

6) So we now have lazy_process_sublink callback? Does that mean we
expand sublinks in two places - sometimes lazily, sometimes not?

Yes, not all sublink is delayed. Let me explain this:
1) I added a GUC switch enable_lazy_process_sublink. If it is turned off, all 
lazy process sublink will not happen,
qual pushdown to sublink depend on lazy procee sublink, which means no quals 
will be pushed down.
2) Even  if enable_lazy_process_sublink = true If Query in this level contains 
some complex features,
sublink in this level quer

回复:Re: Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?

2021-12-12 Thread 曾文旌(义从)



 --原始邮件 --
发件人:Zhihong Yu 
发送时间:Sun Dec 12 01:13:11 2021
收件人:曾文旌(义从) 
抄送:Tomas Vondra , wjzeng , 
PostgreSQL Hackers , shawn wang 
, ggys...@gmail.com 
主题:Re: Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?



On Sat, Dec 11, 2021 at 7:31 AM 曾文旌(义从)  wrote:




 --原始邮件 --
发件人:Tomas Vondra 
发送时间:Wed Dec 8 11:26:35 2021
收件人:曾文旌(义从) , shawn wang 
, ggys...@gmail.com , PostgreSQL 
Hackers 
抄送:wjzeng 
主题:Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?

Hi,

On 12/7/21 10:44, 曾文旌(义从) wrote:
> Hi Hackers
> 
> For my previous proposal, I developed a prototype and passed
> regression testing. It works similarly to subquery's qual pushdown.
> We know that sublink expands at the beginning of each level of
> query. At this stage, The query's conditions and equivalence classes
> are not processed. But after generate_base_implied_equalities the
> conditions are processed,  which is why qual can push down to 
> subquery but sublink not.
> 
> My POC implementation chose to delay the sublink expansion in the
> SELECT clause (targetList) and where clause. Specifically, it is
> delayed after generate_base_implied_equalities. Thus, the equivalent
> conditions already established in the Up level query can be easily
> obtained in the sublink expansion process (make_subplan).
> 
> For example, if the up level query has a.id = 10 and the sublink
> query has a.id = b.id, then we get b.id = 10 and push it down to the
> sublink quey. If b is a partitioned table and is partitioned by id,
> then a large number of unrelated subpartitions are pruned out, This
> optimizes a significant amount of Planner and SQL execution time, 
> especially if the partitioned table has a large number of
> subpartitions and is what I want.
> 
> Currently, There were two SQL failures in the regression test,
> because the expansion order of sublink was changed, which did not
> affect the execution result of SQL.
> 
> Look forward to your suggestions on this proposal.
> 

I took a quick look, and while I don't see / can't think of any problems
with delaying it until after generating implied equalities, there seems
to be a number of gaps.

Thank you for your attention.

1) Are there any regression tests exercising this modified behavior?
Maybe there are, but if the only changes are due to change in order of
targetlist entries, that doesn't seem like a clear proof.

It'd be good to add a couple tests exercising both the positive and
negative case (i.e. when we can and can't pushdown a qual).

I added several samples to the regress(qual_pushdown_to_sublink.sql). 
and I used the partition table to show the plan status of qual being pushed 
down into sublink.
Hopefully this will help you understand the details of this patch. Later, I 
will add more cases.
2) apparently, contrib/postgres_fdw does crash like this:

  #3  0x0077b412 in adjust_appendrel_attrs_mutator
  (node=0x13f7ea0, context=0x7fffc3351b30) at appendinfo.c:470
  470  Assert(!IsA(node, SubLink));
  (gdb) p node
  $1 = (Node *) 0x13f7ea0
  (gdb) p *node
  $2 = {type = T_SubLink}

  Backtrace attached.

For the patch attached in the last email, I passed all the tests under 
src/test/regress.
As you pointed out, there was a problem with regression under contrib(in 
contrib/postgres_fdw). 
This time I fixed it and the current patch (V2) can pass the check-world.


3) various parts of the patch really need at least some comments, like:

  - try_push_outer_qual_to_sublink_query really needs some docs

  - new stuff at the end of initsplan.c

Ok, I added some comments and will add more. If you have questions about any 
details,
please point them out directly.

4) generate_base_implied_equalities

   shouldn't this

if (ec->ec_processed)
;

   really be?

if (ec->ec_processed)
continue;

You are right. I fixed it.

5) I'm not sure why we need the new ec_processed flag.

I did this to eliminate duplicate equalities from the two 
generate_base_implied_equalities calls
1) I need the base equivalent expression generated after 
generate_base_implied_equalities,
which is used to pushdown qual to sublink(lazy_process_sublinks)
2) The expansion of sublink may result in an equivalent expression with 
parameters, such as a = $1,
which needs to deal with the equivalence classes again.
3) So, I added ec_processed and asked to process it again 
(generate_base_implied_equalities)
after the equivalence class changed (add_eq_member/process_equivalence).

Maybe you have a better suggestion, please let me know.

6) So we now have lazy_process_sublink callback? Does that mean we
expand sublinks in two places - sometimes lazily, sometimes not?

Yes, not all sublink is delayed. Let me explain this:
1) I added a GUC switch enable_lazy_p

Re: 回复:Re: Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?

2021-12-24 Thread Zhihong Yu
On Thu, Dec 23, 2021 at 3:52 AM 曾文旌(义从)  wrote:

>
> Fixed a bug found during testing.
>
>
> Wenjing
>
>
>>> Hi,
+   if (condition_is_safe_pushdown_to_sublink(rinfo,
expr_info->outer))
+   {
+   /* replace qual expr from outer var = const to var = const
and push down to sublink query */
+   sublink_query_push_qual(subquery, (Node
*)copyObject(rinfo->clause), expr_info->outer, expr_info->inner);

Since sublink_query_push_qual() is always guarded
by condition_is_safe_pushdown_to_sublink(), it seems
sublink_query_push_qual() can be folded into
condition_is_safe_pushdown_to_sublink().

For generate_base_implied_equalities():

+   if (ec->ec_processed)
+   {
+   ec_index++;
+   continue;
+   }
+   else if (list_length(ec->ec_members) > 1)

Minor comment: the keyword `else` can be omitted (due to `continue` above).

+* Since there may be an unexpanded sublink in the targetList,
+* we'll skip it for now.

Since there may be an -> If there is an

+   {"lazy_process_sublink", PGC_USERSET, QUERY_TUNING_METHOD,
+   gettext_noop("enable lazy process sublink."),

Looking at existing examples from src/backend/utils/misc/guc.c,
enable_lazy_sublink_processing seems to be consistent with existing guc
variable naming.

+lazy_process_sublinks(PlannerInfo *root, bool single_result_rte)

lazy_process_sublinks -> lazily_process_sublinks

+   else
+   {
/* There shouldn't be any OJ info to translate, as yet */
Assert(subroot->join_info_list == NIL);

Indentation for the else block is off.

+   if (istop)
+   f->quals = preprocess_expression_ext(root, f->quals,
EXPRKIND_QUAL, false);
+   else
+   f->quals = preprocess_expression_ext(root, f->quals,
EXPRKIND_QUAL, true);

The above can be written as:

+   f->quals = preprocess_expression_ext(root, f->quals,
EXPRKIND_QUAL, !istop);

For find_equal_conditions_contain_uplevelvar_in_sublink_query():
+   context.has_unexpected_expr == false &&
`!context.has_unexpected_expr` should suffice

equal_expr_safety_check -> is_equal_expr_safe

Cheers


Re: 回复:Re: Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?

2022-01-05 Thread wenjing
I corrected it according to your suggestion.

thanks

Wenjing.


Zhihong Yu  于2021年12月25日周六 02:26写道:

>
>
> On Thu, Dec 23, 2021 at 3:52 AM 曾文旌(义从) 
> wrote:
>
>>
>> Fixed a bug found during testing.
>>
>>
>> Wenjing
>>
>>
 Hi,
> +   if (condition_is_safe_pushdown_to_sublink(rinfo,
> expr_info->outer))
> +   {
> +   /* replace qual expr from outer var = const to var = const
> and push down to sublink query */
> +   sublink_query_push_qual(subquery, (Node
> *)copyObject(rinfo->clause), expr_info->outer, expr_info->inner);
>
> Since sublink_query_push_qual() is always guarded
> by condition_is_safe_pushdown_to_sublink(), it seems
> sublink_query_push_qual() can be folded into
> condition_is_safe_pushdown_to_sublink().
>
> For generate_base_implied_equalities():
>
> +   if (ec->ec_processed)
> +   {
> +   ec_index++;
> +   continue;
> +   }
> +   else if (list_length(ec->ec_members) > 1)
>
> Minor comment: the keyword `else` can be omitted (due to `continue` above).
>
> +* Since there may be an unexpanded sublink in the targetList,
> +* we'll skip it for now.
>
> Since there may be an -> If there is an
>
> +   {"lazy_process_sublink", PGC_USERSET, QUERY_TUNING_METHOD,
> +   gettext_noop("enable lazy process sublink."),
>
> Looking at existing examples from src/backend/utils/misc/guc.c,
> enable_lazy_sublink_processing seems to be consistent with existing guc
> variable naming.
>
> +lazy_process_sublinks(PlannerInfo *root, bool single_result_rte)
>
> lazy_process_sublinks -> lazily_process_sublinks
>
> +   else
> +   {
> /* There shouldn't be any OJ info to translate, as yet */
> Assert(subroot->join_info_list == NIL);
>
> Indentation for the else block is off.
>
> +   if (istop)
> +   f->quals = preprocess_expression_ext(root, f->quals,
> EXPRKIND_QUAL, false);
> +   else
> +   f->quals = preprocess_expression_ext(root, f->quals,
> EXPRKIND_QUAL, true);
>
> The above can be written as:
>
> +   f->quals = preprocess_expression_ext(root, f->quals,
> EXPRKIND_QUAL, !istop);
>
> For find_equal_conditions_contain_uplevelvar_in_sublink_query():
> +   context.has_unexpected_expr == false &&
> `!context.has_unexpected_expr` should suffice
>
> equal_expr_safety_check -> is_equal_expr_safe
>
> Cheers
>
>


0001-poc-pushdown-qual-to-sublink-v5.patch
Description: Binary data


回复:回复:Re: Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?

2021-12-23 Thread 曾文旌(义从)

Fixed a bug found during testing.


Wenjing



 --原始邮件 --
发件人:曾文旌(义从) 
发送时间:Sun Dec 12 20:51:08 2021
收件人:Zhihong Yu 
抄送:Tomas Vondra , wjzeng , 
PostgreSQL Hackers , shawn wang 
, ggys...@gmail.com 
主题:回复:Re: Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?




 --原始邮件 --
发件人:Zhihong Yu 
发送时间:Sun Dec 12 01:13:11 2021
收件人:曾文旌(义从) 
抄送:Tomas Vondra , wjzeng , 
PostgreSQL Hackers , shawn wang 
, ggys...@gmail.com 
主题:Re: Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?



On Sat, Dec 11, 2021 at 7:31 AM 曾文旌(义从)  wrote:




 --原始邮件 --
发件人:Tomas Vondra 
发送时间:Wed Dec 8 11:26:35 2021
收件人:曾文旌(义从) , shawn wang 
, ggys...@gmail.com , PostgreSQL 
Hackers 
抄送:wjzeng 
主题:Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?

Hi,

On 12/7/21 10:44, 曾文旌(义从) wrote:
> Hi Hackers
> 
> For my previous proposal, I developed a prototype and passed
> regression testing. It works similarly to subquery's qual pushdown.
> We know that sublink expands at the beginning of each level of
> query. At this stage, The query's conditions and equivalence classes
> are not processed. But after generate_base_implied_equalities the
> conditions are processed,  which is why qual can push down to 
> subquery but sublink not.
> 
> My POC implementation chose to delay the sublink expansion in the
> SELECT clause (targetList) and where clause. Specifically, it is
> delayed after generate_base_implied_equalities. Thus, the equivalent
> conditions already established in the Up level query can be easily
> obtained in the sublink expansion process (make_subplan).
> 
> For example, if the up level query has a.id = 10 and the sublink
> query has a.id = b.id, then we get b.id = 10 and push it down to the
> sublink quey. If b is a partitioned table and is partitioned by id,
> then a large number of unrelated subpartitions are pruned out, This
> optimizes a significant amount of Planner and SQL execution time, 
> especially if the partitioned table has a large number of
> subpartitions and is what I want.
> 
> Currently, There were two SQL failures in the regression test,
> because the expansion order of sublink was changed, which did not
> affect the execution result of SQL.
> 
> Look forward to your suggestions on this proposal.
> 

I took a quick look, and while I don't see / can't think of any problems
with delaying it until after generating implied equalities, there seems
to be a number of gaps.

Thank you for your attention.

1) Are there any regression tests exercising this modified behavior?
Maybe there are, but if the only changes are due to change in order of
targetlist entries, that doesn't seem like a clear proof.

It'd be good to add a couple tests exercising both the positive and
negative case (i.e. when we can and can't pushdown a qual).

I added several samples to the regress(qual_pushdown_to_sublink.sql). 
and I used the partition table to show the plan status of qual being pushed 
down into sublink.
Hopefully this will help you understand the details of this patch. Later, I 
will add more cases.
2) apparently, contrib/postgres_fdw does crash like this:

  #3  0x0077b412 in adjust_appendrel_attrs_mutator
  (node=0x13f7ea0, context=0x7fffc3351b30) at appendinfo.c:470
  470  Assert(!IsA(node, SubLink));
  (gdb) p node
  $1 = (Node *) 0x13f7ea0
  (gdb) p *node
  $2 = {type = T_SubLink}

  Backtrace attached.

For the patch attached in the last email, I passed all the tests under 
src/test/regress.
As you pointed out, there was a problem with regression under contrib(in 
contrib/postgres_fdw). 
This time I fixed it and the current patch (V2) can pass the check-world.


3) various parts of the patch really need at least some comments, like:

  - try_push_outer_qual_to_sublink_query really needs some docs

  - new stuff at the end of initsplan.c

Ok, I added some comments and will add more. If you have questions about any 
details,
please point them out directly.

4) generate_base_implied_equalities

   shouldn't this

if (ec->ec_processed)
;

   really be?

if (ec->ec_processed)
continue;

You are right. I fixed it.

5) I'm not sure why we need the new ec_processed flag.

I did this to eliminate duplicate equalities from the two 
generate_base_implied_equalities calls
1) I need the base equivalent expression generated after 
generate_base_implied_equalities,
which is used to pushdown qual to sublink(lazy_process_sublinks)
2) The expansion of sublink may result in an equivalent expression with 
parameters, such as a = $1,
which needs to deal with the equivalence classes again.
3) So, I added ec_processed and asked to process it again 
(generate_base_implied_equalities)
after the equivalence class changed (add_eq_memb