On Tue, Jan 31, 2017 at 4:16 PM, Amit Kapila <amit.kapil...@gmail.com> wrote: > On Wed, Dec 28, 2016 at 5:20 PM, Amit Kapila <amit.kapil...@gmail.com> wrote: > >> The drawback of the second approach is >> that we need to evaluate the initplan before it is actually required >> which means that we might evaluate it even when it is not required. I >> am not sure if it is always safe to assume that we can evaluate the >> initplan before pushing it to workers especially for the cases when it >> is far enough down in the plan tree which we are parallelizing, >> > > I think we can always pull up un-correlated initplans at Gather node, > however, if there is a correlated initplan, then it is better not to > allow such initplans for being pushed below gather. Ex. of correlated > initplans: > > postgres=# explain (costs off) select * from t1 where t1.i in (select > t2.i from t2 where t1.k = (select max(k) from t3 where t3.i=t1.i)); > QUERY PLAN > ---------------------------------------------- > Seq Scan on t1 > Filter: (SubPlan 2) > SubPlan 2 > -> Gather > Workers Planned: 1 > Params Evaluated: $1 > InitPlan 1 (returns $1) > -> Aggregate > -> Seq Scan on t3 > Filter: (i = t1.i) > -> Result > One-Time Filter: (t1.k = $1) > -> Parallel Seq Scan on t2 > (13 rows) > > It might be safe to allow above plan, but in general, such plans > should not be allowed, because it might not be feasible to compute > such initplan references at Gather node. I am still thinking on the > best way to deal with such initplans. >
I could see two possibilities to determine whether the plan (for which we are going to generate an initplan) contains a reference to a correlated var param node. One is to write a plan or path walker to determine any such reference and the second is to keep the information about the correlated param in path node. I think the drawback of the first approach is that traversing path tree during generation of initplan can be costly, so for now I have kept the information in path node to prohibit generating parallel initplans which contain a reference to correlated vars. I think we can go with first approach of using path walker if people feel that is better than maintaining a reference in path. Attached patch prohibit_parallel_correl_params_v1.patch implements the second approach of keeping the correlated var param reference in path node and pq_pushdown_initplan_v2.patch uses that to generate parallel initplans. Thoughts? These patches build on top of parallel subplan patch [1]. [1] - https://www.postgresql.org/message-id/caa4ek1kyqjqzqmpez+qra2fmim386gqlqbef+p2wmtqjh1r...@mail.gmail.com -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
prohibit_parallel_correl_params_v1.patch
Description: Binary data
pq_pushdown_initplan_v2.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers