On Wed, Oct 4, 2017 at 3:40 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Tue, Oct 3, 2017 at 7:33 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
>
> Having said all that, I think that this patch only wants to handle the
> subset of cases (2) and (4) where the relevant InitPlan is attached
> ABOVE the Gather node -- which seems very reasonable, because
> evaluating an InitPlan at a level of the plan tree above the level at
> which it is defined sounds like it might be complex.  But I still
> don't quite see why we need these tests.  I mean, if we only allow
> Param references from a set of safe parameter IDs, and the safe
> parameter IDs include only those IDs that can be generated in a
> worker, then won't other InitPlans in the workers anyway be ruled out?

It doesn't happen always.  There are cases when the part of required
conditions are pushed one query level below, so when we check during
max_parallel_hazard_walker, they look safe, but actually, they are
not.  I will try to explain by example:

postgres=# explain (costs off, verbose) select * from t1 where t1.i in
( select 1 + (select max(j) from t3));
                              QUERY PLAN
----------------------------------------------------------------------
 Hash Semi Join
   Output: t1.i, t1.j, t1.k
   Hash Cond: (t1.i = ((1 + $1)))
   ->  Seq Scan on public.t1
         Output: t1.i, t1.j, t1.k
   ->  Hash
         Output: ((1 + $1))
         ->  Result
               Output: (1 + $1)
               InitPlan 1 (returns $1)
                 ->  Finalize Aggregate
                       Output: max(t3.j)
                       ->  Gather
                             Output: (PARTIAL max(t3.j))
                             Workers Planned: 2
                             ->  Partial Aggregate
                                   Output: PARTIAL max(t3.j)
                                   ->  Parallel Seq Scan on public.t3
                                         Output: t3.j
(19 rows)

In the above example, you can see that the condition referring to
initplan (1 + $1) is pushed one level below.  So when it tries to
check parallel safety for the join condition, it won't see Param node.
Now, consider if we don't check contains_parallel_unsafe_param during
generate_gather_paths, then it will lead to below plan.


postgres=# explain (costs off, verbose) select * from t1 where t1.i in
( select 1 + (select max(j) from t3));
                                 QUERY PLAN
----------------------------------------------------------------------------
 Gather
   Output: t1.i, t1.j, t1.k
   Workers Planned: 2
   ->  Hash Semi Join
         Output: t1.i, t1.j, t1.k
         Hash Cond: (t1.i = ((1 + $1)))
         ->  Parallel Seq Scan on public.t1
               Output: t1.i, t1.j, t1.k
         ->  Hash
               Output: ((1 + $1))
               ->  Result
                     Output: (1 + $1)
                     InitPlan 1 (returns $1)
                       ->  Finalize Aggregate
                             Output: max(t3.j)
                             ->  Gather
                                   Output: (PARTIAL max(t3.j))
                                   Workers Planned: 2
                                   ->  Partial Aggregate
                                         Output: PARTIAL max(t3.j)
                                         ->  Parallel Seq Scan on public.t3
                                               Output: t3.j
(22 rows)

This is wrong because when we will try to evaluate params that are
required at gather node, we won't get the required param as there is
no initplan at that level.

>
> If I am all mixed up, please help straighten me out.
>

I think whatever you said is right and very clear.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to