Re: [HACKERS] Parallel Hash take II

Thomas Munro Fri, 01 Sep 2017 15:34:22 -0700

On Sat, Sep 2, 2017 at 5:13 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Thu, Aug 31, 2017 at 8:53 AM, Thomas Munro
> <thomas.mu...@enterprisedb.com> wrote:
>> Check out ExecReScanGather(): it shuts down and waits for all workers
>> to complete, which makes the assumptions in ExecReScanHashJoin() true.
>> If a node below Gather but above Hash Join could initiate a rescan
>> then the assumptions would not hold.  I am not sure what it would mean
>> though and we don't generate any such plans today to my knowledge.  It
>> doesn't seem to make sense for the inner side of Nested Loop to be
>> partial.  Have I missed something here?
>
> I bet this could happen, although recent commits have demonstrated
> that my knowledge of how PostgreSQL handles rescans is less than
> compendious.  Suppose there's a Nested Loop below the Gather and above
> the Hash Join, implementing a join condition that can't give rise to a
> parameterized path, like a.x + b.x = 0.


Hmm.  I still don't see how that could produce a rescan of a partial
path without an intervening Gather, and I would really like to get to
the bottom of this.

At the risk of mansplaining the code that you wrote and turning out to
be wrong:  A Nested Loop can't ever have a partial path on the inner
side.  Under certain circumstances it can have a partial path on the
outer side, because its own results are partial, but for each outer
row it needs to do a total (non-partial) scan of the inner side so
that it can reliably find or not find matches.  Therefore we'll never
rescan partial paths directly, we'll only ever rescan partial paths
indirectly via a Gatheroid node that will synchronise the rescan of
all children to produce a non-partial result.

There may be more reasons to rescan that I'm not thinking of.  But the
whole idea of a rescan seems to make sense only for non-partial paths.
What would it even mean for a worker process to decide to rescan (say)
a Seq Scan without any kind of consensus?

Thought experiment: I suppose we could consider replacing Gather's
clunky shut-down-and-relaunch-workers synchronisation technique with a
new protocol where the Gather node sends a 'rescan!' message to each
worker and then discards their tuples until it receives 'OK, rescan
starts here', and then each parallel-aware node type supplies its own
rescan synchronisation logic as appropriate.  For example, Seq Scan
would somehow need to elect one participant to run
heap_parallelscan_reinitialize and others would wait until it has
done.  This might not be worth the effort, but thinking about this
problem helped me see that rescan of a partial plan without a Gather
node to coordinate doesn't make any sense.

Am I wrong?

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Hash take II

Reply via email to