On 2 November 2014 10:39, Simon Riggs <si...@2ndquadrant.com> wrote:
> On 15 October 2014 05:03, Robert Haas <robertmh...@gmail.com> wrote:
>
>> At least to me, that simple scenario is clear-cut[1], but what do we
>> do in more complicated situations?  For example, suppose backends A
>> and B are members of the same locking group.  A locks a relation with
>> AccessShareLock, an unrelated process X queues up waiting for an
>> AccessExclusiveLock, and then B also requests AccessShareLock.  The
>> normal behavior here is that B should wait for X to go first, but here
>> that's a problem.  If A is the user backend and B is a worker backend,
>> A will eventually wait for B, which is waiting for X, which is waiting
>> for A: deadlock.
>
> Yes, deadlock.
>
> My understanding would be that the lead process would wait on a latch,
> not a heavyweight lock. So it would never perform a deadlock
> detection. Which leaves only X and B to perform the deadlock check.
>
> Are you aware that the deadlock detector will reorder the lock queue,
> if that presents a possible solution to the deadlock?
>
> Would the above example not be resolved simply with the existing code?

Self Answer: Only if the lock request would not be self-conflicting.

Reading the patch (yay!) you say

*   ...However, if a conflicting lock is present that is
* not held by another member of our lock group, then we can't do this.
* In that case we'll have to wait despite the deadlock risk and hope for
* the best.

I wasn't sure what that meant.

Actually, I like the rest of the patch, assuming I understand it. Not
sure that changes anything I've said about wanting to see something
actually working, even as a prototype. For example, how will you test
the code in this patch actually works? The best way is to make
parallel query work in a restricted form and then drop that in
isolation tester, otherwise you'll have to waste time on a test
harness.

The procgloballist stuff should be the subject of a separate patch
which I agree with.

The main question is still why any of this is necessary? If we are
only acquiring fast path locks in workers, what need is there for any
of this? The leader and workers never conflict with each other, so why
would they ever wait for each other (except at a point where they are
actively transferring data), which is the necessary pre-condition for
a deadlock?

Yes, running a command with a conflicting lock would disrupt the start
of a parallel operation, but as I said above, the deadlock detector
will eventually see that and re-arrange us so there is no deadlock.

The only need I can see is if one of the workers/leader requests a
Strong lock on an object, someone else requests a conflicting lock on
that object and then we perform a parallel operation ourselves. We
could easily solve that by saying that you can't go parallel if you
hold a strong lock AND that workers can't take anything higher than
RowShareLock, like HS backends. That's good enough for now.

-- 
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to