Re: [HACKERS] Early locking option to parallel backup

2018-03-02 Thread Andres Freund
On 2018-03-02 11:42:06 -0500, Robert Haas wrote:
> On Fri, Mar 2, 2018 at 2:29 AM, Andres Freund  wrote:
> > There seems to to be consensus in this thread that the approach Lucas
> > proposed isn't what we want, and that instead some shared lock based
> > approach is desirable.  As that has been the case for ~1.5 months, I
> > propose we mark this as returned with feedback?
> 
> Yes, that seems pretty clear-cut to me.  It would be totally unfair if
> a patch that hasn't been updated since November were allowed to submit
> a new version after the start of the final CommitFest.  We shouldn't
> be working on anything now that hasn't been under active development
> recently; we have enough things (and then some) that have.

Done.



Re: [HACKERS] Early locking option to parallel backup

2018-03-02 Thread Robert Haas
On Fri, Mar 2, 2018 at 2:29 AM, Andres Freund  wrote:
> There seems to to be consensus in this thread that the approach Lucas
> proposed isn't what we want, and that instead some shared lock based
> approach is desirable.  As that has been the case for ~1.5 months, I
> propose we mark this as returned with feedback?

Yes, that seems pretty clear-cut to me.  It would be totally unfair if
a patch that hasn't been updated since November were allowed to submit
a new version after the start of the final CommitFest.  We shouldn't
be working on anything now that hasn't been under active development
recently; we have enough things (and then some) that have.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Early locking option to parallel backup

2018-03-01 Thread Andres Freund
On 2018-01-15 12:12:26 -0500, Robert Haas wrote:
> On Thu, Jan 11, 2018 at 9:02 AM, Stephen Frost  wrote:
> > Parallel pg_dump is based on synchronized transactions though and we
> > have a bunch of checks in ImportSnapshot() because a pg_dump parallel
> > worker also can't really be quite the same as a normal backend.  Perhaps
> > we could add on more restrictions in ImportSnapshot() to match the
> > restrictions for parallel-mode workers?  If we think there's other users
> > of SET TRANSACTION SNAPSHOT then we might need to extend that command
> > for this case, but that seems relatively straight-forward.  I don't know
> > how reasonable the idea of taking a normally-started backend and making
> > it close enough to a parallel worker when a SET TRANSACTION SNAPSHOT
> > PARALLEL (or whatever) happens to allow it to skip the lock fairness is
> > though.
> 
> It seems pretty tricky to me.  I actually don't think this use case
> has all that much in common with the parallel query case.  Both can be
> addressed by tweaking the lock manager, but I think this needs a
> different set of tweaks than that did.

There seems to to be consensus in this thread that the approach Lucas
proposed isn't what we want, and that instead some shared lock based
approach is desirable.  As that has been the case for ~1.5 months, I
propose we mark this as returned with feedback?

Greetings,

Andres Freund



Re: [HACKERS] Early locking option to parallel backup

2018-01-15 Thread Robert Haas
On Thu, Jan 11, 2018 at 9:02 AM, Stephen Frost  wrote:
> Parallel pg_dump is based on synchronized transactions though and we
> have a bunch of checks in ImportSnapshot() because a pg_dump parallel
> worker also can't really be quite the same as a normal backend.  Perhaps
> we could add on more restrictions in ImportSnapshot() to match the
> restrictions for parallel-mode workers?  If we think there's other users
> of SET TRANSACTION SNAPSHOT then we might need to extend that command
> for this case, but that seems relatively straight-forward.  I don't know
> how reasonable the idea of taking a normally-started backend and making
> it close enough to a parallel worker when a SET TRANSACTION SNAPSHOT
> PARALLEL (or whatever) happens to allow it to skip the lock fairness is
> though.

It seems pretty tricky to me.  I actually don't think this use case
has all that much in common with the parallel query case.  Both can be
addressed by tweaking the lock manager, but I think this needs a
different set of tweaks than that did.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Early locking option to parallel backup

2018-01-11 Thread Stephen Frost
Lucas, Robert, all,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Mon, Nov 6, 2017 at 4:43 AM, Tom Lane  wrote:
> > I wonder if we couldn't somehow repurpose the work that was done for
> > parallel workers' locks.  Lots of security-type issues to be handled
> > if we're to open that up to clients, but maybe it's solvable.  For
> > instance, maybe only allowing it to clients sharing the same snapshot
> > would help.
> 
> Interesting idea.  There's a bunch of holes that would need to be
> patched there; for instance, you can't have one session running DDL
> while somebody else has AccessShareLock.  Parallel query relies on the
> parallel-mode restrictions to prevent that kind of thing from
> happening, but it would be strange (and likely somewhat broken) to try
> to enforce those here.  It would be strange and probably bad if LOCK
> TABLE a; LOCK TABLE b in one session and LOCK TABLE b; LOCK TABLE a in
> another session failed to deadlock.  In short, there's a big
> difference between a single session using multiple processes and
> multiple closely coordinated sessions.

Parallel pg_dump is based on synchronized transactions though and we
have a bunch of checks in ImportSnapshot() because a pg_dump parallel
worker also can't really be quite the same as a normal backend.  Perhaps
we could add on more restrictions in ImportSnapshot() to match the
restrictions for parallel-mode workers?  If we think there's other users
of SET TRANSACTION SNAPSHOT then we might need to extend that command
for this case, but that seems relatively straight-forward.  I don't know
how reasonable the idea of taking a normally-started backend and making
it close enough to a parallel worker when a SET TRANSACTION SNAPSHOT
PARALLEL (or whatever) happens to allow it to skip the lock fairness is
though.

> Also, even if you did it, you still need a lot of PROCLOCKs.  Workers
> don't need to take all locks up front because they can be assured of
> getting them later, but they've still got to lock the objects they
> actually want to access.  Group locking aims to prevent deadlocks
> between cooperating processes; it is not a license to skip locking
> altogether.

This wouldn't be any different from what's happening today in pg_dump
though, so I'm not sure why this would be an issue?  The proposed patch
locks everything in every worker, which is quite different from the main
process locking everything and then the individual workers locking the
objects they're working on.

* Lucas B (luca...@gmail.com) wrote:
> Em 05/11/2017 21:09, Andres Freund escreveu:
> >Well, the current approach afaics requires #relations * 2 locks, whereas
> >acquiring them in every worker would scale that with the number of
> >workers.
> 
> Yes, that is why I proposed as an option. As an option will not affect
> anyone that does not want to use it.

That's not actually correct- additional options add maintenance overhead
for hackers and for users to have to deal with and there's at least a
clear idea on how we could make this "just work" without having to add
an additional option.  Based on that and the discussion thus far, it
seems like the next step is to try and work through a way to change
things to allow workers to skip the lock fairness and that the current
patch isn't going to be accepted.

> It seems natural to think several connections in a synchronized snapshot as
> the same connection. Then it may be reasonable to grant a shared lock out of
> turn if any connection of the same shared snapshot already have a granted
> lock for the same relation. Last year Tom mentioned that there is already
> queue-jumping logic of that sort in the lock manager for other purposes.
> Although seems conceptually simple, I suspect the implementation is not.

The implementation is almost certainly not simple, but that doesn't mean
we should go in another direction and require a great deal more locks or
add an option to do so.

I'm going to move this patch into 'Waiting for Author' since it's
clearly gotten a good bit of review and discussion already.  If there's
no interest in further exploring the approach of changing the locking
logic then we should probably update the patch to RWF.

This seems like a good project to go on the TODO list though, if you
aren't going to pursue it, to further explore how to let parallel
pg_dump processes jump the lock queue, with a reference back to this
thread.

Thanks!

Stephen


signature.asc
Description: PGP signature