Re: [HACKERS] Early locking option to parallel backup
On 11/06/2017 12:30 PM, Stephen Frost wrote: * Lucas (luca...@gmail.com) wrote: pg_dump was taking more than 24 hours to complete in one of my databases. I begin to research alternatives. Parallel backup reduced the backup time to little less than a hour, but it failed almost every time because of concurrent queries that generated exclusive locks. It is difficult to guarantee that my applications will not issue queries such as drop table, alter table, truncate table, create index or drop index for a hour. And I prefer not to create controls mechanisms to that end if I can work around it. I certainly understand the value of pg_dump-based backups, but have you considered doing file-based backups? That would avoid the need to do any in-database locking at all, and would give you the ability to do PITR too. Further, you could actually restore that backup to another system and then do a pg_dump there to get a logical representation (and this would test your physical database backup/restore process too...). Yes, a point in time recovery has the advantage of keeping the backup more up-to-date, but has the disadvantage of being more expensive and complex. In my case, point in time recovery would require an upgrade of 10 TB of storage space and my stakeholders did not approved this investment yet. I suspect that there is lots of users that uses pg_dump as primary backup tool and that they would benefit of a more reliable parallel backup. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Early locking option to parallel backup
Em 05/11/2017 21:09, Andres Freund escreveu: On 2017-11-05 17:38:39 -0500, Robert Haas wrote: On Sun, Nov 5, 2017 at 5:17 AM, Lucas wrote: The patch creates a "--lock-early" option which will make pg_dump to issue shared locks on all tables on the backup TOC on each parallel worker start. That way, the backup has a very small chance of failing. When it does, happen in the first few seconds of the backup job. My backup scripts (not included here) are aware of that and retries the backup in case of failure. I wonder why we don't do this already ... and by default. Well, the current approach afaics requires #relations * 2 locks, whereas acquiring them in every worker would scale that with the number of workers. Yes, that is why I proposed as an option. As an option will not affect anyone that does not want to use it. IIUC the problem here is that even though a lock is already held by the main backend an independent locker's request will prevent the on-demand lock by the dump worker from being granted. It seems to me the correct fix here would be to somehow avoid the fairness logic in the parallel dump case - although I don't quite know how to best do so. It seems natural to think several connections in a synchronized snapshot as the same connection. Then it may be reasonable to grant a shared lock out of turn if any connection of the same shared snapshot already have a granted lock for the same relation. Last year Tom mentioned that there is already queue-jumping logic of that sort in the lock manager for other purposes. Although seems conceptually simple, I suspect the implementation is not. On the other hand, the lock-early option is very simple and has no impact on anyone that does not want to use it. --- Lucas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Early locking option to parallel backup
Lucas, * Lucas (luca...@gmail.com) wrote: > pg_dump was taking more than 24 hours to complete in one of my databases. I > begin to research alternatives. Parallel backup reduced the backup time to > little less than a hour, but it failed almost every time because of > concurrent queries that generated exclusive locks. It is difficult to > guarantee that my applications will not issue queries such as drop table, > alter table, truncate table, create index or drop index for a hour. And I > prefer not to create controls mechanisms to that end if I can work around > it. I certainly understand the value of pg_dump-based backups, but have you considered doing file-based backups? That would avoid the need to do any in-database locking at all, and would give you the ability to do PITR too. Further, you could actually restore that backup to another system and then do a pg_dump there to get a logical representation (and this would test your physical database backup/restore process too...). Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Early locking option to parallel backup
On 2017-11-05 22:43:34 -0500, Tom Lane wrote: > > IIUC the problem here is that even though a lock is already > > held by the main backend an independent locker's request will prevent > > the on-demand lock by the dump worker from being granted. It seems to > > me the correct fix here would be to somehow avoid the fairness logic in > > the parallel dump case - although I don't quite know how to best do so. > > I wonder if we couldn't somehow repurpose the work that was done for > parallel workers' locks. Lots of security-type issues to be handled > if we're to open that up to clients, but maybe it's solvable. For > instance, maybe only allowing it to clients sharing the same snapshot > would help. Yea, I'd been wondering the same. I'm slightly worried that somehow tying multiple clients into parallel mode would cause a bunch of problems - that's not really the purpose of the code and a number of its assumptions aren't quite right for that. I'm not sure it really buys us much in contrast to just allowing a locker to specify that it's allowed to jump the lock queue for an ASL if it has 'backup' rights or such. Or actually, just allow it as a general option to LOCK, there's plenty other operational cases where the current "fair" behaviour is really annoying, e.g. when executing operational DDL/DML and such. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Early locking option to parallel backup
On Mon, Nov 6, 2017 at 4:43 AM, Tom Lane wrote: > I wonder if we couldn't somehow repurpose the work that was done for > parallel workers' locks. Lots of security-type issues to be handled > if we're to open that up to clients, but maybe it's solvable. For > instance, maybe only allowing it to clients sharing the same snapshot > would help. Interesting idea. There's a bunch of holes that would need to be patched there; for instance, you can't have one session running DDL while somebody else has AccessShareLock. Parallel query relies on the parallel-mode restrictions to prevent that kind of thing from happening, but it would be strange (and likely somewhat broken) to try to enforce those here. It would be strange and probably bad if LOCK TABLE a; LOCK TABLE b in one session and LOCK TABLE b; LOCK TABLE a in another session failed to deadlock. In short, there's a big difference between a single session using multiple processes and multiple closely coordinated sessions. Also, even if you did it, you still need a lot of PROCLOCKs. Workers don't need to take all locks up front because they can be assured of getting them later, but they've still got to lock the objects they actually want to access. Group locking aims to prevent deadlocks between cooperating processes; it is not a license to skip locking altogether. None of which is to say that the problems don't feel related somehow. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Early locking option to parallel backup
Andres Freund writes: > Well, the current approach afaics requires #relations * 2 locks, whereas > acquiring them in every worker would scale that with the number of > workers. Yeah, that's gonna be a problem with this proposal. > IIUC the problem here is that even though a lock is already > held by the main backend an independent locker's request will prevent > the on-demand lock by the dump worker from being granted. It seems to > me the correct fix here would be to somehow avoid the fairness logic in > the parallel dump case - although I don't quite know how to best do so. I wonder if we couldn't somehow repurpose the work that was done for parallel workers' locks. Lots of security-type issues to be handled if we're to open that up to clients, but maybe it's solvable. For instance, maybe only allowing it to clients sharing the same snapshot would help. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Early locking option to parallel backup
On 2017-11-05 17:38:39 -0500, Robert Haas wrote: > On Sun, Nov 5, 2017 at 5:17 AM, Lucas wrote: > > The patch creates a "--lock-early" option which will make pg_dump to issue > > shared locks on all tables on the backup TOC on each parallel worker start. > > That way, the backup has a very small chance of failing. When it does, > > happen in the first few seconds of the backup job. My backup scripts (not > > included here) are aware of that and retries the backup in case of failure. > > I wonder why we don't do this already ... and by default. Well, the current approach afaics requires #relations * 2 locks, whereas acquiring them in every worker would scale that with the number of workers. IIUC the problem here is that even though a lock is already held by the main backend an independent locker's request will prevent the on-demand lock by the dump worker from being granted. It seems to me the correct fix here would be to somehow avoid the fairness logic in the parallel dump case - although I don't quite know how to best do so. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Early locking option to parallel backup
On Sun, Nov 5, 2017 at 5:17 AM, Lucas wrote: > The patch creates a "--lock-early" option which will make pg_dump to issue > shared locks on all tables on the backup TOC on each parallel worker start. > That way, the backup has a very small chance of failing. When it does, > happen in the first few seconds of the backup job. My backup scripts (not > included here) are aware of that and retries the backup in case of failure. I wonder why we don't do this already ... and by default. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Early locking option to parallel backup
On Sun, Nov 5, 2017 at 7:17 PM, Lucas wrote: > The patch creates a "--lock-early" option which will make pg_dump to issue > shared locks on all tables on the backup TOC on each parallel worker start. > That way, the backup has a very small chance of failing. When it does, > happen in the first few seconds of the backup job. My backup scripts (not > included here) are aware of that and retries the backup in case of failure. You should register your patch to the next opened commit fest, which will begin in January, if you are looking for feedback and review: https://commitfest.postgresql.org/16/ -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers