On Mon, Dec 6, 2010 at 7:22 PM, Tom Lane <[email protected]> wrote: > Josh Berkus <[email protected]> writes: > >> However, if you were doing something like parallel pg_dump you could > >> just run the parent and child instances all against the slave, so the > >> pg_dump scenario doesn't seem to offer much of a supporting use-case for > >> worrying about this. When would you really need to be able to do it? > > > If you had several standbys, you could distribute the work of the > > pg_dump among them. This would be a huge speedup for a large database, > > potentially, thanks to parallelization of I/O and network. Imagine > > doing a pg_dump of a 300GB database in 10min. > > That does sound kind of attractive. But to do that I think we'd have to > go with the pass-the-snapshot-through-the-client approach. Shipping > internal snapshot files through the WAL stream doesn't seem attractive > to me. > > While I see Robert's point about preferring not to expose the snapshot > contents to clients, I don't think it outweighs all other considerations > here; and every other one is pointing to doing it the other way. > > How about the publishing transaction puts the snapshot in a (new) system table and passes a UUID to its children, and the joining transactions looks for that UUID in the system table using dirty snapshot (SnapshotAny) using a security-definer function owned by superuser.
No shared memory used, and if WAL-logged, the snapshot would get to the slaves too. I realize SnapshotAny wouldn't be sufficient since we want the tuple to become invisible when the publishing transaction ends (commit/rollback), hence something akin to (new) HeapTupleSatisfiesStillRunning() would be needed. Regards, -- gurjeet.singh @ EnterpriseDB - The Enterprise Postgres Company http://www.EnterpriseDB.com singh.gurj...@{ gmail | yahoo }.com Twitter/Skype: singh_gurjeet Mail sent from my BlackLaptop device
