* Michael Paquier (michael.paqu...@gmail.com) wrote: > On Thu, Sep 3, 2015 at 11:20 AM, Stephen Frost wrote: > >> Not only, +clog, configuration files, etc. > > > > Configuration files? Perhaps you could elaborate? > > Sure. Sorry for being unclear. It copies everything that is not a > relation file, a kind of base backup without the relation files then.
How does that work on systems where the configuration files aren't stored under PGDATA (Debian and derivatives, at least)? I guess I don't quite see why it's necessary for pg_rewind to copy the configuration files in the first place, it doesn't have the same role as pg_basebackup, at least as I understand it. > >> The problem when using differential backups in this case is > >> performance as mentioned above. We would need to scan the whole target > >> cluster, which may take time, the current approach of pg_rewind only > >> needs to scan WAL records to find the list of blocks modified, and > >> directly requests them from the source. I would expect pg_rewind to be > >> as quick as possible. > > > > I don't follow why the current approach of pg_rewind would have to > > change. All I'm suggesting is that we have a different way, one which > > is much more restricted, for pg_rewind to request exactly the > > information it needs for efficient operation. > > Ah, OK. I thought that you were referring to a protocol where caller > sends a single LSN from which it gets a differential backup that needs > to scan all the relation files of the source cluster to get the data > blocks with an LSN newer than the one sent, and then sends them back > to the caller. No, apologies, I was simply pointing out that we might want this kind of a capability at the protocol level to support other replication protocol clients. > I guess that what you are suggesting instead is an approach where > caller sends something like that through the replication protocol with > a relation OID and a block list: > BLOCK_DIFF relation_oid BLOCK_LIST m,n,[o, ...] Right, something along those lines is what I had been thinking. We would probably need to provide independent commands for the different file types, with the parameters expressed in terms appropriate for each kind of file (block numbers for heap, XIDs for WAL and CLOG?). Essentially, whatever API would be both simple for pg_rewind and general enough to be useful for other clients in the future. At least, I imagine that pg_rewind would be a bit simpler if it could communicate with the backend in the 'language of PG' rather than having to specify file names and paths. Other clients that might find such an interface useful are incremental pg_basebackup or possibly parallel pg_basebackup. > Which is close to what pg_read_binary_file does now for a superuser. I really don't see them as being all that close. Further, I worry a bit that users would abuse the replication role to grant access to these functions for non-superusers to be able to access non-PG files (but ones which happen to be under PGDATA, or through a symlink are somewhere else..). > We would need as well to extend BASE_BACKUP so as it does not include > relation files though for this use case. ... huh? I'm not following this comment at all. We might need to provide explicit start/stop backup commands and/or extend BASE_BACKUP for things like parallel pg_basebackup, but I'm not following why we would need to change it for pg_rewind. Further BASE_BACKUP clearly does include relation files today.. Thanks! Stephen
signature.asc
Description: Digital signature