0. Introduction: ================================= This is a proposal for adding incremental backup support to streaming protocol and hence to pg_basebackup command.
1. Proposal ================================= Our proposal is to introduce the concept of a backup profile. The backup profile consists of a file with one line per file detailing tablespace, path, modification time, size and checksum. Using that file the BASE_BACKUP command can decide which file needs to be sent again and which is not changed. The algorithm should be very similar to rsync, but since our files are never bigger than 1 GB per file that is probably granular enough not to worry about copying parts of files, just whole files. This way of operating has also some advantages over using rsync to take a physical backup: It does not require the files from the previous backup to be checksummed again, and they could even reside on some form of long-term, not-directly-accessible storage, like a tape cartridge or somewhere in the cloud (e.g. Amazon S3 or Amazon Glacier). It could also be used in 'refresh' mode, by allowing the pg_basebackup command to 'refresh' an old backup directory with a new backup. The final piece of this architecture is a new program called pg_restorebackup which is able to operate on a "chain of incremental backups", allowing the user to build an usable PGDATA from them or executing maintenance operations like verify the checksums or estimate the final size of recovered PGDATA. We created a wiki page with all implementation details at https://wiki.postgresql.org/wiki/Incremental_backup 2. Goals ================================= The main goal of incremental backup is to reduce the size of the backup. A secondary goal is to reduce backup time also. 3. Development plan ================================= Our development plan proposal is articulated in four phases: Phase 1: Add ‘PROFILE’ option to ‘BASE_BACKUP’ Phase 2: Add ‘INCREMENTAL’ option to ‘BASE_BACKUP’ Phase 3: Support of PROFILE and INCREMENTAL for pg_basebackup Phase 4: pg_restorebackup We are willing to get consensus over our design here before to start implementing it. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
signature.asc
Description: OpenPGP digital signature