On Tue, Jul 29, 2014 at 12:35 PM, Marco Nenciarini <marco.nenciar...@2ndquadrant.it> wrote: >> I agree with much of that. However, I'd question whether we can >> really seriously expect to rely on file modification times for >> critical data-integrity operations. I wouldn't like it if somebody >> ran ntpdate to fix the time while the base backup was running, and it >> set the time backward, and the next differential backup consequently >> omitted some blocks that had been modified during the base backup. > > Our proposal doesn't rely on file modification times for data integrity.
Good. > We are using the file mtime only as a fast indication that the file has > changed, and transfer it again without performing the checksum. > If timestamp and size match we rely on *checksums* to decide if it has > to be sent. So an incremental backup reads every block in the database and transfers only those that have changed? (BTW, I'm just asking. That's OK with me for a first version; we can make improve it, shall we say, incrementally.) Why checksums (which have an arbitrarily-small chance of indicating a match that doesn't really exist) rather than LSNs (which have no chance of making that mistake)? > In "SMART MODE" we would use the file mtime to skip the checksum check > in some cases, but it wouldn't be the default operation mode and it will > have all the necessary warnings attached. However the "SMART MODE" isn't > a core part of our proposal, and can be delayed until we agree on the > safest way to bring it to the end user. That's not a mode I'd feel comfortable calling "smart". More like "roulette mode". IMV, the way to eventually make this efficient is to have a background process that reads the WAL and figures out which data blocks have been modified, and tracks that someplace. Then we can send a precisely accurate backup without relying on either modification times or reading the full database. If Heikki's patch to standardize the way this kind of information is represented in WAL gets committed, this should get a lot easier to implement. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers