Re: [HACKERS] Changeset Extraction v7.0 (was logical changeset generation)

Robert Haas Wed, 22 Jan 2014 10:01:42 -0800

On Wed, Jan 22, 2014 at 10:48 AM, Andres Freund <and...@2ndquadrant.com> wrote:
> Yes, individual operations should be, but you cannot be sure whether a
> rename()/unlink() will survive a crash until the directory is
> fsync()ed. So, what is one going to do if the unlink suceeded, but the
> fsync didn't?


Well, apparently, one is going to PANIC and reinitialize the system.
I presume that upon reinitialization we'll decide that the slot is
gone, and thus won't recreate it in shared memory.  Of course, if the
entire system suffers a hard power failure after that and before the
directory is succesfully fsync'd, then the slot could reappear on the
next startup.  Which is also exactly what would happen if we removed
the slot from shared memory after doing the unlink, and then the
system suffered a hard power failure before the directory contents
made it to disk.  Except that we also panicked.

In the case of shared buffers, the way we handle fsync failures is by
not allowing the system to checkpoint until all of the fsyncs succeed.
 If there's an OS-level reset before that happens, WAL replay will
perform the same buffer modifications over again and the next
checkpoint will again try to flush them to disk and will not complete
unless it does.  That forms a closed system where we never advance the
redo pointer over the covering WAL record until the changes it covers
are on the disk.  But I don't think this code has any similar
interlock; if it does, I missed it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changeset Extraction v7.0 (was logical changeset generation)

Reply via email to