On 02.12.2010 12:31, Heikki Linnakangas wrote:
On 02.12.2010 13:25, Simon Riggs wrote:
On Thu, 2010-12-02 at 12:41 +0200, Heikki Linnakangas wrote:
On 02.12.2010 11:02, Simon Riggs wrote:
The cause of the issue is that replay starts at one LSN and there is a
delay until the RunningXacts WAL record occurs. If there was no delay,
there would be no issue at all. In CreateCheckpoint() we start by
grabbing the WAInsertLock and later recording that pointer as part of
the checkpoint record. My proposal is to replace the "grab the lock"
code with the insert of the RunningXacts WAL record (when wal_level
set), so that recovery always starts with that record type.

Oh, interesting idea. But AFAICS closing the gap between acquiring the
running-xacts snapshot and writing it to the log is sufficient, I don't
see what moving the running-xacts record buys us. Does it allow some
further simplifications somewhere?

Your patch is quite long and you do a lot more than just alter the
locking. I don't think we need those changes at all and especially would
not wish to backpatch that.

Most of the changes to procarray.c were about removing code that's no
longer necessary when we close the gap between acquiring and writing the
running-xacts WAL record. You can leave it as it is as a historical
curiosity, but I'd prefer to simplify it, given that we now know that it
doesn't actually work correctly if the gap is not closed.

Ok, I've committed this patch now.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to