I'm seeing a problem on my benchmark machine: checkpoints stop happening after the ramp-up period.

It looks like the bgwriter gets starved waiting on the CheckpointStartLock. The CheckpointStartLock is held in shared mode over an XLogFlush when committing, which on an extremely busy system like a benchmark is always long enough to have a new transaction to acquire the CheckpointStartLock again.

I'm running another test with more logging to confirm that's what's happening, but I'm pretty sure that's it...

As a proposed fix, instead of acquiring the CheckpointStartLock in RecordTransactionCommit, we set a flag in MyProc saying "commit in progress". Checkpoint will scan through the procarray and make note of any commit in progress transactions, after computing the new redo record ptr, and wait for all of them to finish before flushing clog.

Unless someone has a better idea, I'll write a patch to do the above.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Reply via email to