Re: [PATCHES] Load Distributed Checkpoints, take 3

Heikki Linnakangas Thu, 21 Jun 2007 01:35:50 -0700

Tom Lane wrote:

Heikki Linnakangas <[EMAIL PROTECTED]> writes:
I added a spinlock to protect the signaling fields between bgwriter andbackends. The current non-locking approach gets really difficult as thepatch adds two new flags, and both are more important than the existingckpt_time_warn flag.
That may be, but you could minimize the cost and notational ugliness by
not using the spinlock where you don't have to.  Put the sig_atomic_t
fields back the way they were, and many of the uses of the spinlock go
away.  All you really need it for is the places where the additional
flags are set or read.

I find it easier to understand if it's used whenever any of the fieldsare accessed. You don't need to read and write ckpt_failed andckpt_started/ckpt_done in specific order anymore, for example.

Some other comments:

I tend to agree with whoever said upthread that the combination of GUC
variables proposed here is confusing and ugly.  It'd make more sense to
have min and max checkpoint rates in KB/s, with the max checkpoint rate
only honored when we are predicting we'll finish before the next
checkpoint time.

Really? I thought everyone is happy with the current combination, andthat it was just the old trio of parameters controlling the write, napand sync phases that was ugly. One particularly nice thing aboutdefining the duration as a fraction of checkpoint interval is that wecan come up with a reasonable default value that doesn't depend on yourhardware.


How would a min and max rate work?

Anyone else have an opinion on the parameters?

The flow of control and responsibility between xlog.c, bgwriter.c and
bufmgr.c seems to have reached the breaking point of unintelligibility.
Can you think of a refactoring that would simplify it?  We might have
to resign ourselves to this much messiness, but I'd rather not...

The problem we're trying to solve is doing a checkpoint while runningthe normal bgwriter activity at the same time. The normal way to do twothings simultaneously is to have two different processes (or threads). Ithought about having a separate checkpoint process, but I gave up onthat thought because the communication needed between backends, bgwriterand the checkpointer seems like a mess. The checkpointer would need thependingOpsTable so that it can do the fsyncs, and it would also need toreceive the forget-messages to that table. We could move that tableentirely to the checkpointer, but since bgwriter is presumably doingmost of the writes, there would be a lot of chatter between bgwriter andthe checkpointer.

The current approach is like co-operative multitasking. BufferSyncsyields control to bgwriter every now and then.

The division of labor between xlog.c and other modules is not that bad,IMO. CreateCheckPoint is the main entry point to create a checkpoint,and it calls other modules to do their stuff, including bufmgr.c.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Re: [PATCHES] Load Distributed Checkpoints, take 3

Reply via email to