Hello,

It applies the master and compiled cleanly and no error by
regtest.  (I didn't confirmed that the problem is still fixed but
seemingly no problem)

At Mon, 14 Nov 2016 15:09:09 +0900, Michael Paquier <michael.paqu...@gmail.com> 
wrote in <cab7npqrhzs0fnhnaatrce+cqdokkw+kyrazy5o_r-7zqucg...@mail.gmail.com>
> On Sat, Nov 12, 2016 at 9:01 PM, Andres Freund <and...@anarazel.de> wrote:
> > On 2016-11-11 16:42:43 +0900, Michael Paquier wrote:
> >
> >> + * This takes also
> >> + * advantage to avoid 8-byte torn reads on some platforms by using the
> >> + * fact that each insert lock is located on the same cache line.
> >
> > Something residing on the same cache line doens't provide that guarantee
> > on all platforms.
> 
> OK. Let's remove it then.
> 
> >> + * XXX: There is still room for more improvements here, particularly
> >> + * WAL operations related to unlogged relations (INIT_FORKNUM) should not
> >> + * update the progress LSN as those relations are reset during crash
> >> + * recovery so enforcing buffers of such relations to be flushed for
> >> + * example in the case of a load only on unlogged relations is a waste
> >> + * of disk write.
> >
> > Commit records still have to be written, everything else doesn't write
> > WAL. So I'm doubtful this matters much?
> 
> Hm, okay. In most cases this may not matter... Let's rip that off.
> 
> >> @@ -997,6 +1022,24 @@ XLogInsertRecord(XLogRecData *rdata, XLogRecPtr 
> >> fpw_lsn)
> >>               inserted = true;
> >>       }
> >>
> >> +     /*
> >> +      * Update the LSN progress positions. At least one WAL insertion lock
> >> +      * is already taken appropriately before doing that, and it is 
> >> simpler
> >> +      * to do that here when the WAL record data and type are at hand.
> >
> > But we don't use the "WAL record data and type"?
> 
> Yes, at some point this patch did so...
> 
> >> + * GetProgressRecPtr -- Returns the newest WAL activity position, or in
> >> + * other words any activity not referring to standby logging or segment
> >> + * switches. Finding the last activity position is done by scanning each
> >> + * WAL insertion lock by taking directly the light-weight lock associated
> >> + * to it.
> >> + */
> >
> > I'd prefer not to list the specific records here - that's just
> > guaranteed to get out of date. Why not say something "any activity not
> > requiring a checkpoint to be triggered" or such?
> 
> OK. Makes sense to minimize maintenance.
> 
> >> +      * If this isn't a shutdown or forced checkpoint, and if there has 
> >> been no
> >> +      * WAL activity, skip the checkpoint.  The idea here is to avoid 
> >> inserting
> >> +      * duplicate checkpoints when the system is idle. That wastes log 
> >> space,
> >> +      * and more importantly it exposes us to possible loss of both 
> >> current and
> >> +      * previous checkpoint records if the machine crashes just as we're 
> >> writing
> >> +      * the update.
> >
> > Shouldn't this mention archiving and also that we also ignore some forms
> > of WAL activity?
> 
> I have reworded that as:
> "If this isn't a shutdown or forced checkpoint, and if there has been
> no WAL activity requiring a checkpoint, skip it."
> 
> >> -/* Should the in-progress insertion log the origin? */
> >> -static bool include_origin = false;
> >> +/* status flags of the in-progress insertion */
> >> +static uint8 status_flags = 0;
> >
> > that seems a bit too generic a name. 'curinsert_flags'?
> 
> OK.
> 
> >>                       /*
> >> -                      * only log if enough time has passed and some xlog 
> >> record has
> >> -                      * been inserted.
> >> +                      * Only log if enough time has passed, that some WAL 
> >> activity
> >> +                      * has happened since last checkpoint, and that some 
> >> new WAL
> >> +                      * records have been inserted since the last time we 
> >> came here.
> >
> > I think that sentence needs some polish.
> 
> Let's do this better:
>             /*
> -            * only log if enough time has passed and some xlog record has
> -            * been inserted.
> +            * Only log if one of the following conditions is satisfied since
> +            * the last time we came here::
> +            * - timeout has been reached.
> +            * - WAL activity has happened since last checkpoint.
> +            * - New WAL records have been inserted.
>              */
> 
> >>                        */
> >>                       if (now >= timeout &&
> >> -                             last_snapshot_lsn != GetXLogInsertRecPtr())
> >> +                             GetLastCheckpointRecPtr() < 
> >> current_progress_lsn &&
> >> +                             last_progress_lsn < current_progress_lsn)
> >>                       {
> >
> > Hm. I don't think it's correct to use GetLastCheckpointRecPtr() here?
> > Don't we need to do the comparisons here (and when doing the checkpoint
> > itself) with the REDO pointer of the last checkpoint?
> 
> Hm? The progress pointer is pointing to the lastly inserted LSN, which
> is not the position of the REDO pointer, but the one of the checkpoint
> record. Doing a comparison of the REDO pointer would be a moot
> operation, because as the checkpoint completes, the progress LSN will
> be updated as well. Or do you mean that the progress LSN should *not*
> be updated for a checkpoint record? It seems to me that it should
> but...
> 
> >> diff --git a/src/backend/postmaster/checkpointer.c 
> >> b/src/backend/postmaster/checkpointer.c
> >> index 397267c..7ecc00e 100644
> >> --- a/src/backend/postmaster/checkpointer.c
> >> +++ b/src/backend/postmaster/checkpointer.c
> >> @@ -164,6 +164,7 @@ static double ckpt_cached_elapsed;
> >>
> >>  static pg_time_t last_checkpoint_time;
> >>  static pg_time_t last_xlog_switch_time;
> >> +static XLogRecPtr last_xlog_switch_lsn = InvalidXLogRecPtr;
> >
> > Hm. Is it a good idea to use a static for this? Did you consider
> > checkpointer restarts?
> 
> Indeed, I forgot about that and the current approach is not solid. The
> best way to do things then is to track the LSN position of the last
> switched segment in XLogCtl..

If I'm not missing something, at the worst we have a checkpoint
after a checkpointer restart that should have been supressed. Is
it worth picking it up for the complexity?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to