On Tue, Mar 29, 2016 at 6:40 PM, Magnus Hagander <mag...@hagander.net>
wrote:

>
>
> On Tue, Mar 29, 2016 at 4:36 PM, David Steele <da...@pgmasters.net> wrote:
>
>> On 3/22/16 12:31 PM, Magnus Hagander wrote:
>>
>> On Tue, Mar 22, 2016 at 5:27 PM, David Steele <da...@pgmasters.net
>>> <mailto:da...@pgmasters.net>> wrote:
>>>
>> >
>>
>>>     > Adding the stop time column should be a simple addition and I
>>> don't see
>>>     > a problem with that. I think I misunderstood your original request
>>> on
>>>     > that. Because you are just talking about returning a timestamptz
>>> with
>>>     > the "right now" value for when you called pg_stop_backup()? Or to
>>> be
>>>     > specific, just before pg_Stop_backup *finished*. Or do you mean
>>> when
>>>     > pg_stop_backup() started?
>>>
>>>     What would be ideal is the minimum time that could be used for
>>> PITR.  In
>>>     an exclusive backup that's the time the end-of-backup record is
>>> written
>>>     to WAL.  In a non-exlusive backup I'm not quite sure how that works.
>>>
>>
>> I guess I was hoping that you would know.  I fine with just getting the
>> current timestamp as is currently done in do_pg_stop_backup().  It's not
>> perfect but it will be pretty close.
>>
>> I thought some more about putting STOP_WAL_LOCATION into the backup label
>> and I think this is an important step.  Without that the recovery process
>> needs to use pg_control to determine when the database has reach
>> consistency and that will only work if pg_control was copied last.
>>
>> In summary, I think the patch should be updated to include stop_time as a
>> column and add STOP_WAL_LOCATION and STOP_TIME to backup_label.  The
>> recovery process should read STOP_WAL_LOCATION from backup_label and use
>> that to decide when the database has become consistent.
>>
>
> Is that the only reason we need pg_control to be copied last? I thought
> there were other reasons for that.
>
> I was chatting with Stephen about this earlier, and one thing we
> considered was that maybe we should return pg_control as a bytea field from
> pg_stop_backup() thereby strongly hinting to tools that they should write
> it out from there rather than copy it from the data directory (we can't
> stop them from copying it from the data directory like we do for
> backup_label of course, but if we return the contents and document that's
> how it should be done, it's pretty clear).
>
> But if we can actually remove the whole requirement on the order of the
> copy of pg_control by doing that it certainly seems worthwhile.
>
>
> So - I can definitely see the argument for returning the stop wal
> *location*. But I'm still not sure what the definition of the time would
> be? We can't return it before we know what it means...
>


I had a chat with Heikki, and here's another suggestion:

1. We don't touch the current exclusive backups at all, as previously
discussed, other than deprecating their use. For backwards compat.

2. For new backups, we return the contents of pg_control as a bytea from
pg_stop_backup(). We tell backup programs they are supposed to write this
out as pg_control.backup, *not* as pg_control.

3a. On recovery, if it's an exlcusive backup, we do as we did before.

3b. on recovery, in non-exclusive backups (determined from backup_label),
we check that pg_control.backup exists *and* that pg_control does *not*
exist. That guards us reasonably against backup programs that do the wrong
thing, and we know we get the correct version of pg_control.

4. (we can still add the stop location to the backup_label file in case
backup programs find it useful, but we don't use it in recovery)

Thoughts about this approach?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Reply via email to