Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

MauMau Fri, 24 Jan 2014 05:32:26 -0800

From: "Fujii Masao" <[email protected]>

On Wed, Jan 22, 2014 at 6:37 AM, Heikki Linnakangas

Thanks! The patch looks good to me. Attached is the updated version of
the patch. I added the comments.

Thank you very much. Your comment looks great. I tested some recoverysituations, and confirmed that WAL segments were kept/removed as intended.I'll update the CommitFest entry with this patch.

<[email protected]> wrote:

Sorry for reacting so slowly, but I'm not sure I like this patch. It's a
quite useful property that all the WAL files that are needed for recovery
are copied into pg_xlog, even when restoring from archive, even when not

doing cascading replication. It guarantees that you can restart thestandby,

even if the connection to the archive is lost for some reason. I
intentionally changed the behavior for archive recovery too, when it was
introduced for cascading replication. Also, I think it's good that the

behavior does not depend on whether cascading replication is enabled -it's

a quite subtle difference.

So, IMHO this is not a bug, it's a feature.


Yep.


I understood the benefit for the standby recovery.

To solve the original problem of running out of disk space in archive
recovery, I wonder if we should perform restartpoints more aggressively.We
intentionally don't trigger restatpoings by checkpoint_segments, only
checkpoint_timeout, but I wonder if there should be an option for that.
That's an option.
MauMau, did you try simply reducing checkpoint_timeout, while doing
recovery?
The problem is, we might not be able to perform restartpoints moreaggressivelyeven if we reduce checkpoint_timeout in the server under the archiverecovery.Because the frequency of occurrence of restartpoints depends on not onlythatcheckpoint_timeout but also the checkpoints which happened while theserver
was running.

I haven't tried reducing checkpoint_timeout. I think we cannot take thatapproach, because we cannot suggest appropriate checkpoint_timeout to users.That is, what checkpoint_timeout setting can we suggest so that WAL doesn'taccumulate in pg_xlog/ more than 9.1?

In addition, as Fujii-san said, it doesn't seem we can restartpointcompletely. Plus, if we can cause restartpoints frequently, the recoverywould take (much?) longer, because shared buffers are flushed morefrequently.

So, how about just removing AllowCascadeReplication() condition from thepatch? That allows WAL to accumulate in pg_xlog/ during standby recoverybut not during archive recovery.


Regards
MauMau




--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

Reply via email to