Re: [HACKERS] Avoiding adjacent checkpoint records

2012-08-30 Thread Robert Haas
On Mon, Aug 13, 2012 at 6:19 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Sat, Jun 9, 2012 at 5:43 AM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: So now the standard for my patches is that I must consider what will happen if the xlog is deleted? When you're

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-12 Thread Bruce Momjian
On Wed, Jun 06, 2012 at 06:46:37PM -0400, Tom Lane wrote: I wrote: Actually, it looks like there is an extremely simple way to handle this, which is to move the call of LogStandbySnapshot (which generates the WAL record in question) to before the checkpoint's REDO pointer is set, but

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-12 Thread Tom Lane
Bruce Momjian br...@momjian.us writes: Stupid question, but why are we not just setting a boolean variable in shared memory if we WAL-write a non-XLOG_RUNNING_XACTS record, and only checkpoint if that is true? Well, (1) we are trying to avoid adding such logic to the critical section inside

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-09 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: So now the standard for my patches is that I must consider what will happen if the xlog is deleted? When you're messing around with something that affects data integrity, yes. The long and the short of it is that this patch does reduce our ability to

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-08 Thread Simon Riggs
On 8 June 2012 05:01, Tom Lane t...@sss.pgh.pa.us wrote: Peter Geoghegan pe...@2ndquadrant.com writes: On 7 June 2012 18:03, Robert Haas robertmh...@gmail.com wrote: On Thu, Jun 7, 2012 at 12:52 PM, Simon Riggs si...@2ndquadrant.com wrote: Clearly, delaying checkpoint indefinitely would be a

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-08 Thread Robert Haas
On Thu, Jun 7, 2012 at 9:25 PM, Simon Riggs si...@2ndquadrant.com wrote: The only risk of data loss is in the case where someone deletes their pg_xlog and who didn't take a backup in all that time, which is hardly recommended behaviour. We're at exactly the same risk of data loss if someone

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-08 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: if the database has checkpointed I haven't been exactly clear on the risks about which Tom and Robert have been concerned; is it a question about whether we change the meaning of these settings to something more complicated?: checkpoint_segments

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-08 Thread Robert Haas
On Fri, Jun 8, 2012 at 12:24 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: I haven't been exactly clear on the risks about which Tom and Robert have been concerned; is it a question about whether we change the meaning of these settings to something more complicated?:

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-07 Thread Simon Riggs
On 6 June 2012 20:08, Tom Lane t...@sss.pgh.pa.us wrote: In commit 18fb9d8d21a28caddb72c7ffbdd7b96d52ff9724, Simon modified the rule for when to skip checkpoints on the grounds that not enough activity has happened since the last one.  However, that commit left the comment block about it in a

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-07 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: there is no guarantee that we'll manage to reach a database state that is consistent with data already flushed out to disk during the last checkpoint. Robert Haas robertmh...@gmail.com wrote: I know of real customers who would have suffered real data

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-07 Thread Robert Haas
On Wed, Jun 6, 2012 at 6:46 PM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Actually, it looks like there is an extremely simple way to handle this, which is to move the call of LogStandbySnapshot (which generates the WAL record in question) to before the checkpoint's REDO pointer is set, but

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-07 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Wed, Jun 6, 2012 at 6:46 PM, Tom Lane t...@sss.pgh.pa.us wrote: If we don't like that, I can think of a couple of other ways to get there, but they have their own downsides: * Instead of trying to detect after-the-fact whether any concurrent WAL

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-07 Thread Simon Riggs
On 7 June 2012 14:59, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Tom Lane t...@sss.pgh.pa.us wrote: there is no guarantee that we'll manage to reach a database state that is consistent with data already flushed out to disk during the last checkpoint. Robert Haas robertmh...@gmail.com

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-07 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: Robert Haas robertmh...@gmail.com wrote: I know of real customers who would have suffered real data loss had this code been present in the server version they were using. If that is the concern, then its a one line fix to add the missing clog flush.

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-07 Thread Simon Riggs
On 7 June 2012 17:27, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: Robert Haas robertmh...@gmail.com wrote: I know of real customers who would have suffered real data loss had this code been present in the server version they were using. If that is the

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-07 Thread Robert Haas
On Thu, Jun 7, 2012 at 12:52 PM, Simon Riggs si...@2ndquadrant.com wrote: Clearly, delaying checkpoint indefinitely would be a high risk choice. But they won't be delayed indefinitely, since changes cause WAL records to be written and that would soon cause another checkpoint. But that's

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-07 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: On 7 June 2012 17:27, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: If that is the concern, then its a one line fix to add the missing clog flush. To where, and what performance impact will that have? To the point

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-07 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: ... It's better to have a few unnecessary checkpoints than to risk losing somebody's data, especially since the unnecessary checkpoints only happen with wal_level=hot_standby, but the data loss risk exists for everyone. Yeah, that's another point

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-07 Thread Simon Riggs
On 7 June 2012 18:03, Robert Haas robertmh...@gmail.com wrote: On Thu, Jun 7, 2012 at 12:52 PM, Simon Riggs si...@2ndquadrant.com wrote: Clearly, delaying checkpoint indefinitely would be a high risk choice. But they won't be delayed indefinitely, since changes cause WAL records to be written

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-07 Thread Peter Geoghegan
On 7 June 2012 18:03, Robert Haas robertmh...@gmail.com wrote: On Thu, Jun 7, 2012 at 12:52 PM, Simon Riggs si...@2ndquadrant.com wrote: Clearly, delaying checkpoint indefinitely would be a high risk choice. But they won't be delayed indefinitely, since changes cause WAL records to be written

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-07 Thread Tom Lane
Peter Geoghegan pe...@2ndquadrant.com writes: On 7 June 2012 18:03, Robert Haas robertmh...@gmail.com wrote: On Thu, Jun 7, 2012 at 12:52 PM, Simon Riggs si...@2ndquadrant.com wrote: Clearly, delaying checkpoint indefinitely would be a high risk choice. But they won't be delayed indefinitely,

[HACKERS] Avoiding adjacent checkpoint records

2012-06-06 Thread Tom Lane
In commit 18fb9d8d21a28caddb72c7ffbdd7b96d52ff9724, Simon modified the rule for when to skip checkpoints on the grounds that not enough activity has happened since the last one. However, that commit left the comment block about it in a nonsensical state: * If this isn't a shutdown or forced

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-06 Thread Robert Haas
On Wed, Jun 6, 2012 at 3:08 PM, Tom Lane t...@sss.pgh.pa.us wrote: In commit 18fb9d8d21a28caddb72c7ffbdd7b96d52ff9724, Simon modified the rule for when to skip checkpoints on the grounds that not enough activity has happened since the last one.  However, that commit left the comment block

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Wed, Jun 6, 2012 at 3:08 PM, Tom Lane t...@sss.pgh.pa.us wrote: In commit 18fb9d8d21a28caddb72c7ffbdd7b96d52ff9724, Simon modified the rule for when to skip checkpoints on the grounds that not enough activity has happened since the last one. IIRC,

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-06 Thread Robert Haas
On Wed, Jun 6, 2012 at 4:24 PM, Tom Lane t...@sss.pgh.pa.us wrote: I felt (and still feel) that this was misguided. Looking at it again, I'm inclined to agree.  The behavior was entirely correct up until somebody decided to emit a continuing stream of XLOG_RUNNING_XACTS WAL records even when

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Wed, Jun 6, 2012 at 4:24 PM, Tom Lane t...@sss.pgh.pa.us wrote: I felt (and still feel) that this was misguided. Looking at it again, I'm inclined to agree.  The behavior was entirely correct up until somebody decided to emit a continuing stream of

Re: [HACKERS] Avoiding adjacent checkpoint records

2012-06-06 Thread Tom Lane
I wrote: Actually, it looks like there is an extremely simple way to handle this, which is to move the call of LogStandbySnapshot (which generates the WAL record in question) to before the checkpoint's REDO pointer is set, but after we have decided that we need a checkpoint. On further