Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-12-01 Thread Heikki Linnakangas
On 24.11.2010 12:48, Heikki Linnakangas wrote: When recovery starts, we fetch the oldestActiveXid from the checkpoint record. Let's say that it's 100. We then start replaying WAL records from the Redo pointer, and the first record (heap insert in your case) contains an Xid that's much larger

Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-24 Thread Heikki Linnakangas
On 24.11.2010 06:56, Joachim Wieland wrote: On Tue, Nov 23, 2010 at 8:45 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 19.11.2010 23:46, Joachim Wieland wrote: FATAL: too many KnownAssignedXids. head: 0, tail: 0, nxids: 9978, pArray-maxKnownAssignedXids: 6890 Hmm,

Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-24 Thread Heikki Linnakangas
On 24.11.2010 12:48, Heikki Linnakangas wrote: On 24.11.2010 06:56, Joachim Wieland wrote: On Tue, Nov 23, 2010 at 8:45 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 19.11.2010 23:46, Joachim Wieland wrote: FATAL: too many KnownAssignedXids. head: 0, tail: 0, nxids:

Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-24 Thread Heikki Linnakangas
On 24.11.2010 13:38, Heikki Linnakangas wrote: It's dangerous to initialize latestObservedXid to anything to an older value. older value than the nextXid-1 from the checkpoint record, I meant to say. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via

Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-24 Thread Simon Riggs
On Wed, 2010-11-24 at 12:48 +0200, Heikki Linnakangas wrote: When recovery starts, we fetch the oldestActiveXid from the checkpoint record. Let's say that it's 100. We then start replaying WAL records from the Redo pointer, and the first record (heap insert in your case) contains an Xid

Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-23 Thread Joachim Wieland
On Tue, Nov 23, 2010 at 8:45 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 19.11.2010 23:46, Joachim Wieland wrote: FATAL:  too many KnownAssignedXids. head: 0, tail: 0, nxids: 9978, pArray-maxKnownAssignedXids: 6890 Hmm, that's a lot of entries in KnownAssignedXids.

Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-22 Thread Joachim Wieland
On Sun, Nov 21, 2010 at 11:48 PM, Fujii Masao masao.fu...@gmail.com wrote: -- If you suspect a bug in Hot Standby, please set        trace_recovery_messages = DEBUG2 in postgresql.conf and repeat the action Always useful to know * max_connections * current number of sessions *

Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-22 Thread Heikki Linnakangas
On 19.11.2010 23:46, Joachim Wieland wrote: FATAL: too many KnownAssignedXids. head: 0, tail: 0, nxids: 9978, pArray-maxKnownAssignedXids: 6890 Hmm, that's a lot of entries in KnownAssignedXids. Can you recompile with WAL_DEBUG, and run the recovery again with wal_debug=on ? That will print

Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-21 Thread Fujii Masao
On Sat, Nov 20, 2010 at 6:46 AM, Joachim Wieland j...@mcknight.de wrote: I still have the server, if you want me to debug anything or send a patch against 9.0.1 that gives more output, just let me know. Per previous Simon's comment, the following information would be useful.

[HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-19 Thread Joachim Wieland
Hi, I am seeing the following here on 9.0.1 on Linux x86-64: LOG: redo starts at 1F8/FC00E978 FATAL: too many KnownAssignedXids CONTEXT: xlog redo insert: rel 1663/16384/18373; tid 3829898/23 and this is the complete history: postgres was running as HS in foreground, Ctrl-C'ed it for a

Re: [HACKERS] Hot Standby b-tree delete records review

2010-11-09 Thread Simon Riggs
On Tue, 2010-11-09 at 13:34 +0200, Heikki Linnakangas wrote: (cleaning up my inbox, and bumped into this..) On 22.04.2010 12:31, Simon Riggs wrote: On Thu, 2010-04-22 at 12:18 +0300, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2010-04-22 at 11:56 +0300, Heikki Linnakangas wrote:

Re: [HACKERS] Hot Standby b-tree delete records review

2010-11-09 Thread Heikki Linnakangas
(cleaning up my inbox, and bumped into this..) On 22.04.2010 12:31, Simon Riggs wrote: On Thu, 2010-04-22 at 12:18 +0300, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2010-04-22 at 11:56 +0300, Heikki Linnakangas wrote: If none of the removed heap tuples were present anymore, we

Re: [HACKERS] Hot Standby tuning for btree_xlog_vacuum()

2010-09-27 Thread Robert Haas
On Thu, Apr 29, 2010 at 4:12 PM, Simon Riggs si...@2ndquadrant.com wrote: Simple tuning of btree_xlog_vacuum() using an idea I had a while back, just never implemented. XXX comments removed. Allows us to avoid reading in blocks during VACUUM replay that are only required for correctness of

[HACKERS] Hot Standby performance and deadlocking

2010-05-25 Thread Simon Riggs
Some performance problems have been reported on HS from two users: Erik and Stefan. The characteristics of those issues have been that performance is * sporadically reduced, though mostly runs at full speed * context switch storms reported as being associated So we're looking for something that

Re: [HACKERS] Hot Standby tuning for btree_xlog_vacuum()

2010-05-17 Thread Jim Nasby
On Apr 29, 2010, at 3:20 PM, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: Objections to commit? This is not the time to be hacking stuff like this. You haven't even demonstrated that there's a significant performance issue here. I tend to agree that this point of the cycle

Re: [HACKERS] Hot Standby tuning for btree_xlog_vacuum()

2010-05-17 Thread Tom Lane
Jim Nasby deci...@decibel.org writes: On Apr 29, 2010, at 3:20 PM, Tom Lane wrote: This is not the time to be hacking stuff like this. You haven't even demonstrated that there's a significant performance issue here. I tend to agree that this point of the cycle isn't a good one to be making

Re: [HACKERS] Hot Standby tuning for btree_xlog_vacuum()

2010-05-17 Thread Simon Riggs
On Mon, 2010-05-17 at 16:10 -0400, Tom Lane wrote: Jim Nasby deci...@decibel.org writes: On Apr 29, 2010, at 3:20 PM, Tom Lane wrote: This is not the time to be hacking stuff like this. You haven't even demonstrated that there's a significant performance issue here. I tend to agree

[HACKERS] Hot Standby tuning for btree_xlog_vacuum()

2010-04-29 Thread Simon Riggs
Simple tuning of btree_xlog_vacuum() using an idea I had a while back, just never implemented. XXX comments removed. Allows us to avoid reading in blocks during VACUUM replay that are only required for correctness of index scans. Objections to commit? -- Simon Riggs

Re: [HACKERS] Hot Standby tuning for btree_xlog_vacuum()

2010-04-29 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: Objections to commit? This is not the time to be hacking stuff like this. You haven't even demonstrated that there's a significant performance issue here. regards, tom lane -- Sent via pgsql-hackers mailing list

[HACKERS] Hot Standby b-tree delete records review

2010-04-22 Thread Heikki Linnakangas
btree_redo: case XLOG_BTREE_DELETE: /* * Btree delete records can conflict with standby queries. You * might think that vacuum records would conflict as well, but * we've handled that already. XLOG_HEAP2_CLEANUP_INFO records

Re: [HACKERS] Hot Standby b-tree delete records review

2010-04-22 Thread Simon Riggs
On Thu, 2010-04-22 at 10:24 +0300, Heikki Linnakangas wrote: btree_redo: case XLOG_BTREE_DELETE: /* * Btree delete records can conflict with standby queries. You * might think that vacuum records would conflict as well, but *

Re: [HACKERS] Hot Standby b-tree delete records review

2010-04-22 Thread Heikki Linnakangas
Simon Riggs wrote: On Thu, 2010-04-22 at 10:24 +0300, Heikki Linnakangas wrote: btree_redo: /* * Note that if all heap tuples were LP_DEAD then we will be * returning InvalidTransactionId here. This seems very unlikely * in practice. */ If none of the removed heap

Re: [HACKERS] Hot Standby b-tree delete records review

2010-04-22 Thread Simon Riggs
On Thu, 2010-04-22 at 11:28 +0300, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2010-04-22 at 10:24 +0300, Heikki Linnakangas wrote: btree_redo: /* * Note that if all heap tuples were LP_DEAD then we will be * returning InvalidTransactionId here. This seems very unlikely

Re: [HACKERS] Hot Standby b-tree delete records review

2010-04-22 Thread Heikki Linnakangas
Simon Riggs wrote: On Thu, 2010-04-22 at 11:28 +0300, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2010-04-22 at 10:24 +0300, Heikki Linnakangas wrote: btree_redo: /* * Note that if all heap tuples were LP_DEAD then we will be * returning InvalidTransactionId here. This

Re: [HACKERS] Hot Standby b-tree delete records review

2010-04-22 Thread Simon Riggs
On Thu, 2010-04-22 at 11:56 +0300, Heikki Linnakangas wrote: If none of the removed heap tuples were present anymore, we currently return InvalidTransactionId, which kills/waits out all read-only queries. But if none of the tuples were present anymore, the read-only queries wouldn't have

Re: [HACKERS] Hot Standby b-tree delete records review

2010-04-22 Thread Heikki Linnakangas
Simon Riggs wrote: On Thu, 2010-04-22 at 11:56 +0300, Heikki Linnakangas wrote: If none of the removed heap tuples were present anymore, we currently return InvalidTransactionId, which kills/waits out all read-only queries. But if none of the tuples were present anymore, the read-only

Re: [HACKERS] Hot Standby b-tree delete records review

2010-04-22 Thread Simon Riggs
On Thu, 2010-04-22 at 12:18 +0300, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2010-04-22 at 11:56 +0300, Heikki Linnakangas wrote: If none of the removed heap tuples were present anymore, we currently return InvalidTransactionId, which kills/waits out all read-only queries.

Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint

2010-04-14 Thread Simon Riggs
On Tue, 2010-04-13 at 17:18 +0300, Heikki Linnakangas wrote: I've reviewed your changes and they look correct to me; the main chunk of code is mine and that was tested by me. Ok, committed after fixing an obsoleted comment other small editorialization. Looks good, thanks. -- Simon

Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint

2010-04-13 Thread Heikki Linnakangas
Simon Riggs wrote: On Thu, 2010-04-08 at 19:02 +0300, Heikki Linnakangas wrote: Simon Riggs wrote: OK, that seems better. I'm happy with that instead. Have you tested this? Is it ready to commit? Only very briefly. I think the code is ready, but please review and test to see I didn't miss

Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint

2010-04-08 Thread Simon Riggs
On Tue, 2010-04-06 at 10:22 +0100, Simon Riggs wrote: Initial patch. I will be testing over next day. No commit before at least midday on Wed 7 Apr. Various previous discussions sidelined a very important point: what exactly does it mean to start recovery from a shutdown checkpoint? If

Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint

2010-04-08 Thread Heikki Linnakangas
Simon Riggs wrote: On Tue, 2010-04-06 at 10:22 +0100, Simon Riggs wrote: Initial patch. I will be testing over next day. No commit before at least midday on Wed 7 Apr. Various previous discussions sidelined a very important point: what exactly does it mean to start recovery from a

Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint

2010-04-08 Thread Simon Riggs
On Thu, 2010-04-08 at 13:33 +0300, Heikki Linnakangas wrote: If standby_mode is enabled and there is no source of WAL, then we get a stream of messages saying LOG: record with zero length at 0/C88 ... but most importantly we never get to the main recovery loop, so Hot

Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint

2010-04-08 Thread Robert Haas
On Thu, Apr 8, 2010 at 6:16 AM, Simon Riggs si...@2ndquadrant.com wrote: If standby_mode is enabled and there is no source of WAL, then we get a stream of messages saying LOG:  record with zero length at 0/C88 ... but most importantly we never get to the main recovery loop, so Hot

Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint

2010-04-08 Thread Heikki Linnakangas
Simon Riggs wrote: In StartupXlog() when we get to the point where we Find the first record that logically follows the checkpoint, in the current code ReadRecord() loops forever, spitting out LOG: record with zero length at 0/C88 ... That prevents us from going further down

Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint

2010-04-08 Thread Simon Riggs
On Thu, 2010-04-08 at 18:35 +0300, Heikki Linnakangas wrote: So I have introduced the new mode (snapshot mode) to enter hot standby anyway. That avoids us having to screw around with the loop logic for redo. I don't see any need to support the case of where we have no WAL source

Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint

2010-04-08 Thread Heikki Linnakangas
Simon Riggs wrote: OK, that seems better. I'm happy with that instead. Have you tested this? Is it ready to commit? Only very briefly. I think the code is ready, but please review and test to see I didn't miss anything. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] Hot Standby: Startup at shutdown checkpoint

2010-04-08 Thread Simon Riggs
On Thu, 2010-04-08 at 19:02 +0300, Heikki Linnakangas wrote: Simon Riggs wrote: OK, that seems better. I'm happy with that instead. Have you tested this? Is it ready to commit? Only very briefly. I think the code is ready, but please review and test to see I didn't miss anything. I'm

[HACKERS] Hot Standby: Startup at shutdown checkpoint

2010-04-06 Thread Simon Riggs
Initial patch. I will be testing over next day. No commit before at least midday on Wed 7 Apr. The existing call to PrescanPreparedTransactions() looks correct to me but the comment is wrong. I will change that also, if we agree. -- Simon Riggs www.2ndQuadrant.com diff --git

[HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-03-02 Thread Marc Munro
On Mon, 2010-03-01 at 16:12 -0400, pgsql-hackers-ow...@postgresql.org wrote: . . . However there is a concern with max_standby_age. If you set it to, say, 300s. Then run a 300s query on the slave which causes the slave to fall 299s behind. Now you start a new query on the slave -- it gets a

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-03-02 Thread Josh Berkus
On 3/2/10 12:47 PM, Marc Munro wrote: To take it further still, if vacuum on the master could be prevented from touching records that are less than max_standby_delay seconds old, it would be safe to apply WAL from the very latest vacuum. I guess HOT could be handled similarly though that may

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-03-02 Thread Simon Riggs
On Tue, 2010-03-02 at 12:47 -0800, Marc Munro wrote: IIUC this is only a problem for WAL from HOT updates and vacuums. If no vacuums or HOT updates have been performed, there is no risk of returning bad data. So WAL that does not contain HOT updates or vacuums could be applied on the

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-28 Thread Simon Riggs
On Fri, 2010-02-26 at 16:44 -0500, Tom Lane wrote: Greg Stark gsst...@mit.edu writes: On Fri, Feb 26, 2010 at 9:19 PM, Tom Lane t...@sss.pgh.pa.us wrote: There's *definitely* not going to be enough information in the WAL stream coming from a master that doesn't think it has HS slaves. We

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-28 Thread Simon Riggs
On Fri, 2010-02-26 at 03:33 -0500, Greg Smith wrote: I really hope this discussion can say focused on if and how it's possible to improve this area, with the goal being to deliver a product everyone can be proud of with the full feature set that makes this next release a killer one. The

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-27 Thread Heikki Linnakangas
Dimitri Fontaine wrote: Bruce Momjian br...@momjian.us writes: Doesn't the system already adjust the delay based on the length of slave transactions, e.g. max_standby_delay. It seems there is no need for a user switch --- just max_standby_delay really high. Well that GUC looks like it

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-27 Thread Heikki Linnakangas
Heikki Linnakangas wrote: Dimitri Fontaine wrote: Bruce Momjian br...@momjian.us writes: Doesn't the system already adjust the delay based on the length of slave transactions, e.g. max_standby_delay. It seems there is no need for a user switch --- just max_standby_delay really high. Well

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-27 Thread Bruce Momjian
Greg Smith wrote: Joshua D. Drake wrote: On Sat, 27 Feb 2010 00:43:48 +, Greg Stark gsst...@mit.edu wrote: I want my ability to run large batch queries without any performance or reliability impact on the primary server. +1 I can use any number of other technologies

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-27 Thread Josh Berkus
Greg, If you think of it in those terms, the idea that you need to run PITR backup/archive recovery to not get that behavior isn't an important distinction anymore. If you run SR with the option enabled you could get it, any other setup and you won't. +1. I always expected that we'd get

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-27 Thread Greg Smith
Josh Berkus wrote: Now that I think about it, the xmin thing really doesn't seem conceptually difficult. If the slave just opens a 2nd, special query connection back to the master and publishes its oldest xmin there, as far as the master is concerned, it's just another query backend. Could it

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-27 Thread Josh Berkus
The part I still don't have good visibility on is how much of the necessary SR infrastructure needed to support this communications channel is already available in some form. I had though the walsender on the master was already receiving messages sometimes from the walreceiver on the

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-27 Thread Josh Berkus
Thank you for combining a small personal attack with a selfish commentary about how yours is the only valid viewpoint. Saves me a lot of trouble replying to your messages, can just ignore them instead if this is how you're going to act. Hey, take it easy! I read Stark's post as

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-27 Thread Greg Smith
Josh Berkus wrote: Hey, take it easy! I read Stark's post as tongue-in-cheek, which I think it was. Yeah, I didn't get that. We've already exchanged mutual off-list apologies for the misunderstanding in both directions, I stopped just short of sending flowers. I did kick off this

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-27 Thread Bruce Momjian
Greg Smith wrote: Josh Berkus wrote: Now that I think about it, the xmin thing really doesn't seem conceptually difficult. If the slave just opens a 2nd, special query connection back to the master and publishes its oldest xmin there, as far as the master is concerned, it's just

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-27 Thread Greg Smith
Bruce Momjian wrote: The first option is to connect to the primary server and keep a query active for as long as needed to run queries on the standby. This guarantees that a WAL cleanup record is never generated and query conflicts do not occur, as described above. This could be done using

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-27 Thread Greg Smith
Greg Stark wrote: On Sun, Feb 28, 2010 at 5:28 AM, Greg Smith g...@2ndquadrant.com wrote: The idea of the workaround is that if you have a single long-running query to execute, and you want to make sure it doesn't get canceled because of a vacuum cleanup, you just have it connect back to the

[HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Greg Smith
I'm happy to see we've crossed the point where the worst of the Hot Standby and Streaming Replication issues are sorted out. A look at the to-do lists: http://wiki.postgresql.org/wiki/Hot_Standby_TODO http://wiki.postgresql.org/wiki/Streaming_Replication show no Must-fix items and 5 Serious

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Heikki Linnakangas
Greg Smith wrote: Attached is a tar file with some test case demo scripts that demonstrate the worst of the problems here IMHO. Thanks for that! We've been discussing this for ages, so it's nice to have a concrete example. I don't want to belittle that work because it's been important to make

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Richard Huxton
On 26/02/10 08:33, Greg Smith wrote: There are a number of HS tunables that interact with one another, and depending your priorities a few ways you can try to optimize the configuration for what I expect to be common use cases for this feature. I've written a blog entry at

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Richard Huxton
On 26/02/10 14:10, Heikki Linnakangas wrote: Ideally the standby would stash away the old pages or tuples somewhere so that it can still access them even after replaying the WAL records that remove them from the main storage. I realize that's not going to happen any time soon because it's hard

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Heikki Linnakangas
Richard Huxton wrote: On 26/02/10 08:33, Greg Smith wrote: I'm not sure what you might be expecting from the above combination, but what actually happens is that many of the SELECT statements on the table *that isn't even being updated* are canceled. You see this in the logs: Hmm - this I'd

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Richard Huxton
On 26/02/10 14:45, Heikki Linnakangas wrote: Richard Huxton wrote: On 26/02/10 08:33, Greg Smith wrote: I'm not sure what you might be expecting from the above combination, but what actually happens is that many of the SELECT statements on the table *that isn't even being updated* are

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Heikki Linnakangas
Richard Huxton wrote: Can we not wait to cancel the transaction until *any* new lock is attempted though? That should protect all the single-statement long-running transactions that are already underway. Aggregates etc. Hmm, that's an interesting thought. You'll still need to somehow tell the

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Richard Huxton
Replying to my own post - first sign of madness... Let's see if I've got the concepts clear here, and hopefully my thinking it through will help others reading the archives. There are two queues: 1. Cleanup on the master 2. Replay on the slave Running write queries on the master adds to both

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Robert Haas
On Fri, Feb 26, 2010 at 10:21 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Richard Huxton wrote: Can we not wait to cancel the transaction until *any* new lock is attempted though? That should protect all the single-statement long-running transactions that are already

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Bruce Momjian
Heikki Linnakangas wrote: How to handle situations where the standby goes away for a while, such as a network outage, so that it doesn't block the master from ever cleaning up dead tuples is a concern. Yeah, that's another issue that needs to be dealt with. You'd probably need some kind

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Josh Berkus
On 2/26/10 6:57 AM, Richard Huxton wrote: Can we not wait to cancel the transaction until *any* new lock is attempted though? That should protect all the single-statement long-running transactions that are already underway. Aggregates etc. I like this approach. Is it fragile in some

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Tom Lane
Greg Stark gsst...@mit.edu writes: On Fri, Feb 26, 2010 at 7:16 PM, Tom Lane t...@sss.pgh.pa.us wrote: I don't see a substantial additional burden there.  What I would imagine is needed is that the slave transmits a single number back --- its current oldest xmin --- and the walsender process

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Josh Berkus
Well, as Heikki said, a stop-and-go WAL management approach could deal with that use-case. What I'm concerned about here is the complexity, reliability, maintainability of trying to interlock WAL application with slave queries in any sort of fine-grained fashion. This sounds a bit

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Tom Lane
Greg Stark gsst...@mit.edu writes: Why shouldn't it have any queries at walreceiver startup? It has any xlog segments that were copied from the master and any it can find in the archive, it could easily reach a consistent point long before it needs to connect to the master. If you really want

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Bruce Momjian
bruce wrote: 4 The standby waiting longer than max_standby_delay to acquire a ... #4 can be controlled by max_standby_delay, where a large value only delays playback during crash recovery --- again, a rare occurance. One interesting feature is that max_standby_delay will _only_ delay

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes: Well, as Heikki said, a stop-and-go WAL management approach could deal with that use-case. What I'm concerned about here is the complexity, reliability, maintainability of trying to interlock WAL application with slave queries in any sort of fine-grained

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Tom Lane
Greg Stark gsst...@mit.edu writes: On Fri, Feb 26, 2010 at 9:19 PM, Tom Lane t...@sss.pgh.pa.us wrote: There's *definitely* not going to be enough information in the WAL stream coming from a master that doesn't think it has HS slaves. We can't afford to record all that extra stuff in

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Bruce Momjian
Dimitri Fontaine wrote: Tom Lane t...@sss.pgh.pa.us writes: Well, as Heikki said, a stop-and-go WAL management approach could deal with that use-case. What I'm concerned about here is the complexity, reliability, maintainability of trying to interlock WAL application with slave queries

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Dimitri Fontaine
Bruce Momjian br...@momjian.us writes: Doesn't the system already adjust the delay based on the length of slave transactions, e.g. max_standby_delay. It seems there is no need for a user switch --- just max_standby_delay really high. Well that GUC looks like it allows to set a compromise

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Bruce Momjian
Dimitri Fontaine wrote: Bruce Momjian br...@momjian.us writes: Doesn't the system already adjust the delay based on the length of slave transactions, e.g. max_standby_delay. It seems there is no need for a user switch --- just max_standby_delay really high. Well that GUC looks like it

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Greg Smith
Bruce Momjian wrote: Doesn't the system already adjust the delay based on the length of slave transactions, e.g. max_standby_delay. It seems there is no need for a user switch --- just max_standby_delay really high. The first issue is that you're basically saying I don't care about high

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Greg Smith
Bruce Momjian wrote: 5 Early cleanup of data still visible to the current query's snapshot #5 could be handled by using vacuum_defer_cleanup_age on the master. Why is vacuum_defer_cleanup_age not listed in postgresql.conf? I noticed that myself and fired off a

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Bruce Momjian
Greg Smith wrote: Bruce Momjian wrote: Doesn't the system already adjust the delay based on the length of slave transactions, e.g. max_standby_delay. It seems there is no need for a user switch --- just max_standby_delay really high. The first issue is that you're basically saying

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Greg Stark
On Fri, Feb 26, 2010 at 11:56 PM, Greg Smith g...@2ndquadrant.com wrote: This is also the reason why the whole pause recovery idea is a fruitless path to wander down.  The whole point of this feature is that people have a secondary server available for high-availability, *first and foremost*,

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Bruce Momjian
Greg Smith wrote: You can think of the idea of passing an xmin back from the standby as being like an auto-tuning vacuum_defer_cleanup_age. It's 0 when no standby queries are running, but grows in size to match longer ones. And you don't have to have to know anything to set it correctly;

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Greg Smith
Greg Stark wrote: Well you can go sit in the same corner as Simon with your high availability servers. I want my ability to run large batch queries without any performance or reliability impact on the primary server. Thank you for combining a small personal attack with a selfish

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Greg Stark
On Sat, Feb 27, 2010 at 1:53 AM, Greg Smith g...@2ndquadrant.com wrote: Greg Stark wrote: Well you can go sit in the same corner as Simon with your high availability servers. I want my ability to run large batch queries without any performance or reliability impact on the primary server.

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Greg Smith
Bruce Momjian wrote: Well, I think the choice is either you delay vacuum on the master for 8 hours or pile up 8 hours of WAL files on the slave, and delay application, and make recovery much slower. It is not clear to me which option a user would prefer because the bloat on the master might be

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Greg Smith
Heikki Linnakangas wrote: One such landmine is that the keepalives need to flow from client to server while the WAL records are flowing from server to client. We'll have to crack that problem for synchronous replication too, but I think that alone is a big enough problem to make this 9.1

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Bruce Momjian
Greg Smith wrote: Heikki Linnakangas wrote: One such landmine is that the keepalives need to flow from client to server while the WAL records are flowing from server to client. We'll have to crack that problem for synchronous replication too, but I think that alone is a big enough problem

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Greg Smith
Greg Stark wrote: Eh? That's not what I meant at all. Actually it's kind of the exact opposite of what I meant. Sorry about that--I think we just hit one of those language usage drift bits of confusion. Sit in the corner has a very negative tone to it in US English and I interpreted your

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Greg Stark
On Sat, Feb 27, 2010 at 2:43 AM, Greg Smith g...@2ndquadrant.com wrote: But if you're running the 8 hour report on the master right now, aren't you already exposed to a similar pile of bloat issues while it's going?  If I have the choice between sometimes queries will get canceled vs.

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Joshua D. Drake
On Sat, 27 Feb 2010 00:43:48 +, Greg Stark gsst...@mit.edu wrote: On Fri, Feb 26, 2010 at 11:56 PM, Greg Smith g...@2ndquadrant.com wrote: This is also the reason why the whole pause recovery idea is a fruitless path to wander down.  The whole point of this feature is that people have a

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Greg Smith
Joshua D. Drake wrote: On Sat, 27 Feb 2010 00:43:48 +, Greg Stark gsst...@mit.edu wrote: I want my ability to run large batch queries without any performance or reliability impact on the primary server. +1 I can use any number of other technologies for high availability.

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Greg Smith
Greg Stark wrote: But if they move from having a plain old PITR warm standby to having one they can run queries on they might well assume that the big advantage of having the standby to play with is precisely that they can do things there that they have never been able to do on the master

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Aidan Van Dyk
* Greg Smith g...@2ndquadrant.com [100226 23:39]: Just not having the actual query running on the master is such a reduction in damage that I think it's delivering the essence of what people are looking for regardless. That it might be possible in some cases to additionally avoid the

Re: [HACKERS] Hot Standby query cancellation and Streaming Replication integration

2010-02-26 Thread Greg Smith
Aidan Van Dyk wrote: Would we (ya, the royal we) be willing to say that if you want the benifit of removing the MVCC overhead of long-running queries you need to run PITR backup/archive recovery, and if you want SR, you get a closed-loop master-follows-save-xmin behaviour? To turn that

Re: [HACKERS] Hot standby documentation

2010-02-08 Thread Bruce Momjian
Markus Wanner wrote: Bruce, Bruce Momjian wrote: Ah, I now realize it only mentions warm standby, not hot, so I just updated the documentation to reflect that; you can see it here: Maybe the table below also needs an update, because unlike Warm Standby using PITR, a hot standby

Re: [HACKERS] Hot standby documentation

2010-02-08 Thread Fujii Masao
On Mon, Feb 8, 2010 at 10:34 PM, Bruce Momjian br...@momjian.us wrote: Ahh, good point.  I had not considered the table would change.  What I did was to mark Slaves accept read-only queries as Hot only. Can the warm standby still reside in v9.0? If not, the mark of Hot only seems odd for me.

Re: [HACKERS] Hot standby documentation

2010-02-08 Thread Bruce Momjian
Fujii Masao wrote: On Mon, Feb 8, 2010 at 10:34 PM, Bruce Momjian br...@momjian.us wrote: Ahh, good point. ?I had not considered the table would change. ?What I did was to mark Slaves accept read-only queries as Hot only. Can the warm standby still reside in v9.0? If not, the mark of Hot

Re: [HACKERS] Hot standby documentation

2010-02-07 Thread Markus Wanner
Bruce, Bruce Momjian wrote: Ah, I now realize it only mentions warm standby, not hot, so I just updated the documentation to reflect that; you can see it here: Maybe the table below also needs an update, because unlike Warm Standby using PITR, a hot standby accepts read-only queries and can

Re: [HACKERS] Hot standby documentation

2010-02-07 Thread Robert Haas
On Sun, Feb 7, 2010 at 4:41 AM, Markus Wanner mar...@bluegap.ch wrote: Bruce Momjian wrote: Do we want to call the feature hot standby?  Is a read-only standby a standby or a slave? I think hot standby is pretty much the term, now. See here for the previous iteration of this discussion:

Re: [HACKERS] Hot Standby and DROP DATABASE

2010-02-07 Thread Simon Riggs
On Sat, 2010-02-06 at 17:32 +0100, Andres Freund wrote: So it seems at least the behavior is quite different from what the docs stats. Am I missing something here? Its a small bug/typo in standby.c:ResolveRecoveryConflictWithDatabase The line: CancelDBBackends(dbid,

Re: [HACKERS] Hot standby documentation

2010-02-07 Thread Josh Berkus
I've always thought this feature was misnamed and nothing has happened to change my mind, but it's not clear whether I'm in the majority. I'm afraid force of habit is more powerful than correctness on this one. It's going to be HS/SR whether that's perfectly correct or not. --Josh Berkus

Re: [HACKERS] Hot standby documentation

2010-02-07 Thread David E. Wheeler
On Feb 7, 2010, at 12:35 PM, Josh Berkus wrote: I've always thought this feature was misnamed and nothing has happened to change my mind, but it's not clear whether I'm in the majority. I'm afraid force of habit is more powerful than correctness on this one. It's going to be HS/SR whether

Re: [HACKERS] Hot Standby and DROP DATABASE

2010-02-06 Thread Andres Freund
On Saturday 06 February 2010 02:25:33 Tatsuo Ishii wrote: Hi, While testing Hot Standby, I have encountered strange behavior with DROP DATABASE command. 1) connect to test database at standby via psql 2) issue DROP DATABASE test command to primary 3) session #1 works fine 4) close

<    1   2   3   4   5   6   7   8   9   10   >