Re: [HACKERS] Hot Standby remaining issues

2009-12-04 Thread Heikki Linnakangas
Regarding this item from the wiki page:
 The standby delay is measured as current timestamp - timestamp of last 
 replayed commit record. If there's little activity in the master, that can 
 lead to surprising results. For example, imagine that max_standby_delay is 
 set to 8 hours. The standby is fully up-to-date with the master, and there's 
 no write activity in master. After 10 hours, a long reporting query is 
 started in the standby. Ten minutes later, a small transaction is executed in 
 the master that conflicts with the reporting query. I would expect the 
 reporting query to be canceled 8 hours after the conflicting transaction 
 began, but it is in fact canceled immediately, because it's over 8 hours 
 since the last commit record was replayed.
 
 * Simon says... changed to allow checkpoints to update recoveryLastXTime 
 (Simon DONE) 

Update recoveryLastXTime at checkpoints doesn't help when the master is
completely idle, because we skip checkpoints in that case. It's better
than nothing, of course.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby remaining issues

2009-12-04 Thread Simon Riggs
On Fri, 2009-12-04 at 10:37 +0200, Heikki Linnakangas wrote:
 Regarding this item from the wiki page:
  The standby delay is measured as current timestamp - timestamp of last 
  replayed commit record. If there's little activity in the master, that can 
  lead to surprising results. For example, imagine that max_standby_delay is 
  set to 8 hours. The standby is fully up-to-date with the master, and 
  there's no write activity in master. After 10 hours, a long reporting query 
  is started in the standby. Ten minutes later, a small transaction is 
  executed in the master that conflicts with the reporting query. I would 
  expect the reporting query to be canceled 8 hours after the conflicting 
  transaction began, but it is in fact canceled immediately, because it's 
  over 8 hours since the last commit record was replayed.
  
  * Simon says... changed to allow checkpoints to update 
  recoveryLastXTime (Simon DONE) 
 
 Update recoveryLastXTime at checkpoints doesn't help when the master is
 completely idle, because we skip checkpoints in that case. It's better
 than nothing, of course.

Not if archive_timeout is set, which it would be in warm standby case.
We can do even better than this with SR.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby remaining issues

2009-12-04 Thread Heikki Linnakangas
Simon Riggs wrote:
 On Fri, 2009-12-04 at 10:37 +0200, Heikki Linnakangas wrote:
 Regarding this item from the wiki page:
 The standby delay is measured as current timestamp - timestamp of last 
 replayed commit record. If there's little activity in the master, that can 
 lead to surprising results. For example, imagine that max_standby_delay is 
 set to 8 hours. The standby is fully up-to-date with the master, and 
 there's no write activity in master. After 10 hours, a long reporting query 
 is started in the standby. Ten minutes later, a small transaction is 
 executed in the master that conflicts with the reporting query. I would 
 expect the reporting query to be canceled 8 hours after the conflicting 
 transaction began, but it is in fact canceled immediately, because it's 
 over 8 hours since the last commit record was replayed.

 * Simon says... changed to allow checkpoints to update 
 recoveryLastXTime (Simon DONE) 
 Update recoveryLastXTime at checkpoints doesn't help when the master is
 completely idle, because we skip checkpoints in that case. It's better
 than nothing, of course.
 
 Not if archive_timeout is set, which it would be in warm standby case.
 We can do even better than this with SR.

If the system is completely idle, and no WAL is written, we skip xlog
switches too, even if archive_timeout is set . It would be pointless to
create a stream of WAL files with no content except for the XLOG_SWITCH
records.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby remaining issues

2009-12-04 Thread Kevin Grittner
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 
 If the system is completely idle, and no WAL is written, we skip
 xlog switches too, even if archive_timeout is set . It would be
 pointless to create a stream of WAL files with no content except
 for the XLOG_SWITCH records.
 
It's not pointless if you want to monitor that your backup system is
healthy.  This was previously mentioned a couple years ago:
 
http://archives.postgresql.org/pgsql-general/2007-10/msg01448.php
 
It turns out that it's been working fine under 8.3.  Of course, we
can always add a crontab job to do some small bogus update to force
WAL switches, so it's not the end of the world if we lose the 8.3
behavior; but my preference would be that if a WAL switch interval
is specified, the WAL files switch at least that often.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby remaining issues

2009-12-02 Thread Heikki Linnakangas
Heikki Linnakangas wrote:
 Simon Riggs wrote:
 @@ -654,10 +656,13 @@ LockAcquire(const LOCKTAG *locktag,
  elog(PANIC, lock table corrupted);
  }
  LWLockRelease(partitionLock);
 -ereport(ERROR,
 -(errcode(ERRCODE_OUT_OF_MEMORY),
 - errmsg(out of shared memory),
 -  errhint(You might need to increase 
 max_locks_per_transaction.)));
 +if (reportLockTableError)
 +ereport(ERROR,
 +(errcode(ERRCODE_OUT_OF_MEMORY),
 + errmsg(out of shared memory),
 +  errhint(You might need to increase 
 max_locks_per_transaction.)));
 +else
 +return LOCKACQUIRE_NOT_AVAIL;
  }
  locallock-proclock = proclock;
  
 
 That seems dangerous when dontWait==false.

Ah, I see now that you're only setting reportLockTableError just before
you call LockAcquire, and reset it afterwards. It's safe then, but it
should rather be another argument to the function, as how the global
variable is really being used.

The patch doesn't actually fix the issue it was supposed to fix. If a
read-only transaction holds a lot of locks, consuming so much lock space
that there's none left for the startup process to hold the lock it
wants, it will abort and bring down postmaster. The patch attempts to
kill any conflicting lockers, but those are handled fine already (if
there's any conflicting locks, LockAcquire will return
LOCKACQUIRE_NOT_AVAIL anyway). The problem is with non-conflicting locks
using up the lock space.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby remaining issues

2009-12-02 Thread Simon Riggs
On Wed, 2009-12-02 at 12:49 +0200, Heikki Linnakangas wrote:

 If a read-only transaction holds a lot of locks, consuming so much
 lock space that there's none left for the startup process to hold the
 lock it wants, it will abort and bring down postmaster. The patch
 attempts to kill any conflicting lockers, but those are handled fine
 already (if there's any conflicting locks, LockAcquire will return
 LOCKACQUIRE_NOT_AVAIL anyway). The problem is with non-conflicting
 locks using up the lock space.

Oh dear, another nuke 'em all from orbit scenario. Will do.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby remaining issues

2009-12-02 Thread Heikki Linnakangas
Simon Riggs wrote:
 On Wed, 2009-12-02 at 12:49 +0200, Heikki Linnakangas wrote:
 
 If a read-only transaction holds a lot of locks, consuming so much
 lock space that there's none left for the startup process to hold the
 lock it wants, it will abort and bring down postmaster. The patch
 attempts to kill any conflicting lockers, but those are handled fine
 already (if there's any conflicting locks, LockAcquire will return
 LOCKACQUIRE_NOT_AVAIL anyway). The problem is with non-conflicting
 locks using up the lock space.
 
 Oh dear, another nuke 'em all from orbit scenario. Will do.

Yeah. This case is much like the OOM killer on Linux. Not really nuke
'em all but nuke someone, don't care who..

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby remaining issues

2009-12-02 Thread Simon Riggs
On Wed, 2009-12-02 at 16:41 +, Simon Riggs wrote:
 On Tue, 2009-12-01 at 20:26 +0200, Heikki Linnakangas wrote:
  Simon Riggs wrote:
   commit 02c3eadb766201db084b668daa271db4a900adc9
   Author: Simon Riggs sri...@ebony.(none)
   Date:   Sat Nov 28 06:23:33 2009 +
   
   Added wal_standby_info GUC to turn RM_STANDBY_ID messages on/off.
   Various comments added also.
   
  
  This patch makes it unsafe to start hot standby mode from a shutdown
  checkpoint, because we don't know if wal_standby_info was enabled in the
  master.

Hmm, what happens if someone enables wal_standby_info in postgresql.conf
while the server is shutdown. It would still be a valid starting point
in that case. I'll just make a note, I think.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby remaining issues

2009-12-02 Thread Heikki Linnakangas
Simon Riggs wrote:
 Hmm, what happens if someone enables wal_standby_info in postgresql.conf
 while the server is shutdown. It would still be a valid starting point
 in that case.

Yeah, true.

 I'll just make a note, I think.

Yeah, a manual (or automatic, if you just wait) checkpoint will produce
a new checkpoint record showing that it's safe to start standby again.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby remaining issues

2009-12-01 Thread Heikki Linnakangas
Simon Riggs wrote:
 commit 02c3eadb766201db084b668daa271db4a900adc9
 Author: Simon Riggs sri...@ebony.(none)
 Date:   Sat Nov 28 06:23:33 2009 +
 
 Added wal_standby_info GUC to turn RM_STANDBY_ID messages on/off.
 Various comments added also.
 

This patch makes it unsafe to start hot standby mode from a shutdown
checkpoint, because we don't know if wal_standby_info was enabled in the
master.

I still don't think we need the GUC. But for future-proofing, perhaps we
should add a flag to shutdown checkpoint records, indicating whether
it's safe to start hot standby from it. That way, if we decide to add a
GUC like that at a later stage, we don't need to change the on-disk format.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby remaining issues

2009-11-30 Thread Heikki Linnakangas
Simon Riggs wrote:
 @@ -654,10 +656,13 @@ LockAcquire(const LOCKTAG *locktag,
   elog(PANIC, lock table corrupted);
   }
   LWLockRelease(partitionLock);
 - ereport(ERROR,
 - (errcode(ERRCODE_OUT_OF_MEMORY),
 -  errmsg(out of shared memory),
 -   errhint(You might need to increase 
 max_locks_per_transaction.)));
 + if (reportLockTableError)
 + ereport(ERROR,
 + (errcode(ERRCODE_OUT_OF_MEMORY),
 +  errmsg(out of shared memory),
 +   errhint(You might need to increase 
 max_locks_per_transaction.)));
 + else
 + return LOCKACQUIRE_NOT_AVAIL;
   }
   locallock-proclock = proclock;
  

That seems dangerous when dontWait==false.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby remaining issues

2009-11-27 Thread Simon Riggs
On Wed, 2009-11-25 at 13:00 +0200, Heikki Linnakangas wrote:
 I've put up a wiki page with the issues I see with the patch as it
 stands. They're roughly categorized by seriousness.
 
 http://wiki.postgresql.org/wiki/Hot_Standby_TODO
 
 New issues can and probably will still pop up, let's add them to the
 list as they're found so that we know what still needs to be done.
 
 You had a list of work items at the hot standby main page, but I believe
 it's badly out-of-date. Please move any still relevant items to the
 above list, if any.

4 changes on TODO included here, including all must-fix issues. This is
a combined patch. Will push changes to git also, so each commit is
visible separately.

Lots of wiki comments added.

-- 
 Simon Riggs   www.2ndQuadrant.com
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 75aca23..5af05fe 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1985,7 +1985,7 @@ if (!triggered)
  /listitem
 	 listitem
 	  para
-   LOCK TABLE, though only when explicitly IN ACCESS SHARE MODE
+   LOCK TABLE, though only when explicitly in one of these modes: ACCESS SHARE, ROW SHARE or ROW EXCLUSIVE.
   /para
  /listitem
 	 listitem
@@ -2033,7 +2033,7 @@ if (!triggered)
 	 listitem
 	  para
LOCK TABLE, in short default form, since it requests ACCESS EXCLUSIVE MODE.
-   LOCK TABLE that explicitly requests a lock other than ACCESS SHARE MODE.
+   LOCK TABLE that explicitly requests a mode higher than ROW EXCLUSIVE MODE.
   /para
  /listitem
 	 listitem
@@ -2110,10 +2110,10 @@ if (!triggered)
 
para
 	In recovery, transactions will not be permitted to take any table lock
-	higher than AccessShareLock. In addition, transactions may never assign
+	higher than RowExclusiveLock. In addition, transactions may never assign
 	a TransactionId and may never write WAL.
 	Any LOCK TABLE command that runs on the standby and requests a specific
-	lock type other than AccessShareLock will be rejected.
+	lock mode higher than ROW EXCLUSIVE MODE will be rejected.
/para
 
para
@@ -2207,15 +2207,16 @@ if (!triggered)
 
para
 	We have a number of choices for resolving query conflicts.  The default
-	is that we wait and hope the query completes. The server will wait automatically until the lag between
-	primary and standby is at most max_standby_delay seconds. Once that grace
-	period expires, we take one of the following actions:
+	is that we wait and hope the query completes. The server will wait
+	automatically until the lag between primary and standby is at most 
+	max_standby_delay seconds. Once that grace period expires, we take one
+	of the following actions:
 
 	  itemizedlist
 	   listitem
 	para
-		 If the conflict is caused by a lock, we cancel the standby transaction
-		 immediately, even if it is idle-in-transaction.
+		 If the conflict is caused by a lock, we cancel the conflicting standby
+		 transaction immediately, even if it is idle-in-transaction.
 	/para
 	   /listitem
 	   listitem
@@ -2224,7 +2225,9 @@ if (!triggered)
 		 that a conflict has occurred and that it must cancel itself to avoid the
 		 risk that it silently fails to read relevant data because
 		 that data has been removed. (This is very similar to the much feared
-		 error message snapshot too old).
+		 error message snapshot too old). Some cleanup records only cause
+		 conflict with older queries, though some types of cleanup record
+		 affect all queries.
 	/para
 
 	para
@@ -2364,6 +2367,9 @@ LOG:  database system is ready to accept read only connections
para
 	As a result, you cannot create additional indexes that exist solely
 	on the standby, nor can statistics that exist solely on the standby.
+	If these administrator commands are needed they should be executed
+	on the primary so that the changes will propagate through to the
+	standby.
/para
 
para
@@ -2373,7 +2379,8 @@ LOG:  database system is ready to accept read only connections
 	managed by the Startup process that owns all AccessExclusiveLocks held
 	by transactions being replayed by recovery. pg_stat_activity does not
 	show an entry for the Startup process, nor do recovering transactions
-	show as active.
+	show as active. Note that Startup process does not acquire locks to
+	make database changes and thus no locks are shown in pg_locks.
/para
 
para
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 364756d..05e1c5e 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -659,7 +659,10 @@ _bt_page_recyclable(Page page)
  * order when replaying the effects of a VACUUM, just as we do for the
  * original VACUUM itself. lastBlockVacuumed allows us to tell whether an
  * intermediate range of blocks has had no changes at all by VACUUM,
- * and so must be scanned anyway during replay.
+ * and so must be scanned anyway during replay. We always 

[HACKERS] Hot Standby remaining issues

2009-11-25 Thread Heikki Linnakangas
I've put up a wiki page with the issues I see with the patch as it
stands. They're roughly categorized by seriousness.

http://wiki.postgresql.org/wiki/Hot_Standby_TODO

New issues can and probably will still pop up, let's add them to the
list as they're found so that we know what still needs to be done.

You had a list of work items at the hot standby main page, but I believe
it's badly out-of-date. Please move any still relevant items to the
above list, if any.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby remaining issues

2009-11-25 Thread Simon Riggs
On Wed, 2009-11-25 at 13:00 +0200, Heikki Linnakangas wrote:
 I've put up a wiki page with the issues I see with the patch as it
 stands. They're roughly categorized by seriousness.
 
 http://wiki.postgresql.org/wiki/Hot_Standby_TODO
 
 New issues can and probably will still pop up, let's add them to the
 list as they're found so that we know what still needs to be done.
 
 You had a list of work items at the hot standby main page, but I believe
 it's badly out-of-date. Please move any still relevant items to the
 above list, if any.

I've linked the two pages together and identified the ones I'm currently
working on, plus added a few comments.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers