[GENERAL] Failover, Wal Logging, and Multiple Spares

2009-08-16 Thread Bryan Murphy
Assuming we are running a Postgres instance that is shipping log files to 2
or more warm spares, is there a way I can fail over to one of the spares,
and have the second spare start receiving updates from the new master
without missing a beat?  I can live with losing the old master, and at least
at the moment it would be a controlled failover, but I would like to to know
if it's possible during an uncontrolled failover as well (catastrophic
hardware failure).
Right now, we have just that setup, but every time I've failed over to the
new master, we've had to rebuild our spares from scratch and unfortunately
this is a multi-hour long process.  We can't afford the risk of not having a
warm spare for that length of time.  We're planning to move entirely to a
slony cluster, but I'd like to fail over to a more powerful machine before
we begin the slony migration as the current server is already overloaded.

Thanks,
Bryan


Re: [GENERAL] Failover, Wal Logging, and Multiple Spares

2009-08-17 Thread Bryan Murphy
Ok, I've asked this a few times, but nobody ever responded.  I think I
finally got it though, could somebody confirm my logic?  Basically, you
setup a chain of servers, and when fails you replicate to the next link in
the chain, like so:
Master (A) --> Warm Standby (B) --> Warn Standby (C)  --> etc.

Master Fails, now becomes:

Old Master (A)  x> New Master (B) --> Warm Standby (C)

And, of course, you might have an additional replication chain from Master
(A) just in case you goof something up in the failover process, but that's
the basic idea.

Thanks,
Bryan


On Sun, Aug 16, 2009 at 9:35 PM, Bryan Murphy  wrote:

> Assuming we are running a Postgres instance that is shipping log files to 2
> or more warm spares, is there a way I can fail over to one of the spares,
> and have the second spare start receiving updates from the new master
> without missing a beat?  I can live with losing the old master, and at least
> at the moment it would be a controlled failover, but I would like to to know
> if it's possible during an uncontrolled failover as well (catastrophic
> hardware failure).
> Right now, we have just that setup, but every time I've failed over to the
> new master, we've had to rebuild our spares from scratch and unfortunately
> this is a multi-hour long process.  We can't afford the risk of not having a
> warm spare for that length of time.  We're planning to move entirely to a
> slony cluster, but I'd like to fail over to a more powerful machine before
> we begin the slony migration as the current server is already overloaded.
>
> Thanks,
> Bryan
>


Re: [GENERAL] Failover, Wal Logging, and Multiple Spares

2009-08-17 Thread Yaroslav Tykhiy

On 18/08/2009, at 9:36 AM, Bryan Murphy wrote:

Ok, I've asked this a few times, but nobody ever responded.  I think  
I finally got it though, could somebody confirm my logic?   
Basically, you setup a chain of servers, and when fails you  
replicate to the next link in the chain, like so:


Master (A) --> Warm Standby (B) --> Warn Standby (C)  --> etc.

Master Fails, now becomes:

Old Master (A)  x> New Master (B) --> Warm Standby (C)

And, of course, you might have an additional replication chain from  
Master (A) just in case you goof something up in the failover  
process, but that's the basic idea.


Excuse me, but I fail to see how you are going to replicate from one  
warm standby to another warm standby.  I don't think PostgreSQL can do  
that.  That said, the idea of just partially degrading a warm standby  
cluster by electing a new master node looked very attractive to me, too.


On Sun, Aug 16, 2009 at 9:35 PM, Bryan Murphy  
 wrote:
Assuming we are running a Postgres instance that is shipping log  
files to 2 or more warm spares, is there a way I can fail over to  
one of the spares, and have the second spare start receiving updates  
from the new master without missing a beat?  I can live with losing  
the old master, and at least at the moment it would be a controlled  
failover, but I would like to to know if it's possible during an  
uncontrolled failover as well (catastrophic hardware failure).


Right now, we have just that setup, but every time I've failed over  
to the new master, we've had to rebuild our spares from scratch and  
unfortunately this is a multi-hour long process.  We can't afford  
the risk of not having a warm spare for that length of time.  We're  
planning to move entirely to a slony cluster, but I'd like to fail  
over to a more powerful machine before we begin the slony migration  
as the current server is already overloaded.


Encouraged by Bruce Momjian, I tried and had some success in this  
area.  It was a controlled failover but it worked like a charm.  An  
obvious condition was that the warm standbys be in perfect sync; you  
can't do the trick if some of them received the last WAL segment while  
the others didn't.


Please see http://archives.postgresql.org/pgsql-general/2009-07/msg00215.php 
 for my report.  Of course, questions and comments are welcome.


Cheers,
Yar

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Failover, Wal Logging, and Multiple Spares

2009-08-17 Thread Greg Stark
On Tue, Aug 18, 2009 at 1:25 AM, Yaroslav Tykhiy wrote:
> Encouraged by Bruce Momjian, I tried and had some success in this area.  It
> was a controlled failover but it worked like a charm.  An obvious condition
> was that the warm standbys be in perfect sync; you can't do the trick if
> some of them received the last WAL segment while the others didn't.

It seems like it should be possible to weaken this constraint. As long
as you're careful to fail over to the slave which is the furthest
ahead in replaying WAL. All the other slaves must switch to replaying
logs from the new master before the point where it took over.

This does seem like a very useful area to explore.


-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general