Re: [GENERAL] Timeline Conflict

2011-08-02 Thread Merlin Moncure
On Tue, Aug 2, 2011 at 12:59 AM, senthilnathan
senthilnatha...@gmail.com wrote:
 We have system(Cluster) with Master replicating to 2 stand by servers.

 i.e

 M   |--- S1

      |--- S2

 If master failed, we do a trigger file at S1 to take over as master. Now we
 need to re-point the standby S2 as slave for the new master (i.e S1)

 While trying to start standby S2,there is a conflict in timelines, since on
 recovery it generates a new line.

 Is there any way to solve this issue?

AFAIK, the only solution is to follow the initial standby setup
process to bring the standby up to sync with the new master.  One
small comfort is that since the standby is mostly in the state it
needs to be, an rsync based process might happen fairly quickly.  This
of course means that if you lose the new master before the standby is
up to speed you are facing data loss.  I'm really curious if anyone
has figured out a potential solution to this problem.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Timeline Conflict

2011-08-02 Thread Simon Riggs
On Tue, Aug 2, 2011 at 2:55 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Aug 2, 2011 at 12:59 AM, senthilnathan
 senthilnatha...@gmail.com wrote:
 We have system(Cluster) with Master replicating to 2 stand by servers.

 i.e

 M   |--- S1

      |--- S2

 If master failed, we do a trigger file at S1 to take over as master. Now
 we
 need to re-point the standby S2 as slave for the new master (i.e S1)

 While trying to start standby S2,there is a conflict in timelines, since
 on
 recovery it generates a new line.

 Is there any way to solve this issue?

 AFAIK, the only solution is to follow the initial standby setup
 process to bring the standby up to sync with the new master.  One
 small comfort is that since the standby is mostly in the state it
 needs to be, an rsync based process might happen fairly quickly.  This
 of course means that if you lose the new master before the standby is
 up to speed you are facing data loss.  I'm really curious if anyone
 has figured out a potential solution to this problem.

http://projects.2ndquadrant.com/repmgr

solves the problem

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Timeline Conflict

2011-08-02 Thread Pedro Sam
I've been trying to use repmgr for just that purpose.  Looks like it simply 
creates/modifies a recovery.conf pointing primary_conninfo to the new master, 
and then restart.  It does not seem to have the ability to resolve any timeline 
conflicts at all.

Am I using repmgr incorrectly?

-Original Message-
From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Simon Riggs
Sent: Tuesday, August 02, 2011 12:07 PM
To: Merlin Moncure
Cc: senthilnathan; pgsql-general
Subject: Re: [GENERAL] Timeline Conflict

On Tue, Aug 2, 2011 at 2:55 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Aug 2, 2011 at 12:59 AM, senthilnathan
 senthilnatha...@gmail.com wrote:
 We have system(Cluster) with Master replicating to 2 stand by servers.

 i.e

 M   |--- S1

      |--- S2

 If master failed, we do a trigger file at S1 to take over as master. Now
 we
 need to re-point the standby S2 as slave for the new master (i.e S1)

 While trying to start standby S2,there is a conflict in timelines, since
 on
 recovery it generates a new line.

 Is there any way to solve this issue?

 AFAIK, the only solution is to follow the initial standby setup
 process to bring the standby up to sync with the new master.  One
 small comfort is that since the standby is mostly in the state it
 needs to be, an rsync based process might happen fairly quickly.  This
 of course means that if you lose the new master before the standby is
 up to speed you are facing data loss.  I'm really curious if anyone
 has figured out a potential solution to this problem.

http://projects.2ndquadrant.com/repmgr

solves the problem

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

-
This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Timeline Conflict

2011-08-02 Thread Merlin Moncure
On Tue, Aug 2, 2011 at 2:17 PM, Pedro Sam pe...@rim.com wrote:
 I've been trying to use repmgr for just that purpose.  Looks like it simply 
 creates/modifies a recovery.conf pointing primary_conninfo to the new master, 
 and then restart.  It does not seem to have the ability to resolve any 
 timeline conflicts at all.

It does not -- however it does simplify the process and optimizes the
downtime a little bit.  Reading the README:

And if a previously failed node becomes available again, such as the
lost node1 above, you can get it to resynchronize by only copying over
changes made while it was down using. That hapens with what's called a
forced clone, which overwrites existing data rather than assuming it
starts with an empty database directory tree:

repmgr -D /var/lib/pgsql/9.0 --force standby clone node1

This can be much faster than creating a brand new node that must copy
over every file in the database.

Basically this is formalizing good practice for failing over nodes and
re-syncing to a promoted master.  I will say though that one
unfortunate side effect of using HS/SR for HA is that you need *four*
servers to really protect yourself against data loss -- one master and
three standbys.  With a master and two standbys, you face a risk of
significant loss if the promoted master dies while the remaining
standby is syncing up to it.  What you are looking for is a 'hot sync'
so that standbys could be promoted in such a way that does not require
a full sync -- that doesn't exist right now AFAIK.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Timeline Conflict

2011-08-02 Thread Simon Riggs
On Tue, Aug 2, 2011 at 8:17 PM, Pedro Sam pe...@rim.com wrote:
 I've been trying to use repmgr for just that purpose.  Looks like it simply 
 creates/modifies a recovery.conf pointing primary_conninfo to the new master, 
 and then restart.  It does not seem to have the ability to resolve any 
 timeline conflicts at all.

 Am I using repmgr incorrectly?

It would appear so.

repmgr is not a fix for a problem situation, it is a management system
that will avoid the problems in the first place.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Timeline Conflict

2011-08-02 Thread Simon Riggs
On Tue, Aug 2, 2011 at 8:41 PM, Merlin Moncure mmonc...@gmail.com wrote:

 Basically this is formalizing good practice for failing over nodes and
 re-syncing to a promoted master.  I will say though that one
 unfortunate side effect of using HS/SR for HA is that you need *four*
 servers to really protect yourself against data loss -- one master and
 three standbys.  With a master and two standbys, you face a risk of
 significant loss if the promoted master dies while the remaining
 standby is syncing up to it.  What you are looking for is a 'hot sync'
 so that standbys could be promoted in such a way that does not require
 a full sync -- that doesn't exist right now AFAIK.

repmgr is specifically designed to reduce the time for a follow
action to a very small amount.

There is no risk of significant loss.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Timeline Conflict

2011-08-02 Thread Fujii Masao
On Tue, Aug 2, 2011 at 2:59 PM, senthilnathan senthilnatha...@gmail.com wrote:
 We have system(Cluster) with Master replicating to 2 stand by servers.

 i.e

 M   |--- S1

      |--- S2

 If master failed, we do a trigger file at S1 to take over as master. Now we
 need to re-point the standby S2 as slave for the new master (i.e S1)

 While trying to start standby S2,there is a conflict in timelines, since on
 recovery it generates a new line.

 Is there any way to solve this issue?

Basically you need to take a fresh backup from new master and restart
the standby
using it. But, if S1 and S2 share the archive, S1 is ahead of S2
(i.e., the replay location
of S1 is bigger than or equal to that of S2), and
recovery_target_timeline is set to
'latest' in S2's recovery.conf, you can skip taking a fresh backup
from new master.
In this case, you can re-point S2 as a standby just by changing
primary_conninfo in
S2's recovery.conf and restarting S2. When S2 restarts, S2 reads the
timeline history
file which was created by S1 at failover and adjust its timeline ID to
S1's. So timeline
conflict doesn't happen.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Timeline Conflict

2011-08-02 Thread Simon Riggs
On Wed, Aug 3, 2011 at 2:38 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Aug 2, 2011 at 2:59 PM, senthilnathan senthilnatha...@gmail.com 
 wrote:
 We have system(Cluster) with Master replicating to 2 stand by servers.

 i.e

 M   |--- S1

      |--- S2

 If master failed, we do a trigger file at S1 to take over as master. Now we
 need to re-point the standby S2 as slave for the new master (i.e S1)

 While trying to start standby S2,there is a conflict in timelines, since on
 recovery it generates a new line.

 Is there any way to solve this issue?

 Basically you need to take a fresh backup from new master and restart
 the standby
 using it. But, if S1 and S2 share the archive, S1 is ahead of S2
 (i.e., the replay location
 of S1 is bigger than or equal to that of S2), and
 recovery_target_timeline is set to
 'latest' in S2's recovery.conf, you can skip taking a fresh backup
 from new master.
 In this case, you can re-point S2 as a standby just by changing
 primary_conninfo in
 S2's recovery.conf and restarting S2. When S2 restarts, S2 reads the
 timeline history
 file which was created by S1 at failover and adjust its timeline ID to
 S1's. So timeline
 conflict doesn't happen.

Though this relies upon a shared archive which gives a single point of failure.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general