Re: [Pacemaker] Postgresql streaming replication failover - RA needed

Attila Megyeri Fri, 25 Nov 2011 01:12:47 -0800

A quick snippet from the corosync.log

Nov 23 05:43:05 psql1 pgsql[2845]: DEBUG: Checking right of master.
Nov 23 05:43:05 psql1 pgsql[2845]: INFO: My data status=.
Nov 23 05:43:05 psql1 pgsql[2845]: INFO: psql1 xlog location : 000000000D000000
Nov 23 05:43:05 psql1 pgsql[2845]: INFO: psql2 xlog location : 0000000008000000


As you see, the "my data status" returns an empty string.


-----Original Message-----
From: Attila Megyeri [mailto:amegy...@minerva-soft.com] 
Sent: 2011. november 25. 9:28
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] Postgresql streaming replication failover - RA needed

Hi Takatoshi,

I have restored the PSQL to run without corosync so I cannot send you the 
crm_mon output now.

What I can tell for sure:
- RA never promoted any of the nodes, no matter what the status was. It also 
did not promote the node, when it was the only one.
- I believe the issue is in the comparison of the xlogs. How could I 
troubleshoot that? I see from the logs that crm NEVER tried to invoke pgsql 
with "promote"
- I tried previously the crm_mon -A option, but there was never a " 
pgsql-data-status" attribute. The other attribs were there, including the 
HS:alone
- In the corosync log the only relevant RA message I see is " Master is not 
exist. " I never saw a message like  "My data is out-of-date"

Thank you!

Attila


-----Original Message-----
From: Takatoshi MATSUO [mailto:matsuo....@gmail.com]
Sent: 2011. november 25. 8:56
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] Postgresql streaming replication failover - RA needed

Hi Attila

2011/11/24 Attila Megyeri <amegy...@minerva-soft.com>:
> Hi Takatoshi, All,
>
> Thanks for your reply.
> I see that you have invested significant effort in the development of the RA. 
> I spent the last day trying to set up the RA, but without much success.
>
> My infrastructure is very similar to yours, except for the fact that 
> currently I am testing with a single network adapter.
>
> Replication works nicely when I start the databases manually, not using 
> corosync.
>
> When I try to start using corosync,I see that the ping resources start 
> normally, but the msPostgresql starts on both nodes in slave mode, and I see 
> "HS:alone"

To see "HS:alone" is normal.
And RA compares xlog locations and promote the postgresql having new data.

> In the Wiki you state, the if I start on a signle node only, PSQL should 
> start in Master mode (PRI), but this is not the case.

If the data is old, the node can't be master.
To be master needs pgsql-data-status="LATEST" or "STREAMING|SYNC".
Plese check it using "crm_mon -A".




And to become a master from stopped takes a few minutes because the RA compares 
xlog location on monitor.


> The recovery.conf file is created immediately, and from the logs I see no 
> attempt at all to promote the node.
> In the postgres logs I see that node1, which is supposed to be a master, 
> tries to connect to the vip-rep IP address, which is NOT brought up, because 
> it depends on the Master role...
>
> Do you have any idea?

Please check HA log.
My RA outputs "My data is out-of-date. status=********" to log if the data is 
old.

Regards,
Takatoshi MATSUO

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org Getting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org Getting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Postgresql streaming replication failover - RA needed

Reply via email to