Re: [ClusterLabs] STOP cluster after update resource

Nikolay Popov Wed, 07 Oct 2015 01:51:08 -0700

Hello.

We was looking the ways to utilize Corosync/Pacemaker stack for creating a
high-availability cluster of PostgreSQL servers with automatic failover.

We are using Corosync (2.3.4) as a messaging layer and a statefulmaster/slave

Resource Agent (pgsql) with Pacemaker (1.1.12) on CentOS 7.1.

Things work pretty well for a static cluster - where membership isdefined up front.However, we needed to be able to seamlessly add new machines (node) tothe cluster and removeexisting ones from it, without service interruption. And we ran into aproblem.


Is it possible to add a new node dynamically without interruption?

Do you know the way to add new node to cluster without this disruption?
Maybe some command or something else?

05.10.2015 13:19, Nikolay Popov пишет:

Hello.
I have got STOP cluster status when add\del new cluster node <pi05>after run <update pgsql> command:
How to add a node without STOP cluster?

I am doing command step's:

# pcs cluster auth pi01 pi02 pi03 pi05 -u hacluster -p hacluster

pi01: Authorized
pi02: Authorized
pi03: Authorized
pi05: Authorized

# pcs cluster node add pi05 --start

pi01: Corosync updated
pi02: Corosync updated
pi03: Corosync updated
pi05: Succeeded
pi05: Starting Cluster...

# pcs resource show --full

 Group: master-group
  Resource: vip-master (class=ocf provider=heartbeat type=IPaddr2)
   Attributes: ip=192.168.242.100 nic=eth0 cidr_netmask=24
Operations: start interval=0s timeout=60s on-fail=restart(vip-master-start-interval-0s)monitor interval=10s timeout=60s on-fail=restart(vip-master-monitor-interval-10s)stop interval=0s timeout=60s on-fail=block(vip-master-stop-interval-0s)
  Resource: vip-rep (class=ocf provider=heartbeat type=IPaddr2)
   Attributes: ip=192.168.242.101 nic=eth0 cidr_netmask=24
   Meta Attrs: migration-threshold=0
Operations: start interval=0s timeout=60s on-fail=stop(vip-rep-start-interval-0s)monitor interval=10s timeout=60s on-fail=restart(vip-rep-monitor-interval-10s)stop interval=0s timeout=60s on-fail=ignore(vip-rep-stop-interval-0s)
 Master: msPostgresql
Meta Attrs: master-max=1 master-node-max=1 clone-max=3clone-node-max=1 notify=true
  Resource: pgsql (class=ocf provider=heartbeat type=pgsql)
Attributes: pgctl=/usr/pgsql-9.5/bin/pg_ctlpsql=/usr/pgsql-9.5/bin/psql pgdata=/var/lib/pgsql/9.5/data/rep_mode=sync node_list="pi01 pi02 pi03" restore_command="cp/var/lib/pgsql/9.5/data/wal_archive/%f %p"primary_conninfo_opt="user=repl password=super-pass-for-replkeepalives_idle=60 keepalives_interval=5 keepalives_count=5"master_ip=192.168.242.100 restart_on_promote=true check_wal_receiver=trueOperations: start interval=0s timeout=60s on-fail=restart(pgsql-start-interval-0s)monitor interval=4s timeout=60s on-fail=restart(pgsql-monitor-interval-4s)monitor role=Master timeout=60s on-fail=restartinterval=3s (pgsql-monitor-interval-3s-role-Master)promote interval=0s timeout=60s on-fail=restart(pgsql-promote-interval-0s)demote interval=0s timeout=60s on-fail=stop(pgsql-demote-interval-0s)stop interval=0s timeout=60s on-fail=block(pgsql-stop-interval-0s)
               notify interval=0s timeout=60s (pgsql-notify-interval-0s)
# pcs resource update msPostgresql pgsql master-max=1master-node-max=1 clone-max=4 clone-node-max=1 notify=true
# pcs resource update pgsql pgsql node_list="pi01 pi02 pi03 pi05"

# crm_mon -Afr1
Last updated: Fri Oct 2 17:07:05 2015 Last change: Fri Oct2 17:06:37 2015
 by root via cibadmin on pi01
Stack: corosync
Current DC: pi02 (version 1.1.13-a14efad) - partition with quorum
4 nodes and 9 resources configured

Online: [ pi01 pi02 pi03 pi05 ]

Full list of resources:

 Resource Group: master-group
     vip-master (ocf::heartbeat:IPaddr2):       Stopped
     vip-rep    (ocf::heartbeat:IPaddr2):       Stopped
 Master/Slave Set: msPostgresql [pgsql]
     Slaves: [ pi02 ]
     Stopped: [ pi01 pi03 pi05 ]
 fence-pi01     (stonith:fence_ssh):    Started pi02
 fence-pi02     (stonith:fence_ssh):    Started pi01
 fence-pi03     (stonith:fence_ssh):    Started pi01

Node Attributes:
* Node pi01:
    + master-pgsql                      : -INFINITY
    + pgsql-data-status                 : STREAMING|SYNC
    + pgsql-status                      : STOP
* Node pi02:
    + master-pgsql                      : -INFINITY
    + pgsql-data-status                 : LATEST
    + pgsql-status                      : STOP
* Node pi03:
    + master-pgsql                      : -INFINITY
    + pgsql-data-status                 : STREAMING|POTENTIAL
    + pgsql-status                      : STOP
* Node pi05:
    + master-pgsql                      : -INFINITY
    + pgsql-status                      : STOP

Migration Summary:
* Node pi01:
* Node pi03:
* Node pi02:
* Node pi05:

After some time is worked:
Every 2.0s: crm_mon-Afr1 Fri Oct 217:04:36 2015
Last updated: Fri Oct 2 17:04:36 2015 Last change: Fri Oct2 17:04:07 2015 by root via
 cibadmin on pi01
Stack: corosync
Current DC: pi02 (version 1.1.13-a14efad) - partition with quorum
4 nodes and 9 resources configured

Online: [ pi01 pi02 pi03 pi05 ]

Full list of resources:

 Resource Group: master-group
     vip-master (ocf::heartbeat:IPaddr2):       Started pi02
     vip-rep    (ocf::heartbeat:IPaddr2):       Started pi02
 Master/Slave Set: msPostgresql [pgsql]
     Masters: [ pi02 ]
     Slaves: [ pi01 pi03 pi05 ]

 fence-pi01     (stonith:fence_ssh):    Started pi02
 fence-pi02     (stonith:fence_ssh):    Started pi01
 fence-pi03     (stonith:fence_ssh):    Started pi01

Node Attributes:
* Node pi01:
    + master-pgsql                      : 100
    + pgsql-data-status                 : STREAMING|SYNC
    + pgsql-receiver-status             : normal
    + pgsql-status                      : HS:sync
* Node pi02:
    + master-pgsql                      : 1000
    + pgsql-data-status                 : LATEST
    + pgsql-master-baseline             : 0000000008000098
    + pgsql-receiver-status             : ERROR
    + pgsql-status                      : PRI
* Node pi03:
    + master-pgsql                      : -INFINITY
    + pgsql-data-status                 : STREAMING|POTENTIAL
    + pgsql-receiver-status             : normal
    + pgsql-status                      : HS:potential
* Node pi05:
    + master-pgsql      : -INFINITY
    + pgsql-data-status                      : STREAMING|POTENTIAL
    + pgsql-receiver-status                  : normal
    + pgsql-status                           : HS:potential

Migration Summary:
* Node pi01:
* Node pi03:
* Node pi02:
* Node pi05:


--
Nikolay Popov


_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


--
Nikolay Popov
n.po...@postgrespro.ru
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] STOP cluster after update resource

Reply via email to