Dear list I am quite new to PaceMaker and i am configuring a two node active/active cluster which consist basically on something like this:
My whole configuration is this one: Stack: corosync Current DC: pbx2vs3 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 2 nodes and 10 resources configured Online: [ pbx1vs3 pbx2vs3 ] Full list of resources: Clone Set: dlm-clone [dlm] Started: [ pbx1vs3 pbx2vs3 ] Clone Set: asteriskfs-clone [asteriskfs] Started: [ pbx1vs3 pbx2vs3 ] Clone Set: asterisk-clone [asterisk] Started: [ pbx1vs3 pbx2vs3 ] fence_pbx2_xvm (stonith:fence_xvm): Started pbx2vs3 fence_pbx1_xvm (stonith:fence_xvm): Started pbx1vs3 Clone Set: clvmd-clone [clvmd] Started: [ pbx1vs3 pbx2vs3 ] PCSD Status: pbx1vs3: Online pbx2vs3: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled [root@pbx1 ~]# pcs config show Cluster Name: asteriskcluster Corosync Nodes: pbx1vs3 pbx2vs3 Pacemaker Nodes: pbx1vs3 pbx2vs3 Resources: Clone: dlm-clone Meta Attrs: clone-max=2 clone-node-max=1 interleave=true Resource: dlm (class=ocf provider=pacemaker type=controld) Attributes: allow_stonith_disabled=false Operations: start interval=0s timeout=90 (dlm-start-interval-0s) stop interval=0s on-fail=fence (dlm-stop-interval-0s) monitor interval=60s on-fail=fence (dlm-monitor-interval-60s) Clone: asteriskfs-clone Meta Attrs: interleave=true clone-max=2 clone-node-max=1 Resource: asteriskfs (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/vg_san1/lv_pbx directory=/mnt/asterisk fstype=gfs2 Operations: start interval=0s timeout=60 (asteriskfs-start-interval-0s) stop interval=0s on-fail=fence (asteriskfs-stop-interval-0s) monitor interval=60s on-fail=fence (asteriskfs-monitor-interval-60s) Clone: asterisk-clone Meta Attrs: interleaved=true sipp_monitor=/root/scripts/haasterisk.sh sipp_binary=/usr/local/src/sipp-3.4.1/bin/sipp globally-unique=false ordered=false interleave=true clone-max=2 clone-node-max=1 notify=true Resource: asterisk (class=ocf provider=heartbeat type=asterisk) Attributes: user=root group=root config=/mnt/asterisk/etc/asterisk.conf sipp_monitor=/root/scripts/haasterisk.sh sipp_binary=/usr/local/src/sipp-3.4.1/bin/sipp maxfiles=65535 Operations: start interval=0s timeout=40s (asterisk-start-interval-0s) stop interval=0s on-fail=fence (asterisk-stop-interval-0s) monitor interval=10s (asterisk-monitor-interval-10s) Clone: clvmd-clone Meta Attrs: clone-max=2 clone-node-max=1 interleave=true Resource: clvmd (class=ocf provider=heartbeat type=clvm) Operations: start interval=0s timeout=90 (clvmd-start-interval-0s) monitor interval=30s on-fail=fence (clvmd-monitor-interval-30s) stop interval=0s on-fail=fence (clvmd-stop-interval-0s) Stonith Devices: Resource: fence_pbx2_xvm (class=stonith type=fence_xvm) Attributes: port=tegamjg_pbx2 pcmk_host_list=pbx2vs3 Operations: monitor interval=60s (fence_pbx2_xvm-monitor-interval-60s) Resource: fence_pbx1_xvm (class=stonith type=fence_xvm) Attributes: port=tegamjg_pbx1 pcmk_host_list=pbx1vs3 Operations: monitor interval=60s (fence_pbx1_xvm-monitor-interval-60s) Fencing Levels: Location Constraints: Ordering Constraints: start fence_pbx1_xvm then start fence_pbx2_xvm (kind:Mandatory) (id:order-fence_pbx1_xvm-fence_pbx2_xvm-mandatory) start fence_pbx2_xvm then start dlm-clone (kind:Mandatory) (id:order-fence_pbx2_xvm-dlm-clone-mandatory) start dlm-clone then start clvmd-clone (kind:Mandatory) (id:order-dlm-clone-clvmd-clone-mandatory) start clvmd-clone then start asteriskfs-clone (kind:Mandatory) (id:order-clvmd-clone-asteriskfs-clone-mandatory) start asteriskfs-clone then start asterisk-clone (kind:Mandatory) (id:order-asteriskfs-clone-asterisk-clone-mandatory) Colocation Constraints: clvmd-clone with dlm-clone (score:INFINITY) (id:colocation-clvmd-clone-dlm-clone-INFINITY) asteriskfs-clone with clvmd-clone (score:INFINITY) (id:colocation-asteriskfs-clone-clvmd-clone-INFINITY) asterisk-clone with asteriskfs-clone (score:INFINITY) (id:colocation-asterisk-clone-asteriskfs-clone-INFINITY) Resources Defaults: migration-threshold: 2 failure-timeout: 10m start-failure-is-fatal: false Operations Defaults: No defaults set Cluster Properties: cluster-infrastructure: corosync cluster-name: asteriskcluster dc-version: 1.1.13-10.el7_2.2-44eb2dd have-watchdog: false last-lrm-refresh: 1468598829 no-quorum-policy: ignore stonith-action: reboot stonith-enabled: true Now my problem is that, for example, when i fence one of the nodes, the other one restarts every clone resource and start them back again, same thing happens when i stop pacemaker and corosync in one node only (pcs cluster stop). That would mean that if i have a problem in one of my Asterisk (for example in DLM resource or CLVMD) that would require fencing right away, for example node pbx2vs3, the other node (pbx1vs3) will restart every service which will drop all my calls in a well functioning node. To be even more general, this happens every time a resource needs stop/start or restart on any node it requires to be done on every node in the cluster. All this leads to a basic question, is this a strict way for clone resources to behave?, is it possible to configure them so they would behave, dare i say, in a more unique way (i know about the option globally-unique but as far as i understand that doesnt do the work). I have been reading about clone resources for a while but there are no many examples about what it cant do. There are some meta operations that doesnt make sense, sorry about that, the problem is that i dont know how to delete them with PCSD :). Now, I found something interesting about constraint ordering with clone resources in "Pacemaker Explained" documentation, which describes something like this: *"<constraints><rsc_location id="clone-prefers-node1" rsc="apache-clone" node="node1" score="500"/><rsc_colocation id="stats-with-clone" rsc="apache-stats" with="apache-clone"/><rsc_order id="start-clone-then-stats" first="apache-clone" then="apache-stats"/></constraints>""Ordering constraints behave slightly differently for clones. In the example above, apache-stats willwait until all copies of apache-clone that need to be started have done so before being started itself.Only if no copies can be started will apache-stats be prevented from being active. Additionally, theclone will wait for apache-stats to be stopped before stopping itself".* I am not sure if that has something to do with it, but i cannot destroy the whole cluster to test it and probably in vain. Thank you very much. Regards Alejandro
_______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org