Hi,

We're having an issue with our cluster where after a reboot of our system a 
location constraint reappears for the ClusterIP. This causes a problem, because 
we have a daemon that checks the cluster state and waits until the ClusterIP is 
started before it kicks off our application. We didn't have this issue when 
using an earlier version of pacemaker. Here is the constraint as shown by pcs:

[root@g5se-f3efce cib]# pcs constraint
Location Constraints:
  Resource: ClusterIP
    Disabled on: g5se-f3efce (role: Started)
Ordering Constraints:
Colocation Constraints:

...and here is our cluster status with the ClusterIP being Stopped:

[root@g5se-f3efce cib]# pcs status
Cluster name: cl-g5se-f3efce
Last updated: Thu Feb 18 11:36:01 2016
Last change: Thu Feb 18 10:48:33 2016 via crm_resource on g5se-f3efce
Stack: cman
Current DC: g5se-f3efce - partition with quorum
Version: 1.1.11-97629de
1 Nodes configured
4 Resources configured


Online: [ g5se-f3efce ]

Full list of resources:

sw-ready-g5se-f3efce   (ocf::pacemaker:GBmon): Started g5se-f3efce
meta-data      (ocf::pacemaker:GBmon): Started g5se-f3efce
netmon (ocf::heartbeat:ethmonitor):    Started g5se-f3efce
ClusterIP      (ocf::heartbeat:IPaddr2):       Stopped


The cluster really just has one node at this time.

I retrieve the constraint ID, remove the constraint, verify that ClusterIP is 
started,  and then reboot:

[root@g5se-f3efce cib]# pcs constraint ref ClusterIP
Resource: ClusterIP
  cli-ban-ClusterIP-on-g5se-f3efce
[root@g5se-f3efce cib]# pcs constraint remove cli-ban-ClusterIP-on-g5se-f3efce

[root@g5se-f3efce cib]# pcs status
Cluster name: cl-g5se-f3efce
Last updated: Thu Feb 18 11:45:09 2016
Last change: Thu Feb 18 11:44:53 2016 via crm_resource on g5se-f3efce
Stack: cman
Current DC: g5se-f3efce - partition with quorum
Version: 1.1.11-97629de
1 Nodes configured
4 Resources configured


Online: [ g5se-f3efce ]

Full list of resources:

sw-ready-g5se-f3efce   (ocf::pacemaker:GBmon): Started g5se-f3efce
meta-data      (ocf::pacemaker:GBmon): Started g5se-f3efce
netmon (ocf::heartbeat:ethmonitor):    Started g5se-f3efce
ClusterIP      (ocf::heartbeat:IPaddr2):       Started g5se-f3efce


[root@g5se-f3efce cib]# reboot

....after reboot, log in, and the constraint is back and ClusterIP has not 
started.


I have noticed in /var/lib/pacemaker/cib that the cib-x.raw files get created 
when there are changes to the cib (cib.xml). After a reboot, I see the 
constraint being added in a diff between .raw files:

[root@g5se-f3efce cib]# diff cib-7.raw cib-8.raw
1c1
< <cib epoch="239" num_updates="0" admin_epoch="0" 
validate-with="pacemaker-1.2" cib-last-written="Thu Feb 18 11:44:53 2016" 
update-origin="g5se-f3efce" update-client="crm_resource" 
crm_feature_set="3.0.9" have-quorum="1" dc-uuid="g5se-f3efce">
---
> <cib epoch="240" num_updates="0" admin_epoch="0" 
> validate-with="pacemaker-1.2" cib-last-written="Thu Feb 18 11:46:49 2016" 
> update-origin="g5se-f3efce" update-client="crm_resource" 
> crm_feature_set="3.0.9" have-quorum="1" dc-uuid="g5se-f3efce">
50c50,52
<     <constraints/>
---
>     <constraints>
>       <rsc_location id="cli-ban-ClusterIP-on-g5se-f3efce" rsc="ClusterIP" 
> role="Started" node="g5se-f3efce" score="-INFINITY"/>
>     </constraints>


I have also looked in /var/log/cluster/corosync.log and seen logs where it 
seems the cib is getting updated. I'm not sure if the constraint is being put 
back in at shutdown or at start up. I just don't understand why it's being put 
back in. I don't think our daemon code or other scripts are doing this,  but it 
is something I could verify.

********************************

>From "yum info pacemaker", my current version is:

Name        : pacemaker
Arch        : x86_64
Version     : 1.1.12
Release     : 8.el6_7.2

My earlier version was:

Name        : pacemaker
Arch        : x86_64
Version     : 1.1.10
Release     : 1.el6_4.4

I'm still using an earlier version pcs, because the new one seems to have 
issues with python:

Name        : pcs
Arch        : noarch
Version     : 0.9.90
Release     : 1.0.1.el6.centos

*******************************

If anyone has ideas on the cause or thoughts on this, anything would be greatly 
appreciated.

Thanks!



Jeremy Matthews


_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to