Update : I tried to spot the problem by running 2 Wheezy virtual machines configured with debian pinning like this :
# cat /etc/apt/preferences Package: * Pin: release a=wheezy Pin-Priority: 900 Package: * Pin: release a=squeeze Pin-Priority: 800 # aptitude install corosync # aptitude install pacemaker/squeeze so : root@pcmk2:/etc/corosync# dpkg -l | grep pacem ii pacemaker 1.0.9.1+hg15626-1 amd64 HA cluster resource manager root@pcmk2:/etc/corosync# dpkg -l | grep corosync ii corosync 1.4.2-3 amd64 Standards-based cluster framework (daemon and modules) ii libcorosync4 1.4.2-3 all Standards-based cluster framework (transitional package) and the problem did not occur : root@pcmk1:~/pacemaker# crm_mon -1 ============ Last updated: Thu Aug 29 05:53:50 2013 Stack: openais Current DC: pcmk1 - partition with quorum Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ pcmk2 pcmk1 ] ip (ocf::heartbeat:IPaddr2): Started pcmk1 Clone Set: mysql-mm (unmanaged) mysql:0 (ocf::heartbeat:mysql): Started pcmk2 (unmanaged) mysql:1 (ocf::heartbeat:mysql): Started pcmk1 (unmanaged) root@pcmk2:/etc/corosync# /etc/init.d/mysql stop [ ok ] Stopping MySQL database server: mysqld. root@pcmk1:~/pacemaker# crm_mon -1 ============ Last updated: Thu Aug 29 05:55:39 2013 Stack: openais Current DC: pcmk1 - partition with quorum Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ pcmk2 pcmk1 ] ip (ocf::heartbeat:IPaddr2): Started pcmk1 Clone Set: mysql-mm (unmanaged) mysql:0 (ocf::heartbeat:mysql): Started pcmk2 (unmanaged) FAILED mysql:1 (ocf::heartbeat:mysql): Started pcmk1 (unmanaged) Failed actions: mysql:0_monitor_15000 (node=pcmk2, call=5, rc=7, status=complete): not running root@pcmk2:/etc/corosync# /etc/init.d/mysql start [ ok ] Starting MySQL database server: mysqld .. [info] Checking for tables which need an upgrade, are corrupt or were not closed cleanly.. root@pcmk1:~/pacemaker# crm_mon -1 ============ Last updated: Thu Aug 29 05:56:34 2013 Stack: openais Current DC: pcmk1 - partition with quorum Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ pcmk2 pcmk1 ] ip (ocf::heartbeat:IPaddr2): Started pcmk1 Clone Set: mysql-mm (unmanaged) mysql:0 (ocf::heartbeat:mysql): Started pcmk2 (unmanaged) mysql:1 (ocf::heartbeat:mysql): Started pcmk1 (unmanaged) ----- What I noticed : with pacemaker 1.1.7, crm see 3 resources configured when in 1.0.9 it sees 2 resources (for the exact same configuration) 2013/8/27 tetsuo shima <tetsuo.41.sh...@gmail.com> > Hi list ! > > I'm having an issue with corosync, here is the scenario : > > # crm_mon -1 > ============ > Last updated: Tue Aug 27 09:50:13 2013 > Last change: Mon Aug 26 16:06:01 2013 via cibadmin on node2 > Stack: openais > Current DC: node1 - partition with quorum > Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff > 2 Nodes configured, 2 expected votes > 3 Resources configured. > ============ > > Online: [ node2 node1 ] > > ip (ocf::heartbeat:IPaddr2): Started node1 > Clone Set: mysql-mm [mysql] (unmanaged) > mysql:0 (ocf::heartbeat:mysql): Started node1 (unmanaged) > mysql:1 (ocf::heartbeat:mysql): Started node2 (unmanaged) > > # /etc/init.d/mysql stop > [ ok ] Stopping MySQL database server: mysqld. > > # crm_mon -1 > ============ > Last updated: Tue Aug 27 09:50:30 2013 > Last change: Mon Aug 26 16:06:01 2013 via cibadmin on node2 > Stack: openais > Current DC: node1 - partition with quorum > Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff > 2 Nodes configured, 2 expected votes > 3 Resources configured. > ============ > > Online: [ node2 node1 ] > > ip (ocf::heartbeat:IPaddr2): Started node1 > Clone Set: mysql-mm [mysql] (unmanaged) > mysql:0 (ocf::heartbeat:mysql): Started node1 (unmanaged) > mysql:1 (ocf::heartbeat:mysql): Started node2 (unmanaged) FAILED > > Failed actions: > mysql:0_monitor_15000 (node=node2, call=27, rc=7, status=complete): > not running > > # /etc/init.d/mysql start > [ ok ] Starting MySQL database server: mysqld .. > [info] Checking for tables which need an upgrade, are corrupt or were > not closed cleanly.. > > # sleep 60 && crm_mon -1 > ============ > Last updated: Tue Aug 27 09:51:54 2013 > Last change: Mon Aug 26 16:06:01 2013 via cibadmin on node2 > Stack: openais > Current DC: node1 - partition with quorum > Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff > 2 Nodes configured, 2 expected votes > 3 Resources configured. > ============ > > Online: [ node2 node1 ] > > ip (ocf::heartbeat:IPaddr2): Started node1 > Clone Set: mysql-mm [mysql] (unmanaged) > mysql:0 (ocf::heartbeat:mysql): Started node1 (unmanaged) > mysql:1 (ocf::heartbeat:mysql): Started node2 (unmanaged) FAILED > > Failed actions: > mysql:0_monitor_15000 (node=node2, call=27, rc=7, status=complete): > not running > > As you can see, every time I stop Mysql (which is unmanaged), the resource > is marked as failed : > > crmd: [1828]: info: process_lrm_event: LRM operation mysql:0_monitor_15000 > (call=4, rc=7, cib-update=10, confirmed=false) not running > > When I restart the resource : > > crmd: [1828]: info: process_lrm_event: LRM operation mysql:0_monitor_15000 > (call=4, rc=0, cib-update=11, confirmed=false) ok > > The resource is still in failed state and does not recover until I > manually clean up the resource. > > # crm_mon --one-shot --operations > ============ > Last updated: Tue Aug 27 10:17:30 2013 > Last change: Mon Aug 26 16:06:01 2013 via cibadmin on node2 > Stack: openais > Current DC: node1 - partition with quorum > Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff > 2 Nodes configured, 2 expected votes > 3 Resources configured. > ============ > > Online: [ node2 node1 ] > > ip (ocf::heartbeat:IPaddr2): Started node1 > Clone Set: mysql-mm [mysql] (unmanaged) > mysql:0 (ocf::heartbeat:mysql): Started node1 (unmanaged) > mysql:1 (ocf::heartbeat:mysql): Started node2 (unmanaged) FAILED > > Operations: > * Node node1: > ip: migration-threshold=1 > + (57) probe: rc=0 (ok) > mysql:0: migration-threshold=1 fail-count=1 > + (58) probe: rc=0 (ok) > + (59) monitor: interval=15000ms rc=0 (ok) > * Node node2: > mysql:0: migration-threshold=1 fail-count=3 > + (27) monitor: interval=15000ms rc=7 (not running) > + (27) monitor: interval=15000ms rc=0 (ok) > > Failed actions: > mysql:0_monitor_15000 (node=node2, call=27, rc=7, status=complete): > not running > > --- > > Here is some details about my configuration : > > # cat /etc/debian_version > 7.1 > > # dpk# dpkg -l | grep corosync > ii corosync 1.4.2-3 > amd64 Standards-based cluster framework > > # dpkg -l | grep pacem > ii pacemaker 1.1.7-1 > amd64 HA cluster resource manager > > # crm configure show > node node2 \ > attributes standby="off" > node node1 > primitive ip ocf:heartbeat:IPaddr2 \ > params ip="192.168.0.20" cidr_netmask="255.255.0.0" nic="eth2.2755" > iflabel="mysql" \ > meta is-managed="true" target-role="Started" \ > meta resource-stickiness="100" > primitive mysql ocf:heartbeat:mysql \ > op monitor interval="15" timeout="30" > clone mysql-mm mysql \ > meta is-managed="false" > location cli-prefer-ip ip 50: node1 > colocation ip-on-mysql-mm 200: ip mysql-mm > property $id="cib-bootstrap-options" \ > dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > stonith-enabled="false" \ > no-quorum-policy="ignore" \ > last-lrm-refresh="1377513557" \ > start-failure-is-fatal="false" > rsc_defaults $id="rsc-options" \ > resource-stickiness="1" \ > migration-threshold="1" > > --- > > Does anyone know what is wrong with my configuration ? > > Thanks for the help, > > Best regards. > > >
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org