On 11/09/2016 12:27 PM, CART Andreas wrote: > Hi again > > > > Sorry for missing the omission of the master role within the colocation > constraint. > > I added it - but unfortunately still no success. > > > > (In the meantime I added 2 additional filesystem resources on top of the > NFSServer, but that should not change anything regarding the root > problem that I miss the demote of DRBDClone.) > > > > I again started with all resources located at ventsi-clst1 and issued a > 'pcs resource move DRBD_global_clst' (the resource next collocated next > to the DRBDClone). > > > > With that I end up with all primitive resources stopped and the > DRBDClone resource still being master at ventsi-clst1. > > Here is what pacemaker pretends has to be done: > > ================================================================== > > [root@ventsi-clst2 ~]# crm_simulate -Ls > > > > Current cluster status: > > Online: [ ventsi-clst1-sync ventsi-clst2-sync ] > > > > ipmi-fence-clst1 (stonith:fence_ipmilan): Started > ventsi-clst2-sync > > ipmi-fence-clst2 (stonith:fence_ipmilan): Started > ventsi-clst1-sync > > IPaddrNFS (ocf::heartbeat:IPaddr2): Stopped > > NFSServer (ocf::heartbeat:nfsserver): Stopped > > Master/Slave Set: DRBDClone [DRBD] > > Masters: [ ventsi-clst1-sync ] <=== still not demoted > > Slaves: [ ventsi-clst2-sync ] > > DRBD_global_clst (ocf::heartbeat:Filesystem): Stopped > > NFS_global_clst (ocf::heartbeat:Filesystem): Stopped > > BIND_global_clst (ocf::heartbeat:Filesystem): Stopped > > > > Allocation scores: > > native_color: ipmi-fence-clst1 allocation score on ventsi-clst1-sync: > -INFINITY > > native_color: ipmi-fence-clst1 allocation score on ventsi-clst2-sync: > INFINITY > > native_color: ipmi-fence-clst2 allocation score on ventsi-clst1-sync: > INFINITY > > native_color: ipmi-fence-clst2 allocation score on ventsi-clst2-sync: > -INFINITY > > clone_color: DRBDClone allocation score on ventsi-clst1-sync: 0 > > clone_color: DRBDClone allocation score on ventsi-clst2-sync: 0 > > clone_color: DRBD:0 allocation score on ventsi-clst1-sync: INFINITY > > clone_color: DRBD:0 allocation score on ventsi-clst2-sync: 0 > > clone_color: DRBD:1 allocation score on ventsi-clst1-sync: 0 > > clone_color: DRBD:1 allocation score on ventsi-clst2-sync: INFINITY > > native_color: DRBD:0 allocation score on ventsi-clst1-sync: INFINITY > > native_color: DRBD:0 allocation score on ventsi-clst2-sync: 0 > > native_color: DRBD:1 allocation score on ventsi-clst1-sync: -INFINITY > > native_color: DRBD:1 allocation score on ventsi-clst2-sync: INFINITY > > DRBD:1 promotion score on ventsi-clst2-sync: 10000 > > DRBD:0 promotion score on ventsi-clst1-sync: 1 > > native_color: DRBD_global_clst allocation score on ventsi-clst1-sync: > -INFINITY > > native_color: DRBD_global_clst allocation score on ventsi-clst2-sync: > INFINITY > > native_color: IPaddrNFS allocation score on ventsi-clst1-sync: -INFINITY > > native_color: IPaddrNFS allocation score on ventsi-clst2-sync: 0 > > native_color: NFSServer allocation score on ventsi-clst1-sync: -INFINITY > > native_color: NFSServer allocation score on ventsi-clst2-sync: 0 > > native_color: NFS_global_clst allocation score on ventsi-clst1-sync: 0 > > native_color: NFS_global_clst allocation score on ventsi-clst2-sync: > -INFINITY > > native_color: BIND_global_clst allocation score on ventsi-clst1-sync: > -INFINITY > > native_color: BIND_global_clst allocation score on ventsi-clst2-sync: 0 > > > > Transition Summary: > > * Start IPaddrNFS (ventsi-clst2-sync) > > * Start NFSServer (ventsi-clst2-sync) > > * Demote DRBD:0 (Master -> Slave ventsi-clst1-sync) <=== this > demote never happens > > * Promote DRBD:1 (Slave -> Master ventsi-clst2-sync) > > * Start DRBD_global_clst (ventsi-clst2-sync) > > * Start NFS_global_clst (ventsi-clst1-sync) > > * Start BIND_global_clst (ventsi-clst2-sync)
Strangely, this sequence appears to be ignoring the constraint "start DRBD_global_clst then start IPaddrNFS". Can you open a bug report at http://bugs.clusterlabs.org/ and attach the CIB (or pe-input file) in use at this time? For testing purposes, you may want to try replacing the "start DRBD_global_clst then start IPaddrNFS" constraint with "promote DRBDClone then start IPaddrNFS" to see whether that makes a difference. > And this is the executed transaction: > > ================================================================== > > [root@ventsi-clst2 ~]# crm_simulate --xml-file > /var/lib/pacemaker/pengine/pe-input-1157.bz2 --save-graph problem5.graph > --save-dotfile problem5.dot -V --simulate > > Using the original execution date of: 2016-11-09 17:54:10Z > > > > Current cluster status: > > Online: [ ventsi-clst1-sync ventsi-clst2-sync ] > > > > ipmi-fence-clst1 (stonith:fence_ipmilan): Started > ventsi-clst2-sync > > ipmi-fence-clst2 (stonith:fence_ipmilan): Started > ventsi-clst1-sync > > IPaddrNFS (ocf::heartbeat:IPaddr2): Started ventsi-clst1-sync > > NFSServer (ocf::heartbeat:nfsserver): Started ventsi-clst1-sync > > Master/Slave Set: DRBDClone [DRBD] > > Masters: [ ventsi-clst1-sync ] > > Slaves: [ ventsi-clst2-sync ] > > DRBD_global_clst (ocf::heartbeat:Filesystem): Started > ventsi-clst1-sync > > NFS_global_clst (ocf::heartbeat:Filesystem): Started > ventsi-clst2-sync > > BIND_global_clst (ocf::heartbeat:Filesystem): Started > ventsi-clst1-sync > > > > Transition Summary: > > * Stop IPaddrNFS (ventsi-clst1-sync) > > * Stop NFSServer (ventsi-clst1-sync) > > * Stop DRBD_global_clst (ventsi-clst1-sync) > > * Stop NFS_global_clst (Started ventsi-clst2-sync) > > * Stop BIND_global_clst (ventsi-clst1-sync) > > > > Executing cluster transition: > > * Resource action: NFS_global_clst stop on ventsi-clst2-sync > > * Resource action: BIND_global_clst stop on ventsi-clst1-sync > > * Resource action: NFSServer stop on ventsi-clst1-sync > > * Resource action: IPaddrNFS stop on ventsi-clst1-sync > > * Resource action: DRBD_global_clst stop on ventsi-clst1-sync > > * Pseudo action: all_stopped <=== no demote > > Using the original execution date of: 2016-11-09 17:54:10Z > > > > Revised cluster status: > > Online: [ ventsi-clst1-sync ventsi-clst2-sync ] > > > > ipmi-fence-clst1 (stonith:fence_ipmilan): Started > ventsi-clst2-sync > > ipmi-fence-clst2 (stonith:fence_ipmilan): Started > ventsi-clst1-sync > > IPaddrNFS (ocf::heartbeat:IPaddr2): Stopped > > NFSServer (ocf::heartbeat:nfsserver): Stopped > > Master/Slave Set: DRBDClone [DRBD] > > Masters: [ ventsi-clst1-sync ] > > Slaves: [ ventsi-clst2-sync ] > > DRBD_global_clst (ocf::heartbeat:Filesystem): Stopped > > NFS_global_clst (ocf::heartbeat:Filesystem): Stopped > > BIND_global_clst (ocf::heartbeat:Filesystem): Stopped > > > > And finally here the updated config: > > ================================================================== > > [root@ventsi-clst1 ~]# pcs config > > Cluster Name: clst1 > > Corosync Nodes: > > ventsi-clst1-sync ventsi-clst2-sync > > Pacemaker Nodes: > > ventsi-clst1-sync ventsi-clst2-sync > > > > Resources: > > Resource: IPaddrNFS (class=ocf provider=heartbeat type=IPaddr2) > > Attributes: ip=xxx.xxx.xxx.xxx cidr_netmask=24 > > Operations: start interval=0 timeout=20 (IPaddrNFS-start-interval-0) > > stop interval=0 timeout=20 (IPaddrNFS-stop-interval-0) > > monitor interval=10 timeout=20 (IPaddrNFS-monitor-interval-10) > > Resource: NFSServer (class=ocf provider=heartbeat type=nfsserver) > > Attributes: > nfs_shared_infodir=/drbdmnts/global_clst/nfsserversettings/ > nfs_ip=xxx.xxx.xxx.xxx nfsd_args="-H xxx.xxx.xxx.xxx" > > Operations: start interval=0 timeout=40 (NFSServer-start-interval-0) > > stop interval=0 timeout=20 (NFSServer-stop-interval-0) > > monitor interval=10 timeout=20 (NFSServer-monitor-interval-10) > > Master: DRBDClone > > Meta Attrs: master-max=1 master-node-max=1 clone-max=2 > clone-node-max=1 notify=true > > Resource: DRBD (class=ocf provider=linbit type=drbd) > > Attributes: drbd_resource=nfsdata > > Operations: start interval=0 timeout=240 (DRBD-start-interval-0) > > promote interval=0 timeout=90 (DRBD-promote-interval-0) > > demote interval=0 timeout=90 (DRBD-demote-interval-0) > > stop interval=0 timeout=100 (DRBD-stop-interval-0) > > monitor interval=9 role=Master timeout=5 > (DRBD-monitor-interval-9) > > monitor interval=10 role=Slave timeout=5 > (DRBD-monitor-interval-10) > > Resource: DRBD_global_clst (class=ocf provider=heartbeat type=Filesystem) > > Attributes: device=/dev/drbd1 directory=/drbdmnts/global_clst fstype=ext4 > > Operations: start interval=0 timeout=60 > (DRBD_global_clst-start-interval-0) > > stop interval=0 timeout=60 (DRBD_global_clst-stop-interval-0) > > monitor interval=20 timeout=40 > (DRBD_global_clst-monitor-interval-20) > > Resource: NFS_global_clst (class=ocf provider=heartbeat type=Filesystem) > > Attributes: device=xxx.xxx.xxx.xxx:/drbdmnts/global_clst/nfs > directory=/global/nfs fstype=nfs > > Operations: start interval=0 timeout=60 (NFS_global_clst-start-interval-0) > > stop interval=0 timeout=60 (NFS_global_clst-stop-interval-0) > > monitor interval=20 timeout=40 > (NFS_global_clst-monitor-interval-20) > > Resource: BIND_global_clst (class=ocf provider=heartbeat type=Filesystem) > > Attributes: device=/drbdmnts/global_clst/nfs directory=/global/nfs > fstype=none options=bind > > Operations: start interval=0 timeout=60 > (BIND_global_clst-start-interval-0) > > stop interval=0 timeout=60 (BIND_global_clst-stop-interval-0) > > monitor interval=20 timeout=40 > (BIND_global_clst-monitor-interval-20) > > > > Stonith Devices: > > Resource: ipmi-fence-clst1 (class=stonith type=fence_ipmilan) > > Attributes: lanplus=1 login=foo passwd=bar action=reboot > ipaddr=yyy.yyy.yyy.yyy pcmk_host_check=static-list > pcmk_host_list=ventsi-clst1-sync auth=password timeout=30 cipher=1 > > Operations: monitor interval=60 (ipmi-fence-clst1-monitor-interval-60) > > Resource: ipmi-fence-clst2 (class=stonith type=fence_ipmilan) > > Attributes: lanplus=1 login=foo passwd=bar action=reboot > ipaddr=zzz.zzz.zzz.zzz pcmk_host_check=static-list > pcmk_host_list=ventsi-clst2-sync auth=password timeout=30 cipher=1 > > Operations: monitor interval=60 (ipmi-fence-clst2-monitor-interval-60) > > Fencing Levels: > > > > Location Constraints: > > Resource: DRBD_global_clst > > Disabled on: ventsi-clst1-sync (score:-INFINITY) (role: Started) > (id:cli-ban-DRBD_global_clst-on-ventsi-clst1-sync) > > Resource: ipmi-fence-clst1 > > Disabled on: ventsi-clst1-sync (score:-INFINITY) > (id:location-ipmi-fence-clst1-ventsi-clst1-sync--INFINITY) > > Resource: ipmi-fence-clst2 > > Disabled on: ventsi-clst2-sync (score:-INFINITY) > (id:location-ipmi-fence-clst2-ventsi-clst2-sync--INFINITY) > > Ordering Constraints: > > start IPaddrNFS then start NFSServer (kind:Mandatory) > (id:order-IPaddrNFS-NFSServer-mandatory) > > promote DRBDClone then start DRBD_global_clst (kind:Mandatory) > (id:order-DRBDClone-DRBD_global_clst-mandatory) > > start DRBD_global_clst then start IPaddrNFS (kind:Mandatory) > (id:order-DRBD_global_clst-IPaddrNFS-mandatory) > > start NFSServer then start NFS_global_clst (kind:Mandatory) > (id:order-NFSServer-NFS_global_clst-mandatory) > > start NFSServer then start BIND_global_clst (kind:Mandatory) > (id:order-NFSServer-BIND_global_clst-mandatory) > > Colocation Constraints: > > NFSServer with IPaddrNFS (score:INFINITY) > (id:colocation-NFSServer-IPaddrNFS-INFINITY) > > IPaddrNFS with DRBD_global_clst (score:INFINITY) > (id:colocation-IPaddrNFS-DRBD_global_clst-INFINITY) > > NFS_global_clst with NFSServer (score:-INFINITY) > (id:colocation-NFS_global_clst-NFSServer--INFINITY) > > BIND_global_clst with NFSServer (score:INFINITY) > (id:colocation-BIND_global_clst-NFSServer-INFINITY) > > DRBD_global_clst with DRBDClone (score:INFINITY) (rsc-role:Started) > (with-rsc-role:Master) (id:colocation-DRBD_global_clst-DRBDClone-INFINITY) > > > > Resources Defaults: > > resource-stickiness: INFINITY > > Operations Defaults: > > timeout: 10s > > > > Cluster Properties: > > cluster-infrastructure: cman > > dc-version: 1.1.14-8.el6-70404b0 > > have-watchdog: false > > last-lrm-refresh: 1478703150 > > no-quorum-policy: ignore > > stonith-enabled: true > > symmetric-cluster: true > > Node Attributes: > > ventsi-clst1-sync: PostgresSon-data-status=DISCONNECT > > ventsi-clst2-sync: PostgresSon-data-status=DISCONNECT > > > > > > Kind regards > > Andi > > > > -----Original Message----- > From: Ken Gaillot [mailto:kgail...@redhat.com] > Sent: Dienstag, 8. November 2016 22:29 > To: users@clusterlabs.org > Subject: Re: [ClusterLabs] DRBD demote/promote not called - Why? How to fix? > > > > On 11/04/2016 01:57 PM, CART Andreas wrote: > >> Hi > >> > >> I have a basic 2 node active/passive cluster with Pacemaker (1.1.14 , > >> pcs: 0.9.148) / CMAN (3.0.12.1) / Corosync (1.4.7) on RHEL 6.8. > >> This cluster runs NFS on top of DRBD (8.4.4). > >> > >> Basically the system is working on both nodes and I can switch the > >> resources from one node to the other. > >> But switching resources to the other node does not work, if I try to > >> move just one resource and have the others follow due to the location > >> constraints. > >> > >> From the logged messages I see that in this “failure case” there is NO > >> attempt to demote/promote the DRBD clone resource. > >> > >> Here is my setup: > >> ================================================================== > >> Cluster Name: clst1 > >> Corosync Nodes: > >> ventsi-clst1-sync ventsi-clst2-sync > >> Pacemaker Nodes: > >> ventsi-clst1-sync ventsi-clst2-sync > >> > >> Resources: > >> Resource: IPaddrNFS (class=ocf provider=heartbeat type=IPaddr2) > >> Attributes: ip=xxx.xxx.xxx.xxx cidr_netmask=24 > >> Operations: start interval=0s timeout=20s (IPaddrNFS-start-interval-0s) > >> stop interval=0s timeout=20s (IPaddrNFS-stop-interval-0s) > >> monitor interval=5s (IPaddrNFS-monitor-interval-5s) > >> Resource: NFSServer (class=ocf provider=heartbeat type=nfsserver) > >> Attributes: nfs_shared_infodir=/var/lib/nfsserversettings/ > >> nfs_ip=xxx.xxx.xxx.xxx nfsd_args="-H xxx.xxx.xxx.xxx" > >> Operations: start interval=0s timeout=40 (NFSServer-start-interval-0s) > >> stop interval=0s timeout=20s (NFSServer-stop-interval-0s) > >> monitor interval=10s timeout=20s > >> (NFSServer-monitor-interval-10s) > >> Master: DRBDClone > >> Meta Attrs: master-max=1 master-node-max=1 clone-max=2 > >> clone-node-max=1 notify=true > >> Resource: DRBD (class=ocf provider=linbit type=drbd) > >> Attributes: drbd_resource=nfsdata > >> Operations: start interval=0s timeout=240 (DRBD-start-interval-0s) > >> promote interval=0s timeout=90 (DRBD-promote-interval-0s) > >> demote interval=0s timeout=90 (DRBD-demote-interval-0s) > >> stop interval=0s timeout=100 (DRBD-stop-interval-0s) > >> monitor interval=1s timeout=5 (DRBD-monitor-interval-1s) > >> Resource: DRBD_global_clst (class=ocf provider=heartbeat type=Filesystem) > >> Attributes: device=/dev/drbd1 directory=/drbdmnts/global_clst > fstype=ext4 > >> Operations: start interval=0s timeout=60 > >> (DRBD_global_clst-start-interval-0s) > >> stop interval=0s timeout=60 > >> (DRBD_global_clst-stop-interval-0s) > >> monitor interval=20 timeout=40 > >> (DRBD_global_clst-monitor-interval-20) > >> > >> Stonith Devices: > >> Resource: ipmi-fence-clst1 (class=stonith type=fence_ipmilan) > >> Attributes: lanplus=1 login=foo passwd=bar action=reboot > >> ipaddr=yyy.yyy.yyy.yyy pcmk_host_check=static-list > >> pcmk_host_list=ventsi-clst1-sync auth=password timeout=30 cipher=1 > >> Operations: monitor interval=60s (ipmi-fence-clst1-monitor-interval-60s) > >> Resource: ipmi-fence-clst2 (class=stonith type=fence_ipmilan) > >> Attributes: lanplus=1 login=foo passwd=bar action=reboot > >> ipaddr=zzz.zzz.zzz.zzz pcmk_host_check=static-list > >> pcmk_host_list=ventsi-clst2-sync auth=password timeout=30 cipher=1 > >> Operations: monitor interval=60s (ipmi-fence-clst2-monitor-interval-60s) > >> Fencing Levels: > >> > >> Location Constraints: > >> Resource: ipmi-fence-clst1 > >> Disabled on: ventsi-clst1-sync (score:-INFINITY) > >> (id:location-ipmi-fence-clst1-ventsi-clst1-sync--INFINITY) > >> Resource: ipmi-fence-clst2 > >> Disabled on: ventsi-clst2-sync (score:-INFINITY) > >> (id:location-ipmi-fence-clst2-ventsi-clst2-sync--INFINITY) > >> Ordering Constraints: > >> start IPaddrNFS then start NFSServer (kind:Mandatory) > >> (id:order-IPaddrNFS-NFSServer-mandatory) > >> promote DRBDClone then start DRBD_global_clst (kind:Mandatory) > >> (id:order-DRBDClone-DRBD_global_clst-mandatory) > >> start DRBD_global_clst then start IPaddrNFS (kind:Mandatory) > >> (id:order-DRBD_global_clst-IPaddrNFS-mandatory) > >> Colocation Constraints: > >> NFSServer with IPaddrNFS (score:INFINITY) > >> (id:colocation-NFSServer-IPaddrNFS-INFINITY) > >> DRBD_global_clst with DRBDClone (score:INFINITY) > >> (id:colocation-DRBD_global_clst-DRBDClone-INFINITY) > > > > It took me a while to notice it, it's easily overlooked, but the above > > constraint is the problem. It says DRBD_global_clst must be located > > where DRBDClone is running ... not necessarily where DRBDClone is > > master. This constraint should be created like this: > > > > pcs constraint colocation add DRBD_global_clst with master DBRDClone > > > >> IPaddrNFS with DRBD_global_clst (score:INFINITY) > >> (id:colocation-IPaddrNFS-DRBD_global_clst-INFINITY) > >> > >> Resources Defaults: > >> resource-stickiness: INFINITY > >> Operations Defaults: > >> timeout: 10s > >> > >> Cluster Properties: > >> cluster-infrastructure: cman > >> dc-version: 1.1.14-8.el6-70404b0 > >> have-watchdog: false > >> last-lrm-refresh: 1478277432 > >> no-quorum-policy: ignore > >> stonith-enabled: true > >> symmetric-cluster: true > >> ================================================================== > >> > >> Initial state is e.g. this (all resources at node1): > >> > >> Online: [ ventsi-clst1-sync ventsi-clst2-sync ] > >> > >> Full list of resources: > >> > >> ipmi-fence-clst1 (stonith:fence_ipmilan): Started > >> ventsi-clst2-sync > >> ipmi-fence-clst2 (stonith:fence_ipmilan): Started > >> ventsi-clst1-sync > >> IPaddrNFS (ocf::heartbeat:IPaddr2): Started ventsi-clst1-sync > >> NFSServer (ocf::heartbeat:nfsserver): Started ventsi-clst1-sync > >> Master/Slave Set: DRBDClone [DRBD] > >> Masters: [ ventsi-clst1-sync ] > >> Slaves: [ ventsi-clst2-sync ] > >> DRBD_global_clst (ocf::heartbeat:Filesystem): Started > >> ventsi-clst1-sync > >> ================================================================== > >> > >> If I shutdown the cluster at node 1 (‘pcs cluster stop’) or if I move > >> the DRBD clone resource (‘pcs resource move DRBDClone’) all resources > >> switch successfully to node2. > >> I.e. the demote/promote of the DRBD clone resource is working in these > >> cases. > >> > >> But if I try to move any other resource (e.g. ‘pcs resource move > >> NFSServer’) the resources NFSServer, IPaddrNFS and DRBD_global_clst are > >> stopped at node 1, but then already follows starting of the > >> DRBD_global_clst resource at node2, which fails due to the missing > >> demote/promote. > >> As far as I can see there is some follow-up attempt to repair things > >> partially as the resources are started again at node1 exclusive the > >> resource which I moved due to my move command. > >> > >> Final state is like this: > >> > >> Online: [ ventsi-clst1-sync ventsi-clst2-sync ] > >> > >> Full list of resources: > >> > >> ipmi-fence-clst1 (stonith:fence_ipmilan): Started > >> ventsi-clst2-sync > >> ipmi-fence-clst2 (stonith:fence_ipmilan): Started > >> ventsi-clst1-sync > >> IPaddrNFS (ocf::heartbeat:IPaddr2): Started ventsi-clst1-sync > >> NFSServer (ocf::heartbeat:nfsserver): Stopped > >> Master/Slave Set: DRBDClone [DRBD] > >> Masters: [ ventsi-clst1-sync ] > >> Slaves: [ ventsi-clst2-sync ] > >> DRBD_global_clst (ocf::heartbeat:Filesystem): Started > >> ventsi-clst1-sync > >> > >> Failed Actions: > >> * DRBD_global_clst_start_0 on ventsi-clst2-sync 'unknown error' (1): > >> call=778, status=complete, exitreason='none', > >> last-rc-change='Fri Nov 4 19:32:56 2016', queued=0ms, exec=43ms > >> ================================================================== > >> > >> Here are the logged messages for this “failure case”: > >> > >> 2016-11-04T19:32:55.163982+01:00 ventsi-clst1 crmd[6116]: notice: > >> State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC > >> cause=C_FSA_INTERNAL origin=abort_transition_graph ] > >> 2016-11-04T19:32:55.168100+01:00 ventsi-clst1 pengine[6115]: notice: > >> On loss of CCM Quorum: Ignore > >> 2016-11-04T19:32:55.181252+01:00 ventsi-clst1 pengine[6115]: notice: > >> Move IPaddrNFS#011(Started ventsi-clst1-sync -> ventsi-clst2-sync) > >> 2016-11-04T19:32:55.181260+01:00 ventsi-clst1 pengine[6115]: notice: > >> Move NFSServer#011(Started ventsi-clst1-sync -> ventsi-clst2-sync) > >> 2016-11-04T19:32:55.181278+01:00 ventsi-clst1 pengine[6115]: notice: > >> Move DRBD_global_clst#011(Started ventsi-clst1-sync -> > >> ventsi-clst2-sync) <=== here no demote/promote is listed > >> 2016-11-04T19:32:55.182385+01:00 ventsi-clst1 pengine[6115]: notice: > >> Calculated Transition 202: /var/lib/pacemaker/pengine/pe-input-766.bz2 > >> 2016-11-04T19:32:55.182998+01:00 ventsi-clst1 crmd[6116]: notice: > >> Initiating action 15: stop NFSServer_stop_0 on ventsi-clst1-sync (local) > >> 2016-11-04T19:32:55.196265+01:00 ventsi-clst1 > >> nfsserver(NFSServer)[15978]: INFO: Stopping NFS server ... > >> 2016-11-04T19:32:55.249137+01:00 ventsi-clst1 kernel: nfsd: last server > >> has exited, flushing export cache > >> 2016-11-04T19:32:55.252241+01:00 ventsi-clst1 rpc.mountd[15282]: Caught > >> signal 15, un-registering and exiting. > >> 2016-11-04T19:32:55.632708+01:00 ventsi-clst1 > >> nfsserver(NFSServer)[15978]: INFO: Stopping sm-notify > >> 2016-11-04T19:32:55.650552+01:00 ventsi-clst1 > >> nfsserver(NFSServer)[15978]: INFO: Stopping rpc.statd > >> 2016-11-04T19:32:55.666777+01:00 ventsi-clst1 rpc.statd[15243]: Caught > >> signal 15, un-registering and exiting > >> 2016-11-04T19:32:56.692819+01:00 ventsi-clst1 > >> nfsserver(NFSServer)[15978]: INFO: NFS server stopped > >> 2016-11-04T19:32:56.695523+01:00 ventsi-clst1 crmd[6116]: notice: > >> Operation NFSServer_stop_0: ok (node=ventsi-clst1-sync, call=1220, rc=0, > >> cib-update=1695, confirmed=true) > >> 2016-11-04T19:32:56.696243+01:00 ventsi-clst1 crmd[6116]: notice: > >> Initiating action 12: stop IPaddrNFS_stop_0 on ventsi-clst1-sync (local) > >> 2016-11-04T19:32:56.727882+01:00 ventsi-clst1 IPaddr2(IPaddrNFS)[16108]: > >> INFO: IP status = ok, IP_CIP= > >> 2016-11-04T19:32:56.733383+01:00 ventsi-clst1 crmd[6116]: notice: > >> Operation IPaddrNFS_stop_0: ok (node=ventsi-clst1-sync, call=1222, rc=0, > >> cib-update=1696, confirmed=true) > >> 2016-11-04T19:32:56.733917+01:00 ventsi-clst1 crmd[6116]: notice: > >> Initiating action 48: stop DRBD_global_clst_stop_0 on ventsi-clst1-sync > >> (local) > >> 2016-11-04T19:32:56.757181+01:00 ventsi-clst1 > >> Filesystem(DRBD_global_clst)[16163]: INFO: Running stop for /dev/drbd1 > >> on /drbdmnts/global_clst > >> 2016-11-04T19:32:56.764684+01:00 ventsi-clst1 > >> Filesystem(DRBD_global_clst)[16163]: INFO: Trying to unmount > >> /drbdmnts/global_clst > >> 2016-11-04T19:32:56.771260+01:00 ventsi-clst1 > >> Filesystem(DRBD_global_clst)[16163]: INFO: unmounted > >> /drbdmnts/global_clst successfully > >> 2016-11-04T19:32:56.776640+01:00 ventsi-clst1 crmd[6116]: notice: > >> Operation DRBD_global_clst_stop_0: ok (node=ventsi-clst1-sync, > >> call=1224, rc=0, cib-update=1697, confirmed=true) > >> 2016-11-04T19:32:56.777140+01:00 ventsi-clst1 crmd[6116]: notice: > >> Initiating action 49: start DRBD_global_clst_start_0 on > >> ventsi-clst2-sync <=== hereis the attempt to start the filesystem at > >> the other node, although DRBD has not yet been promoted > >> 2016-11-04T19:32:56.840137+01:00 ventsi-clst1 crmd[6116]: warning: > >> Action 49 (DRBD_global_clst_start_0) on ventsi-clst2-sync failed > >> (target: 0 vs. rc: 1): Error > >> 2016-11-04T19:32:56.840158+01:00 ventsi-clst1 crmd[6116]: notice: > >> Transition aborted by DRBD_global_clst_start_0 'modify' on > >> ventsi-clst2-sync: Event failed > >> (magic=0:1;49:202:0:b7941532-c74b-40cc-a8ad-27b5502b8fba, cib=0.649.4, > >> source=match_graph_event:381, 0) > >> 2016-11-04T19:32:56.840232+01:00 ventsi-clst1 crmd[6116]: warning: > >> Action 49 (DRBD_global_clst_start_0) on ventsi-clst2-sync failed > >> (target: 0 vs. rc: 1): Error > >> 2016-11-04T19:32:56.840328+01:00 ventsi-clst1 crmd[6116]: notice: > >> Transition 202 (Complete=5, Pending=0, Fired=0, Skipped=0, Incomplete=5, > >> Source=/var/lib/pacemaker/pengine/pe-input-766.bz2): Complete > >> 2016-11-04T19:32:56.843693+01:00 ventsi-clst1 pengine[6115]: notice: > >> On loss of CCM Quorum: Ignore > >> 2016-11-04T19:32:56.844072+01:00 ventsi-clst1 pengine[6115]: warning: > >> Processing failed op start for DRBD_global_clst on ventsi-clst2-sync: > >> unknown error (1) > >> 2016-11-04T19:32:56.844102+01:00 ventsi-clst1 pengine[6115]: warning: > >> Processing failed op start for DRBD_global_clst on ventsi-clst2-sync: > >> unknown error (1) > >> 2016-11-04T19:32:56.845071+01:00 ventsi-clst1 pengine[6115]: notice: > >> Start IPaddrNFS#011(ventsi-clst2-sync) > >> 2016-11-04T19:32:56.845078+01:00 ventsi-clst1 pengine[6115]: notice: > >> Start NFSServer#011(ventsi-clst2-sync) > >> 2016-11-04T19:32:56.845081+01:00 ventsi-clst1 pengine[6115]: notice: > >> Demote DRBD:0#011(Master -> Slave ventsi-clst1-sync) <=== here there > >> would be the necessarydemote/promote … but it’s too late; the start of > >> the filesystem already failed… > >> 2016-11-04T19:32:56.845083+01:00 ventsi-clst1 pengine[6115]: notice: > >> Promote DRBD:1#011(Slave -> Master ventsi-clst2-sync) > >> 2016-11-04T19:32:56.845084+01:00 ventsi-clst1 pengine[6115]: notice: > >> Recover DRBD_global_clst#011(Started ventsi-clst2-sync) > >> 2016-11-04T19:32:56.847986+01:00 ventsi-clst1 pengine[6115]: notice: > >> Calculated Transition 203: /var/lib/pacemaker/pengine/pe-input-767.bz2 > >> <=== … so the above transition gets caught by thefollowing attempt to > >> repair things partially > >> 2016-11-04T19:32:56.867679+01:00 ventsi-clst1 pengine[6115]: notice: > >> On loss of CCM Quorum: Ignore > >> 2016-11-04T19:32:56.868074+01:00 ventsi-clst1 pengine[6115]: warning: > >> Processing failed op start for DRBD_global_clst on ventsi-clst2-sync: > >> unknown error (1) > >> 2016-11-04T19:32:56.868101+01:00 ventsi-clst1 pengine[6115]: warning: > >> Processing failed op start for DRBD_global_clst on ventsi-clst2-sync: > >> unknown error (1) > >> 2016-11-04T19:32:56.868287+01:00 ventsi-clst1 pengine[6115]: warning: > >> Forcing DRBD_global_clst away from ventsi-clst2-sync after 1000000 > >> failures (max=1000000) > >> 2016-11-04T19:32:56.869011+01:00 ventsi-clst1 pengine[6115]: notice: > >> Start IPaddrNFS#011(ventsi-clst1-sync) > >> 2016-11-04T19:32:56.869023+01:00 ventsi-clst1 pengine[6115]: notice: > >> Recover DRBD_global_clst#011(Started ventsi-clst2-sync -> > ventsi-clst1-sync) > >> 2016-11-04T19:32:56.869770+01:00 ventsi-clst1 pengine[6115]: notice: > >> Calculated Transition 204: /var/lib/pacemaker/pengine/pe-input-768.bz2 > >> 2016-11-04T19:32:56.870065+01:00 ventsi-clst1 crmd[6116]: notice: > >> Initiating action 3: stop DRBD_global_clst_stop_0 on ventsi-clst2-sync > >> 2016-11-04T19:32:56.908075+01:00 ventsi-clst1 crmd[6116]: notice: > >> Initiating action 42: start DRBD_global_clst_start_0 on > >> ventsi-clst1-sync (local) > >> 2016-11-04T19:32:56.931072+01:00 ventsi-clst1 > >> Filesystem(DRBD_global_clst)[16242]: INFO: Running start for /dev/drbd1 > >> on /drbdmnts/global_clst > >> 2016-11-04T19:32:56.943250+01:00 ventsi-clst1 kernel: EXT4-fs (drbd1): > >> warning: maximal mount count reached, running e2fsck is recommended > >> 2016-11-04T19:32:56.953253+01:00 ventsi-clst1 kernel: EXT4-fs (drbd1): > >> mounted filesystem with ordered data mode. Opts: > >> 2016-11-04T19:32:56.964284+01:00 ventsi-clst1 crmd[6116]: notice: > >> Operation DRBD_global_clst_start_0: ok (node=ventsi-clst1-sync, > >> call=1225, rc=0, cib-update=1701, confirmed=true) > >> 2016-11-04T19:32:56.965104+01:00 ventsi-clst1 crmd[6116]: notice: > >> Initiating action 10: start IPaddrNFS_start_0 on ventsi-clst1-sync (local) > >> 2016-11-04T19:32:56.965325+01:00 ventsi-clst1 crmd[6116]: notice: > >> Initiating action 43: monitor DRBD_global_clst_monitor_20000 on > >> ventsi-clst1-sync (local) > >> 2016-11-04T19:32:56.996235+01:00 ventsi-clst1 IPaddr2(IPaddrNFS)[16308]: > >> INFO: Adding inet address xxx.xxx.xxx.xxx/24 with broadcast address > >> xxx.xxx.xxx.255 to device bond0 > >> 2016-11-04T19:32:57.002059+01:00 ventsi-clst1 IPaddr2(IPaddrNFS)[16308]: > >> INFO: Bringing device bond0 up > >> 2016-11-04T19:32:57.008128+01:00 ventsi-clst1 IPaddr2(IPaddrNFS)[16308]: > >> INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p > >> /var/run/resource-agents/send_arp-xxx.xxx.xxx.xxx bond0 xxx.xxx.xxx.xxx > >> auto not_used not_used > >> 2016-11-04T19:32:57.020159+01:00 ventsi-clst1 crmd[6116]: notice: > >> Operation IPaddrNFS_start_0: ok (node=ventsi-clst1-sync, call=1226, > >> rc=0, cib-update=1703, confirmed=true) > >> 2016-11-04T19:32:57.020901+01:00 ventsi-clst1 crmd[6116]: notice: > >> Initiating action 11: monitor IPaddrNFS_monitor_5000 on > >> ventsi-clst1-sync (local) > >> 2016-11-04T19:32:57.052231+01:00 ventsi-clst1 crmd[6116]: notice: > >> Transition 204 (Complete=6, Pending=0, Fired=0, Skipped=0, Incomplete=0, > >> Source=/var/lib/pacemaker/pengine/pe-input-768.bz2): Complete > >> 2016-11-04T19:32:57.052251+01:00 ventsi-clst1 crmd[6116]: notice: > >> State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > >> cause=C_FSA_INTERNAL origin=notify_crmd ] > >> ================================================================== > >> > >> Any ideas what could be the reason for this behavior? > >> And how could this be fixed? > >> > >> > >> (I already found several articles on the internet with the > >> recommendation to have two separately configured monitor operations for > >> the DRBD resource configured one for the master role and another one for > >> the slave role. > >> Already tried this to no avail.) > >> > >> Regards > >> Andi _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org