Re: [ClusterLabs] No DRBD resource promoted to master in Active/Passive setup

Ken Gaillot Tue, 20 Sep 2016 09:13:43 -0700

On 09/20/2016 07:15 AM, Auer, Jens wrote:
> Hi,
> 
> I did some more tests after updating DRBD to the latest version. The behavior 
> does not change, but I found out that
> - everything works fine when I physically unplug the network cables instead 
> of ifdown'ing the device


BTW that's a more accurate simulation of a network failure.

> - I can see in the log files that the device gets promoted after stopping the 
> initial master node, but then gets immediately demoted. I don't understand 
> why this happens:
> Sep 20 12:08:03 MDA1PFP-S02 crmd[2354]:  notice: Operation ACTIVE_start_0: ok 
> (node=MDA1PFP-PCS02, call=29, rc=0, cib-update=21, confirmed=true)
> Sep 20 12:08:03 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_notify_0: ok 
> (node=MDA1PFP-PCS02, call=28, rc=0, cib-update=0, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 kernel: block drbd1: peer( Primary -> Secondary ) 
> Sep 20 12:08:04 MDA1PFP-S02 IPaddr2(mda-ip)[3528]: INFO: Adding inet address 
> 192.168.120.20/32 with broadcast address 192.168.120.255 to device bond0
> Sep 20 12:08:04 MDA1PFP-S02 avahi-daemon[1084]: Registering new address 
> record for 192.168.120.20 on bond0.IPv4.
> Sep 20 12:08:04 MDA1PFP-S02 IPaddr2(mda-ip)[3528]: INFO: Bringing device 
> bond0 up
> Sep 20 12:08:04 MDA1PFP-S02 IPaddr2(mda-ip)[3528]: INFO: 
> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p 
> /var/run/resource-agents/send_arp-192.168.120.20 bond0 192.168.120.20 auto 
> not_used not_used
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation mda-ip_start_0: ok 
> (node=MDA1PFP-PCS02, call=31, rc=0, cib-update=23, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_notify_0: ok 
> (node=MDA1PFP-PCS02, call=32, rc=0, cib-update=0, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_notify_0: ok 
> (node=MDA1PFP-PCS02, call=34, rc=0, cib-update=0, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: peer( Secondary -> 
> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown ) 
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: ack_receiver terminated
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: Terminating 
> drbd_a_shared_f
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: Connection closed
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: conn( TearDown -> 
> Unconnected ) 
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: receiver terminated
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: Restarting receiver thread
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: receiver (re)started
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: conn( Unconnected -> 
> WFConnection ) 
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_notify_0: ok 
> (node=MDA1PFP-PCS02, call=35, rc=0, cib-update=0, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_notify_0: ok 
> (node=MDA1PFP-PCS02, call=36, rc=0, cib-update=0, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: helper command: 
> /sbin/drbdadm fence-peer shared_fs
> Sep 20 12:08:04 MDA1PFP-S02 crm-fence-peer.sh[3779]: invoked for shared_fs
> Sep 20 12:08:04 MDA1PFP-S02 crm-fence-peer.sh[3779]: INFO peer is not 
> reachable, my disk is UpToDate: placed constraint 
> 'drbd-fence-by-handler-shared_fs-drbd1_sync'
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: helper command: 
> /sbin/drbdadm fence-peer shared_fs exit code 5 (0x500)
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: fence-peer helper 
> returned 5 (peer is unreachable, assumed to be dead)
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: pdsk( DUnknown -> 
> Outdated ) 
> Sep 20 12:08:04 MDA1PFP-S02 kernel: block drbd1: role( Secondary -> Primary ) 

>From these logs, I don't see any request by Pacemaker for DRBD to be
promoted, so I'm wondering if DRBD decided to promote itself here.

> Sep 20 12:08:04 MDA1PFP-S02 kernel: block drbd1: new current UUID 
> 098EF9936C4F4D27:5157BB476E60F5AA:6BC19D97CF96E5D2:6BC09D97CF96E5D2
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:   error: pcmkRegisterNode: Triggered 
> assert at xml.c:594 : node->type == XML_ELEMENT_NODE
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_promote_0: 
> ok (node=MDA1PFP-PCS02, call=37, rc=0, cib-update=25, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_notify_0: ok 
> (node=MDA1PFP-PCS02, call=38, rc=0, cib-update=0, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Our peer on the DC 
> (MDA1PFP-PCS01) is dead

Here, Pacemaker lost corosync connectivity to its peer. Isn't corosync
traffic on a separate interface? Or is this a different test than before?

> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: State transition S_NOT_DC -> 
> S_ELECTION [ input=I_ELECTION cause=C_CRMD_STATUS_CALLBACK 
> origin=peer_update_callback ]
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: State transition S_ELECTION 
> -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_TIMER_POPPED 
> origin=election_timeout_popped ]
> Sep 20 12:08:04 MDA1PFP-S02 attrd[2351]:  notice: crm_update_peer_proc: Node 
> MDA1PFP-PCS01[1] - state is now lost (was member)
> Sep 20 12:08:04 MDA1PFP-S02 attrd[2351]:  notice: Removing all MDA1PFP-PCS01 
> attributes for attrd_peer_change_cb
> Sep 20 12:08:04 MDA1PFP-S02 attrd[2351]:  notice: Lost attribute writer 
> MDA1PFP-PCS01
> Sep 20 12:08:04 MDA1PFP-S02 attrd[2351]:  notice: Removing MDA1PFP-PCS01/1 
> from the membership list
> Sep 20 12:08:04 MDA1PFP-S02 attrd[2351]:  notice: Purged 1 peers with id=1 
> and/or uname=MDA1PFP-PCS01 from the membership cache
> Sep 20 12:08:04 MDA1PFP-S02 stonith-ng[2349]:  notice: crm_update_peer_proc: 
> Node MDA1PFP-PCS01[1] - state is now lost (was member)
> Sep 20 12:08:04 MDA1PFP-S02 stonith-ng[2349]:  notice: Removing 
> MDA1PFP-PCS01/1 from the membership list
> Sep 20 12:08:04 MDA1PFP-S02 stonith-ng[2349]:  notice: Purged 1 peers with 
> id=1 and/or uname=MDA1PFP-PCS01 from the membership cache
> Sep 20 12:08:04 MDA1PFP-S02 cib[2348]:  notice: crm_update_peer_proc: Node 
> MDA1PFP-PCS01[1] - state is now lost (was member)
> Sep 20 12:08:04 MDA1PFP-S02 cib[2348]:  notice: Removing MDA1PFP-PCS01/1 from 
> the membership list
> Sep 20 12:08:04 MDA1PFP-S02 cib[2348]:  notice: Purged 1 peers with id=1 
> and/or uname=MDA1PFP-PCS01 from the membership cache
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]: warning: FSA: Input I_ELECTION_DC 
> from do_election_check() received in state S_INTEGRATION
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Notifications disabled
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:   error: pcmkRegisterNode: Triggered 
> assert at xml.c:594 : node->type == XML_ELEMENT_NODE
> Sep 20 12:08:04 MDA1PFP-S02 pengine[2353]:  notice: On loss of CCM Quorum: 
> Ignore
> Sep 20 12:08:04 MDA1PFP-S02 pengine[2353]:  notice: Demote  drbd1:0   (Master 
> -> Slave MDA1PFP-PCS02)
> Sep 20 12:08:04 MDA1PFP-S02 pengine[2353]:  notice: Calculated Transition 0: 
> /var/lib/pacemaker/pengine/pe-input-1813.bz2
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Initiating action 55: notify 
> drbd1_pre_notify_demote_0 on MDA1PFP-PCS02 (local)
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_notify_0: ok 
> (node=MDA1PFP-PCS02, call=39, rc=0, cib-update=0, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Initiating action 18: demote 
> drbd1_demote_0 on MDA1PFP-PCS02 (local)

The demote is requested by Pacemaker.

You can get more info from the pe-input-1813.bz2 file referenced above,
e.g. "crm_simulate -Ssx /var/lib/pacemaker/pengine/pe-input-1813.bz2"
should show the scores and planned actions. It's not the easiest to read
but it has some good info.

> 
> Best wishes,
>   Jens
> 
> --
> Jens Auer | CGI | Software-Engineer
> CGI (Germany) GmbH & Co. KG
> Rheinstraße 95 | 64295 Darmstadt | Germany
> T: +49 6151 36860 154
> jens.a...@cgi.com
> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter 
> de.cgi.com/pflichtangaben.
> 
> CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging to CGI 
> Group Inc. and its affiliates may be contained in this message. If you are 
> not a recipient indicated or intended in this message (or responsible for 
> delivery of this message to such person), or you think for any reason that 
> this message may have been addressed to you in error, you may not use or copy 
> or deliver this message to anyone else. In such case, you should destroy this 
> message and are asked to notify the sender by reply e-mail.
> 
> ________________________________________
> Von: Ken Gaillot [kgail...@redhat.com]
> Gesendet: Montag, 19. September 2016 17:27
> An: Auer, Jens; Cluster Labs - All topics related to open-source clustering 
> welcomed
> Betreff: Re: AW: [ClusterLabs] No DRBD resource promoted to master in 
> Active/Passive setup
> 
> On 09/19/2016 09:48 AM, Auer, Jens wrote:
>> Hi,
>>
>>> Is the network interface being taken down here used for corosync
>>> communication? If so, that is a node-level failure, and pacemaker will
>>> fence.
>>
>> We have different connections on each server:
>> - A bonded 10GB network card for data traffic that will be accessed via a 
>> virtual ip managed by pacemaker in 192.168.120.1/24. In the cluster nodes 
>> MDA1PFP-S01 and MDA1PFP-S02 are assigned to 192.168.120.10 and 
>> 192.168.120.11.
>>
>> - A dedicated back-to-back connection for corosync heartbeats in 
>> 192.168.121.1/24. MDA1PFP-PCS01 and MDA1PFP-S02 are assigned to 
>> 192.168.121.10 and 192.168.121.11. When the cluster is created, we use these 
>> as primary node names and use the 10GB device as a second backup connection 
>> for increased reliability: pcs cluster setup --name MDA1PFP 
>> MDA1PFP-PCS01,MDA1PFP-S01 MDA1PFP-PCS02,MDA1PFP-S02
>>
>> - A dedicated back-to-back connection for drbd in 192.168.122.1/24. Hosts 
>> MDA1PFP-DRBD01 and MDA1PFP-DRBD02 are assigned 192.168.23.10 and 
>> 192.168.123.11.
> 
> Ah, nice.
> 
>> Given that I think it is not a node-level failure. pcs status also reports 
>> the nodes as online. I think this should not trigger fencing from pacemaker.
>>
>>> When DRBD is configured with 'fencing resource-only' and 'fence-peer
>>> "/usr/lib/drbd/crm-fence-peer.sh";', and DRBD detects a network outage,
>>> it will try to add a constraint that prevents the other node from
>>> becoming master. It removes the constraint when connectivity is restored.
>>
>>> I am not familiar with all the under-the-hood details, but IIUC, if
>>> pacemaker actually fences the node, then the other node can still take
>>> over the DRBD. But if there is a network outage and no pacemaker
>>> fencing, then you'll see the behavior you describe -- DRBD prevents
>>> master takeover, to avoid stale data being used.
>>
>> This is my understanding as well, but there should be no network outage for 
>> DRBD. I can reproduce the behavior by stopping cluster nodes which DRBD 
>> seems to interpret as network outages since it cannot communicate with the 
>> shutdown node anymore. Maybe I should ask on the DRBD mailing list?
> 
> OK, I think I follow you now: you're ifdown'ing the data traffic
> interface, but the interfaces for both corosync and DRBD traffic are
> still up. So, pacemaker detects the virtual IP failure on the traffic
> interface, and correctly recovers the IP on the other node, but the DRBD
> master role is not recovered.
> 
> If the behavior goes away when you remove the DRBD fencing config, then
> it sounds like DRBD is seeing it as a network outage, and is adding the
> constraint to prevent a stale master. Yes, I think that would be worth
> bringing up on the DRBD list, though there might be some DRBD users here
> who can chime in, too.
> 
>> Cheers,
>>   Jens
>> --
>> Jens Auer | CGI | Software-Engineer
>> CGI (Germany) GmbH & Co. KG
>> Rheinstraße 95 | 64295 Darmstadt | Germany
>> T: +49 6151 36860 154
>> jens.a...@cgi.com
>> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter 
>> de.cgi.com/pflichtangaben.
>>
>> CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging to 
>> CGI Group Inc. and its affiliates may be contained in this message. If you 
>> are not a recipient indicated or intended in this message (or responsible 
>> for delivery of this message to such person), or you think for any reason 
>> that this message may have been addressed to you in error, you may not use 
>> or copy or deliver this message to anyone else. In such case, you should 
>> destroy this message and are asked to notify the sender by reply e-mail.
>>
>> ________________________________________
>> Von: Ken Gaillot [kgail...@redhat.com]
>> Gesendet: Montag, 19. September 2016 16:28
>> An: Auer, Jens; Cluster Labs - All topics related to open-source clustering 
>> welcomed
>> Betreff: Re: [ClusterLabs] No DRBD resource promoted to master in 
>> Active/Passive setup
>>
>> On 09/19/2016 02:31 AM, Auer, Jens wrote:
>>> Hi,
>>>
>>> I am not sure that pacemaker should do any fencing here. In my setting, 
>>> corosync is configured to use a back-to-back connection for heartbeats. 
>>> This is different subnet then used by the ping resource that checks the 
>>> network connectivity and detects a failure. In my test, I bring down the 
>>> network device used by ping and this triggers the failover. The node status 
>>> is known by pacemaker since it receives heartbeats and it only a resource 
>>> failure. I asked for fencing conditions a few days ago, and basically was 
>>> asserted that resource failure should not trigger STONITH actions if not 
>>> explicitly configured.
>>
>> Is the network interface being taken down here used for corosync
>> communication? If so, that is a node-level failure, and pacemaker will
>> fence.
>>
>> There is a bit of a distinction between DRBD fencing and pacemaker
>> fencing. The DRBD configuration is designed so that DRBD's fencing
>> method is to go through pacemaker.
>>
>> When DRBD is configured with 'fencing resource-only' and 'fence-peer
>> "/usr/lib/drbd/crm-fence-peer.sh";', and DRBD detects a network outage,
>> it will try to add a constraint that prevents the other node from
>> becoming master. It removes the constraint when connectivity is restored.
>>
>> I am not familiar with all the under-the-hood details, but IIUC, if
>> pacemaker actually fences the node, then the other node can still take
>> over the DRBD. But if there is a network outage and no pacemaker
>> fencing, then you'll see the behavior you describe -- DRBD prevents
>> master takeover, to avoid stale data being used.
>>
>>
>>> I am also wondering why this is "sticky". After a failover test the DRBD 
>>> resources are not working even if I restart the cluster on all nodes.
>>>
>>> Best wishes,
>>>   Jens
>>>
>>> --
>>> Dr. Jens Auer | CGI | Software Engineer
>>> CGI Deutschland Ltd. & Co. KG
>>> Rheinstraße 95 | 64295 Darmstadt | Germany
>>> T: +49 6151 36860 154
>>> jens.a...@cgi.com
>>> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter 
>>> de.cgi.com/pflichtangaben.
>>>
>>> CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging to 
>>> CGI Group Inc. and its affiliates may be contained in this message. If you 
>>> are not a recipient indicated or intended in this message (or responsible 
>>> for delivery of this message to such person), or you think for any reason 
>>> that this message may have been addressed to you in error, you may not use 
>>> or copy or deliver this message to anyone else. In such case, you should 
>>> destroy this message and are asked to notify the sender by reply e-mail.
>>>
>>>> -----Original Message-----
>>>> From: Ken Gaillot [mailto:kgail...@redhat.com]
>>>> Sent: 16 September 2016 17:56
>>>> To: users@clusterlabs.org
>>>> Subject: Re: [ClusterLabs] No DRBD resource promoted to master in 
>>>> Active/Passive
>>>> setup
>>>>
>>>> On 09/16/2016 10:02 AM, Auer, Jens wrote:
>>>>> Hi,
>>>>>
>>>>> I have an Active/Passive configuration with a drbd mast/slave resource:
>>>>>
>>>>> MDA1PFP-S01 14:40:27 1803 0 ~ # pcs status Cluster name: MDA1PFP
>>>>> Last updated: Fri Sep 16 14:41:18 2016        Last change: Fri Sep 16
>>>>> 14:39:49 2016 by root via cibadmin on MDA1PFP-PCS01
>>>>> Stack: corosync
>>>>> Current DC: MDA1PFP-PCS02 (version 1.1.13-10.el7-44eb2dd) - partition
>>>>> with quorum
>>>>> 2 nodes and 7 resources configured
>>>>>
>>>>> Online: [ MDA1PFP-PCS01 MDA1PFP-PCS02 ]
>>>>>
>>>>> Full list of resources:
>>>>>
>>>>>  Master/Slave Set: drbd1_sync [drbd1]
>>>>>      Masters: [ MDA1PFP-PCS02 ]
>>>>>      Slaves: [ MDA1PFP-PCS01 ]
>>>>>  mda-ip    (ocf::heartbeat:IPaddr2):    Started MDA1PFP-PCS02
>>>>>  Clone Set: ping-clone [ping]
>>>>>      Started: [ MDA1PFP-PCS01 MDA1PFP-PCS02 ]
>>>>>  ACTIVE    (ocf::heartbeat:Dummy):    Started MDA1PFP-PCS02
>>>>>  shared_fs    (ocf::heartbeat:Filesystem):    Started MDA1PFP-PCS02
>>>>>
>>>>> PCSD Status:
>>>>>   MDA1PFP-PCS01: Online
>>>>>   MDA1PFP-PCS02: Online
>>>>>
>>>>> Daemon Status:
>>>>>   corosync: active/disabled
>>>>>   pacemaker: active/disabled
>>>>>   pcsd: active/enabled
>>>>>
>>>>> MDA1PFP-S01 14:41:19 1804 0 ~ # pcs resource --full
>>>>>  Master: drbd1_sync
>>>>>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2
>>>>> clone-node-max=1 notify=true
>>>>>   Resource: drbd1 (class=ocf provider=linbit type=drbd)
>>>>>    Attributes: drbd_resource=shared_fs
>>>>>    Operations: start interval=0s timeout=240 (drbd1-start-interval-0s)
>>>>>                promote interval=0s timeout=90 (drbd1-promote-interval-0s)
>>>>>                demote interval=0s timeout=90 (drbd1-demote-interval-0s)
>>>>>                stop interval=0s timeout=100 (drbd1-stop-interval-0s)
>>>>>                monitor interval=60s (drbd1-monitor-interval-60s)
>>>>>  Resource: mda-ip (class=ocf provider=heartbeat type=IPaddr2)
>>>>>   Attributes: ip=192.168.120.20 cidr_netmask=32 nic=bond0
>>>>>   Operations: start interval=0s timeout=20s (mda-ip-start-interval-0s)
>>>>>               stop interval=0s timeout=20s (mda-ip-stop-interval-0s)
>>>>>               monitor interval=1s (mda-ip-monitor-interval-1s)
>>>>>  Clone: ping-clone
>>>>>   Resource: ping (class=ocf provider=pacemaker type=ping)
>>>>>    Attributes: dampen=5s multiplier=1000 host_list=pf-pep-dev-1
>>>>> timeout=1 attempts=3
>>>>>    Operations: start interval=0s timeout=60 (ping-start-interval-0s)
>>>>>                stop interval=0s timeout=20 (ping-stop-interval-0s)
>>>>>                monitor interval=1 (ping-monitor-interval-1)
>>>>>  Resource: ACTIVE (class=ocf provider=heartbeat type=Dummy)
>>>>>   Operations: start interval=0s timeout=20 (ACTIVE-start-interval-0s)
>>>>>               stop interval=0s timeout=20 (ACTIVE-stop-interval-0s)
>>>>>               monitor interval=10 timeout=20
>>>>> (ACTIVE-monitor-interval-10)
>>>>>  Resource: shared_fs (class=ocf provider=heartbeat type=Filesystem)
>>>>>   Attributes: device=/dev/drbd1 directory=/shared_fs fstype=xfs
>>>>>   Operations: start interval=0s timeout=60 (shared_fs-start-interval-0s)
>>>>>               stop interval=0s timeout=60 (shared_fs-stop-interval-0s)
>>>>>               monitor interval=20 timeout=40
>>>>> (shared_fs-monitor-interval-20)
>>>>>
>>>>> MDA1PFP-S01 14:41:35 1805 0 ~ # pcs constraint --full Location
>>>>> Constraints:
>>>>>   Resource: mda-ip
>>>>>     Enabled on: MDA1PFP-PCS01 (score:50)
>>>>> (id:location-mda-ip-MDA1PFP-PCS01-50)
>>>>>     Constraint: location-mda-ip
>>>>>       Rule: score=-INFINITY boolean-op=or  (id:location-mda-ip-rule)
>>>>>         Expression: pingd lt 1  (id:location-mda-ip-rule-expr)
>>>>>         Expression: not_defined pingd
>>>>> (id:location-mda-ip-rule-expr-1) Ordering Constraints:
>>>>>   start ping-clone then start mda-ip (kind:Optional)
>>>>> (id:order-ping-clone-mda-ip-Optional)
>>>>>   promote drbd1_sync then start shared_fs (kind:Mandatory)
>>>>> (id:order-drbd1_sync-shared_fs-mandatory)
>>>>> Colocation Constraints:
>>>>>   ACTIVE with mda-ip (score:INFINITY) 
>>>>> (id:colocation-ACTIVE-mda-ip-INFINITY)
>>>>>   drbd1_sync with mda-ip (score:INFINITY) (rsc-role:Master)
>>>>> (with-rsc-role:Started) (id:colocation-drbd1_sync-mda-ip-INFINITY)
>>>>>   shared_fs with drbd1_sync (score:INFINITY) (rsc-role:Started)
>>>>> (with-rsc-role:Master) (id:colocation-shared_fs-drbd1_sync-INFINITY)
>>>>>
>>>>> The cluster starts fine, except resources starting not on the
>>>>> preferred host. I asked this in a different question to keep things 
>>>>> separated.
>>>>> The status after starting is:
>>>>> Last updated: Fri Sep 16 14:39:57 2016          Last change: Fri Sep 16
>>>>> 14:39:49 2016 by root via cibadmin on MDA1PFP-PCS01
>>>>> Stack: corosync
>>>>> Current DC: MDA1PFP-PCS02 (version 1.1.13-10.el7-44eb2dd) - partition
>>>>> with quorum
>>>>> 2 nodes and 7 resources configured
>>>>>
>>>>> Online: [ MDA1PFP-PCS01 MDA1PFP-PCS02 ]
>>>>>
>>>>>  Master/Slave Set: drbd1_sync [drbd1]
>>>>>      Masters: [ MDA1PFP-PCS02 ]
>>>>>      Slaves: [ MDA1PFP-PCS01 ]
>>>>> mda-ip  (ocf::heartbeat:IPaddr2):    Started MDA1PFP-PCS02
>>>>>  Clone Set: ping-clone [ping]
>>>>>      Started: [ MDA1PFP-PCS01 MDA1PFP-PCS02 ] ACTIVE
>>>>> (ocf::heartbeat:Dummy): Started MDA1PFP-PCS02
>>>>> shared_fs    (ocf::heartbeat:Filesystem):    Started MDA1PFP-PCS02
>>>>>
>>>>> From this state, I did two tests to simulate a cluster failover:
>>>>> 1. Shutdown the cluster node with the master with pcs cluster stop 2.
>>>>> Disable the network device for the virtual ip with ifdown and wait
>>>>> until ping detects it
>>>>>
>>>>> In both cases, the failover is executed but the drbd is not promoted
>>>>> to master on the new active node:
>>>>> Last updated: Fri Sep 16 14:43:33 2016          Last change: Fri Sep 16
>>>>> 14:43:31 2016 by root via cibadmin on MDA1PFP-PCS01
>>>>> Stack: corosync
>>>>> Current DC: MDA1PFP-PCS01 (version 1.1.13-10.el7-44eb2dd) - partition
>>>>> with quorum
>>>>> 2 nodes and 7 resources configured
>>>>>
>>>>> Online: [ MDA1PFP-PCS01 ]
>>>>> OFFLINE: [ MDA1PFP-PCS02 ]
>>>>>
>>>>>  Master/Slave Set: drbd1_sync [drbd1]
>>>>>      Slaves: [ MDA1PFP-PCS01 ]
>>>>> mda-ip  (ocf::heartbeat:IPaddr2):    Started MDA1PFP-PCS01
>>>>>  Clone Set: ping-clone [ping]
>>>>>      Started: [ MDA1PFP-PCS01 ]
>>>>> ACTIVE  (ocf::heartbeat:Dummy): Started MDA1PFP-PCS01
>>>>>
>>>>> I was able to trace this to the fencing in the drbd configuration
>>>>> MDA1PFP-S01 14:41:44 1806 0 ~ # cat /etc/drbd.d/shared_fs.res resource
>>>>> shared_fs {
>>>>> disk    /dev/mapper/rhel_mdaf--pf--pep--1-drbd;
>>>>>   disk {
>>>>>     fencing resource-only;
>>>>>   }
>>>>>   handlers {
>>>>>     fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
>>>>>     after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
>>>>>   }
>>>>>     device    /dev/drbd1;
>>>>>     meta-disk internal;
>>>>>     on MDA1PFP-S01 {
>>>>>         address 192.168.123.10:7789;
>>>>>     }
>>>>>     on MDA1PFP-S02 {
>>>>>         address 192.168.123.11:7789;
>>>>>     }
>>>>> }
>>>>
>>>> This coordinates fencing between DRBD and pacemaker. You still have to 
>>>> configure
>>>> fencing in pacemaker. If pacemaker can't fence the unseen node, it can't 
>>>> be sure it's
>>>> safe to bring up master.
>>>>
>>>>> I am using drbd 8.4.7, drbd utils 8.9.5 and pacemaker 2.3.4-7.el7 with
>>>>> corosyinc 0.9.143-15.el7 from the Centos7 repositories.
>>>>>
>>>>> MDA1PFP-S01 15:00:20 1841 0 ~ # drbdadm --version
>>>>> DRBDADM_BUILDTAG=GIT-hash:\
>>>> 5d50d9fb2a967d21c0f5746370ccc066d3a67f7d\
>>>>> build\ by\ mockbuild@\,\ 2016-01-12\ 12:46:45
>>>>> DRBDADM_API_VERSION=1
>>>>> DRBD_KERNEL_VERSION_CODE=0x080407
>>>>> DRBDADM_VERSION_CODE=0x080905
>>>>> DRBDADM_VERSION=8.9.5
>>>>>
>>>>> If I disable the fencing scripts everything works as expected. If
>>>>> enabled, no node is promoted to master after failover. It seems to be
>>>>> a sticky modificaton because once a failover is simulated with fencing
>>>>> scripts activated I cannot get the cluster to work anymore. Even
>>>>> removing the setting from the DRBD configuration does not help.
>>>>>
>>>>> I captured the complete log from /var/log/messages from cluster start
>>>>> to failover if that helps:
>>>>> MDA1PFP-S01 14:48:37 1807 0 ~ # cat /var/log/messages Sep 16 14:40:16
>>>>> MDA1PFP-S01 rsyslogd: [origin software="rsyslogd"
>>>>> swVersion="7.4.7" x-pid="13857" x-info="http://www.rsyslog.com";] start
>>>>> Sep 16 14:40:16 MDA1PFP-S01 rsyslogd-2221: module 'imuxsock' already
>>>>> in this config, cannot be added  [try http://www.rsyslog.com/e/2221 ]
>>>>> Sep 16 14:40:16 MDA1PFP-S01 systemd: Stopping System Logging Service...
>>>>> Sep 16 14:40:16 MDA1PFP-S01 systemd: Starting System Logging Service...
>>>>> Sep 16 14:40:16 MDA1PFP-S01 systemd: Started System Logging Service.
>>>>> Sep 16 14:40:27 MDA1PFP-S01 systemd: Started Corosync Cluster Engine.
>>>>> Sep 16 14:40:27 MDA1PFP-S01 systemd: Started Pacemaker High
>>>>> Availability Cluster Manager.
>>>>> Sep 16 14:43:30 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> ACTIVE_start_0: ok (node=MDA1PFP-PCS01, call=33, rc=0, cib-update=22,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:30 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=32, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:30 MDA1PFP-S01 IPaddr2(mda-ip)[15321]: INFO: Adding inet
>>>>> address 192.168.120.20/32 with broadcast address 192.168.120.255 to
>>>>> device bond0 Sep 16 14:43:30 MDA1PFP-S01 avahi-daemon[912]:
>>>>> Registering new address record for 192.168.120.20 on bond0.IPv4.
>>>>> Sep 16 14:43:30 MDA1PFP-S01 IPaddr2(mda-ip)[15321]: INFO: Bringing
>>>>> device bond0 up Sep 16 14:43:30 MDA1PFP-S01 kernel: block drbd1: peer(
>>>>> Primary -> Secondary ) Sep 16 14:43:30 MDA1PFP-S01
>>>>> IPaddr2(mda-ip)[15321]: INFO:
>>>>> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>>>>> /var/run/resource-agents/send_arp-192.168.120.20 bond0 192.168.120.20
>>>>> auto not_used not_used Sep 16 14:43:30 MDA1PFP-S01 crmd[13130]:
>>>>> notice: Operation
>>>>> mda-ip_start_0: ok (node=MDA1PFP-PCS01, call=35, rc=0, cib-update=24,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:30 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=36, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:30 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=38, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:30 MDA1PFP-S01 kernel: drbd shared_fs: peer( Secondary ->
>>>>> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
>>>>> Sep 16 14:43:30 MDA1PFP-S01 kernel: drbd shared_fs: ack_receiver
>>>>> terminated Sep 16 14:43:30 MDA1PFP-S01 kernel: drbd shared_fs:
>>>>> Terminating drbd_a_shared_f Sep 16 14:43:31 MDA1PFP-S01 kernel: drbd
>>>>> shared_fs: Connection closed Sep 16 14:43:31 MDA1PFP-S01 kernel: drbd
>>>>> shared_fs: conn( TearDown -> Unconnected ) Sep 16 14:43:31 MDA1PFP-S01
>>>>> kernel: drbd shared_fs: receiver terminated Sep 16 14:43:31
>>>>> MDA1PFP-S01 kernel: drbd shared_fs: Restarting receiver thread Sep 16
>>>>> 14:43:31 MDA1PFP-S01 kernel: drbd shared_fs: receiver (re)started Sep
>>>>> 16 14:43:31 MDA1PFP-S01 kernel: drbd shared_fs: conn( Unconnected ->
>>>>> WFConnection ) Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice:
>>>>> Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=39, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=40, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 kernel: drbd shared_fs: helper command:
>>>>> /sbin/drbdadm fence-peer shared_fs
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crm-fence-peer.sh[15569]: invoked for
>>>>> shared_fs Sep 16 14:43:31 MDA1PFP-S01 crm-fence-peer.sh[15569]: INFO
>>>>> peer is not reachable, my disk is UpToDate: placed constraint
>>>>> 'drbd-fence-by-handler-shared_fs-drbd1_sync'
>>>>> Sep 16 14:43:31 MDA1PFP-S01 kernel: drbd shared_fs: helper command:
>>>>> /sbin/drbdadm fence-peer shared_fs exit code 5 (0x500) Sep 16 14:43:31
>>>>> MDA1PFP-S01 kernel: drbd shared_fs: fence-peer helper returned 5 (peer
>>>>> is unreachable, assumed to be dead) Sep 16 14:43:31 MDA1PFP-S01
>>>>> kernel: drbd shared_fs: pdsk( DUnknown -> Outdated ) Sep 16 14:43:31
>>>>> MDA1PFP-S01 kernel: block drbd1: role( Secondary -> Primary ) Sep 16
>>>>> 14:43:31 MDA1PFP-S01 kernel: block drbd1: new current UUID
>>>>>
>>>> B1FC3E9C008711DD:C02542C7B26F9B28:BCC6102B1FD69768:BCC5102B1FD697
>>>> 68
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:   error: pcmkRegisterNode:
>>>>> Triggered assert at xml.c:594 : node->type == XML_ELEMENT_NODE Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_promote_0: ok (node=MDA1PFP-PCS01, call=41, rc=0, cib-update=26,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=42, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Our peer on the DC
>>>>> (MDA1PFP-PCS02) is dead
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: State transition
>>>>> S_NOT_DC -> S_ELECTION [ input=I_ELECTION
>>>> cause=C_CRMD_STATUS_CALLBACK
>>>>> origin=peer_update_callback ] Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:
>>>>> notice: State transition S_ELECTION -> S_INTEGRATION [
>>>>> input=I_ELECTION_DC cause=C_TIMER_POPPED
>>>>> origin=election_timeout_popped ] Sep 16 14:43:31 MDA1PFP-S01
>>>>> attrd[13128]:  notice: crm_update_peer_proc:
>>>>> Node MDA1PFP-PCS02[2] - state is now lost (was member) Sep 16 14:43:31
>>>>> MDA1PFP-S01 attrd[13128]:  notice: Removing all
>>>>> MDA1PFP-PCS02 attributes for attrd_peer_change_cb Sep 16 14:43:31
>>>>> MDA1PFP-S01 attrd[13128]:  notice: Lost attribute writer
>>>>> MDA1PFP-PCS02
>>>>> Sep 16 14:43:31 MDA1PFP-S01 attrd[13128]:  notice: Removing
>>>>> MDA1PFP-PCS02/2 from the membership list Sep 16 14:43:31 MDA1PFP-S01
>>>>> attrd[13128]:  notice: Purged 1 peers with
>>>>> id=2 and/or uname=MDA1PFP-PCS02 from the membership cache Sep 16
>>>>> 14:43:31 MDA1PFP-S01 stonith-ng[13125]:  notice:
>>>>> crm_update_peer_proc: Node MDA1PFP-PCS02[2] - state is now lost (was
>>>>> member) Sep 16 14:43:31 MDA1PFP-S01 stonith-ng[13125]:  notice:
>>>>> Removing
>>>>> MDA1PFP-PCS02/2 from the membership list Sep 16 14:43:31 MDA1PFP-S01
>>>>> stonith-ng[13125]:  notice: Purged 1 peers with id=2 and/or
>>>>> uname=MDA1PFP-PCS02 from the membership cache Sep 16 14:43:31
>>>>> MDA1PFP-S01 cib[13124]:  notice: crm_update_peer_proc:
>>>>> Node MDA1PFP-PCS02[2] - state is now lost (was member) Sep 16 14:43:31
>>>>> MDA1PFP-S01 cib[13124]:  notice: Removing
>>>>> MDA1PFP-PCS02/2 from the membership list Sep 16 14:43:31 MDA1PFP-S01
>>>>> cib[13124]:  notice: Purged 1 peers with
>>>>> id=2 and/or uname=MDA1PFP-PCS02 from the membership cache Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]: warning: FSA: Input I_ELECTION_DC
>>>>> from do_election_check() received in state S_INTEGRATION Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Notifications disabled
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:   error: pcmkRegisterNode:
>>>>> Triggered assert at xml.c:594 : node->type == XML_ELEMENT_NODE Sep 16
>>>>> 14:43:31 MDA1PFP-S01 pengine[13129]:  notice: On loss of CCM
>>>>> Quorum: Ignore
>>>>> Sep 16 14:43:31 MDA1PFP-S01 pengine[13129]:  notice: Demote  drbd1:0
>>>>> (Master -> Slave MDA1PFP-PCS01)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 pengine[13129]:  notice: Calculated
>>>>> Transition 0: /var/lib/pacemaker/pengine/pe-input-414.bz2
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Initiating action 55:
>>>>> notify drbd1_pre_notify_demote_0 on MDA1PFP-PCS01 (local) Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=43, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Initiating action 8:
>>>>> demote drbd1_demote_0 on MDA1PFP-PCS01 (local) Sep 16 14:43:31
>>>>> MDA1PFP-S01 systemd-udevd: error: /dev/drbd1: Wrong medium type Sep 16
>>>>> 14:43:31 MDA1PFP-S01 kernel: block drbd1: role( Primary -> Secondary )
>>>>> Sep 16 14:43:31 MDA1PFP-S01 kernel: block drbd1: bitmap WRITE of 0
>>>>> pages took 0 jiffies Sep 16 14:43:31 MDA1PFP-S01 kernel: block drbd1:
>>>>> 0 KB (0 bits) marked out-of-sync by on disk bit-map.
>>>>> Sep 16 14:43:31 MDA1PFP-S01 systemd-udevd: error: /dev/drbd1: Wrong
>>>>> medium type
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:   error: pcmkRegisterNode:
>>>>> Triggered assert at xml.c:594 : node->type == XML_ELEMENT_NODE Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_demote_0: ok (node=MDA1PFP-PCS01, call=44, rc=0, cib-update=49,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Initiating action 56:
>>>>> notify drbd1_post_notify_demote_0 on MDA1PFP-PCS01 (local) Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=45, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Initiating action 10:
>>>>> monitor drbd1_monitor_60000 on MDA1PFP-PCS01 (local) Sep 16 14:43:31
>>>>> MDA1PFP-S01 corosync[13019]: [TOTEM ] A new membership
>>>>> (192.168.121.10:988) was formed. Members left: 2 Sep 16 14:43:31
>>>>> MDA1PFP-S01 corosync[13019]: [QUORUM] Members[1]: 1 Sep 16 14:43:31
>>>>> MDA1PFP-S01 corosync[13019]: [MAIN  ] Completed service
>>>>> synchronization, ready to provide service.
>>>>> Sep 16 14:43:31 MDA1PFP-S01 pacemakerd[13113]:  notice:
>>>>> crm_reap_unseen_nodes: Node MDA1PFP-PCS02[2] - state is now lost (was
>>>>> member)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: crm_reap_unseen_nodes:
>>>>> Node MDA1PFP-PCS02[2] - state is now lost (was member) Sep 16 14:43:31
>>>>> MDA1PFP-S01 crmd[13130]: warning: No match for shutdown action on 2
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Stonith/shutdown of
>>>>> MDA1PFP-PCS02 not matched
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Transition aborted:
>>>>> Node failure (source=peer_update_callback:252, 0)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:   error: pcmkRegisterNode:
>>>>> Triggered assert at xml.c:594 : node->type == XML_ELEMENT_NODE Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Transition 0 (Complete=10,
>>>>> Pending=0, Fired=0, Skipped=0, Incomplete=0,
>>>>> Source=/var/lib/pacemaker/pengine/pe-input-414.bz2): Complete Sep 16
>>>>> 14:43:31 MDA1PFP-S01 pengine[13129]:  notice: On loss of CCM
>>>>> Quorum: Ignore
>>>>> Sep 16 14:43:31 MDA1PFP-S01 pengine[13129]:  notice: Calculated
>>>>> Transition 1: /var/lib/pacemaker/pengine/pe-input-415.bz2
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Transition 1
>>>>> (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
>>>>> Source=/var/lib/pacemaker/pengine/pe-input-415.bz2): Complete Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: State transition
>>>>> S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
>>>>> cause=C_FSA_INTERNAL origin=notify_crmd ] Sep 16 14:48:48 MDA1PFP-S01
>>>>> chronyd[909]: Source 62.116.162.126 replaced with 46.182.19.75
>>>>>
>>>>> Any help appreciated,
>>>>>   Jens
>>>>>
>>>>>
>>>>> --
>>>>> *Jens Auer *| CGI | Software-Engineer
>>>>> CGI (Germany) GmbH & Co. KG
>>>>> Rheinstraße 95 | 64295 Darmstadt | Germany
>>>>> T: +49 6151 36860 154
>>>>> _jens.auer@cgi.com_ <mailto:jens.a...@cgi.com> Unsere Pflichtangaben
>>>>> gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter
>>>>> _de.cgi.com/pflichtangaben_ <http://de.cgi.com/pflichtangaben>.

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] No DRBD resource promoted to master in Active/Passive setup

Reply via email to