Re: [ClusterLabs] reproducible split brain
On Wed, Mar 16, 2016, at 04:00 PM, Digimer wrote: > On 16/03/16 03:59 PM, Christopher Harvey wrote: > > I am able to create a split brain situation in corosync 1.1.13 using > > iptables in a 3 node cluster. > > > > I have 3 nodes, vmr-132-3, vmr-132-4, and vmr-132-5 > > > > All nodes are operational and form a 3 node cluster with all nodes are > > members of that ring. > > vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ] > > vmr-132-4 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ] > > vmr-132-5 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ] > > so far so good. > > > > running the following on vmr-132-4 drops all incoming (but not outgoing) > > packets from vmr-132-3: > > # iptables -I INPUT -s 192.168.132.3 -j DROP > > # iptables -L > > Chain INPUT (policy ACCEPT) > > target prot opt source destination > > DROP all -- 192.168.132.3anywhere > > > > Chain FORWARD (policy ACCEPT) > > target prot opt source destination > > > > Chain OUTPUT (policy ACCEPT) > > target prot opt source destination > > > > vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ] > > vmr-132-4 ---> Online: [ vmr-132-4 vmr-132-5 ] > > vmr-132-5 ---> Online: [ vmr-132-4 vmr-132-5 ] > > > > vmr-132-3 thinks everything is normal and continues to provide service, > > vmr-132-4 and 5 form a new ring, achieve quorum and provide the same > > service. Splitting the link between 3 and 4 in both directions isolates > > vmr 3 from the rest of the cluster and everything fails over normally, > > so only a unidirectional failure causes problems. > > > > I don't have stonith enabled right now, and looking over the > > pacemaker.log file closely to see if 4 and 5 would normally have fenced > > 3, but I didn't see any fencing or stonith logs. > > > > Would stonith solve this problem, or does this look like a bug? > > It should, that is its job. is there some log I can enable that would say "ERROR: hey, I would use stonith here, but you have it disabled! your warranty is void past this point! do not pass go, do not file a bug"? > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without > access to education? > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] reproducible split brain
On 16/03/16 04:04 PM, Christopher Harvey wrote: > On Wed, Mar 16, 2016, at 04:00 PM, Digimer wrote: >> On 16/03/16 03:59 PM, Christopher Harvey wrote: >>> I am able to create a split brain situation in corosync 1.1.13 using >>> iptables in a 3 node cluster. >>> >>> I have 3 nodes, vmr-132-3, vmr-132-4, and vmr-132-5 >>> >>> All nodes are operational and form a 3 node cluster with all nodes are >>> members of that ring. >>> vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ] >>> vmr-132-4 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ] >>> vmr-132-5 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ] >>> so far so good. >>> >>> running the following on vmr-132-4 drops all incoming (but not outgoing) >>> packets from vmr-132-3: >>> # iptables -I INPUT -s 192.168.132.3 -j DROP >>> # iptables -L >>> Chain INPUT (policy ACCEPT) >>> target prot opt source destination >>> DROP all -- 192.168.132.3anywhere >>> >>> Chain FORWARD (policy ACCEPT) >>> target prot opt source destination >>> >>> Chain OUTPUT (policy ACCEPT) >>> target prot opt source destination >>> >>> vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ] >>> vmr-132-4 ---> Online: [ vmr-132-4 vmr-132-5 ] >>> vmr-132-5 ---> Online: [ vmr-132-4 vmr-132-5 ] >>> >>> vmr-132-3 thinks everything is normal and continues to provide service, >>> vmr-132-4 and 5 form a new ring, achieve quorum and provide the same >>> service. Splitting the link between 3 and 4 in both directions isolates >>> vmr 3 from the rest of the cluster and everything fails over normally, >>> so only a unidirectional failure causes problems. >>> >>> I don't have stonith enabled right now, and looking over the >>> pacemaker.log file closely to see if 4 and 5 would normally have fenced >>> 3, but I didn't see any fencing or stonith logs. >>> >>> Would stonith solve this problem, or does this look like a bug? >> >> It should, that is its job. > > is there some log I can enable that would say > "ERROR: hey, I would use stonith here, but you have it disabled! your > warranty is void past this point! do not pass go, do not file a bug"? If I had it my way, that would be printed to STDOUT when you start pacemaker without stonith... -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] PCS, Corosync, Pacemaker, and Bind (Ken Gaillot)
I guess I have to say "never mind!" I don't know what the problem was yesterday, but it loads just fine today, even when the named config and the virtual ip don't match! But for your edamacation, ifconfig does NOT show the address although ip addr does: ip addr 1: lo:mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc mq state UP qlen 1000 link/ether 00:50:56:8b:d0:f7 brd ff:ff:ff:ff:ff:ff inet 192.168.30.36/26 brd 192.168.30.63 scope global eth0 valid_lft forever preferred_lft forever inet 192.168.30.38/26 brd 192.168.30.63 scope global secondary eth0 valid_lft forever preferred_lft forever ifconfig -a eth0 eth0: flags=4163 mtu 1500 inet 192.168.30.36 netmask 255.255.255.192 broadcast 192.168.30.63 ether 00:50:56:8b:d0:f7 txqueuelen 1000 (Ethernet) RX packets 1169357 bytes 163514247 (155.9 MiB) RX errors 0 dropped 4639 overruns 0 frame 0 TX packets 1351613 bytes 210957790 (201.1 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Cluster failover failure with Unresolved dependency
On 03/16/2016 05:49 AM, Lorand Kelemen wrote: > Dear Ken, > > Thanks for the reply! I lowered migration-threshold to 1 and rearranged > contraints like you suggested: > > Location Constraints: > Ordering Constraints: > promote mail-clone then start fs-services (kind:Mandatory) > promote spool-clone then start fs-services (kind:Mandatory) > start fs-services then start network-services (kind:Mandatory) Certainly not a big deal, but I would change the above constraint to start fs-services then start mail-services. The IP doesn't care whether the filesystems are up yet or not, but postfix does. > start network-services then start mail-services (kind:Mandatory) > Colocation Constraints: > fs-services with spool-clone (score:INFINITY) (rsc-role:Started) > (with-rsc-role:Master) > fs-services with mail-clone (score:INFINITY) (rsc-role:Started) > (with-rsc-role:Master) > network-services with mail-services (score:INFINITY) > mail-services with fs-services (score:INFINITY) > > Now virtualip and postfix becomes stopped, I guess these are relevant but I > attach also full logs: > > Mar 16 11:38:06 [7419] HWJ-626.domain.localpengine: info: > native_color: Resource postfix cannot run anywhere > Mar 16 11:38:06 [7419] HWJ-626.domain.localpengine: info: > native_color: Resource virtualip-1 cannot run anywhere > > Interesting, will try to play around with ordering - colocation, the > solution must be in these settings... > > Best regards, > Lorand > > Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info: > cib_perform_op: Diff: --- 0.215.7 2 > Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info: > cib_perform_op: Diff: +++ 0.215.8 (null) > Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info: > cib_perform_op: + /cib: @num_updates=8 > Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info: > cib_perform_op: ++ > /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postfix']: > operation_key="postfix_monitor_45000" operation="monitor" > crm-debug-origin="do_update_resource" crm_feature_set="3.0.10" > transition-key="86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a" > transition-magic="0:7;86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a" > on_node="mail1" call-id="1333" rc-code="7" > Mar 16 11:38:06 [7420] HWJ-626.domain.local crmd: info: > abort_transition_graph: Transition aborted by postfix_monitor_45000 > 'create' on mail1: Inactive graph > (magic=0:7;86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a, cib=0.215.8, > source=process_graph_event:598, 1) > Mar 16 11:38:06 [7420] HWJ-626.domain.local crmd: info: > update_failcount: Updating failcount for postfix on mail1 after failed > monitor: rc=7 (update=value++, time=1458124686) I don't think your constraints are causing problems now; the above message indicates that the postfix resource failed. Postfix may not be able to run anywhere because it's already failed on both nodes, and the IP would be down because it has to be colocated with postfix, and postfix can't run. The rc=7 above indicates that the postfix agent's monitor operation returned 7, which is "not running". I'd check the logs for postfix errors. > Mar 16 11:38:06 [7420] HWJ-626.domain.local crmd: info: > process_graph_event: Detected action (2962.86) > postfix_monitor_45000.1333=not running: failed > Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info: > attrd_client_update: Expanded fail-count-postfix=value++ to 1 > Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info: > cib_process_request: Completed cib_modify operation for section status: OK > (rc=0, origin=mail1/crmd/253, version=0.215.8) > Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info: > attrd_peer_update:Setting fail-count-postfix[mail1]: (null) -> 1 from > mail2 > Mar 16 11:38:06 [7420] HWJ-626.domain.local crmd: notice: > do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ > input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] > Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info: > write_attribute: Sent update 406 with 2 changes for > fail-count-postfix, id=, set=(null) > Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info: > attrd_peer_update:Setting last-failure-postfix[mail1]: 1458124291 -> > 1458124686 from mail2 > Mar 16 11:38:06 [7418] HWJ-626.domain.local attrd: info: > write_attribute: Sent update 407 with 2 changes for > last-failure-postfix, id=, set=(null) > Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info: > cib_process_request: Forwarding cib_modify operation for section status to > master (origin=local/attrd/406) > Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info: > cib_process_request: Forwarding cib_modify operation for section status to > master (origin=local/attrd/407) > Mar 16
Re: [ClusterLabs] Antw: Re: reproducible split brain
On Thu, Mar 17, 2016, at 06:24 PM, Ken Gaillot wrote: > On 03/17/2016 05:10 PM, Christopher Harvey wrote: > > If I ignore pacemaker's existence, and just run corosync, corosync > > disagrees about node membership in the situation presented in the first > > email. While it's true that stonith just happens to quickly correct the > > situation after it occurs it still smells like a bug in the case where > > corosync in used in isolation. Corosync is after all a membership and > > total ordering protocol, and the nodes in the cluster are unable to > > agree on membership. > > > > The Totem protocol specifies a ring_id in the token passed in a ring. > > Since all of the 3 nodes but one have formed a new ring with a new id > > how is it that the single node can survive in a ring with no other > > members passing a token with the old ring_id? > > > > Are there network failure situations that can fool the Totem membership > > protocol or is this an implementation problem? I don't see how it could > > not be one or the other, and it's bad either way. > > Neither, really. In a split brain situation, there simply is not enough > information for any protocol or implementation to reliably decide what > to do. That's what fencing is meant to solve -- it provides the > information that certain nodes are definitely not active. > > There's no way for either side of the split to know whether the opposite > side is down, or merely unable to communicate properly. If the latter, > it's possible that they are still accessing shared resources, which > without proper communication, can lead to serious problems (e.g. data > corruption of a shared volume). The totem protocol is silent on the topic of fencing and resources, much the way TCP is. Please explain to me what needs to be fenced in a cluster without resources where membership and total message ordering are the only concern. If fencing were a requirement for membership and ordering, wouldn't stonith be part of corosync and not pacemaker? ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Pacemaker startup-fencing
Andrei Borzenkovwrites: > On Wed, Mar 16, 2016 at 4:18 PM, Lars Ellenberg > wrote: > >> On Wed, Mar 16, 2016 at 01:47:52PM +0100, Ferenc Wágner wrote: >> > And some more about fencing: > > 3. What's the difference in cluster behavior between >- stonith-enabled=FALSE (9.3.2: how often will the stop operation be > retried?) >- having no configured STONITH devices (resources won't be started, > right?) >- failing to STONITH with some error (on every node) >- timing out the STONITH operation >- manual fencing I do not think there is much difference. Without fencing pacemaker cannot make decision to relocate resources so cluster will be stuck. >>> >>> Then I wonder why I hear the "must have working fencing if you value >>> your data" mantra so often (and always without explanation). After all, >>> it does not risk the data, only the automatic cluster recovery, right? >> >> stonith-enabled=false >> means: >> if some node becomes unresponsive, >> it is immediately *assumed* it was "clean" dead. >> no fencing takes place, >> resource takeover happens without further protection. > > Oh! Actually it is not quite clear from documentation; documentation > does not explain what happens in case of stonith-enabled=false at all. Yes, this is a crucially important piece of information, which should be prominently announced in the documentation. Thanks for spelling it out, Lars. Hope you don't mind that I turned your text into https://github.com/ClusterLabs/pacemaker/pull/960. -- Feri ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] booth release v1.0
On 18/03/16 09:25 AM, Dejan Muhamedagic wrote: > Hello everybody, > > I'm happy to announce that the booth repository was yesterday > tagged as v1.0: > > https://github.com/ClusterLabs/booth/releases/tag/v1.0 > > There were very few patches since the v1.0 rc1. The complete > list of changes is available in the ChangeLog: > > https://github.com/ClusterLabs/booth/blob/v1.0/ChangeLog > > The binaries are provided for some Linux distributions. Currently, > there are packages for CentOS7 and various openSUSE versions: > > http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/ > > If you don't know what booth is and what is it good for, please > check the README at the bottom of the git repository home page: > > https://github.com/ClusterLabs/booth > > Cheers, > > Dejan Hey hey, congratulations!! -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org