Re: [ClusterLabs] reproducible split brain

2016-03-18 Thread Christopher Harvey
On Wed, Mar 16, 2016, at 04:00 PM, Digimer wrote:
> On 16/03/16 03:59 PM, Christopher Harvey wrote:
> > I am able to create a split brain situation in corosync 1.1.13 using
> > iptables in a 3 node cluster.
> > 
> > I have 3 nodes, vmr-132-3, vmr-132-4, and vmr-132-5
> > 
> > All nodes are operational and form a 3 node cluster with all nodes are
> > members of that ring.
> > vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
> > vmr-132-4 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
> > vmr-132-5 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
> > so far so good.
> > 
> > running the following on vmr-132-4 drops all incoming (but not outgoing)
> > packets from vmr-132-3:
> > # iptables -I INPUT -s 192.168.132.3 -j DROP
> > # iptables -L
> > Chain INPUT (policy ACCEPT)
> > target prot opt source   destination
> > DROP   all  --  192.168.132.3anywhere
> > 
> > Chain FORWARD (policy ACCEPT)
> > target prot opt source   destination
> > 
> > Chain OUTPUT (policy ACCEPT)
> > target prot opt source   destination
> > 
> > vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
> > vmr-132-4 ---> Online: [ vmr-132-4 vmr-132-5 ]
> > vmr-132-5 ---> Online: [ vmr-132-4 vmr-132-5 ]
> > 
> > vmr-132-3 thinks everything is normal and continues to provide service,
> > vmr-132-4 and 5 form a new ring, achieve quorum and provide the same
> > service. Splitting the link between 3 and 4 in both directions isolates
> > vmr 3 from the rest of the cluster and everything fails over normally,
> > so only a unidirectional failure causes problems.
> > 
> > I don't have stonith enabled right now, and looking over the
> > pacemaker.log file closely to see if 4 and 5 would normally have fenced
> > 3, but I didn't see any fencing or stonith logs.
> > 
> > Would stonith solve this problem, or does this look like a bug?
> 
> It should, that is its job.

is there some log I can enable that would say
"ERROR: hey, I would use stonith here, but you have it disabled! your
warranty is void past this point! do not pass go, do not file a bug"?

> -- 
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] reproducible split brain

2016-03-18 Thread Digimer
On 16/03/16 04:04 PM, Christopher Harvey wrote:
> On Wed, Mar 16, 2016, at 04:00 PM, Digimer wrote:
>> On 16/03/16 03:59 PM, Christopher Harvey wrote:
>>> I am able to create a split brain situation in corosync 1.1.13 using
>>> iptables in a 3 node cluster.
>>>
>>> I have 3 nodes, vmr-132-3, vmr-132-4, and vmr-132-5
>>>
>>> All nodes are operational and form a 3 node cluster with all nodes are
>>> members of that ring.
>>> vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
>>> vmr-132-4 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
>>> vmr-132-5 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
>>> so far so good.
>>>
>>> running the following on vmr-132-4 drops all incoming (but not outgoing)
>>> packets from vmr-132-3:
>>> # iptables -I INPUT -s 192.168.132.3 -j DROP
>>> # iptables -L
>>> Chain INPUT (policy ACCEPT)
>>> target prot opt source   destination
>>> DROP   all  --  192.168.132.3anywhere
>>>
>>> Chain FORWARD (policy ACCEPT)
>>> target prot opt source   destination
>>>
>>> Chain OUTPUT (policy ACCEPT)
>>> target prot opt source   destination
>>>
>>> vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
>>> vmr-132-4 ---> Online: [ vmr-132-4 vmr-132-5 ]
>>> vmr-132-5 ---> Online: [ vmr-132-4 vmr-132-5 ]
>>>
>>> vmr-132-3 thinks everything is normal and continues to provide service,
>>> vmr-132-4 and 5 form a new ring, achieve quorum and provide the same
>>> service. Splitting the link between 3 and 4 in both directions isolates
>>> vmr 3 from the rest of the cluster and everything fails over normally,
>>> so only a unidirectional failure causes problems.
>>>
>>> I don't have stonith enabled right now, and looking over the
>>> pacemaker.log file closely to see if 4 and 5 would normally have fenced
>>> 3, but I didn't see any fencing or stonith logs.
>>>
>>> Would stonith solve this problem, or does this look like a bug?
>>
>> It should, that is its job.
> 
> is there some log I can enable that would say
> "ERROR: hey, I would use stonith here, but you have it disabled! your
> warranty is void past this point! do not pass go, do not file a bug"?

If I had it my way, that would be printed to STDOUT when you start
pacemaker without stonith...

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] PCS, Corosync, Pacemaker, and Bind (Ken Gaillot)

2016-03-18 Thread Mike Bernhardt
I guess I have to say "never mind!" I don't know what the problem was
yesterday, but it loads just fine today, even when the named config and the
virtual ip don't match! But for your edamacation, ifconfig does NOT show the
address although ip addr does:

ip addr
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN 
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
   valid_lft forever preferred_lft forever
inet6 ::1/128 scope host 
   valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qdisc mq state UP qlen
1000
link/ether 00:50:56:8b:d0:f7 brd ff:ff:ff:ff:ff:ff
inet 192.168.30.36/26 brd 192.168.30.63 scope global eth0
   valid_lft forever preferred_lft forever
inet 192.168.30.38/26 brd 192.168.30.63 scope global secondary eth0
   valid_lft forever preferred_lft forever

ifconfig -a eth0
eth0: flags=4163  mtu 1500
inet 192.168.30.36  netmask 255.255.255.192  broadcast 192.168.30.63
ether 00:50:56:8b:d0:f7  txqueuelen 1000  (Ethernet)
RX packets 1169357  bytes 163514247 (155.9 MiB)
RX errors 0  dropped 4639  overruns 0  frame 0
TX packets 1351613  bytes 210957790 (201.1 MiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Cluster failover failure with Unresolved dependency

2016-03-18 Thread Ken Gaillot
On 03/16/2016 05:49 AM, Lorand Kelemen wrote:
> Dear Ken,
> 
> Thanks for the reply! I lowered migration-threshold to 1 and rearranged
> contraints like you suggested:
> 
> Location Constraints:
> Ordering Constraints:
>   promote mail-clone then start fs-services (kind:Mandatory)
>   promote spool-clone then start fs-services (kind:Mandatory)
>   start fs-services then start network-services (kind:Mandatory)

Certainly not a big deal, but I would change the above constraint to
start fs-services then start mail-services. The IP doesn't care whether
the filesystems are up yet or not, but postfix does.

>   start network-services then start mail-services (kind:Mandatory)
> Colocation Constraints:
>   fs-services with spool-clone (score:INFINITY) (rsc-role:Started)
> (with-rsc-role:Master)
>   fs-services with mail-clone (score:INFINITY) (rsc-role:Started)
> (with-rsc-role:Master)
>   network-services with mail-services (score:INFINITY)
>   mail-services with fs-services (score:INFINITY)
> 
> Now virtualip and postfix becomes stopped, I guess these are relevant but I
> attach also full logs:
> 
> Mar 16 11:38:06 [7419] HWJ-626.domain.localpengine: info:
> native_color: Resource postfix cannot run anywhere
> Mar 16 11:38:06 [7419] HWJ-626.domain.localpengine: info:
> native_color: Resource virtualip-1 cannot run anywhere
> 
> Interesting, will try to play around with ordering - colocation, the
> solution must be in these settings...
> 
> Best regards,
> Lorand
> 
> Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info:
> cib_perform_op:   Diff: --- 0.215.7 2
> Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info:
> cib_perform_op:   Diff: +++ 0.215.8 (null)
> Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info:
> cib_perform_op:   +  /cib:  @num_updates=8
> Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info:
> cib_perform_op:   ++
> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postfix']:
>   operation_key="postfix_monitor_45000" operation="monitor"
> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
> transition-key="86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
> transition-magic="0:7;86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
> on_node="mail1" call-id="1333" rc-code="7"
> Mar 16 11:38:06 [7420] HWJ-626.domain.local   crmd: info:
> abort_transition_graph:   Transition aborted by postfix_monitor_45000
> 'create' on mail1: Inactive graph
> (magic=0:7;86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a, cib=0.215.8,
> source=process_graph_event:598, 1)
> Mar 16 11:38:06 [7420] HWJ-626.domain.local   crmd: info:
> update_failcount: Updating failcount for postfix on mail1 after failed
> monitor: rc=7 (update=value++, time=1458124686)

I don't think your constraints are causing problems now; the above
message indicates that the postfix resource failed. Postfix may not be
able to run anywhere because it's already failed on both nodes, and the
IP would be down because it has to be colocated with postfix, and
postfix can't run.

The rc=7 above indicates that the postfix agent's monitor operation
returned 7, which is "not running". I'd check the logs for postfix errors.

> Mar 16 11:38:06 [7420] HWJ-626.domain.local   crmd: info:
> process_graph_event:  Detected action (2962.86)
> postfix_monitor_45000.1333=not running: failed
> Mar 16 11:38:06 [7418] HWJ-626.domain.local  attrd: info:
> attrd_client_update:  Expanded fail-count-postfix=value++ to 1
> Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info:
> cib_process_request:  Completed cib_modify operation for section status: OK
> (rc=0, origin=mail1/crmd/253, version=0.215.8)
> Mar 16 11:38:06 [7418] HWJ-626.domain.local  attrd: info:
> attrd_peer_update:Setting fail-count-postfix[mail1]: (null) -> 1 from
> mail2
> Mar 16 11:38:06 [7420] HWJ-626.domain.local   crmd:   notice:
> do_state_transition:  State transition S_IDLE -> S_POLICY_ENGINE [
> input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
> Mar 16 11:38:06 [7418] HWJ-626.domain.local  attrd: info:
> write_attribute:  Sent update 406 with 2 changes for
> fail-count-postfix, id=, set=(null)
> Mar 16 11:38:06 [7418] HWJ-626.domain.local  attrd: info:
> attrd_peer_update:Setting last-failure-postfix[mail1]: 1458124291 ->
> 1458124686 from mail2
> Mar 16 11:38:06 [7418] HWJ-626.domain.local  attrd: info:
> write_attribute:  Sent update 407 with 2 changes for
> last-failure-postfix, id=, set=(null)
> Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info:
> cib_process_request:  Forwarding cib_modify operation for section status to
> master (origin=local/attrd/406)
> Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info:
> cib_process_request:  Forwarding cib_modify operation for section status to
> master (origin=local/attrd/407)
> Mar 16 

Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-18 Thread Christopher Harvey
On Thu, Mar 17, 2016, at 06:24 PM, Ken Gaillot wrote:
> On 03/17/2016 05:10 PM, Christopher Harvey wrote:
> > If I ignore pacemaker's existence, and just run corosync, corosync
> > disagrees about node membership in the situation presented in the first
> > email. While it's true that stonith just happens to quickly correct the
> > situation after it occurs it still smells like a bug in the case where
> > corosync in used in isolation. Corosync is after all a membership and
> > total ordering protocol, and the nodes in the cluster are unable to
> > agree on membership.
> > 
> > The Totem protocol specifies a ring_id in the token passed in a ring.
> > Since all of the 3 nodes but one have formed a new ring with a new id
> > how is it that the single node can survive in a ring with no other
> > members passing a token with the old ring_id?
> > 
> > Are there network failure situations that can fool the Totem membership
> > protocol or is this an implementation problem? I don't see how it could
> > not be one or the other, and it's bad either way.
> 
> Neither, really. In a split brain situation, there simply is not enough
> information for any protocol or implementation to reliably decide what
> to do. That's what fencing is meant to solve -- it provides the
> information that certain nodes are definitely not active.
> 
> There's no way for either side of the split to know whether the opposite
> side is down, or merely unable to communicate properly. If the latter,
> it's possible that they are still accessing shared resources, which
> without proper communication, can lead to serious problems (e.g. data
> corruption of a shared volume).

The totem protocol is silent on the topic of fencing and resources, much
the way TCP is.

Please explain to me what needs to be fenced in a cluster without
resources where membership and total message ordering are the only
concern. If fencing were a requirement for membership and ordering,
wouldn't stonith be part of corosync and not pacemaker?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Pacemaker startup-fencing

2016-03-18 Thread Ferenc Wágner
Andrei Borzenkov  writes:

> On Wed, Mar 16, 2016 at 4:18 PM, Lars Ellenberg  
> wrote:
>
>> On Wed, Mar 16, 2016 at 01:47:52PM +0100, Ferenc Wágner wrote:
>>
> And some more about fencing:
>
> 3. What's the difference in cluster behavior between
>- stonith-enabled=FALSE (9.3.2: how often will the stop operation be 
> retried?)
>- having no configured STONITH devices (resources won't be started, 
> right?)
>- failing to STONITH with some error (on every node)
>- timing out the STONITH operation
>- manual fencing

 I do not think there is much difference. Without fencing pacemaker
 cannot make decision to relocate resources so cluster will be stuck.
>>>
>>> Then I wonder why I hear the "must have working fencing if you value
>>> your data" mantra so often (and always without explanation).  After all,
>>> it does not risk the data, only the automatic cluster recovery, right?
>>
>> stonith-enabled=false
>> means:
>> if some node becomes unresponsive,
>> it is immediately *assumed* it was "clean" dead.
>> no fencing takes place,
>> resource takeover happens without further protection.
>
> Oh! Actually it is not quite clear from documentation; documentation
> does not explain what happens in case of stonith-enabled=false at all.

Yes, this is a crucially important piece of information, which should be
prominently announced in the documentation.  Thanks for spelling it out,
Lars.  Hope you don't mind that I turned your text into
https://github.com/ClusterLabs/pacemaker/pull/960.
-- 
Feri

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] booth release v1.0

2016-03-18 Thread Digimer
On 18/03/16 09:25 AM, Dejan Muhamedagic wrote:
> Hello everybody,
> 
> I'm happy to announce that the booth repository was yesterday
> tagged as v1.0:
> 
> https://github.com/ClusterLabs/booth/releases/tag/v1.0
> 
> There were very few patches since the v1.0 rc1. The complete
> list of changes is available in the ChangeLog:
> 
> https://github.com/ClusterLabs/booth/blob/v1.0/ChangeLog
> 
> The binaries are provided for some Linux distributions. Currently,
> there are packages for CentOS7 and various openSUSE versions:
> 
> http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/
> 
> If you don't know what booth is and what is it good for, please
> check the README at the bottom of the git repository home page:
> 
> https://github.com/ClusterLabs/booth
> 
> Cheers,
> 
> Dejan

Hey hey, congratulations!!

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org