Re: [ClusterLabs] STONITH not communicated back to initiator until token expires

2017-04-26 Thread Chris Walker
Just to close the loop on this issue, discussions with Redhat have
confirmed that this behavior is as designed, that all membership
changes must first be realized by the Corosync layer.  So the full
trajectory of a STONITH action in response to, for example, a failed
stop operation looks like:

crmd requests STONITH
stonith-ng successfully STONITHs node

corosync communicates membership change to stonith-ng
stonith-ng communicates successful STONITH to crmd
cluster reacts to down node

Thanks,
Chris

On Wed, Apr 5, 2017 at 5:07 PM, Chris Walker
 wrote:
> Thanks very much for your reply Ken.  Unfortunately, the same delay
> happens when the DC is not the node that's being STONITHed. In either
> case, the failure looks the same to me: the stonithd instance that
> does the STONITH operation does not pass back the result to the
> original stonithd, so remote_op_done can't be invoked to send the
> result to the original initiator (in this case, crmd).
>
> Sorry, this problem is probably too in-depth for the mailing list.
> I've created RH ticket 01812422 for this issue (seems stuck in L1/L2
> support at the moment :( )
>
> Thanks again,
> Chris
>
>
>
> On Tue, Apr 4, 2017 at 12:47 PM, Ken Gaillot  wrote:
>> On 03/13/2017 10:43 PM, Chris Walker wrote:
>>> Thanks for your reply Digimer.
>>>
>>> On Mon, Mar 13, 2017 at 1:35 PM, Digimer >> > wrote:
>>>
>>> On 13/03/17 12:07 PM, Chris Walker wrote:
>>> > Hello,
>>> >
>>> > On our two-node EL7 cluster (pacemaker: 1.1.15-11.el7_3.4; corosync:
>>> > 2.4.0-4; libqb: 1.0-1),
>>> > it looks like successful STONITH operations are not communicated from
>>> > stonith-ng back to theinitiator (in this case, crmd) until the 
>>> STONITHed
>>> > node is removed from the cluster when
>>> > Corosync notices that it's gone (i.e., after the token timeout).
>>>
>>> Others might have more useful info, but my understanding of a lost node
>>> sequence is this;
>>>
>>> 1. Node stops responding, corosync declares it lost after token timeout
>>> 2. Corosync reforms the cluster with remaining node(s), checks if it is
>>> quorate (always true in 2-node)
>>> 3. Corosync informs Pacemaker of the membership change.
>>> 4. Pacemaker invokes stonith, waits for the fence agent to return
>>> "success" (exit code of the agent as per the FenceAgentAPI
>>> [https://docs.pagure.org/ClusterLabs.fence-agents/FenceAgentAPI.md]
>>> ).
>>> If
>>> the method fails, it moves on to the next method. If all methods fail,
>>> it goes back to the first method and tries again, looping indefinitely.
>>>
>>>
>>> That's roughly my understanding as well for the case when a node
>>> suddenly leaves the cluster (e.g., poweroff), and this case is working
>>> as expected for me.  I'm seeing delays when a node is marked for STONITH
>>> while it's still up (e.g., after a stop operation fails).  In this case,
>>> what I expect to see is something like:
>>> 1.  crmd requests that stonith-ng fence the node
>>> 2.  stonith-ng (might be a different stonith-ng) fences the node and
>>> sends a message that it has succeeded
>>> 3.  stonith-ng (the original from step 1) receives this message and
>>> communicates back to crmd that the node has been fenced
>>>
>>> but what I'm seeing is
>>> 1.  crmd requests that stonith-ng fence the node
>>> 2.  stonith-ng fences the node and sends a message saying that it has
>>> succeeded
>>> 3.  nobody hears this message
>>> 4.  Corosync eventually realizes that the fenced node is no longer part
>>> of the config and broadcasts a config change
>>> 5.  stonith-ng finishes the STONITH operation that was started earlier
>>> and communicates back to crmd that the node has been STONITHed
>>
>> In your attached log, bug1 was DC at the time of the fencing, and bug0
>> takes over DC after the fencing. This is what I expect is happening
>> (logs from bug1 would help confirm):
>>
>> 1. crmd on the DC (bug1) runs pengine which sees the stop failure and
>> schedules fencing (of bug1)
>>
>> 2. stonithd on bug1 sends a query to all nodes asking who can fence bug1
>>
>> 3. Each node replies, and stonithd on bug1 chooses bug0 to execute the
>> fencing
>>
>> 4. stonithd on bug0 fences bug1. At this point, it would normally report
>> the result to the DC ... but that happens to be bug1.
>>
>> 5. Once crmd on bug0 takes over DC, it can decide that the fencing
>> succeeded, but it can't take over DC until it sees that the old DC is
>> gone, which takes a while because of your long token timeout. So, this
>> is where the delay is coming in.
>>
>> I'll have to think about whether we can improve this, but I don't think
>> it would be easy. There are complications if for example a fencing
>> topology is used, such that the result being reported in step 4 might
>> not be the entire result.
>>
>>> I'm less convinced that the sending of the STON

[ClusterLabs] IPaddr2 cloning inside containers

2017-04-26 Thread Ken Gaillot
FYI, I stumbled across a report of a suspected kernel issue breaking
iptables clusterip inside containers:

https://github.com/lxc/lxd/issues/2773

ocf:heartbeat:IPaddr2 uses clusterip when cloned. I'm guessing no one's
tried something like that yet, but this is a note of caution to anyone
thinking about it.

Pacemaker's new bundle feature doesn't support cloning the IPs it
creates, but that might be an interesting future feature if this issue
is resolved.
-- 
Ken Gaillot 

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Problem with clone ClusterIP

2017-04-26 Thread Ken Gaillot
On 04/26/2017 02:45 AM, Bratislav Petkovic wrote:
> Tahank you,
> 
>  
> 
> We use the Cisco Nexus 7000 switches, they support Multicast MAC.
> 
> It is possible that something is not configured correctly.
> 
> In this environment working IBM PowerHA SystemMirror 7.1 (use Multicast)
>  without problems.
>  
> 
> Regards,
> 
>  
> 
> Bratislav

I believe SystemMirror uses multicast IP, which is at a higher level
than multicast Ethernet. Multicast Ethernet is much less commonly seen,
so it's often disabled.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Problem with clone ClusterIP

2017-04-26 Thread Bratislav Petkovic
Tahank you,

We use the Cisco Nexus 7000 switches, they support Multicast MAC.
It is possible that something is not configured correctly.
In this environment working IBM PowerHA SystemMirror 7.1 (use Multicast)  
without problems.

Regards,

Bratislav

ODRICANJE OD ODGOVORNOSTI

Informacija u ovom e-mailu namenjena je isklju?ivo primaocima navedenim u 
adresi poruke. Poruka mo?e sadr?ati poverljive informacije. Ukoliko gre?kom 
primite ovaj e-mail, molimo da ga obri?ete. Ako niste navedeni kao primalac, 
svako kopiranje kori??enje i obelodanjivanje sadr?aja je zabranjeno i mo?e biti 
nezakonito. AIK BANKA AD BEOGRAD nije odgovorna, niti garantuje, za bilo koje 
mi?ljenje, preporuku, zaklju?ak, zahtev, ponudu i/ili ugovornu obavezu koja se 
ostvaruje ovakvim na?inom komunikacije.

DISCLAIMER

Information in this email is intended for the recipients addressed in the 
email. It may contain confidential information. If you receive this email in 
error, please delete it. If you are not addressed as the recipient, please have 
in mind that copying and disclosure of the email or using the information 
contained in the email is prohibited and may be unlawful. AIK BANKA AD BEOGRAD 
neither endorses nor is responsible for any opinion, recommendation, 
conclusion, demand, offer and/or agreement contained in this email.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org