Re: [ClusterLabs] Pacemaker crash and fencing failure

Brian Campbell Sat, 21 Nov 2015 08:34:42 -0800

On Sat, Nov 21, 2015 at 1:50 AM, Andrei Borzenkov <arvidj...@gmail.com> wrote:
> 21.11.2015 03:38, Brian Campbell пишет:
>>
>>
>> What I'm concerned about is the initial failure of crmd on master1
>> that led to master2 deciding to fence it, and then master2's failure
>> to fence master1 and thus getting stuck and not being able to manage
>> resources. It seems to have simply stopped doing anything, with no
>> logs indicating why it did so.
>>
>
> That's actually normal. If fencing is required but could not be performed
> cluster is stuck - no further actions can be completed in this state. So the
> root cause here seems to be unsuccessful fencing.


Yes, that part I expect. The problem I'm having is that there's no
indication of why fencing was unnsuccessful, since we had previously
tested fencing and it was working; in fact, we see fencing working
later on in the logs, after someone manually reboots master1 it sees
it as unclean and sucessfully fences it.

So, the problem is that fencing failed to work without anything logged
about why, so it's hard to figure out what needs to be fixed to make
it more reliable in the future.

-- Brian

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Pacemaker crash and fencing failure

Reply via email to