>
>
> So if this is really the reason it would probably be worth
> finding out what is really happening.
>
Thanks. Yes, I think this is really the reason. I fixed it one week ago and
hasn't happened again
___
Users mailing list:
On 07/12/2017 05:16 PM, Cesar Hernandez wrote:
>
>> El 6 jul 2017, a las 17:34, Ken Gaillot escribió:
>>
>> On 07/06/2017 10:27 AM, Cesar Hernandez wrote:
It looks like a bug when the fenced node rejoins quickly enough that it
is a member again before its fencing
> El 6 jul 2017, a las 17:34, Ken Gaillot escribió:
>
> On 07/06/2017 10:27 AM, Cesar Hernandez wrote:
>>
>>>
>>> It looks like a bug when the fenced node rejoins quickly enough that it
>>> is a member again before its fencing confirmation has been sent. I know
>>> there
>>>
>>
>> Could it be caused if node 2 becomes rebooted and alive before the stonith
>> script has finished?
>
> That *shouldn't* cause any problems, but I'm not sure what's happening
> in this case.
Maybe is the cause for it... My other servers installations had a slow stonith
device and
On 07/06/2017 04:48 PM, Ken Gaillot wrote:
> On 07/06/2017 09:26 AM, Klaus Wenninger wrote:
>> On 07/06/2017 04:20 PM, Cesar Hernandez wrote:
If node2 is getting the notification of its own fencing, it wasn't
successfully fenced. Successful fencing would render it incapacitated
On 07/06/2017 09:26 AM, Klaus Wenninger wrote:
> On 07/06/2017 04:20 PM, Cesar Hernandez wrote:
>>> If node2 is getting the notification of its own fencing, it wasn't
>>> successfully fenced. Successful fencing would render it incapacitated
>>> (powered down, or at least cut off from the network
On 07/06/2017 04:20 PM, Cesar Hernandez wrote:
>> If node2 is getting the notification of its own fencing, it wasn't
>> successfully fenced. Successful fencing would render it incapacitated
>> (powered down, or at least cut off from the network and any shared
>> resources).
>
> Maybe I don't
>
> If node2 is getting the notification of its own fencing, it wasn't
> successfully fenced. Successful fencing would render it incapacitated
> (powered down, or at least cut off from the network and any shared
> resources).
Maybe I don't understand you, or maybe you don't understand me... ;)
On 07/06/2017 08:54 AM, Cesar Hernandez wrote:
>
>>
>> So, the above log means that node1 decided that node2 needed to be
>> fenced, requested fencing of node2, and received a successful result for
>> the fencing, and yet node2 was not killed.
>>
>> Your fence agent should not return success
On 07/04/2017 08:28 AM, Cesar Hernandez wrote:
>
>>
>> Agreed, I don't think it's multicast vs unicast.
>>
>> I can't see from this what's going wrong. Possibly node1 is trying to
>> re-fence node2 when it comes back. Check that the fencing resources are
>> configured correctly, and check whether
>
>
>>>
>>> But you definitely shouldn't have a fencing-agent that claims to have fenced
>>> a node if it is not sure - rather the other way round if in doubt.
>>
>>
>
> True! Which is why I mentioned it to be dangerous.
> But your fencing-agent is even more dangerous ;-)
>
>
Well.. my
On 07/05/2017 04:50 PM, Cesar Hernandez wrote:
>> Not a good idea probably - and the reason for what you are experiencing ;-)
>> If you have problems starting the nodes within a certain time-window
>> disabling startup-fencing might be an option to consider although dangerous.
>> But you
> Not a good idea probably - and the reason for what you are experiencing ;-)
> If you have problems starting the nodes within a certain time-window
> disabling startup-fencing might be an option to consider although dangerous.
> But you definitely shouldn't have a fencing-agent that claims to
On 07/05/2017 04:22 PM, Cesar Hernandez wrote:
>
>> Are you logging which ones went OK and which failed.
>> The script returns negatively if both go wrong?
> The script always returns OK
Not a good idea probably - and the reason for what you are experiencing ;-)
If you have problems starting the
> Are you logging which ones went OK and which failed.
> The script returns negatively if both go wrong?
The script always returns OK
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home:
On 07/05/2017 08:50 AM, Cesar Hernandez wrote:
>> Might be kind of a strange race as well ... but without knowing what the
>> script actually does ...
>>
> The script first try to reboot the node using ssh, something like ssh $NODE
> reboot -f, then runs a remote reboot using AWS api
Are you
> Might be kind of a strange race as well ... but without knowing what the
> script actually does ...
>
The script first try to reboot the node using ssh, something like ssh $NODE
reboot -f, then runs a remote reboot using AWS api
Thanks
___
On 07/04/2017 04:52 PM, Cesar Hernandez wrote:
>> The first line is the consequence of the 2nd.
>> And the 1st says that node2 just has seen some fencing-resource
>> positively reporting to have fenced himself - which
>> is why crmd is exiting in a way that it is not respawned
>> by pacemakerd.
>
> The first line is the consequence of the 2nd.
> And the 1st says that node2 just has seen some fencing-resource
> positively reporting to have fenced himself - which
> is why crmd is exiting in a way that it is not respawned
> by pacemakerd.
Thanks. But my script have a logfile, I've checked
On 07/04/2017 03:28 PM, Cesar Hernandez wrote:
>> Agreed, I don't think it's multicast vs unicast.
>>
>> I can't see from this what's going wrong. Possibly node1 is trying to
>> re-fence node2 when it comes back. Check that the fencing resources are
>> configured correctly, and check whether node1
>
> Agreed, I don't think it's multicast vs unicast.
>
> I can't see from this what's going wrong. Possibly node1 is trying to
> re-fence node2 when it comes back. Check that the fencing resources are
> configured correctly, and check whether node1 sees the first fencing
> succeed.
Thanks.
On 07/03/2017 02:34 AM, Cesar Hernandez wrote:
> Hi
>
> I have installed a pacemaker cluster with two nodes. The same type of
> installation has done before many times and the following error never
> appeared before. The situation is the following:
>
> both nodes running cluster services
>
Hi
I have installed a pacemaker cluster with two nodes. The same type of
installation has done before many times and the following error never appeared
before. The situation is the following:
both nodes running cluster services
stop pacemaker on node 1
stop pacemaker on node 2
start corosync
23 matches
Mail list logo