On 5/29/21 12:21 AM, Strahil Nikolov wrote:
I agree -> fencing is mandatory.
Agreed that with proper fencing setup the cluster
wouldn'thave run into that state.
But still it might be interesting to find out what has
happened. Not seeing anything in the log snippet either.
Assuming you are running something systemd-based.
Did you check the journal for pacemaker to see what
systemd is thinking?
With the standard unit-file systemd should observe
pacemakerd and restart it if it goes away ungracefully.
You should be able to test this behavior sending a
SIGKILL to pacemakerd.
pacemakerd in turn watches out for signals from the
sub-daemons it has spawned (I'm currently working
on more in-depth observation here.).
So just disappearing shouldn't happen that easily.
Did you find any core-dumps?
Regards,
Klaus
You can enable the debug logs by editing corosync.conf or
/etc/sysconfig/pacemaker.
In case simple reload doesn't work, you can set the cluster in global
maintenance, stop and then start the stack.
Best Regards,
Strahil Nikolov
On Fri, May 28, 2021 at 22:13, Digimer
<li...@alteeve.ca> wrote:
On 2021-05-28 3:08 p.m., Eric Robinson wrote:
>
>> -----Original Message-----
>> From: Digimer <li...@alteeve.ca <mailto:li...@alteeve.ca>>
>> Sent: Friday, May 28, 2021 12:43 PM
>> To: Cluster Labs - All topics related to open-source clustering
welcomed
>> <users@clusterlabs.org <mailto:users@clusterlabs.org>>; Eric
Robinson <eric.robin...@psmnv.com
<mailto:eric.robin...@psmnv.com>>; Strahil
>> Nikolov <hunter86...@yahoo.com <mailto:hunter86...@yahoo.com>>
>> Subject: Re: [ClusterLabs] Cluster Stopped, No Messages?
>>
>> Shared storage is not what triggers the need for fencing.
Coordinating actions
>> is what triggers the need. Specifically; If you can run
resource on both/all
>> nodes at the same time, you don't need HA. If you can't, you
need fencing.
>>
>> Digimer
>
> Thanks. That said, there is no fencing, so any thoughts on why
the node behaved the way it did?
Without fencing, when a communication or membership issues arises,
it's
hard to predict what will happen.
I don't see anything in the short log snippet to indicate what
happened.
What's on the other node during the event? When did the node disappear
and when was it rejoined, to help find relevant log entries?
Going forward, if you want predictable and reliable operation,
implement
fencing asap. Fencing is required.
--
Digimer
Papers and Projects: https://alteeve.com/w/ <https://alteeve.com/w/>
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal
talent
have lived and died in cotton fields and sweatshops." - Stephen
Jay Gould
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/