On 03/09/19 20:15 +0300, Andrei Borzenkov wrote:
> 03.09.2019 11:09, Marco Marino пишет:
>> Hi, I have a problem with fencing on a two node cluster. It seems that
>> randomly the cluster cannot complete monitor operation for fence devices.
>> In log I see:
>> crmd[8206]:   error: Result of monitor operation for fence-node2 on
>> ld2.mydomain.it: Timed Out
> 
> Can you actually access IP addresses of your IPMI ports?

[
Tangentially, interesting aspect beyond that and applicable for any
non-IP cross-host referential needs, which I haven't seen mentioned
anywhere so far, is the risk of DNS resolution (when /etc/hosts will
come short) getting to troubles (stale records, port blocked, DNS
server overload [DNSSEC, etc.], IPv4/IPv6 parallel records that the SW
cannot handle gracefully, etc.).  In any case, just a single DNS
server would apparently be an undesired SPOF, and would be unfortunate
when unable to fence a node because of that.

I think the most robust approach is to use IP addresses whenever
possible, and unambiguous records in /etc/hosts when practical.
]

>> As attachment there is
>> - /var/log/messages for node1 (only the important part)
>> - /var/log/messages for node2 (only the important part) <-- Problem starts
>> here
>> - pcs status
>> - pcs stonith show (for both fence devices)
>> 
>> I think it could be a timeout problem, so how can I see timeout value for
>> monitor operation in stonith devices?
>> Please, someone can help me with this problem?
>> Furthermore, how can I fix the state of fence devices without downtime?

-- 
Jan (Poki)

Attachment: pgpL97hDs1Edl.pgp
Description: PGP signature

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to