On 03/09/19 20:15 +0300, Andrei Borzenkov wrote: > 03.09.2019 11:09, Marco Marino пишет: >> Hi, I have a problem with fencing on a two node cluster. It seems that >> randomly the cluster cannot complete monitor operation for fence devices. >> In log I see: >> crmd[8206]: error: Result of monitor operation for fence-node2 on >> ld2.mydomain.it: Timed Out > > Can you actually access IP addresses of your IPMI ports?
[ Tangentially, interesting aspect beyond that and applicable for any non-IP cross-host referential needs, which I haven't seen mentioned anywhere so far, is the risk of DNS resolution (when /etc/hosts will come short) getting to troubles (stale records, port blocked, DNS server overload [DNSSEC, etc.], IPv4/IPv6 parallel records that the SW cannot handle gracefully, etc.). In any case, just a single DNS server would apparently be an undesired SPOF, and would be unfortunate when unable to fence a node because of that. I think the most robust approach is to use IP addresses whenever possible, and unambiguous records in /etc/hosts when practical. ] >> As attachment there is >> - /var/log/messages for node1 (only the important part) >> - /var/log/messages for node2 (only the important part) <-- Problem starts >> here >> - pcs status >> - pcs stonith show (for both fence devices) >> >> I think it could be a timeout problem, so how can I see timeout value for >> monitor operation in stonith devices? >> Please, someone can help me with this problem? >> Furthermore, how can I fix the state of fence devices without downtime? -- Jan (Poki)
pgpL97hDs1Edl.pgp
Description: PGP signature
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/