On Oct 4, 2016, at 3:03 PM, Digimer <li...@alteeve.ca> wrote: > > On 04/10/16 06:50 PM, Israel Brewster wrote: >> On Oct 4, 2016, at 2:26 PM, Ken Gaillot <kgail...@redhat.com >> <mailto:kgail...@redhat.com>> wrote: >>> >>> On 10/04/2016 11:31 AM, Israel Brewster wrote: >>>> I sent this a week ago, but never got a response, so I'm sending it >>>> again in the hopes that it just slipped through the cracks. It seems to >>>> me that this should just be a simple mis-configuration on my part >>>> causing the issue, but I suppose it could be a bug as well. >>>> >>>> I have two two-node clusters set up using corosync/pacemaker on CentOS >>>> 6.8. One cluster is simply sharing an IP, while the other one has >>>> numerous services and IP's set up between the two machines in the >>>> cluster. Both appear to be working fine. However, I was poking around >>>> today, and I noticed that on the single IP cluster, corosync, stonithd, >>>> and fenced were using "significant" amounts of processing power - 25% >>>> for corosync on the current primary node, with fenced and stonithd often >>>> showing 1-2% (not horrible, but more than any other process). In looking >>>> at my logs, I see that they are dumping messages like the following to >>>> the messages log every second or two: >>>> >>>> Sep 27 08:51:50 fai-dbs1 stonith-ng[4851]: warning: get_xpath_object: >>>> No match for //@st_delegate in /st-reply >>>> Sep 27 08:51:50 fai-dbs1 stonith-ng[4851]: notice: remote_op_done: >>>> Operation reboot of fai-dbs1 by fai-dbs2 for >>>> stonith_admin.cman.15835@fai-dbs2.c5161517: No such device >>>> Sep 27 08:51:50 fai-dbs1 crmd[4855]: notice: tengine_stonith_notify: >>>> Peer fai-dbs1 was not terminated (reboot) by fai-dbs2 for fai-dbs2: No >>>> such device (ref=c5161517-c0cc-42e5-ac11-1d55f7749b05) by client >>>> stonith_admin.cman.15835 >>>> Sep 27 08:51:50 fai-dbs1 fence_pcmk[15393]: Requesting Pacemaker fence >>>> fai-dbs2 (reset) >>> >>> The above shows that CMAN is asking pacemaker to fence a node. Even >>> though fencing is disabled in pacemaker itself, CMAN is configured to >>> use pacemaker for fencing (fence_pcmk). >> >> I never did any specific configuring of CMAN, Perhaps that's the >> problem? I missed some configuration steps on setup? I just followed the >> directions >> here: >> http://jensd.be/156/linux/building-a-high-available-failover-cluster-with-pacemaker-corosync-pcs, >> which disabled stonith in pacemaker via the >> "pcs property set stonith-enabled=false" command. Is there separate CMAN >> configs I need to do to get everything copacetic? If so, can you point >> me to some sort of guide/tutorial for that? > > Disabling stonith is not possible in cman, and very ill advised in > pacemaker. This is a mistake a lot of "tutorials" make when the author > doesn't understand the role of fencing. > > In your case, pcs setup cman to use the fence_pcmk "passthrough" fence > agent, as it should. So when something went wrong, corosync detected it, > informed cman which then requested pacemaker to fence the peer. With > pacemaker not having stonith configured and enabled, it could do > nothing. So pacemaker returned that the fence failed and cman went into > an infinite loop trying again and again to fence (as it should have). > > You must configure stonith (exactly how depends on your hardware), then > enable stonith in pacemaker. >
Gotcha. There is nothing special about the hardware, it's just two physical boxes connected to the network. So I guess I've got a choice of either a) live with the logging/load situation (since the system does work perfectly as-is other than the excessive logging), or b) spend some time researching stonith to figure out what it does and how to configure it. Thanks for the pointers. > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without > access to education? > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org