On Fri, 2018-09-21 at 13:34 +0530, Prasad Nagaraj wrote: > Hi - > > Yesterday, I noticed that when I am trying to execute 'crm node > standby' command on one of my cluster nodes, it was failing with > > "Error performing operation: Communication error on send . Return > code is 70" > > My corosync logs had these entries during that time: > > Sep 20 22:14:54 [4454] vm5c336912f1 crmd: notice: > throttle_handle_load: High CPU load detected: 1.850000 > Sep 20 22:14:57 [4449] vm5c336912f1 cib: info: > cib_process_ping: Reporting our current digest to vmb546073338: > 8fe67fcfcd20515c246c225a124a8902 for 0.481.2 (0x2742230 0) > Sep 20 22:15:09 [4449] vm5c336912f1 cib: info: > cib_process_request: Forwarding cib_modify operation for section > nodes to master (origin=local/crm_attribute/4) > Sep 20 22:15:24 [4454] vm5c336912f1 crmd: notice: > throttle_handle_load: High CPU load detected: 1.640000 > Sep 20 22:15:54 [4454] vm5c336912f1 crmd: info: > throttle_handle_load: Moderate CPU load detected: 0.990000 > Sep 20 22:15:54 [4454] vm5c336912f1 crmd: info: > throttle_send_command: New throttle mode: 0010 (was 0100) > Sep 20 22:16:24 [4454] vm5c336912f1 crmd: info: > throttle_send_command: New throttle mode: 0001 (was 0010) > Sep 20 22:16:54 [4454] vm5c336912f1 crmd: info: > throttle_send_command: New throttle mode: 0000 (was 0001) > Sep 20 22:17:09 [4449] vm5c336912f1 cib: info: > cib_process_request: Forwarding cib_modify operation for section > nodes to master (origin=local/crm_attribute/4) > Sep 20 22:19:10 [4449] vm5c336912f1 cib: info: > cib_process_request: Forwarding cib_modify operation for section > nodes to master (origin=local/crm_attribute/4) > Sep 20 22:23:08 [4449] vm5c336912f1 cib: info: > cib_perform_op: Diff: --- 0.481.2 2 > Sep 20 22:23:08 [4449] vm5c336912f1 cib: info: > cib_perform_op: Diff: +++ 0.482.0 > 9bacc862b8713430c81ea91694942a41 > Sep 20 22:23:08 [4449] vm5c336912f1 cib: info: > cib_perform_op: + /cib: @epoch=482, @num_updates=0 > > > Is the above behavior due to pacemaker thinking that cluster is > highly loaded and trying to throttle the execution of commands ? What > is the best way to resolve or work-around such problems. We do have > high io load on our cluster - which hosts mysql database.
Throttling is a natural way to handle occasional high load and is not a problem in itself. I wouldn't expect a load of 1.85 to make a big difference, so I wouldn't worry about that unless other load-related problems emerge. The error message you reported sounds more like a networking issue than a load issue. Are you seeing any network issues around that time? Especially corosync retransmits or token timeouts could be significant. > > Also from the thread, > https://lists.clusterlabs.org/pipermail/users/2017-May/005702.html > > it was asked : > >There is not much detail about “load-threshold”. > > Please can someone share steps or any commands to modify “load- > threshold”. > Could someone advise whether this is the way to control the > throttling of cluster operations and how to set this parameter ? > > Thanks in advance, > Prasad > _______________________________________________ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch. > pdf > Bugs: http://bugs.clusterlabs.org -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org