Hi all.. Over the past few days, I noticed that pcsd and ruby process is pegged at 99% CPU, and commands such as pcs status pcsd take up to 5 minutes to complete. On all active cluster nodes, top shows:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 27225 haclust+ 20 0 116324 91600 23136 R 99.3 0.1 1943:40 cib 23277 root 20 0 12.868g 8.176g 8460 S 99.7 13.0 407:44.18 ruby The system log indicates High CIB load detected over the past 2 days: [root@zs95kj ~]# grep "High CIB load detected" /var/log/messages |grep "Feb 3" |wc -l 1655 [root@zs95kj ~]# grep "High CIB load detected" /var/log/messages |grep "Feb 2" |wc -l 1658 [root@zs95kj ~]# grep "High CIB load detected" /var/log/messages |grep "Feb 1" |wc -l 147 [root@zs95kj ~]# grep "High CIB load detected" /var/log/messages |grep "Jan 31" |wc -l 444 [root@zs95kj ~]# grep "High CIB load detected" /var/log/messages |grep "Jan 30" |wc -l 352 The first entries logged on Feb 2 started around 8:42am ... Feb 2 08:42:12 zs95kj crmd[27233]: notice: High CIB load detected: 0.974333 This happens to coincide with the time that I had caused a node fence (off) action by creating a iface-bridge resources and specified a non-existent vlan slave interface (reported to the group yesterday in a separate email thread). It also happened to cause me to lose quorum in the cluster, because 2 of my 5 cluster nodes were already offline. My cluster currently has just over 200 VirtualDomain resources to manage, plus one iface-bridge resource and one iface-vlan resource. Both of which are currently configured properly and operational. I would appreciate some guidance how to proceed with debugging this issue. I have not taken any recovery actions yet. I considered stopping the cluster, recycling pcsd.service on all nodes, restarting cluster... and also, reboot the nodes, if necessary. But, didn't want to clear it yet in case there's anything I can capture while in this state. Thanks.. Scott Greenlese ... KVM on System Z - Solutions Test, Poughkeepsie, N.Y. INTERNET: swgre...@us.ibm.com
_______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org