Stefan,
Hello everyone!
I am using Pacemaker (1.1.12), Corosync (2.3.0) and libqb (0.16.0) in 2-node
clusters (virtualized in VMware infrastructure, OS: RHEL 6.7).
I noticed that if only one node is present, the CPU usage of Corosync (as seen
with top) is slowly but steadily increasing (over days; in my setting about 1%
per day). The node is basically idle, some Pacemaker managed resources are
running but they are not contacted by any clients.
I upgraded a test stand-alone node to Corosync (2.4.2) and libqb (1.0.1) (which
at least made the memleak go away), but the CPU usage is still increasing on
the node.
When I add a second node to the cluster, the CPU load drops back down to a
normal (low) CPU usage.
I haven't witnessed the increasing CPU load yet if two nodes were present in a
cluster.
Even if running Pacemaker/Corosync as a massive-overkill-Monit-replacement is
questionable, the observed CPU-load is not what I expect to happen.
What could be the reason for this CPU-load increase? Is there a rational behind
this?
This is really interesting observation. I can talk about corosync and I
must say no, there is no rationale behind. It simply shouldn't be
happening. Also I don't see any reason why connection of other node(s)
could help to remove CPU-load.
Is this a config thing or something in the binaries?
For sure not in corosync. Also your config file looks just ok.
Could you test single ring only and udpu if behavior stays same?
Regards,
Honza
BR, Stefan
My corosync.conf:
# Please read the corosync.conf.5 manual page
compatibility: whitetank
aisexec {
user:root
group:root
}
totem {
version: 2
# Security configuration
secauth: on
threads: 0
# Timeout for token
token: 1000
token_retransmits_before_loss_const: 4
# Number of messages that may be sent by one processor on receipt of
the token
max_messages: 20
# How long to wait for join messages in the membership protocol (ms)
join: 50
consensus: 1200
# Turn off the virtual synchrony filter
vsftype: none
# Stagger sending the node join messages by 1..send_join ms
send_join: 50
# Limit generated nodeids to 31-bits (positive signed integers)
clear_node_high_bit: yes
# Interface configuration
rrp_mode: passive
interface {
ringnumber: 0
bindnetaddr: 10.20.30.0
mcastaddr: 226.95.30.100
mcastport: 5510
}
interface {
ringnumber: 1
bindnetaddr: 10.20.31.0
mcastaddr: 226.95.31.100
mcastport: 5510
}
}
logging {
fileline: off
to_stderr: no
to_logfile: no
to_syslog: yes
syslog_facility: local3
debug: off
}
amf {
mode: disabled
}
quorum {
provider: corosync_votequorum
expected_votes: 1
}
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org