Your message dated Sun, 12 Jan 2020 11:04:29 +0100
with message-id <[email protected]>
and subject line Re: crmd: number of connections to pengine socket increasing,,
exhausting max_open_files after some time
has caused the Debian Bug report #722339,
regarding crmd: number of connections to pengine socket increasing,, exhausting
max_open_files after some time
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)
--
722339: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=722339
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: pacemaker
Version: 1.1.7-1
Severity: important
Tags: upstream
Dear Maintainer,
we are using pacemaker/corosync for building our clusters. In addition,
we use puppet to build our systems.
Since we switched from squeeze to wheezy, we found a serious problem
within the crmd process.
The problem can be triggered like this:
* requirement: a working pacemaker/corosync cluster
1. Login to all nodes of the cluster
2. run 'watch -n1 "lsof -p `pidof crmd` | grep socket | wc -l"' on all
nodes to see the number of open sockets of the crmd process
3. choose one node of the cluster and save the configuration with 'crm
configure save /tmp/config.bak'
4. update the configuration with the saved file: 'crm configure load
update /tmp/config.bak'
5. The number of open socket of at least one node should increase now
The problem is, that puppet triggers the last command on every run (in
our case every 10 minutes). The number of sockets keep increasing
till max_open_files is reached (usually 1024). After that, the cluster
behaves unexpectedly. In our case, it lost all of it's resources till
the next puppet run. I testet the above with a squeeze installation and
the problem did not appear. We found the same problem on RedHat systems
using the same pacemaker version (1.1.7).
As a workaround, we disabled the pacemaker module in puppet. But I think
this is a critical problem, since the pacemaker cluster who should
provide high availability can cause a serious downtime.
Regards,
Christian
-- System Information:
Debian Release: 7.1
APT prefers stable
APT policy: (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 3.2.0-4-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages pacemaker depends on:
ii adduser 3.113+nmu3
ii corosync 1.4.2-3
ii libbz2-1.0 1.0.6-4
ii libc6 2.13-38
ii libcfg4 1.4.2-3
ii libcib1 1.1.7-1
ii libconfdb4 1.4.2-3
ii libcoroipcc4 1.4.2-3
ii libcpg4 1.4.2-3
ii libcrmcluster1 1.1.7-1
ii libcrmcommon2 1.1.7-1
ii libesmtp6 1.0.6-1+b1
ii libglib2.0-0 2.33.12+really2.32.4-5
ii libgnutls26 2.12.20-7
ii liblrm2 1.0.9+hg2665-1
ii libltdl7 2.4.2-1.1
ii libncurses5 5.9-10
ii libpam0g 1.1.3-7.1
ii libpe-rules2 1.1.7-1
ii libpe-status3 1.1.7-1
ii libpengine3 1.1.7-1
ii libpils2 1.0.9+hg2665-1
ii libplumb2 1.0.9+hg2665-1
ii libsnmp15 5.4.3~dfsg-2.7
ii libssl1.0.0 1.0.1e-2
ii libstonithd1 1.1.7-1
ii libtinfo5 5.9-10
ii libtransitioner1 1.1.7-1
ii libuuid1 2.20.1-5.3
ii libxml2 2.8.0+dfsg1-7+nmu1
ii libxslt1.1 1.1.26-14.1
ii python 2.7.3-4
ii python2.7 2.7.3-6
ii resource-agents 1:3.9.2-5+deb7u1
pacemaker recommends no packages.
pacemaker suggests no packages.
-- no debconf information
--- End Message ---
--- Begin Message ---
Hi Christian,
I failed to reproduce this with Pacemaker 1.1.16 (from jessie
backports). Please note that the crm command isn't part of the
pacemaker package, it's from the crmsh package, so I tested with
cibadmin --query and cibadmin --replace instead. Since this is a very
old report (sorry for neglecting it this long), I'm closing it in the
hope that it's been fixed meanwhile, wherever it was. If it still
affects you on stretch or buster, please open a new one with current
info.
--
Thanks,
Feri
--- End Message ---