On Thu, May 16, 2013 at 08:05:39PM +0000, Wilson, Christopher (IT) wrote: > I have a heartbeat 2.1.3-1 cluster and it was running fine until a recent > network outage. Since then one node has been getting errors such as
You do realize that there is heartbeat 3 and pacemaker? > heartbeat: [3824]: ERROR: Message hist queue is filling up (500 messages in > queue) I don't think this ^^^ message has anything to do with those "missing sockets" below. > I have looked through other mailing lists on the internet and have found that > it most likely stems from missing sockets in /var/run/heartbeat (notably > /var/run/heartbeat/register) > I have uninstalled the rpm and re-installed it, rebooted the machine and run > an strace on the heartbeat process to no avail. > It appears that heartbeat does not try to create the socket files if they are > missing. > > Could someone help me understand which component of heartbeat is responsible > for creating socket files? Heartbeat (the core process itself) is creating those sockets. It does not (in that version, anyways) create the *directory* /var/run/heartbeat. So you need to put a mkdir in your init script, if you have /var/run on tmpfs or similar. heartbeat 3 has that covered, btw. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems