On Thu, May 16, 2013 at 08:05:39PM +0000, Wilson, Christopher (IT) wrote:
> I have a heartbeat 2.1.3-1 cluster and it was running fine until a recent 
> network outage. Since then one node has been getting errors such as

You do realize that there is heartbeat 3 and pacemaker?

> heartbeat: [3824]: ERROR: Message hist queue is filling up (500 messages in 
> queue)

I don't think this ^^^ message has anything to do with
those "missing sockets" below.

> I have looked through other mailing lists on the internet and have found that 
> it most likely stems from missing sockets in /var/run/heartbeat (notably 
> /var/run/heartbeat/register)
> I have uninstalled the rpm and re-installed it, rebooted the machine and run 
> an strace on the heartbeat process to no avail.
> It appears that heartbeat does not try to create the socket files if they are 
> missing.
> 
> Could someone help me understand which component of heartbeat is responsible 
> for creating socket files?

Heartbeat (the core process itself) is creating those sockets.
It does not (in that version, anyways) create the *directory* 
/var/run/heartbeat.
So you need to put a mkdir in your init script, if you have /var/run on tmpfs 
or similar.

heartbeat 3 has that covered, btw.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to