On Thu, Jan 19, 2012 at 02:18:53PM +0000, Efrat Lefeber wrote: > Hi, > > I am using linux-ha heartbeat on a two simple nodes cluster. > For some reason which I can't figure out, the socket > /var/run/heartbeat/register is not created though the directory > /var/run/heartbeat/ exist: > > ll /var/run/heartbeat/ > total 24 > drwxr-x--- 6 hacluster haclient 4096 2012-01-19 14:30 . > drwxr-xr-x 16 root root 4096 2012-01-19 14:30 .. > drwxr-x--- 2 hacluster haclient 4096 2012-01-19 14:30 ccm > drwxr-x--- 2 hacluster haclient 4096 2012-01-19 14:30 crm > drwxr-x--- 2 hacluster haclient 4096 2012-01-19 14:30 dopd > drwxr-xr-t 2 root root 4096 2012-01-19 14:30 rsctmp > > > /etc/init.d/heartbeat status > heartbeat OK [pid 14685 et al] is running on vs-158 [vs-158]... > > cl_status hbstatus > Heartbeat is stopped on this machine. > > I ran cl_status with strace and I saw this error: > connect(3, {sa_family=AF_FILE, path="/var/run/heartbeat/register"...}, 110) = > -1 ENOENT (No such file or directory) > > > Who created this socket?
That's one of the first things the heartbeat binary does when it starts, If it can not create that socket, heartbeat will not even start up. Of course, in theory someone may remove that socket after it was created. If so, make sure that does not happen again ;) > How can I find out why isn't the socket created? Where did you get your packages/binaries? Double check your build? lsof -n -p your heartbeat master control process? > Is there a workaround I can do to create the socket? Fix your installation. > This problem doesn't happen all the time. I have another node with the > same configuration and the socket was created there. Same packages and build? -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems