On Wed, Jan 23, 2013 at 12:53 AM, E-Blokos <in...@e-blokos.com> wrote: > > HI, > > on Fedora 17 corosync pacemaker version 1.1.7 (fedora update) > all nodes quit corosync pacemaker after a while > > [root@node140 ~]# systemctl status corosync > corosync.service - Corosync Cluster Engine > Loaded: loaded (/usr/lib/systemd/system/corosync.service; enabled) > Active: failed (Result: exit-code) since Tue, 22 Jan 2013 08:42:47 > -0500; 5min ago > Process: 13152 ExecStop=/usr/share/corosync/corosync stop > (code=exited, status=0/SUCCESS) > Process: 26754 ExecStart=/usr/share/corosync/corosync start > (code=exited, status=1/FAILURE) > Main PID: 1442 (code=dumped, signal=BUS)
Looks like corosync crashed and took pacemaker with it. Hard to say without the backtrace :( > CGroup: name=systemd:/system/corosync.service > > Jan 22 08:42:47 node140 corosync[26754]: [62B blob data] > Jan 22 08:42:47 node140 corosync[26761]: [SERV ] Unloading all Corosync > service engines. > Jan 22 08:42:47 node140 corosync[26761]: [QB ] withdrawing server sockets > Jan 22 08:42:47 node140 corosync[26761]: [SERV ] Service engine unloaded: > corosync vote quorum service v1.0 > Jan 22 08:42:47 node140 corosync[26761]: [QB ] withdrawing server sockets > Jan 22 08:42:47 node140 corosync[26761]: [SERV ] Service engine unloaded: > corosync configuration map access > Jan 22 08:42:47 node140 corosync[26761]: [QB ] withdrawing server sockets > Jan 22 08:42:47 node140 corosync[26761]: [SERV ] Service engine unloaded: > corosync configuration service > Jan 22 08:42:47 node140 corosync[26761]: [QB ] withdrawing server sockets > Jan 22 08:42:47 node140 corosync[26761]: [SERV ] Service engine unloaded: > corosync cluster closed process group service v1.01 > > in log: > > Jan 22 08:42:23 node140 pacemakerd[26540]: info: crm_log_init_worker: > Changed active directory to /var/lib/heartbeat/cores/root > Jan 22 08:42:23 node140 pacemakerd[26540]: Could not initialize Cluster > Configuration Database API instance, error 2 > Jan 22 08:42:23 node140 systemd[1]: pacemaker.service: main process exited, > code=exited, status=1 > Jan 22 08:42:23 node140 systemd[1]: Unit pacemaker.service entered failed > state. > Jan 22 08:42:23 node140 systemd[1]: pacemaker.service holdoff time over, > scheduling restart. > > permission problems ? if yes is cores/root must be other than hacluster.root > ? > > Thanks > > Franck > > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org