Hi Dejan, and thanks for the super quick response. > Please upgrade to 1.0.3. Not sure, but those versions you have > may have a bad bug.
I didn't want to do too many changes before having some people have a look at it. I agree there are some new pacemaker/cluster-glue revisions, though. > > corosync.x86_64 1.2.0-1.el5 > > corosynclib.x86_64 1.2.0-1.el5 > > heartbeat.x86_64 3.0.1-1.el5 > > heartbeat-libs.x86_64 3.0.1-1.el5 > > You don't need both heartbeat and corosync. I think this comes from the RPM dependencies. If I try to remove heartbeat using 'yum', then it also wants to remove pacemaker. I am making sure that heartbeat doesn't start, though. Only corosync is configured to start at system boot. > Anything in logs? Or is that the log attached? The attached logs (messages.2) show what happened just before and right after the freeze. The last log entry is at 17:26:59. The freeze lasts until 17:41:46. During that time, we should at a minimum have logs for drbd monitoring (crm_attribute...). > Feb 4 17:41:54 nfs2a lrmd: [3072]: info: RA output: (res_drbd:1:start:stderr) 0 : Failure: (124) Device > is attached to a disk (use detach first) > Feb 4 17:41:54 nfs2a lrmd: [3072]: info: RA output: (res_drbd:1:start:stderr) Command 'drbdsetup 0 disk > /dev/sdb /dev/sdb internal --set-defaults --create-device --fencing=resource-only > --on-io-error=detach' terminated with exit code 10 > Feb 4 17:41:54 nfs2a drbd[3243]: ERROR: nfs: Called drbdadm -c /etc/drbd.conf --peer nfs2b.test.local > up nfs > Feb 4 17:41:54 nfs2a drbd[3243]: ERROR: nfs: Exit code 1 > > That's what I could find in the logs. This happens after the freeze and manual reboot. I am not sure why I get this error, but for sure after the other node came back up, everything worked fine again. Thanks again, - Patrick - ************************************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. [email protected] ************************************************************************************** _______________________________________________ Pacemaker mailing list [email protected] http://oss.clusterlabs.org/mailman/listinfo/pacemaker
