Hi, There is a flag you can use to force a reboot if unmount is not successful.
See: http://kbase.redhat.com/faq/FAQ_51_11753.shtm Kær kveðja / Best regards, Finnur Ö. Guðmundsson MCP - RHCA - Linux+ System Engineer - System Operations [EMAIL PROTECTED] TM Software - Skyggnir Urðarhvarf 6, IS- 203 Kópavogur, Iceland tel: + 354 545 3000-fax + 354 545 3001 www.t.is -----Original Message----- From: [EMAIL PROTECTED] on behalf of Jonas Helgi Palsson Sent: Mon 7/21/2008 21:35 To: 'linux clustering' Subject: [Linux-cluster] Node with failed service does not get fenced. Hi Running CentOS 5.2, all current updates on x86_64 platform. I have set up a 2node cluster with following resources in one service * one shared MD device (the resource is a script that assembles and stops the , device and checks its status). * one shared filesystem, * one shared NFS startup script, * one shared ip. Which are started in that order. And the cluster works normaly, I can move the service between the two nodes. But I have observed one behavior that is not good. Once when trying to move the service from one node to another, the clustermanager could not "umount" the filesystem. Although "lsof | grep <mountpoint>" did not show anything, "umount -f <mountpoint>" did not work. ("umount -l <mountpoint>" did the job) But when the clustermanager failed on that, it also failes on the MD script and goes into "failed" status, with a message that "manual intervention is needed". Why does the node not get fenced down? Upon "reboot -f" the service does not start until the faulty node is back online. Are there any magical things one can put in cluster.conf to get the behavior I want? That if a service does not want to stop cleanly, fence the node and start the service on another node? regards Jonas -- Jonas Helgi Palsson -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
-- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster