Dear all,

I have three hosts setup in my test environment.
They each have two connections to the SAN which has GFS2 on it.

Everything works like a charm, except when I reboot a host.
Once it tries to stop gfs2-utils service it will just hang.

I managed to pinpoint this to the umount command in the service's script.

However, I tried manually starting and stopping the service a lot of
times and had zero failures.
Further investigation shows difference between manual stopping of the
service and stopping of the service by the reboot.

First one lists reason to remove node as "leave" and latter's reason
is "nodedown".

I don't know why would we have different behaviour for the same
command (umount -a -t gfs2).

If the command wouldn't hang, I could suspect next service in order to
be the culprit, but this isn't the case. You can see the hosts output
in itafgfsxen02-reboot.

I am attaching logs from my machines and also corosync.conf.
dlm_controld is started with these options: "dlm_controld -D -f 0 -q 0
-s 0 -K -L"
If I missed any info, please don't hold back to request it :)

Any help will be highly appreciated.

Kind regards,
Momcilo "Momo" Medic.
(fedorauser)

Attachment: itiafgfsxen01-corosync
Description: Binary data

Attachment: itiafgfsxen01-dlm
Description: Binary data

Attachment: itiafgfsxen02-corosync
Description: Binary data

Attachment: itiafgfsxen02-dlm
Description: Binary data

Attachment: itiafgfsxen02-reboot
Description: Binary data

Attachment: itiafgfsxen03-corosync
Description: Binary data

Attachment: itiafgfsxen03-dlm
Description: Binary data

Attachment: corosync.conf
Description: Binary data

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to