On 29/08/11 13:24, Tim Serong wrote:
On 28/08/11 21:43, Sebastian Kaps wrote:
Hi,
on our two-node cluster (SLES11-SP1+HAE; corosync 1.3.1, pacemaker
1.1.5) we have defined the following FS resource and its corresponding
clone:
primitive p_fs_wwwdata ocf:heartbeat:Filesystem \
params device="/dev/drbd1" \
directory="/mnt/wwwdata" fstype="ocfs2" \
options="rw,noatime,noacl,nouser_xattr,commit=30,data=writeback" \
op start interval="0" timeout="90s" \
op stop interval="0" timeout="300s"
clone c_fs_wwwdata p_fs_wwwdata \
params master-max="2" clone-max="2" \
meta target-role="Started" is-managed="true"
one of the nodes (node01) went down last night and I started it with
the cluster put into maintenance-mode.
After checking everything else, I mounted the ocfs2-resource manually,
did some "crm resource reprobe/cleanup" to make the cluster aware of
this and finally turned off the maintenance-mode.
Looking at the output of crm_mon, everything looks good again:
Clone Set: c_fs_wwwdata [p_fs_wwwdata]
Started: [ node01 node02 ]
alternatively looking at "crm_mon -n":
Node node02: online
p_fs_wwwdata:1 (ocf::heartbeat:Filesystem) Started
Node node01: online
p_fs_wwwdata:0 (ocf::heartbeat:Filesystem) Started
but the HAWK web interface (version 0.3.6 coming with SLES11SP1-HAE)
displays this:
Clone Set: c_fs_wwwdata
- p_fs_wwwdata:0: Started: node01, node02
- p_fs_wwwdata:1: Stopped
Does anybody know why there is a difference?
Did I make a mistake when manually mounting the FS while it was
unmanaged?
Or is this only a cosmetical issue with HAWK?
When these resources are started by pacemaker, HAWK shows exactly
what's expected: two started resoures, one per node.
Thanks in advance!
It's almost certainly a cosmetic issue in Hawk. I have fixed one or two
bugs along these lines since version 0.3.6. If you'd like to try a newer
(not-officially-supported-by-SUSE-but-best-effort-support-by-me) build,
you can try hawk-0.4.1 from:
http://software.opensuse.org/search?q=Hawk&baseproject=SUSE%3ASLE-11%3ASP1&lang=en
Alternately, if you can reproduce the issue then send me the output of
"cibadmin -Q" (offlist is fine), I can verify/fix it.
Just for the record, it was a cosmetic issue in Hawk, now fixed in hg:
http://hg.clusterlabs.org/pacemaker/hawk/rev/3266874ef3fe
Regards,
Tim
--
Tim Serong
Senior Clustering Engineer
SUSE
tser...@suse.com
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker