I'm having an issue with heartbeat, where it has running resources on a
server which is offline. The crm_mon status is:

============
Last updated: Tue Mar 25 16:43:38 2008
Current DC: front001(7c5cc43a-0601-4924-afa5-1bf1f29efcfb)
5 Nodes configured.
13 Resources configured.
============

Node: front001 (7c5cc43a-0601-4924-afa5-1bf1f29efcfb): online
Node: front005 (7bbcb330-7d41-445f-ada0-a61d1046863e): online
Node: front004 (77e65606-f308-4c59-8493-65b1b571a2ab): online
Node: front003 (3d8b08af-160c-4268-9446-20321e8803aa): OFFLINE
Node: front002 (607f465e-3fef-4dd3-8afc-35dadec069ec): online

Full list of resources:

xxxxxx001       (heartbeat::ocf:xen):   Started front001
yyyyyy001       (heartbeat::ocf:xen):   Started front001
yyyyyy002       (heartbeat::ocf:xen):   Started front002
xxx001  (heartbeat::ocf:xen):   Started front003
xxx002  (heartbeat::ocf:xen):   Started front004
xx001   (heartbeat::ocf:xen):   Started front003
xx002   (heartbeat::ocf:xen):   Started front004
xxxxxx002       (heartbeat::ocf:xen):   Started front002
stonith_front001        (stonith:external/ipmitool)[    front005     
front003 ]
stonith_front002        (stonith:external/ipmitool):    Started front001
stonith_front003        (stonith:external/ipmitool):    Started front005
FAILED
stonith_front004        (stonith:external/ipmitool):    Started front001
stonith_front005        (stonith:external/ipmitool):    Started front001

Failed actions:
    stonith_front003_start_0 (node=front005, call=2523, rc=1): Error
    stonith_front003_start_0 (node=front001, call=74, rc=1): Error
    stonith_front003_start_0 (node=front004, call=22, rc=1): Error
    stonith_front003_monitor_0 (node=front002, call=20, rc=14): Error
    stonith_front003_start_0 (node=front002, call=22, rc=1): Error


The front003 has a hardware failure, so it is to be expected that the
stonith action will fail. ( This is a custom stonith script, so there
might be some bugs left in it. The xen ocf script is also a custom one )

The real problem is that it shows 2 resources running on the front003,
while this server is obviously offline. It should move the resources to
one of the other servers, but doesn't for some reason.

The OS is centos 5.1, with the following packages:

heartbeat-common-2.1.3-15.1
heartbeat-resources-2.1.3-15.1
pacemaker-heartbeat-0.6.2-11.1
heartbeat-2.1.3-15.1

I can provide more details if necessary.

Niels

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to