Hi!

I have some strange problems with the current update of the cluster software in 
SLES11 SP2 (I didn't see such problems before the update):

sbd monitoring went crazy (reporting running sbds when there were none, 
compaining the unability to stop sbd when there was none), so I stopped it.

Now that I re-activated it, the cluster talks about resources that had been 
deleted days ago, like:
---
Apr 19 08:56:19 h05 attrd: [13083]: notice: attrd_local_callback: Sending full 
refresh (origin=crmd)
Apr 19 08:56:19 h05 attrd: [13083]: notice: attrd_trigger_update: Sending flush 
op to all hosts for: last-failure-prm_stonith_sbd (1365148953)
Apr 19 08:56:19 h05 cib: [13080]: info: cib_process_request: Operation 
complete: op cib_delete for section //node_state[@uname='h05']/lrm 
(origin=local/crmd/6835, version=0.744.19): ok (rc=0)
Apr 19 08:56:19 h05 crmd: [13085]: info: abort_transition_graph: 
te_update_diff:320 - Triggered transition abort (complete=1, tag=lrm_rsc_op, 
id=prm_v06_v06_raid1_last_0, 
magic=0:7;117:15:7:de539cd3-5895-4bcd-a388-ebad29a7b63d, cib=0.744.19) : 
Resource op removal
---

The resource prm_v06_v06_raid1 had been removed several days before in:
Apr 15 10:08:16 h05 cib: [13080]: info: cib_replace_notify: Replaced: 0.733.19 
-> 0.734.1 from <null>

Interestingly a CIB dump minutes before the SBD-Change showed that the deleted 
resource still had an "lrm_resource" entry in the CIB:
---
          <lrm_resource id="prm_v06_v06_raid1" type="Raid1" class="ocf" 
provider="heartbeat">
            <lrm_rsc_op id="prm_v06_v06_raid1_last_0" 
operation_key="prm_v06_v06_raid1_monitor_0" operation="monitor" 
crm-debug-origin="build_active_RAs" crm_feature_set="3.0.6" 
transition-key="117:15:7:de539cd3-5895-4bcd-a388-ebad29a7b63d" 
transition-magic="0:7;117:15:7:de539cd3-5895-4bcd-a388-ebad29a7b63d" 
call-id="76" rc-code="7" op-status="0" interval="0" 
op-digest="0e6b2558abfd3cee98ee60cb7b03e6b0"/>
---
And the resource should have been removed before:
Apr 15 13:14:00 h05 crmd: [13085]: info: abort_transition_graph: 
te_update_diff:320 - Triggered transition abort (complete=1, tag=lrm_rsc_op, 
id=prm_v06_v06_raid1_last_0, magic=0:7;117:15:7:de5
39cd3-5895-4bcd-a388-ebad29a7b63d, cib=0.735.35) : Resource op removal

Isn't his very strange, or is there a reasonable explanation?

Regards,
Ulrich


_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to