hi, i have a cluster with several resources.
i issued crm_resource -P and now have got the cluster in some strange state, which it cannot resolve by itself: > Node: wc01 (31de4ab3-2d05-476e-8f9a-627ad6cd94ca): standby > Node: wc02 (f36760d8-d84a-46b2-b452-4c8cac8b3396): standby ... > Master/Slave Set: ms_drbd_www > drbd_www:0 (ocf::heartbeat:drbd) Master [ wc01 wc02 ] > drbd_www:1 (ocf::heartbeat:drbd) Master [ wc01 wc02 ] ... > Master/Slave Set: ms_drbd_mysql > drbd_mysql:0 (ocf::heartbeat:drbd) Master [ wc01 wc02 ] > drbd_mysql:1 (ocf::heartbeat:drbd) Master [ wc01 wc02 ] failed actions: > Failed actions: > drbd_www:1_monitor_0 (node=wc02, call=13666, rc=0): complete > drbd_www:0_monitor_0 (node=wc02, call=13665, rc=0): complete > drbd_mysql:1_monitor_0 (node=wc02, call=13672, rc=0): complete > drbd_mysql:0_monitor_0 (node=wc02, call=13671, rc=0): complete those monitoring failures repeat continouesly. in the logfiles i find: ... > crmd[14105]: 2008/11/12_13:14:19 WARN: status_from_rc: Action 16 > (drbd_www:0_monitor_0) on wc02 failed (target: 8 vs. rc: 0): Error > crmd[14105]: 2008/11/12_13:14:19 info: abort_transition_graph: > __FUNCTION__:385 - Triggered transition abort (complete=0, tag=lrm_rsc_op, > id=drbd_www:0_monitor_0, > magic=0:0;16:670:8:d3f15030-d3f0-421d-a477-ce19a2cae321) : Event failed > crmd[14105]: 2008/11/12_13:14:19 info: update_abort_priority: Abort priority > upgraded from 0 to 1 > crmd[14105]: 2008/11/12_13:14:19 info: update_abort_priority: Abort action > done superceeded by restart > crmd[14105]: 2008/11/12_13:14:19 info: match_graph_event: Action > drbd_www:0_monitor_0 (16) confirmed on wc02 (rc=4) > crmd[14105]: 2008/11/12_13:14:19 WARN: status_from_rc: Action 17 > (drbd_www:1_monitor_0) on wc02 failed (target: 8 vs. rc: 0): Error > crmd[14105]: 2008/11/12_13:14:19 info: abort_transition_graph: > __FUNCTION__:385 - Triggered transition abort (complete=0, tag=lrm_rsc_op, > id=drbd_www:1_monitor_0, > magic=0:0;17:670:8:d3f15030-d3f0-421d-a477-ce19a2cae321) : Event failed > crmd[14105]: 2008/11/12_13:14:19 info: match_graph_event: Action > drbd_www:1_monitor_0 (17) confirmed on wc02 (rc=4) ... i put some debug information into the drbd ocf ra: > #!/bin/sh > echo "----" >> /tmp/lalala but /tmp/lalala stays emtpy. if i manually call the drbd ra with all parameters i get the expected rc 8. hb_report http://ip52.ipax.at/~raoul/cluster/no_monitor_action.tar.gz (its kinda big as a lot of actions failed) cheers, raoul ps: i allready tried to revoke the crm_standby, but this does not resolve the error messages and does not call the drbd ocf ra. -- ____________________________________________________________________ DI (FH) Raoul Bhatia M.Sc. email. [EMAIL PROTECTED] Technischer Leiter IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at Barawitzkagasse 10/2/2/11 email. [EMAIL PROTECTED] 1190 Wien tel. +43 1 3670030 FN 277995t HG Wien fax. +43 1 3670030 15 ____________________________________________________________________ _______________________________________________ Pacemaker mailing list [email protected] http://list.clusterlabs.org/mailman/listinfo/pacemaker
