I have a customer (running SLE 11 SP4 HAE) who is seeing the following stonith behavior running the ipmi stonith plugin.

Dec 15 14:21:43 test4 pengine[24002]: warning: pe_fence_node: Node test3 will be fenced because termination was requested Dec 15 14:21:43 test4 pengine[24002]: warning: determine_online_status: Node test3 is unclean Dec 15 14:21:43 test4 pengine[24002]: warning: stage6: Scheduling Node test3 for STONITH

... it issues the reset and it is noted ...
Dec 15 14:21:45 test4 external/ipmi(STONITH-test3)[177184]: [177197]: debug: ipmitool output: Chassis Power Control: Reset Dec 15 14:21:46 test4 stonith-ng[23999]: notice: log_operation: Operation 'reboot' [177179] (call 2 from crmd.24003) for host 'test3' with device 'STONITH-test3' returned: 0 (OK)

... test3 does go down ...
Dec 15 14:22:21 test4 kernel: [90153.906461] Cell 2 (test3) left the membership

... but the stonith operation times out (it said OK earlier) ...
Dec 15 14:22:56 test4 stonith-ng[23999]: notice: remote_op_timeout: Action reboot (a399a8cb-541a-455e-8d7c-9072d48667d1) for test3 (crmd.24003) timed out Dec 15 14:23:05 test4 external/ipmi(STONITH-test3)[177667]: [177678]: debug: ipmitool output: Chassis Power is on

Dec 15 14:23:56 test4 crmd[24003]: error: stonith_async_timeout_handler: Async call 2 timed out after 132000ms Dec 15 14:23:56 test4 crmd[24003]: notice: tengine_stonith_callback: Stonith operation 2/51:100:0:f43dc87c-faf0-4034-8b51-be0c13c95656: Timer expired (-62) Dec 15 14:23:56 test4 crmd[24003]: notice: tengine_stonith_callback: Stonith operation 2 for test3 failed (Timer expired): aborting transition. Dec 15 14:23:56 test4 crmd[24003]: notice: abort_transition_graph: Transition aborted: Stonith failed (source=tengine_stonith_callback:697, 0)

This looks like a bug but a quick search did not turn up anything. Does anyone recognize this problem?

--

Ron Kerry


_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to