hi,
the kvm guest are different kvm host.
2014-03-24 0:30 GMT+01:00 Andrew Beekhof and...@beekhof.net:
On 21 Mar 2014, at 11:11 pm, Beo Banks beo.ba...@googlemail.com wrote:
yap and that´s my issue.
stonith is very powerfull but how can the cluster handle hardware
failure?
by connecting to the switch that supplies power to said hardware
exactly the reason devices like fence_virsh and external/ssh are not
considered reliable.
are both these VMs running on the same physical hardware?
primitive stonith-linux01 stonith:fence_virsh \
params pcmk_host_list=linux01 pcmk_host_check=dynamic-list
pcmk_host_map=linux01:linux01 action=reboot ipaddr=XX
secure=true login=root identity_file=/root/.ssh/id_rsa
debug=/var/log/stonith.log verbose=false \
you dont need the host map if the name and value (name:value) are the same
op monitor interval=300s \
op start interval=0 timeout=60s \
meta failure-timeout=180s
primitive stonith-linux02 stonith:fence_virsh \
params pcmk_host_list=linux02 pcmk_host_check=dynamic-list
pcmk_host_map=linux02:linux02 action=reboot ipaddr=X
secure=true login=root identity_file=/root/.ssh/id_rsa delay=5
debug=/var/log/stonith.log verbose=false \
op monitor interval=60s \
op start interval=0 timeout=60s \
meta failure-timeout=180s
2014-03-18 13:54 GMT+01:00 emmanuel segura emi2f...@gmail.com:
do you have stonith configured?
2014-03-18 13:07 GMT+01:00 Alex Samad - Yieldbroker
alex.sa...@yieldbroker.com:
Im not expert but
Current DC: linux02 - partition WITHOUT quorum
Version: 1.1.10-14.el6_5.2-368c726
2 Nodes configured, 2 expected votes
I think your 2nd node can't make quorum, there is some special config
for 2 node cluster to allow nodes to make quorum with 1 vote..
A
From: Beo Banks [mailto:beo.ba...@googlemail.com]
Sent: Tuesday, 18 March 2014 10:06 PM
To: pacemaker@oss.clusterlabs.org
Subject: [Pacemaker] crm resource doesn´t move after hardware crash
hi,
i have a hardware crash in a two-node drbd cluster.
the active node has a hardware failure is actual down.
i am wondering that my 2nd doesn´t migrate/move the resource.
the 2nd node want´s to fence the device but that´s not possible (it´s
down)
how can i enable the services on the last good node?
and how can i optimize my config to handle that kind of error?
crm status
Last updated: Tue Mar 18 12:01:07 2014
Last change: Tue Mar 18 11:28:22 2014 via crmd on linux02
Stack: classic openais (with plugin)
Current DC: linux02 - partition WITHOUT quorum
Version: 1.1.10-14.el6_5.2-368c726
2 Nodes configured, 2 expected votes
21 Resources configured
Node linux01: UNCLEAN (offline)
Online: [ linux02 ]
Resource Group: mysql
mysql_fs (ocf::heartbeat:Filesystem):Started linux01
mysql_ip (ocf::heartbeat:IPaddr2): Started linux01
and so on
cluster.log
Mar 18 11:54:43 [2234] linux02 crmd: notice:
tengine_stonith_callback: Stonith operation 17 for linux01 failed
(Timer expired): aborting transition.
Mar 18 11:54:43 [2234] linux02 crmd: info:
abort_transition_graph:tengine_stonith_callback:463 - Triggered
transition abort (complete=0) : Stonith failed
Mar 18 11:54:43 [2234] linux02 crmd: notice: run_graph:
Transition 15 (Complete=9, Pending=0, Fired=0, Skipped=36, Incomplete=19,
Source=/var/lib/pacemaker/pengine/pe-warn-63.bz2): Stopped
Mar 18 11:54:43 [2234] linux02 crmd: notice:
too_many_st_failures: Too many failures to fence linux01 (16), giving up
Mar 18 11:54:43 [2234] linux02 crmd: info: do_log:FSA:
Input I_TE_SUCCESS from notify_crmd() received in state S_TRANSITION_ENGINE
Mar 18 11:54:43 [2234] linux02 crmd: notice:
do_state_transition: State transition S_TRANSITION_ENGINE - S_IDLE [
input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
Mar 18 11:54:43 [2230] linux02 stonith-ng: info: stonith_command:
Processed st_notify reply from linux02: OK (0)
Mar 18 11:54:43 [2234] linux02 crmd: notice:
tengine_stonith_notify:Peer linux01 was not terminated (reboot) by
linux02 for linux02: Timer expired
(ref=7939b264-699c-4d00-a89c-07e7e0193a80) by client crmd.2234
Mar 18 11:54:44 [2229] linux02cib: info: crm_client_new:
Connecting 0x155ac00 for uid=0 gid=0 pid=23360
id=b88b2690-0c3f-48ac-b8b4-3a47b7f9114a
Mar 18 11:54:44 [2229] linux02cib: info:
cib_process_request: Completed cib_query operation for section 'all': OK
(rc=0, origin=local/crm_mon/2, version=0.125.2)
Mar 18 11:54:44 [2229] linux02cib: info: crm_client_destroy:
Destroying 0 events
Mar 18 11:55:03 [2229] linux02cib: info: crm_client_new:
Connecting 0x155ac00 for uid=0 gid=0 pid=23415