Hi, in my OpenNebula environments i used a combination of Pacemaker and Corosync for monitoring the VMM host of a cluster, where proper checking of "libvirt" was crucial, to perform fencing and/or STONITH actions in case of a host failure. OpenNebula / oned triggers a failover of the VMs with the HOST_HOOK on ERROR (ft/host_error.rb).
Since several troubles with Corosync/Pacemaker (e.g. monitoring timeout of fencing device (IPMI/ILO-Module)) i decided to implement fencing / STONITH in the host_error.rb-Hook which triggers the failover (--delete --recreate). I think this is the "right" place for adding those functions? Therefore i added some attributes to the host templates (ILO_IP, ILO_USER, ILO_PASS - we use HP Servers with iLO-modules): MONITORING INFORMATION ARCH="x86_64" CPUSPEED="1999" CPUSPEED="1999" HOSTNAME="lab-cloud-staging-node-03" HYPERVISOR="kvm" HYPERVISOR="kvm" ILO_IP="IP.IP.IP.IP" ILO_PASS="USERNAME" ILO_USER="PASSWORD" MODELNAME="Intel(R) Xeon(R) CPU E5335 @ 2.00GHz" ... To access these attributes i changed the configuration of the hook in oned.conf: HOST_HOOK = [ name = "error", on = "ERROR", command = "/var/lib/one/remotes/hooks/ft/host_error.rb", arguments = "$ID $TEMPLATE -d -r", remote = "no" ] In the next step i modified the host_error.rb-Hook to trigger the STONITH-action in case of an host error. For that i included "rubyipmi", "base64" and "nokogiri" gem in the hook and added some (primitive, i`m not a programmer :) lines of code: <start> # ILO/BMC IP Base = $TEMPLATE if !(host_template=ARGV[1]) exit -1 end host_template_decoded=Base64.decode64(host_template) xml=Nokogiri::Slop(host_template_decoded) ilo_ip=xml.HOST.TEMPLATE.ILO_IP.content ilo_user=xml.HOST.TEMPLATE.ILO_USER.content ilo_pass=xml.HOST.TEMPLATE.ILO_PASS.content # Method UID LED activate def uidled(ilo_ip, ilo_pass, ilo_user) conn = Rubyipmi.connect(ilo_user, ilo_pass, ilo_ip, "ipmitool") # 86400 Sekunden = 1 Tag value = conn.chassis.identify(true, 86400) puts value sleep (2) end # Methode Hard-Reset by iLO/BMC def stonith(ilo_ip, ilo_pass, ilo_user) conn = Rubyipmi.connect(ilo_user, ilo_pass, ilo_ip, "ipmitool") value = conn.chassis.power.cycle puts value sleep (10) end # trigger uidled and stonith uidled(ilo_ip, ilo_pass, ilo_user) stonith(ilo_ip, ilo_pass, ilo_user) </stop> Is this the "right" way to trigger fencing actions with OpenNebula, or are there better ways to implement fencing/STONITH - how do you implement it? Perhaps virtual machine disk locking (e.g. SANLOCK) could be a solution for some environments? I think there is (currently) a lack of a proper fencing mechanism in OpenNebula, isn`t it? Best regards, Sebastian. _______________________________________________ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org