Charles,

Thanks for your reply. Indeed you may be right that the NUT fencing agent might 
be written with networked UPSes in mind, as healthy nodes could use the network 
to issue "fence" orders to remove unhealthy ones. I will post here if I find 
more info.

The problem with the resupply of services is that NUT doesn't restart on the 
node that comes back up. To recap, I pull the power on one UPS, both nodes 
shutdown. The remaining mains connected UPS power cycles its outlets, which 
reboots its node. Because the node has just started, it wants all of its 
services to be healthy before providing them. This includes the fencing agent, 
which relies on NUT, which hasn't started. So the node doesn't start the rest 
of its services (Apache, MySQL, Samba).

Relevant log entries.
 
Feb 13 23:11:42 xinetd[1647] Reading included configuration file: 
/etc/xinetd.d/cups-lpd [file/etclxinetd.d/cups-lpd] [linel 7]
Feb 13 23:11:42 systemd[1] Starting LSB: UPS monitoring software (deprecated, 
remote/local)...
Feb 13 23:11:43 usbhid-ups[2093] Startup successful
Feb 13 23:11:43 upsd[1 932] Starting NUT UPS drivers ..done
Feb 13 23:11:43 upsd[21 04] not listening on 192.168.1.22 port 3.493
Feb 13 23:11:43 upsd[21 04] listening on ::1 port 3493
Feb 13 23:11:43 upsd[2104] listening on 127.0.0.1 port 3493
Feb 1323:11:43 upsd[21041 no listening interface available
Feb 13 23:11:43 startproc[2095] startproc: exit status of parent of 
/usr/sbin/upsd: 1
Feb 13 23:11:43 usbhid-ups[20931 Signal 15: exiting
Feb 1323:11:43 upsd[1932] Starting NUTUPSserver..failed
Feb 13 23:11:43 systemd[1] upsd.service: Control process exited, codeexited 
status7
Feb 13 23:11:43 systemd[1] Failed to start LSB: UPS monitoring software 
(deprecated, remote/local).
Feb 13 23:11:43 systemd[1] upsd.service: Unit entered failed state.
Feb 13 23:11:43 systemd[1] upsd.service: Failed with result 'exit-code'.

I can manually bring the surviving node's services back up if by removing the 
requirement that Stonith services are enabled. I cannot get NUT to restart 
until I restart the 2nd node. 

Regards,
Tim.




-----Original Message-----
From: Charles Lepple [mailto:clep...@gmail.com] 
Sent: Sunday, 12 February 2017 11:57 AM
To: Tim Richards
Cc: nut-upsuser Mailing List
Subject: Re: [Nut-upsuser] NUT configuration complicated by Stonith/Fencing 
cabling

On Feb 10, 2017, at 5:48 PM, Tim Richards <tims_t...@hotmail.com> wrote:
> 
> I am trying to kill two birds with one stone, that is UPS protection from 
> power failure and cluster node fencing (Stonith) with the UPS ability to cut 
> power to a node. Somebody has done this, as there exists a fencing agent 
> using NUT in the Pacemaker/Corosync (Linux-HA cluster software), I just don't 
> know the best way to go about it.

Some UPS models have more than one serial port, or have a network adapter which 
can support multiple monitoring systems (via SNMP or HTTP/XML). Is it possible 
that the NUT fencing agent was written with that case in mind? That would mean 
that neither node would depend on the other for UPS status.

Can you elaborate on the "resupply of services problem"? With cross-connected 
UPSes (and only a single comm port per UPS), I am not sure if you can achieve 
both goals when only one UPS loses power.

(I don't think this sort of setup has been discussed much on the NUT lists, 
although it certainly sounds like an interesting way to use NUT. If you do find 
out more about how the NUT fencing agent was intended to be configured, perhaps 
from the fencing software lists or forums, feel free to post that here was 
well.)

-- 
- Charles Lepple
https://ghz.cc/charles/



_______________________________________________
Nut-upsuser mailing list
Nut-upsuser@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsuser

Reply via email to