Hello List,

Any suggestions to solve the following would be most appreciated.

Setup: Active/Passive Two Node Cluster. Two UPSes (APC Smart-UPS 1500 C) with 
USB communication cables cross connected (ie UPS-webserver1 monitored by 
webserver2, and vice versa) to allow for stonith/fencing
OS OpenSuse Leap 42.2
NUT version 2.7.1-2.41-x86_64
Fencing agent: external/nut

Problem: When power fails to a single UPS, both nodes are shutdown. The node 
with the still powered UPS comes back up, but requires manual intervention to 
keep it providing services. I would like only the node with the "On Battery" 
UPS to shutdown.
The resupply of services problem seems to be that NUT on the node that comes 
back up will not restart until the other node restarts.

Stonith and my upssched-cmd script both use

upscmd -u ups-webserver2-master -p mypassword ups-webserver2@webserver1 
shutdown.reboot

or

upscmd -u ups-webserver1-master -p mypassword ups-webserver1@webserver2 
shutdown.reboot

as appropriate. When the cluster software (Pacemaker/Corosync) use the one of 
above command as part of a fencing operation, only the target node is shutdown, 
and its UPS's outlets power-cycled. When NUT via my upssched-cmd script issues 
one of the above commands both nodes shutdown and both of their UPS's outlets 
power-cycle.

This problem should be very rare, but it would be better to cover it rather 
than not.

Power failure and resupply to both UPSes (the most common problem for me) works 
well. I use upssched to set the same timers after power failure on each system. 
The receive simultaneous shutdown commands, which they obey. When power returns 
they both come back up.

Stonith/Fencing via the stonith resource agent external/nut resource agent 
works.

Thanks,
Tim.



My config files

ups.conf

On webserver1
[ups-webserver2]
        driver = usbhid-ups
        port = auto
        desc = "APC Smart-UPS C 1000/1500va"
        vendorid = 051d

On webserver2
[ups-webserver1]
        driver = usbhid-ups
        port = auto
        desc = "APC Smart-UPS C 1000/1500va"
        vendorid = 051d


nut.conf

MODE=netserver


upsd.conf

Webserver1
LISTEN 127.0.0.1 3493
LISTEN ::1 3493
LISTEN 192.168.1.21 3493

Webserver2
LISTEN 127.0.0.1 3493
LISTEN ::1 3493
LISTEN 192.168.1.22 3493



upsd.users

defines users (special settings required for stonith to work)

On webserver1
[ups-webserver2-slave]
        password = mypassword
        actions = SET
        instcmds = ALL
        upsmon slave

[ups-webserver2-master]
        password = mypassword
        actions = SET
        actions = FSD
        instcmds = ALL
        upsmon master


On webserver2
[ups-webserver1-slave]
        password = mypassword
        actions = SET
        instcmds = ALL
        upsmon slave

[ups-webserver1-master]
        password = mypassword
        actions = SET
        actions = FSD
        instcmds = ALL
        upsmon master

upsmon.conf

Webserver1
MONITOR ups-webserver1@webserver2 1 ups-webserver1-master mypassword master
MONITOR ups-webserver2@localhost 0 ups-webserver2-slave mypassword slave

Webserver2
MONITOR ups-webserver2@webserver1 1 ups-webserver2-master mypassword master
MONITOR ups-webserver1@localhost 0 ups-webserver1-slave mypassword slave



It needs the following
upsmon.conf

NOTIFYCMD            /usr/sbin/upssched
NOTIFYFLAG ONLINE    SYSLOG+WALL+
NOTIFYFLAG ONBATT    SYSLOG+WALL+EXEC


Configure 'upssched' by editing upssched.conf
upssched.conf

webserver1
CMDSCRIPT /bin/upssched-cmd
PIPEFN /var/lib/ups/upssched/upssched.pipe
LOCKFN /var/lib/ups/upssched/upssched.lock
AT ONBATT ups-webserver2@localhost START-TIMER onbatt-ups-webserver2 600
AT ONLINE ups-webserver2@localhost CANCEL-TIMER onbatt-ups-webserver2

webserver2
CMDSCRIPT /bin/upssched-cmd                                               .
PIPEFN /var/lib/ups/upssched/upssched.pipe
LOCKFN /var/lib/ups/upssched/upssched.lock
AT ONBATT ups-webserver1@localhost START-TIMER onbatt-ups-webserver1 600
AT ONLINE ups-webserver1@localhost CANCEL-TIMER onbatt-ups-webserver1



Edit /bin/upssched-cmd
/bin/upssched-cmd

webserver1
case $1 in
        onbatt-ups-webserver1)
                logger -t upssched-cmd "UPS-Webserver1 has gone on battery."
                ;;
        onbatt-ups-webserver2)
                logger -t upssched-cmd "UPS-Webserver2 has gone on battery."
                /usr/bin/upscmd -u ups-webserver2-master -p mypassword 
ups-webserver2@webserver1 shutdown.reboot
                ;;
        *)
                logger -t upssched-cmd "Unrecognized command: $1"
                ;;
esac

Webserver2
case $1 in
        onbatt-ups-webserver1)
                logger -t upssched-cmd "UPS-Webserver1 has been gone on 
battery."
                /usr/bin/upscmd -u ups-webserver1-master -p mypassword 
ups-webserver1@webserver2 shutdown.reboot
                ;;
        onbatt-ups-webserver2)
                logger -t upssched-cmd "UPS-Webserver2 has gone on battery."
                ;;
        *)
                logger -t upssched-cmd "Unrecognized command: $1"
                ;;
esac





_______________________________________________
Nut-upsuser mailing list
Nut-upsuser@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsuser

Reply via email to