Hi Jim,
Thanks for the prompt response.
The restart I refer to was exactly as you say. Where I restarted the
service using: systemctl restart nut-server. This was separate to where
I mention the reboot of server machine, which resolves the issue.
The driver used was:
Network UPS Tools - UPS driver controller 2.8.0
Network UPS Tools - BCMXCP UPS driver 0.32 (2.8.0)
I simulated the fault again, by putting the UPS in bypass and
disconnecting the battery. This caused the RB alert again. With this I
then reconnected battery, restored UPS to normal operating condition.
Then used upsdrvctl to STOP and START the driver.
Generating alert condition for simulating RB:
Alert type: REPLBATT
.....................
ups.status: ALARM OL BYPASS RB
ups.test.result: Done and error
Alert cleared on UPS, and alert condition with RB persisting on NUT-SERVER:
Alert type: ONLINE
.................
ups.status: OL RB
ups.test.result: Done and passed
Restarting using upsdrvctl start/stop command clears RB:
Alert type: COMMOK
..................
ups.status: OL
ups.test.result: Done and passed
So it seems that your and my suspicions have been verified. Where bcmxcp
seems to "latch" the alarm until driver restart or server reboot.
I think you are correct, in that this can cause issues in other subsets
of real-life cases. Thinking here of automating and scripting and so forth.
What would you suggest at this point? Can this be submitted as a bug?
Vyasa
On 6/30/25 14:18, Jim Klimov wrote:
Hello,
You mention that you've tried restarting the "nut-server" - I
suppose you mean literally, the service unit by such name - of the NUT
data server. Did you try restarting the unit for the NUT driver (e.g.
`systemctl restart nut-drvier@upsname` with NUT v2.8.x and newer)?
You did not mention the driver used, but I wonder if that driver
program "latches" the RB value when it goes bad and never updates
it?.. This could make sense when UPS battery replacement means server
downtime, but that is just a subset of real-life cases - so generally
can be just an oversight. For example, `bcmxcp` code seems to only set
`bcmxcp_status.alarm_replace_battery=1` (oddly neither the field nor
struct is ever initialized to 0, so might be garbage on some
systems/compilers that do not zero-out aggregate types by default).
Jim
On Mon, Jun 30, 2025 at 7:53 PM Vyasa via Nut-upsuser
<[email protected]> wrote:
Hello,
CONFIGURATION:
I am using a Powerware PW9120 3000i, on a network configuration
with a server and a couple of slaves.
The nut-server OS is /Debian 12 (6.1.0-37-amd64)/. Nut was
installed from the Debian repo with version /2.8.0-7 amd64/, and
client has the same version.
UPS is connected with a standard RS232 serial connection, and
works with all standard commands and functionality.
Command "/upscmd -l upsname/" provides the following, where I have
successfully used /test.battery.start/ and /test.system.start/:
beeper.disable - Disable the UPS beeper
beeper.enable - Enable the UPS beeper
beeper.mute - Temporarily mute the UPS beeper
load.on - Turn on the load immediately
outlet.1.load.off - Turn off the load on outlet 1 immediately
outlet.1.load.on - Turn on the load on outlet 1 immediately
outlet.1.shutdown.return - Turn off the outlet 1 and return when
power is back
outlet.2.load.off - Turn off the load on outlet 2 immediately
outlet.2.load.on - Turn on the load on outlet 2 immediately
outlet.2.shutdown.return - Turn off the outlet 2 and return when
power is back
shutdown.return - Turn off the load and return when power is back
shutdown.stayoff - Turn off the load and remain off
test.battery.start - Start a battery test
test.system.start - Start a system test
ISSUE:
Every couple of years when I have to replace batteries in the UPS,
I get an issue with not being able to clear the REPLBATT alert.
That is not until I reboot the server running NUT-SERVER. This
might seem as not a big deal, but becomes a hassle when batteries
haven't quite failed yet and are still good after a ups battery test.
The UPS itself reports OK after battery replacement or battery
test, and clears alarm on its LCD. But when I poll the UPS data
using "upsc upsname" I still see the RB or REPLBATT and this will
not clear until I reboot the server. So without reboot the alert
will then be generated based on RBWARNTIME in upsmon.conf, which
is as per nut design.
So without reboot I always get the RB flag with status:
/Alert type: REPLBATT/
/............/
/ups.status: OL RB/
/ups.test.result: Done and passed/
After reboot of server the alert is cleared:
/Alert type: COMMOK
............
ups.status: OL
ups.test.result: Done and passed/
So my question becomes, why is this reboot required and it doesn't
seem to make any sense? I can't understand why the polled data
from a UPS would change after a reboot, while on the UPS LCD its
reporting all OK? I tried restarting NUT-SERVER to see if it
would make any difference. Also, the command test.battery.start
will clear the alarm on the UPS if battery test good.
The only explanation that I have come up with is that the
persistent RB/REPLBATT is latched to this condition and is an
artifact of UPS to NUT handshaking.
Any feedback would be kindly appreciated, as I have searched and
searched.
Thank you!
Vyasa
_______________________________________________
Nut-upsuser mailing list
[email protected]
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser
_______________________________________________
Nut-upsuser mailing list
[email protected]
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser