Hi Jim,

Thanks for the prompt response.

The restart I refer to was exactly as you say.  Where I restarted the service using: systemctl restart nut-server.  This was separate to where I mention the reboot of server machine, which resolves the issue.

The driver used was:
Network UPS Tools - UPS driver controller 2.8.0
Network UPS Tools - BCMXCP UPS driver 0.32 (2.8.0)

I simulated the fault again, by putting the UPS in bypass and disconnecting the battery.  This caused the RB alert again.  With this I then reconnected battery, restored UPS to normal operating condition.  Then used upsdrvctl to STOP and START the driver.

Generating alert condition for simulating RB:
Alert type: REPLBATT
.....................
ups.status: ALARM OL BYPASS RB
ups.test.result: Done and error

Alert cleared on UPS, and alert condition with RB persisting on NUT-SERVER:
Alert type: ONLINE
.................
ups.status: OL RB

ups.test.result: Done and passed

Restarting using upsdrvctl start/stop command clears RB:
Alert type: COMMOK
..................
ups.status: OL
ups.test.result: Done and passed

So it seems that your and my suspicions have been verified. Where bcmxcp seems to "latch" the alarm until driver restart or server reboot.

I think you are correct, in that this can cause issues in other subsets of real-life cases.  Thinking here of automating and scripting and so forth.

What would you suggest at this point?  Can this be submitted as a bug?

Vyasa



On 6/30/25 14:18, Jim Klimov wrote:
Hello,

  You mention that you've tried restarting the "nut-server" - I suppose you mean literally, the service unit by such name - of the NUT data server. Did you try restarting the unit for the NUT driver (e.g. `systemctl restart nut-drvier@upsname` with NUT v2.8.x and newer)?

  You did not mention the driver used, but I wonder if that driver program "latches" the RB value when it goes bad and never updates it?.. This could make sense when UPS battery replacement means server downtime, but that is just a subset of real-life cases - so generally can be just an oversight. For example, `bcmxcp` code seems to only set `bcmxcp_status.alarm_replace_battery=1` (oddly neither the field nor struct is ever initialized to 0, so might be garbage on some systems/compilers that do not zero-out aggregate types by default).

Jim


On Mon, Jun 30, 2025 at 7:53 PM Vyasa via Nut-upsuser <[email protected]> wrote:

    Hello,

    CONFIGURATION:

    I am using a Powerware PW9120 3000i, on a network configuration
    with a server and a couple of slaves.

    The nut-server OS is /Debian 12 (6.1.0-37-amd64)/. Nut was
    installed from the Debian repo with version /2.8.0-7 amd64/, and
    client has the same version.

    UPS is connected with a standard RS232 serial connection, and
    works with all standard commands and functionality.

    Command "/upscmd -l upsname/" provides the following, where I have
    successfully used /test.battery.start/ and /test.system.start/:

    beeper.disable - Disable the UPS beeper
    beeper.enable - Enable the UPS beeper
    beeper.mute - Temporarily mute the UPS beeper
    load.on - Turn on the load immediately
    outlet.1.load.off - Turn off the load on outlet 1 immediately
    outlet.1.load.on - Turn on the load on outlet 1 immediately
    outlet.1.shutdown.return - Turn off the outlet 1 and return when
    power is back
    outlet.2.load.off - Turn off the load on outlet 2 immediately
    outlet.2.load.on - Turn on the load on outlet 2 immediately
    outlet.2.shutdown.return - Turn off the outlet 2 and return when
    power is back
    shutdown.return - Turn off the load and return when power is back
    shutdown.stayoff - Turn off the load and remain off
    test.battery.start - Start a battery test
    test.system.start - Start a system test

    ISSUE:

    Every couple of years when I have to replace batteries in the UPS,
    I get an issue with not being able to clear the REPLBATT alert. 
    That is not until I reboot the server running NUT-SERVER.  This
    might seem as not a big deal, but becomes a hassle when batteries
    haven't quite failed yet and are still good after a ups battery test.

    The UPS itself reports OK after battery replacement or battery
    test, and clears alarm on its LCD.  But when I poll the UPS data
    using "upsc upsname" I still see the RB or REPLBATT and this will
    not clear until I reboot the server.  So without reboot the alert
    will then be generated based on RBWARNTIME in upsmon.conf, which
    is as per nut design.

    So without reboot I always get the RB flag with status:

    /Alert type: REPLBATT/
    /............/
    /ups.status: OL RB/
    /ups.test.result: Done and passed/

    After reboot of server the alert is cleared:

    /Alert type: COMMOK
    ............
    ups.status: OL
    ups.test.result: Done and passed/

    So my question becomes, why is this reboot required and it doesn't
    seem to make any sense?  I can't understand why the polled data
    from a UPS would change after a reboot, while on the UPS LCD its
    reporting all OK?  I tried restarting NUT-SERVER to see if it
    would make any difference.  Also, the command test.battery.start
    will clear the alarm on the UPS if battery test good.

    The only explanation that I have come up with is that the
    persistent RB/REPLBATT is latched to this condition and is an
    artifact of UPS to NUT handshaking.

    Any feedback would be kindly appreciated, as I have searched and
    searched.

    Thank you!

    Vyasa
    _______________________________________________
    Nut-upsuser mailing list
    [email protected]
    https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser
_______________________________________________
Nut-upsuser mailing list
[email protected]
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser

Reply via email to