[Nut-upsuser] Experiences With NUT Eaton & Cyberpower UPS

2024-01-20 Thread Jeff Rickman


I watched the recent discussions of "issues" with an Eaton Ellipse UPS with 
some interest.

I use the US version of that UPS, the Ellipse Pro 1500 in my case. I had not 
encountered many of the issues recounted in those discussions. The Login 
Failure message is not unique to that UPS as I have seen it with my Cyberpower 
units. The polling setting also had me wondering as I have never had to alter 
that variable. In short, the entire discussion left me wondering/puzzled.

In my own case I have found Eaton Powerware (from MGE acquisition), Eaton 
Ellipse Pro and Cyberpower UPS to be practically "plug & play" in their 
'ups.conf' configurations under NUT 2.7.4 and 2.8.0. I attach these UPS to a 
typical PC USB power or to the USB ports of a Raspberry Pi 3 something; no 
apparent operational differences. Only the Powerware units have showed any 
quirky behavior, be being a bit slower to respond (compared to a Cyberpower CP 
1500C or 1500D UPS, or even an APC Smart-UPS 1000) to NUT polling on a NUT 
server daemon 'start' or 'restart' compared to the Ellipse and Cyberpower 
units. I accept that Eaton Powerware quirky-ness by being more patient with 
those units.

As for my NUT server experiences, I personally prefer low power mini PC 
platforms as NUT server hosts over Raspberry Pi. I have run both platforms for 
years now and over the years so this is not a 'blind knock', just a personal 
preference.

I have multiple NUT server systems with 2 or more UPS connected. In some cases 
both UPS use the 'usbhid' driver but those UPS come from 2 different 
manufacturers so the 'ups.conf' configuration is still very simple. I had one 
instance where I had 2 Eaton Ellipse Pro 1500 UPS connected to the same NUT 
server and it worked just fine; the UPS were distinguished by their serial 
numbers as seen in verbose 'lsusb' output with corresponding entries made in 
the NUT server's 'ups.conf' file.

So based on all of that personal experience I still regard the entire recent 
Ellipse Pro discussions as "puzzling".

In closing, and this is unrelated to NUT in any way, those Eaton Ellipse Pro 
1500 units must be partially disassembled (in a non-obvious way as screws are 
not the issue) if the batteries swell inside; the internal battery space 
tolerances are very close and even 2mm of swell will hold the batteries tight 
in the case.

Jeff Rickman

-- 


___
Nut-upsuser mailing list
Nut-upsuser@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser


Re: [Nut-upsuser] NUT and Eaton UPS produce a lot of error messages

2024-01-20 Thread Jim Klimov via Nut-upsuser
Thanks for the info!

First, regarding the later detail about starting `usbhid-ups`: the
"Resource busy" indicates that likely an earlier instance of the NUT driver
(in its own service unit) is still running and holding the device. In a
worse case, some other programs might consider this a HID device (same
class family as keyboards and mice) and grab it somehow, or it gets passed
through virtualization/containers and a different OS holds it in fact
(guest if you're running NUT on hypervisor, or vice versa). Make sure you
stop those too, whichever applies.

Second, looking at dmesg and other logs: it seems the device number gets
re-enumerated (lost #3, got #4) during reconnection - some OSes do that,
others do not. This might confuse a driver's attempts to re-attach (relying
on earlier learned details from initial connection) - although due to an
issue in NUT v2.8.0 and before the drivers might have not tracked the
"device" number (seen as "unknown" here), which got fixed in 2.8.1, but
since 2.8.2 the `nut-scanner` would again not-suggest it as a config option
by default due to such unreliability with reconnections...

There are quite a few nuances that might differ between "vanilla" NUT
systemd integration (which also evolved over time in git sources) and that
packaged by distros, I haven't looked deep into Debian recipes lately.
Maybe there's some unfortunate interaction of service dependencies that
follows up from the disconnections. Some educated guesswork below:

My current guess would be that a driver gives up reconnecting with the
remembered data that no longer matches the device, and perhaps exits to be
restarted by systemd (journal history of `nut-driver@Eaton.service`, if
that one got generated by `nut-driver-enumerator` on your system, might
confirm or dispel that guess; otherwise check for a plain old
`nut-driver.service` monolith). Or it exits/crashes due to some other
reason, such as manual runs of `usbhid-ups` (if the PID file is saved by
the daemon and is found and used by the command-line spawned instance to
kill off the "presumed-frozen" competitor; in the log above I don't see
direct indications of that though).

* Found an example of when a new driver process kills its older "self"
using the PID files - just in case you see such indications in your logs:
   Duplicate driver instance detected (PID file
/run/nut/nutdrv_qx-tecnowaremansarda.pid exists)! Terminating other driver!

Further, what I guess could follow up is that if your only
`nut-driver*.service` exits and restarts, the systemd dependency it
provides for `nut-server.service` (maybe via `nut-driver.target`) flickers
and causes the data server to restart. (Not sure at the moment if it is a
weak Wants or a harder Requires type of dependency there; technically
`upsd` should run well without drivers and report that the device is
unknown or data is stale).

Probably you can interpolate the event trails from `dmesg` and `journalctl
-lx` as suggested earlier, with bumped `debug_min` in `ups.conf`,
`upsd.conf` and maybe `upsmon.conf`, to check if the real events and
service state changes seem to confirm this theorized chain of events:

* USB reconnection (HW/FW reasons can vary a lot);
* `nut-driver@Eaton.service` tries to reconnect but fails, or for older
versions just aborts due to loss of link (even without OS USB
re-enumeration, there can be a few seconds when the newly made devfs node
is owned by `root` and `udev` did not have time to hand it off to `nut`, so
the NUT driver can not re-attach);
* fault of `nut-driver*` causes `nut-server` to stop (shouldn't, but might,
happen... at least fits the symptoms you've posted earlier)
* the `nut-driver` is resuscitated by systemd after some RestartSec timeout
* dependency for `nut-server` is healthy so it is started up again
* thinking of it, maybe the last couple of steps is almost concurrent: as
soon as systemd launched the driver process, its unit is considered
healthy, so the data server starts; however the driver then takes some time
to do the initial walk of the device and only then begins talking to `upsd`
-- maybe this is when `upsmon` asks for login to the not-yet-recognized UPS
so the nut-server says it is denied per your original post. With NUT
v2.8.1+ the systemd integration is tighter, so daemons can notify the
service manager when they are actually ready to serve, and only then the
dependencies can start, if (packaging) build-time configuration options
enable this mode.

Given the revised integrations since NUT v2.8.0 release, you might have
better luck with current master-branch codebase - see
https://github.com/networkupstools/nut/wiki/Building-NUT-for-in%E2%80%90place-upgrades-or-non%E2%80%90disruptive-tests
- and by confirming that it works or by uncovering more edge cases that are
not well handled, help improve an upcoming NUT v2.8.2 release :)

Hope this helps,
Jim Klimov


On Sat, Jan 20, 2024 at 1:07 PM Stefan Schumacher <
> stefanschumacheratw...@gmail.com> wrote:
>

Re: [Nut-upsuser] NUT and Eaton UPS produce a lot of error messages

2024-01-20 Thread Jim Klimov via Nut-upsuser
Well, at least per issue tracker, they can be finicky especially regarding
reconnects (in many models, the USB controller tends to fall into
power-saving at the wrong times, so polling rates should be increased).
More on the Wiki and in github search/labels :)

As for Eatons, I've had experience with many enterprise and some
home-oriented models (worked there for a while), and can't point to any
particular "systemic" flaws of the brand. Individual devices of course
could be randomly damaged in shop/transport/storage/..., but that's about
it.

In this thread I haven't yet seen one clue or report that would be about
the actual UPS or its driver, or a dmesg dump about re-connections, if any
- so just have no founded opinion if the power device is or is not at any
fault. Likewise, I don't think I've seen any details about the fileserver's
hardware (e.g. Raspberry Pi's are often mentioned in issues nowadays, no
idea if that points to inherent issues of their hardware or just to their
popularity and higher chance-to-encounter in tinkering, though). All
bread-crumbs so far were about the data server (nut-server/upsd) and the
client (upsmon).

Jim


On Sat, Jan 20, 2024 at 2:03 AM Sam Varshavchik 
wrote:

> Stefan Schumacher via Nut-upsuser writes:
>
> > I am considering returning the device to the online shop I bought. As
> > it is, it's nothing but trouble. Can you recommend a small UPS (1
> > Server) that is guaranteed to work flawlessly with NUT?
>
> I've had good experience with Cyberpower units, as long as they're
> included
> NUT's hardware list.
>
> ___
> Nut-upsuser mailing list
> Nut-upsuser@alioth-lists.debian.net
> https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser
>
___
Nut-upsuser mailing list
Nut-upsuser@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser