On Jun 17, 2019, at 3:00 PM, David Zomaya <[email protected]> wrote:
> 
> Hi Network UPS Tools Support,
>  
> I’m not sure if this is a question for the “user group” or the developer 
> group”.

The config files will be useful for -users, but I'd say the development list is 
probably better for discussing potential changes.
 
> My name is David Zomaya and I work at Tripp Lite in our technical support 
> department. Copied on this email are Eric Cobb from our Product Management 
> group & Jonathan Manzanilla tech support subject matter expert for our single 
> phase UPS product lines. 

I recognize Eric's name from a few years ago - he emailed some detailed test 
results with NUT connecting to various Tripp-Lite models. Hi, Eric!

> Recently, we received a complaint about our SMART1500LCDXL dropping and 
> reconnecting in different Linux Operating Systems.A fter some in-house 
> testing, the behavior seems to be reproducible on a number of different *nix 
> operating systems (Windows seems fine). Here’s an example of the drops in 
> /var/log/messages (I’ll use CentOS 7.6 as my reference point throughout this 
> email):
> May 29 19:25:27 localhost kernel: usb 2-2.1: new low-speed USB device number 
> 6 using uhci_hcd
> May 29 19:25:27 localhost kernel: usb 2-2.1: New USB device found, 
> idVendor=09ae, idProduct=2012
> May 29 19:25:27 localhost kernel: usb 2-2.1: New USB device strings: Mfr=1, 
> Product=2, SerialNumber=0

This is a little off-topic, but I would like to point out that not including a 
machine-readable serial number in the USB device descriptor makes it difficult 
for people to reliably use two or more UPSes on a single *nix system. Due to 
some complications with the way that USB devices are opened in libusb, there is 
no easy way to "open the next unused USB device", so we recommend that people 
match against the serial number.

> May 29 19:25:27 localhost kernel: usb 2-2.1: Product: Tripp Lite UPS
> May 29 19:25:27 localhost kernel: usb 2-2.1: Manufacturer: Tripp Lite
> May 29 19:25:27 localhost kernel: hid-generic 0003:09AE:2012.0004: 
> hiddev0,hidraw1: USB HID v1.10 Device [Tripp Lite  Tripp Lite UPS ] on 
> usb-0000:02:02.0-2.1/input0
> May 29 19:25:29 localhost upsd[6287]: UPS [TrippLiteUPS] data is no longer 
> stale
> May 29 19:25:29 localhost upsd: UPS [TrippLiteUPS] data is no longer stale
> May 29 19:25:45 localhost kernel: usb 2-2.1: USB disconnect, device number 6

It would be interesting to see the debug log from usbhid-ups as well. It would 
give a little more context to the kernel errors. I haven't used a physical 
CentOS or RedHat system in a while, so I am not sure of the specifics needed to 
just stop the usbhid-ups driver, but then you can restart it with a few "-D" 
flags (3 should be sufficient for this kind of problem) and "-a TrippLiteUPS" 
to match this configuration. Please compress any log files (gzip preferred; zip 
works).

> As a result, this impacted the user’s ability use NUT software on their Linux 
> hosts. After some trial and error (and a lot of search engine use), I was 
> able to find that the following configuration changes/settings stop the drops 
> and stabilize performance:
> 1)      This in the ups.conf file
> pollinterval = 1

pollinterval defaults to 2, and to be honest, for most other UPSes, we suggest 
that people raise the value (since many UPSes do not update their filtered 
values more frequently than that anyway). Do you know how frequently the 
Windows software polls the UPS? Should this be applied to other models as well, 
or just protocol 2012?

> [TrippLiteUPS]
>     driver = usbhid-ups
>     port = auto
>     desc = "SMART1500LCD"
> 2)      The attached 62-nut-usbups.rules file at /etc/udev/rules.d/
> 3)      The attached 42-usb-hid-pm.rules /usr/lib/udev/rules.d/

The 62-nut-usbups.rules file looks pretty standard. Do you know if the changes 
to 42-usb-hd-pm.rules are needed? It seems like none of the USB devices would 
have the right permissions if 62-nut-usbups.rules isn't sufficient (though this 
happened in Debian once).

>  
> Below is some other information that may be relevant regarding my testing.
>  
> ·         I installed using the command “yum install nut.x86_64”
>  
> ·         Operating system version:
> CentOS Linux release 7.6.1810 (Core)
>  
> ·         Network UPS Tools version
> Network UPS Tools upsd 2.7.2

Note that NUT 2.7.4 has been out for some time now.
>  
>   
> I’m not the most well-versed in Network UPS Tools, so I am not sure how 
> “good” of a solution this is. I can however, get you more information on our 
> product and testing if that helps.
>  
> The questions I have are:
> 1)      Does the above seem like a “good” way to address this problem? (given 
> that the drops are something we need to look into on our end)

I can't argue with the results, though I would like to narrow it down a little 
(there may be other issues at play with the permissions in the udev files) and 
make sure that it is not coincidence.

> 2)      Is there a good way to get this fix implemented in the driver?

Each USB vendor ID generally gets their own source file, so we could add a 
special case to drivers/tripplite-hid.c. As mentioned earlier, if you know that 
this will be a problem across all protocol 2012 UPSes, we can check for that ID.

I will say that there is a bit of a logjam in the release pipeline, due to some 
(unnecessary, IMHO) deprecation of libusb-0.1: 
https://github.com/networkupstools/nut/issues/300

So it's unclear when a code change will get to users. The configuration file 
changes should work in the mean time.

> 3)      Have you had any reports of similar issues?

I thought we did, but maybe I am confusing it with protocol 3016 devices. We 
actually added a lot of the protocol 2012 devices to the hardware compatibility 
list based on the test results that Eric provided, so I assume they worked then 
(about six years ago).

The protocol 3016 devices (in particular, the SMART1500LCDT and OMNI1500LCDT) 
sometimes don't even stay on USB long enough to read a USB descriptor, and this 
does seem correlated with newer motherboards. Example: 
https://github.com/networkupstools/nut/issues/577

From where I stand, there really shouldn't be anything that a user-space 
program (like a NUT driver) can do that should be able to cause a USB device to 
disconnect during normal polling. (Aside from firmware updates, which we don't 
attempt.) That said, I recognize that USB Phy layer problems can be hard to 
diagnose, and power management can compound the issue.

> 4)      While we are communicating, are there any other open Tripp Lite items 
> I could help your team(s) with? No promises, but if I can help I’d like to.

Aside from those two USB issues, just a few other thoughts:

Some users prefer not to post the entire serial numbers from their UPS when 
reporting issues. Is there a convention for the serial number digits such that 
we can ask for just the first few digits, and get an idea as to whether the 
problem is limited to a given hardware or firmware revision? There seems to be 
a firmware revision buried in the HID descriptor for some models, but I don't 
know how to interpret it, and some of these connection problems present 
themselves before the UPS can return that HID report.

Although these models are not as common, we still hear from users with 
non-HID-PDC-based USB devices (and some serial UPSes as well). 
Publicly-available protocol documents would help us write better drivers for 
those devices. If not, a better way to identify models with proprietary 
protocols would be useful.

Another thing is considering how users get started with NUT. Sometimes a user 
inherits an UPS on a given system, and they want to set up NUT to monitor it. 
Ideally, we'd like to have a way for them to quickly triage whether a 
particular UPS model will work, and before they have NUT installed, they will 
likely have "lsusb" or similar tools to enumerate devices. Other times, a user 
is replacing another UPS, and they want to know which models are supported by 
NUT before purchasing one. In both of those cases, more information about how 
USB IDs map to models can help smooth out those processes. At the moment, we 
manually add each protocol number to the usbhid-ups driver when a user tries an 
UPS that isn't listed already. If there were a convention that all USB idDevice 
values in a certain range were going to be HID PDC compliant, we could change 
the default from opt-in to enabled-by-default (but we wouldn't want the UPS 
driver to try to control a USB hub).

> Thanks for your time. 

No problem, thanks for reaching out!

> _______________________________________________
> Nut-upsdev mailing list
> [email protected]
> https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsdev


_______________________________________________
Nut-upsdev mailing list
[email protected]
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsdev

Reply via email to