I'm having an USB communication issue with ugen and the blazer_usb NUT
driver on the last snapshot 6.9 GENERIC.MP#473 amd64.  Notably the
'ugen_clear_iface_eps: clear endpoints failed!' message and
communication loss with the UPS.

Here is a link to my dmesg: https://clbin.com/pc1dk

- NUT configuration is working normally until the following points
- Sometimes upsd can't access the device at boot, I have to manually
  restart the service
- "randomly" The communication fails and I have to restart the service
  too
- The device can be totally unaccessible after some restart, needing to
  replug the USB

Other points that could be useful:
- This UPS USB interface was working without any issue on FreeBSD and
  ugen
- The UPS is supported by nut with blazer_usb:
 
https://networkupstools.org/ddl/PowerWalker/Line-Interactive_VI_1400.html

It's mentioned it should work with the NUT nutdrv_qx driver too so I'm
trying with this one too, see below.

Below are detailed information of the issue:

The tail of /var/log/messages with comments:
Apr 22 19:30:25 se-h1 /bsd: sd5: 457860MB, 512 bytes/sector, 937697393
sectors
Apr 22 19:30:25 se-h1 /bsd: root on sd5a (468563ff41793b32.a) swap on
sd5b dump on sd5b
Apr 22 19:30:25 se-h1 /bsd: inteldrm0: 1024x768, 32bpp
Apr 22 19:30:25 se-h1 /bsd: wsdisplay0 at inteldrm0 mux 1: console
(std, vt100 emulation), using wskbd0
Apr 22 19:30:25 se-h1 /bsd: wsdisplay0: screen 1-5 added (std, vt100
emulation)
Apr 22 19:30:25 se-h1 ntpd[96719]: constraints configured but none
available
Apr 22 19:30:41 se-h1 ntpd[43289]: no reply received in time, skipping
initial time setting
Apr 22 19:30:41 se-h1 savecore: no core dump
## The following message happens when upsd tries to access the device
##
Apr 22 19:30:47 se-h1 /bsd: ugen_clear_iface_eps: clear endpoints
failed! 
Apr 22 19:30:56 se-h1 reorder_kernel: kernel relinking done
## At this point upsd can communicate to my device without any issue,
everything works as expected ##
## But suddenly, ten minutes later: ##
Apr 22 19:40:53 se-h1 blazer_usb[31504]: Communications with UPS lost:
status read failed!
Apr 22 19:40:55 se-h1 upsd[80896]: Data for UPS [ups0] is stale - check
driver
Apr 22 19:40:59 se-h1 upsmon[82703]: Poll UPS [ups0@localhost] failed -
Data stale
Apr 22 19:40:59 se-h1 upsmon[82703]: Communications with UPS
ups0@localhost lost
Apr 22 19:41:04 se-h1 upsmon[82703]: Poll UPS [ups0@localhost] failed -
Data stale
Apr 22 19:41:39 se-h1 last message repeated 7 times
Apr 22 19:42:24 se-h1 last message repeated 9 times
## So what I do there is rcctl restart upsd (sometimes I need to do it
several times) ##
Apr 22 19:42:27 se-h1 upsd[80896]: mainloop: Interrupted system call
Apr 22 19:42:29 se-h1 upsmon[82703]: Poll UPS [ups0@localhost] failed -
Write error: Broken pipe
## Again: ##
Apr 22 19:42:32 se-h1 /bsd: ugen_clear_iface_eps: clear endpoints
failed!
Apr 22 19:42:34 se-h1 upsmon[82703]: Communications with UPS
ups0@localhost established
## Back to an operational state ##

My device is 0665:5161
$ doas usbdevs -v                                                    
Controller /dev/usb0:
addr 01: 8086:0000 Intel, xHCI root hub
         super speed, self powered, config 1, rev 1.00
         driver: uhub0
addr 02: 0665:5161 INNO TECH, USB to Serial # --- this one --- #
         low speed, power 100 mA, config 1, rev 0.02, iSerial 20100826
         driver: ugen0
addr 03: 1058:25a1 Western Digital, Elements 25A1
         super speed, power 224 mA, config 1, rev 10.14, iSerial
ommited
         driver: umass0
addr 04: 1058:1078 Western Digital, Elements 1078
         super speed, power 224 mA, config 1, rev 10.65, iSerial
ommited
         driver: umass1
addr 05: 1058:25a2 Western Digital, Elements 25A2
         super speed, power 224 mA, config 1, rev 10.04, iSerial
ommited
         driver: umass2
Controller /dev/usb1:
addr 01: 8086:0000 Intel, EHCI root hub
         high speed, self powered, config 1, rev 1.00
         driver: uhub1
addr 02: 8087:8008 Intel, Rate Matching Hub
         high speed, self powered, config 1, rev 0.04
         driver: uhub3
Controller /dev/usb2:
addr 01: 8086:0000 Intel, EHCI root hub
         high speed, self powered, config 1, rev 1.00
         driver: uhub2
addr 02: 8087:8000 Intel, Rate Matching Hub
         high speed, self powered, config 1, rev 0.04
         driver: uhub4

_ups permissions are set with /etc/rc.local
$ cat /etc/rc.local           
# ups0 ugen0 permissions for USB access
chgrp _ups /dev/usb0 /dev/ugen0*       
chmod g+rw /dev/usb0 /dev/ugen0*       

When access works:
$ upsc ups0
battery.charge: 100
battery.voltage: 13.70
battery.voltage.high: 13.00
battery.voltage.low: 10.40
battery.voltage.nominal: 12.0
device.type: ups
driver.flag.nolock: enabled
driver.name: blazer_usb
driver.parameter.pollinterval: 2
driver.parameter.port: /dev/ugen0
driver.parameter.productid: 5161
driver.parameter.protocol: megatec
driver.parameter.synchronous: yes
driver.parameter.vendorid: 0665
driver.version: 2.7.4
driver.version.internal: 0.12
input.current.nominal: 3.0
input.frequency: 50.3
input.frequency.nominal: 50
input.voltage: 241.4
input.voltage.fault: 241.4
input.voltage.nominal: 230
output.voltage: 241.4
ups.beeper.status: disabled
ups.delay.shutdown: 30
ups.delay.start: 180
ups.load: 14
ups.productid: 5161
ups.status: OL
ups.temperature: 25.0
ups.type: offline / line interactive
ups.vendorid: 0665

$ doas nut-scanner -U                        
SNMP library not found. SNMP search disabled.         
Neon library not found. XML search disabled.          
AVAHI client library not found. AVAHI search disabled.
Scanning USB bus.                                     
[nutdev1]                                             
        driver = "blazer_usb"                         
        port = "auto"                                 
        vendorid = "0665"                             
        productid = "5161"                            
        product = "USB to Serial"                     
        serial = "20100826"                           
        vendor = "INNO TECH"                          
        bus = "000"                                   

When access is broken:
$ upsc ups0
Error: Data stale
$ doas nut-scanner -U                                                 
SNMP library not found. SNMP search disabled.                         
Neon library not found. XML search disabled.                          
AVAHI client library not found. AVAHI search disabled.                
Scanning USB bus.
Failed to open device,
skipping. (Result too large) 

Note that the system still see the usb device when the access is
broken:
$ doas usbdevs -v
Controller /dev/usb0:                                                 
addr 01: 8086:0000 Intel, xHCI root hub                               
         super speed, self powered, config 1, rev 1.00                
         driver: uhub0                                                
addr 02: 0665:5161 INNO TECH, USB to Serial                           
         low speed, power 100 mA, config 1, rev 0.02, iSerial 20100826
         driver: ugen0                                                

What I do to reconnect the UPS:
$ doas rcctl restart upsd                    
upsd(failed)                                          
$ doas rcctl restart upsd                    
upsd(failed)                                          
$ doas rcctl restart upsd                    
upsd(ok)                                              

NUT configuration details in ups.conf:
user = _ups                       
[ups0]                            
        desc = "ups0" 
        driver = blazer_usb       
        protocol = megatec        
        nolock = true  # Seems the problems appens less frequently
        synchronous = yes  # Also tried adding this to solve the issue
        port = /dev/ugen0         
        vendorid = 0665           
        productid = 5161          

Communication works:
$ doas blazer_usb -DDDD -a ups0                                        
Network UPS Tools - Megatec/Q1 protocol USB driver 0.12 (2.7.4)       
   0.000000     debug level is '4'                                    
   0.006263     Checking device (8086/0000) (000/001)                 
   0.006600     - VendorID: 8086                                      
   0.006610     - ProductID: 0000                                     
   0.006628     - Manufacturer: Intel                                 
   0.006635     - Product: xHCI root hub                              
   0.006639     - Serial Number: unknown                              
   0.006643     - Bus: 000                                            
   0.006663     - Device release number: 0100                         
   0.006669     Trying to match device                                
   0.006688     Device does not match - skipping                      
   0.006732     Checking device (0665/5161) (000/002)                 
   0.011226     - VendorID: 0665                                      
   0.011234     - ProductID: 5161                                     
   0.011251     - Manufacturer: INNO TECH                             
   0.011257     - Product: USB to Serial                              
   0.011261     - Serial Number: 20100826                             
   0.011265     - Bus: 000                                            
   0.011268     - Device release number: 0002                         
   0.011275     Trying to match device                                
   0.011284     Device matches                                        
   0.011308     nut_usb_set_altinterface: skipped
usb_set_altinterface(udev, 0) 
   0.011342     Trying megatec protocol...                            
   0.011952     send: Q1                                              
   5.179636     read: (241.4 241.4 241.4 014 50.3 13.7 25.0 00001000  
   5.179872     Status read in 1 tries                                
   5.179877     Supported UPS detected with megatec protocol          
   5.180407     send: F                                               
   5.307636     read: #230.0 003 12.00 50.0                           
   5.307652     Ratings read in 1 tries                               
[...]

Communication don't work a few second after:
$ doas blazer_usb -DDDD -a ups0
Network UPS Tools - Megatec/Q1 protocol USB driver 0.12 (2.7.4)
   0.000000     debug level is '4'
  17.337417     Checking device (8086/0000) (000/001)
  17.337836     - VendorID: 8086
  17.337841     - ProductID: 0000
  17.337845     - Manufacturer: Intel
  17.337849     - Product: xHCI root hub
  17.337856     - Serial Number: unknown
  17.337866     - Bus: 000
  17.337876     - Device release number: 0100
  17.337879     Trying to match device
  17.337900     Device does not match - skipping
  17.337946     Checking device (0665/5161) (000/002)
  20.328959     - VendorID: 0665
  20.328969     - ProductID: 5161
  20.328974     - Manufacturer: unknown
  20.328977     - Product: unknown
  20.328981     - Serial Number: unknown
  20.328984     - Bus: 000
  20.328988     - Device release number: 0002
  20.328994     Trying to match device
  20.329008     Device matches
  20.329061     nut_usb_set_altinterface: skipped
usb_set_altinterface(udev, 0)
  20.329104     Trying megatec protocol...
  22.320937     send: Q1
  27.456483     read: .4 014 50.3 13.7 25.0 00001000
  27.456492     blazer_status: short reply
  27.456497     Status read 1 failed
  27.456914     send: Q1
  28.938774     read: timeout
  28.938797     blazer_status: short reply
  28.938801     Status read 2 failed
  33.939171     send: Input/output error
  33.939207     blazer_status: short reply
  33.939212     Status read 3 failed
  33.939216     Skipping mustek protocol...
  33.939219     Skipping megatec/old protocol...
  33.939223     Skipping zinto protocol...
  33.939227     No supported UPS detected

If I redo the same command one more time:
$ doas blazer_usb -DDDD -a ups0
Network UPS Tools - Megatec/Q1 protocol USB driver 0.12 (2.7.4)
   0.000000     debug level is '4'
   5.000757     Checking device (8086/0000) (000/001)
   5.001132     - VendorID: 8086
   5.001137     - ProductID: 0000
   5.001141     - Manufacturer: Intel
   5.001144     - Product: xHCI root hub
   5.001148     - Serial Number: unknown
   5.001152     - Bus: 000
   5.001155     - Device release number: 0100
   5.001159     Trying to match device
   5.001175     Device does not match - skipping
   5.001203     Checking device (1058/25A1) (000/003)
   5.001956     - VendorID: 1058
   5.001964     - ProductID: 25a1
   5.001981     - Manufacturer: Western Digital
   5.001987     - Product: Elements 25A1
   5.001990     - Serial Number: ommited
   5.001994     - Bus: 000
   5.001998     - Device release number: 1014
   5.002002     Trying to match device
   5.002008     Device does not match - skipping
   5.002018     Checking device (1058/1078) (000/004)
   5.002689     - VendorID: 1058
   5.002696     - ProductID: 1078
   5.002713     - Manufacturer: Western Digital
   5.002719     - Product: Elements 1078
   5.002723     - Serial Number: ommited
   5.002727     - Bus: 000
   5.002730     - Device release number: 1065
   5.002734     Trying to match device
   5.002738     Device does not match - skipping
   5.002746     Checking device (1058/25A2) (000/005)
   5.003438     - VendorID: 1058
   5.003444     - ProductID: 25a2
   5.003460     - Manufacturer: Western Digital
   5.003464     - Product: Elements 25A2
   5.003469     - Serial Number: ommited
   5.003473     - Bus: 000
   5.003477     - Device release number: 1004
   5.003483     Trying to match device
   5.003488     Device does not match - skipping
   5.003497     No appropriate HID device found
   5.003512     No supported devices found. Please check your device
   availability with 'lsusb' and make sure you have an up-to-date
   version of NUT. If this does not help, try running the driver with
at
   least 'subdriver', 'vendorid' and 'productid' options specified.
   Please refer to the man page for details about these options (man 8
   blazer_usb).

At this point in my tests, the device were totally inaccessible,
unplugged and
replugged it and nut-scanner immediately found it again:

ugen0 detached
ugen0 at uhub0 port 9 "INNO TECH USB to Serial" rev 1.10/0.02 addr 2
$ doas nut-scanner -U
SNMP library not found. SNMP search disabled.
Neon library not found. XML search disabled.
AVAHI client library not found. AVAHI search disabled.
Scanning USB bus.
[nutdev1]
        driver = "blazer_usb"
        port = "auto"
        vendorid = "0665"
        productid = "5161"
        product = "USB to Serial"
        serial = "20100826"
        vendor = "INNO TECH"
        bus = "000"

$ doas rcctl restart upsd
upsd(ok)

Broadcast Message from root@blah
        ((not a tty)) at 20:19 ...

Communications with UPS ups0@localhost established
Works again for a few minutes.

nutdrv_qx test: while this driver seems to recognize the device too, I
encounter the same problem.  It worked a first time with the following
configuration:
[ups0]
        desc = "ups0"
        driver = nutdrv_qx
        nolock = true
        synchronous = yes
        port = /dev/ugen0
        vendorid = 0665
        productid = 5161

I rebooted to see if upsd could connect automatically, but nop.
Nutdrv_qx can't match the device. I had to unplug and replug to make it
work again.

- I've seen mentions of usb_quirks.c for usbhid driver in
  /usr/local/share/doc/pkg-readmes/nut
```
The option with fewest side-effects is to add the following entries to
the table in /sys/dev/usb/usb_quirks.c and build a new kernel:

{ USB_VENDOR_APC, USB_PRODUCT_APC_UPS, ANY, { UQ_BAD_HID }},
{ USB_VENDOR_APC, USB_PRODUCT_APC_UPS5G, ANY, { UQ_BAD_HID }},
```

- Would it be useful ?
- What does 'ugen_clear_iface_eps: clear endpoints failed!' mean when
  the blazer_usb driver access the device ?
- Is this related:
  https://www.mail-archive.com/tech@openbsd.org/msg62221.html ?
- What should I try to get this USB communication reliable ?




Reply via email to