Alan Stern wrote:
On Sun, 12 Jun 2005, Martin Kessler wrote:Well here it is, 'control queue full' appeared around 22:48 I took asnapshot from my logfile and have attached it. its 5MB so I compressedit, hope thats ok ;-) If you need data prior to 21:00 as well, let me know.This helps. The log clearly shows rapidly repeated control URBs being sent. They arise because some program is doing an ioctl to the hiddev driver (hiddev_ioctl() gets called), resulting in a report being submitted via a control URB (hid_submit_ctrl() gets called).
Almost certainly apcupsd is that program.
The messages started at 21:01:46 and recurred at one-minute intervals.Every five minutes beginning at 21:05:46 the interval contained lots ofthese messages, whereas the other intervals contained only a few.Interestingly, the messages stopped at 22:00:46 after 61 repetitions, andat 22:48:53 (almost an hour later!) the first "control queue full" message appeared. So it's hard to say whether there's any definite connection. But maybe this pattern will provide a clue. At this point you should take out the patch I sent, and add somedebugging printks to hiddev_ioctl() and hid_submit_ctrl(). It might beparticularly interesting to see the value of current->comm. It also would be interesting to see why these calls didn't show up in Adam's debugging.
These calls did appear when we did hid-core debug, and they're normal. An UPS typically supports dozens of reports containing everything from simple boolean flags (on-battery) to line quality (line-voltage, line-frequency) to environmental data (humidity, temperature). Apcupsd's algorithm is the following:
(1) Poll the "interesting" usages. (In Martin's case there are 16 usages that qualify as interesting, which translates to 16 control transfers issued via hiddev. There is little processing to do on these reports, so it is not
surprising to see the control transfers issued back-to-back.) (2) Set internal flags based on current UPS state.(3) Wait for interrupt reports to arrive. When such a report arrives, read its value and go back to step #1. (Note that hid-core filters out interrupt reports when the value doesn't change, so apcupsd won't react until
a report value actually changes.)(4) If 60 seconds elapse with no interrupt reports, go back to step #1.
So the basic algorithm means we should expect to see bursts of 16 control transfers once every 60 seconds, plus additional bursts generated in response to interrupt transfers. What's happening in Martin's case (based on previous hid-core debug logs) is that the line-voltage and ups-load reports are fluctuating occasionally. These changes are delivered via interupt transfers which result in apcupsd polling the other interesting usages as well.
One thing I can't explain is occasionally I see bursts of control transfers repeated with no intervening interrupt transfer. That implies that hiddev is sending events to apcupsd with no reason. In some cases this happens as many as 20 times in succession, giving the appearance of a continuous stream of transfers. It's actually a set of 16 transfers repeated N times because apcupsd thinks an interrupt event has happened. I need to investigate that behavior.
One thing we could try is slowing down the back-to-back control transfers. It's possible they're causing the UPS to lock up. Martin, please try the attached apcupsd patch which enforces a 100 ms delay between control transfers.
--Adam
apcupsd-enforce-urb-delay.patch
Description: Binary data