Alan Stern wrote:
On Sun, 12 Jun 2005, Martin Kessler wrote:

Well here it is, 'control queue full' appeared around 22:48 I took a
snapshot from my logfile and have attached it. its 5MB so I compressed
it, hope thats ok ;-) If you need data prior to 21:00 as well, let me
know.

This helps.  The log clearly shows rapidly repeated control URBs being
sent.  They arise because some program is doing an ioctl to the hiddev
driver (hiddev_ioctl() gets called), resulting in a report being
submitted via a control URB (hid_submit_ctrl() gets called).

Almost certainly apcupsd is that program.

The messages started at 21:01:46 and recurred at one-minute intervals.
Every five minutes beginning at 21:05:46 the interval contained lots of
these messages, whereas the other intervals contained only a few.

Interestingly, the messages stopped at 22:00:46 after 61 repetitions, and
at 22:48:53 (almost an hour later!) the first "control queue full"
message appeared.  So it's hard to say whether there's any definite
connection.
But maybe this pattern will provide a clue.

At this point you should take out the patch I sent, and add some
debugging printks to hiddev_ioctl() and hid_submit_ctrl(). It might be
particularly interesting to see the value of current->comm.  It also
would be
interesting to see why these calls didn't show up in Adam's debugging.

These calls did appear when we did hid-core debug, and they're normal. An UPS typically supports dozens of reports containing everything from simple boolean flags (on-battery) to line quality (line-voltage, line-frequency) to environmental data (humidity, temperature). Apcupsd's algorithm is the following:

(1) Poll the "interesting" usages. (In Martin's case there are 16 usages that qualify as interesting, which translates to 16 control transfers issued via hiddev. There is little processing to do on these reports, so it is not
        surprising to see the control transfers issued back-to-back.)
   (2) Set internal flags based on current UPS state.
(3) Wait for interrupt reports to arrive. When such a report arrives, read its value and go back to step #1. (Note that hid-core filters out interrupt reports when the value doesn't change, so apcupsd won't react until
        a report value actually changes.)
(4) If 60 seconds elapse with no interrupt reports, go back to step #1.

So the basic algorithm means we should expect to see bursts of 16 control transfers once every 60 seconds, plus additional bursts generated in response to interrupt transfers. What's happening in Martin's case (based on previous hid-core debug logs) is that the line-voltage and ups-load reports are fluctuating occasionally. These changes are delivered via interupt transfers which result in apcupsd polling the other interesting usages as well.

One thing I can't explain is occasionally I see bursts of control transfers repeated with no intervening interrupt transfer. That implies that hiddev is sending events to apcupsd with no reason. In some cases this happens as many as 20 times in succession, giving the appearance of a continuous stream of transfers. It's actually a set of 16 transfers repeated N times because apcupsd thinks an interrupt event has happened. I need to investigate that behavior.

One thing we could try is slowing down the back-to-back control transfers. It's possible they're causing the UPS to lock up. Martin, please try the attached apcupsd patch which enforces a 100 ms delay between control transfers.

--Adam

Attachment: apcupsd-enforce-urb-delay.patch
Description: Binary data

Reply via email to