On Sat, 6 Jul 2013 09:53:21 -0400
Alan Stern <st...@rowland.harvard.edu> wrote:
> Please use Reply-To-All so that your messages get sent to the mailing 
> list as well as to me.

Dear Alan,

Sorry! As it happened I just noticed this and was about to re-send it
when my mailer crashed...

> 
> On Sat, 6 Jul 2013, Larry Keegan wrote:
> 
> > On Fri, 5 Jul 2013 17:41:25 -0400
> > Alan Stern <st...@rowland.harvard.edu> wrote:
> > > On Fri, 5 Jul 2013, Larry Keegan wrote:
> > > 
> > > > Dear Chaps,
> > > > 
> > > > I have a Microsoft Basic Optical mouse. When I used an earlier
> > > > kernel (2.6 series) it just worked. However, now that I'm using
> > > > 3.4.51 it suffers from USB disconnects and reconnects every 62
> > > > seconds precisely. These disconnects do not occur whilst the
> > > > mouse device is open. I have tried two different wired mice in
> > > > different USB slots of different computers with different
> > > > motherboard types, all to no avail.
> > > 
> > > Can you post a usbmon trace showing some of those disconnects and 
> > > reconnects?
> > > 
> > 
> > Dear Alan,
> > 
> > Thank you very much for helping me to diagnose this problem.
> > 
> > The following trace was made by waiting for a disconnect and
> > reconnect event to complete, starting cat, waiting for another event
> > and killing cat a few seconds afterwards.
> > 
> > The mouse was plugged directly into the socket on the motherboard.
> > 
> > # cat /sys/kernel/debug/usb/usbmon/4u
> > e9921c00 2437517512 C Ii:4:001:1 0:128 1 = 04
> > e9921c00 2437517523 S Ii:4:001:1 -115:128 2 <
> > e7454480 2437517533 S Ci:4:001:0 s a3 00 0000 0002 0004 4 <
> > e7454480 2437517543 C Ci:4:001:0 0 4 = 00010300
> > e7454480 2437517545 S Co:4:001:0 s 23 01 0010 0002 0000 0
> > e7454480 2437517548 C Co:4:001:0 0 0
> > e7454480 2437517549 S Co:4:001:0 s 23 01 0011 0002 0000 0
> > e7454480 2437517551 C Co:4:001:0 0 0
> > e7454900 2437520222 S Ci:4:001:0 s a3 00 0000 0002 0004 4 <
> > e7454900 2437520229 C Ci:4:001:0 0 4 = 00010000
> > e7454480 2437545511 S Ci:4:001:0 s a3 00 0000 0002 0004 4 <
> > e7454480 2437545518 C Ci:4:001:0 0 4 = 00010000
> > e4411780 2437571511 S Ci:4:001:0 s a3 00 0000 0002 0004 4 <
> > e4411780 2437571517 C Ci:4:001:0 0 4 = 00010000
> > e7454480 2437597511 S Ci:4:001:0 s a3 00 0000 0002 0004 4 <
> > e7454480 2437597517 C Ci:4:001:0 0 4 = 00010000
> > e7454480 2437623509 S Ci:4:001:0 s a3 00 0000 0002 0004 4 <
> > e7454480 2437623515 C Ci:4:001:0 0 4 = 00010000
> > e9921c00 2439017513 C Ii:4:001:1 0:128 1 = 04
> > e9921c00 2439017518 S Ii:4:001:1 -115:128 2 <
> > e7454480 2439017523 S Ci:4:001:0 s a3 00 0000 0002 0004 4 <
> > e7454480 2439017528 C Ci:4:001:0 0 4 = 01030100
> ...
> 
> This at least confirms that the disconnects do take place at the 
> hardware level.  They aren't merely a software artifact.
> 
> > I can supply more if you tell me what configurations are most
> > interesting.
> 
> No, I don't think usbmon will reveal anything more interesting than 
> this.
> 

Just a thought, but has the way the kernel 'sets-up' a USB mouse
changed over the months? I mean in a way which would mean different
numbers in the above usbmon trace? What I'm wondering is if the newer
kernel could be eeking out different bugs in the mouse's software, and
that is provoking the disconnect. Is that possible? Or is it genuinely an 
electrical thing? I'm afraid I know zip about USB.

> > > > What is interesting is that if I connect either mouse to the
> > > > computer via a hub, the timing of the disconnects becomes
> > > > irregular and unpredictable. I often find all the devices on
> > > > the bus (ie. everything but the root hub) disconnect themselves
> > > > at much the same time as the mouse does. Often the mouse
> > > > doesn't reconnect to the bus. In any case plugging in either
> > > > mouse makes that entire bus distinctly unreliable. This problem
> > > > is not peculiar to any particular hub model.
> 
> It certainly sounds like something in the mice is causing all this 
> disturbance.
> 
> > > > I am using a virgin kernel and am not using a well-known
> > > > user-space distribution. I use udev to set up my device nodes,
> > > > but that's all. To me, this has a distinct whiff of USB suspend
> > > > problems. I have attempted to turn off USB suspend by
> > > > repeatedly writing 'on' to all the 'power/control' files
> > > > under /sys/bus/usb/devices. It did not help. I also re-compiled
> > > > the kernel with power management and usb suspend support
> > > > removed. This didn't help either.
> > > 
> > > That's a pretty clear indication that USB suspend is _not_
> > > involved.
> > > 
> > > > I'd appreciate some advice.
> > > 
> > > It's possible this is an electrical problem -- the bus may not
> > > provide enough power for all the devices on it.
> > 
> > I doubt it's a power problem because:
> > 
> > - I have tried 3 computers now, each with different motherboards.
> > - One of the computers has had no hardware changes. It worked
> > before I updated the kernel, but fails now.
> > - The computers consist of nothing more than motherboard with
> >   integrated graphics, 35W or 65W Celeron CPU, RAM and single HDD.
> > They are powered by 250, 450 and 750W power supplies respectively.
> > One of the computers has a slim-line Radeon graphics card.
> > - I've also tried connecting the mouse via a Startech USB2EXT4 CAT5
> > USB extender. This device is mains-powered on both the local and
> > remote end.
> 
> Agreed, lack of power seems unlikely.
> 
> > EMI is also a possibility - both mains-borne and pick up from the
> > unscreened mouse leads. Although I've not done any spectrum
> > analysis, I have tried creating interference (mains and irradiated)
> > and they seem pretty immune to upset. The other thing is, prior to
> > doing the interference tests, the tangle of wires hadn't moved for
> > months, as the dust testified.
> > 
> > The other thing which makes me feel that power and EMI are unlikely
> > to be the cause of the problem is the precise nature of the events.
> > Here is a snippet from my syslog:
> > 
> > # tail -f /var/log/messages | grep 'USB disconnect'
> > 2013-07-06T11:27:19.518567Z {0.6} ud1 kernel: usb 4-2: USB
> > disconnect, device number 126
> > 2013-07-06T11:28:21.518556Z {0.6} ud1 kernel: usb 4-2: USB
> > disconnect, device number 127
> > 2013-07-06T11:29:23.518567Z {0.6} ud1 kernel: usb 4-2: USB
> > disconnect, device number 2
> > 2013-07-06T11:30:25.518562Z {0.6} ud1 kernel: usb 4-2: USB
> > disconnect, device number 3
> > 2013-07-06T11:31:27.518568Z {0.6} ud1 kernel: usb 4-2: USB
> > disconnect, device number 4
> > 2013-07-06T11:32:29.518567Z {0.6} ud1 kernel: usb 4-2: USB
> > disconnect, device number 5
> > 2013-07-06T11:33:31.518568Z {0.6} ud1 kernel: usb 4-2: USB
> > disconnect, device number 6
> > 
> > These events occur with surprising regularity. Too regular for it
> > to be an uncorrelated external event. Three mice, (two Microsoft,
> > and one other) but all three containing the same PixArt chip all
> > exhibit the same timing. If this were power droop I'd expect
> > capacitor component tolerances to cause slightly different timings
> > on the different mice, or different computers but they are
> > microsecond perfect.
> 
> Yes.  Obviously this is related to a hardware or software timer.  The 
> microsecond perfection may not mean anything, however.  Is this bus 
> connected to a UHCI controller?  The uhci-hcd driver uses a software 
> timer for detecting port-connect changes; the timer runs at a steady 
> rate of precisely 4 times per second.

Ah ah. I thought the numbers were suspiciously regular. I suppose the
mice, being 1.5Mbps devices will be connected to the UHCI. The ports
are wired to the motherboard's Intel ICH7. The port is definitely also
capable of speaking at 480Mbps.

However, I'm not sure this explains the essentially-random behaviour of
the most of the bus when the mouse is plugged into a USB 2.0 hub. In
these cases it is my understanding that the controller is EHCI and the
hub buffers these packets and sends them over the 480Mbps link as if
they were 'normal' USB 2.0 packets. That implies the mouse is also
electrically incompatible with my hubs. Umm. I'll look into this.

> 
> > Watch what happens when I physically unplug the mouse and plug it
> > back in:
> > 
> > 2013-07-06T11:42:49.518557Z {0.6} ud1 kernel: usb 4-2: USB
> > disconnect, device number 15
> > 2013-07-06T11:43:51.518568Z {0.6} ud1 kernel: usb 4-2: USB
> > disconnect, device number 16
> > 2013-07-06T11:44:28.713462Z {0.6} ud1 kernel: usb 4-2: USB
> > disconnect, device number 17
> > 2013-07-06T11:47:41.268567Z {0.6} ud1 kernel: usb 4-2: USB
> > disconnect, device number 18
> > 2013-07-06T11:48:43.268571Z {0.6} ud1 kernel: usb 4-2: USB
> > disconnect, device number 19
> > 2013-07-06T11:49:45.268566Z {0.6} ud1 kernel: usb 4-2: USB
> > disconnect, device number 20
> > 
> > This has to be a timer somewhere in the USB software in the mice or
> > the kernel, doesn't it?
> 
> Indeed.
> 
> You said the problems began on one of the computers when you upgraded
> the kernel.  Can you try using bisection to find the kernel change
> that first triggered the problem?

I'm recompiling as we speak.

Thank you very much for your help.

Yours,

Larry.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to