[linux-usb-devel] autosuspend IRQ trouble

2006-12-05 Thread Dominik Brodowski
Hi,

git bisect proved that the patch

commit 40f122f343797d02390c5a157372cac0c5b50bb7
Author: Alan Stern <[EMAIL PROTECTED]>
Date:   Thu Nov 9 14:44:33 2006 -0500

USB: Add autosuspend support to the hub driver

This patch (as742b) adds autosuspend/autoresume support to the USB hub
driver.  The largest aspect of the change is that we no longer need a
special flag for root hubs that want to be resumed.  Now every hub is
autoresumed whenever khubd needs to access it.

is the cause for IRQ #10 being disabled on my notebook:

[0.00] Linux version 2.6.19 ([EMAIL PROTECTED]) (gcc-Version 4.1.1 
(Gentoo 4.1.1-r1)) #12 PREEMPT Tue Dec 5 07:05:28 EST 2006
...
[5.894386] Initializing CPU#0
[5.894557] CPU 0 irqstacks, hard=c0599000 soft=c0598000
...
[6.149445] ACPI: bus type pci registered
[6.152902] PCI: PCI BIOS revision 2.10 entry at 0xfd9b2, last bus=2
[6.152958] PCI: Using configuration type 1
[6.153010] Setting up standard PCI resources
...
[6.196235] PCI: Probing PCI hardware (bus 00)
[6.204078] Boot video device is :00:02.0
[6.204790] PCI quirk: region 1000-107f claimed by ICH4 ACPI/GPIO/TCO
[6.204850] PCI quirk: region 1180-11bf claimed by ICH4 GPIO
[6.204990] PCI: Ignoring BAR0-3 of IDE controller :00:1f.1
[6.205959] PCI: Firmware left :02:08.0 e100 interrupts enabled, 
disabling
[6.206136] PCI: Transparent bridge - :00:1e.0
[6.206476] PCI: Bus #03 (-#06) is hidden behind transparent bridge #02 
(-#02) (try 'pci=assign-busses')
[6.206555] Please report the result to linux-kernel to fix this permanently
[6.206851] PCI: Bus #07 (-#0a) is hidden behind transparent bridge #02 
(-#02) (try 'pci=assign-busses')
[6.206928] Please report the result to linux-kernel to fix this permanently
[6.207017] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
[6.236980] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCIB._PRT]
[6.241593] ACPI: PCI Interrupt Link [LNKA] (IRQs *10)
[6.242806] ACPI: PCI Interrupt Link [LNKB] (IRQs *10)
[6.243914] ACPI: PCI Interrupt Link [LNKC] (IRQs *11)
[6.245044] ACPI: PCI Interrupt Link [LNKD] (IRQs *11)
[6.246163] ACPI: PCI Interrupt Link [LNKE] (IRQs *5)
[6.247256] ACPI: PCI Interrupt Link [LNKF] (IRQs 5) *0, disabled.
[6.248434] ACPI: PCI Interrupt Link [LNKG] (IRQs 10) *0, disabled.
[6.249591] ACPI: PCI Interrupt Link [LNKH] (IRQs *10)
[6.259908] ACPI: Power Resource [PFAN] (on)
[6.262406] SCSI subsystem initialized
[6.262639] libata version 2.00 loaded.
[6.263040] usbcore: registered new interface driver usbfs
[6.263355] usbcore: registered new interface driver hub
[6.263731] usbcore: registered new device driver usb
[6.264431] PCI: Using ACPI for IRQ routing
[6.264486] PCI: If a device doesn't work, try "pci=routeirq".  If it helps, 
post a report
...
[6.273486] PCI: Ignore bogus resource 6 [0:0] of :00:02.0
[6.273588] PCI: Bus 3, cardbus bridge: :02:03.0
[6.273642]   IO window: 3400-34ff
[6.273697]   IO window: 3800-38ff
[6.273753]   PREFETCH window: 4000-41ff
[6.273809]   MEM window: 4600-47ff
[6.273865] PCI: Bus 7, cardbus bridge: :02:03.1
[6.273918]   IO window: 3c00-3cff
[6.273973]   IO window: 1400-14ff
[6.274028]   PREFETCH window: 4200-43ff
[6.274115]   MEM window: 4800-49ff
[6.274170] PCI: Bridge: :00:1e.0
[6.274223]   IO window: 3000-3fff
[6.274279]   MEM window: e020-e02f
[6.274335]   PREFETCH window: 4000-43ff
[6.274407] PCI: Setting latency timer of device :00:1e.0 to 64
[6.275135] ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10
[6.275194] PCI: setting IRQ 10 as level-triggered
[6.275198] ACPI: PCI Interrupt :02:03.0[A] -> Link [LNKA] -> GSI 10 
(level, low) -> IRQ 10
[6.275337] PCI: Setting latency timer of device :02:03.0 to 64
[6.275357] PCI: Enabling device :02:03.1 ( -> 0003)
[6.276018] ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10
[6.276075] ACPI: PCI Interrupt :02:03.1[B] -> Link [LNKB] -> GSI 10 
(level, low) -> IRQ 10
[6.276225] PCI: Setting latency timer of device :02:03.1 to 64
...
[8.748036] Advanced Linux Sound Architecture Driver Version 1.0.13 (Tue Nov 
28 14:07:24 2006 UTC).
[8.750710] ACPI: PCI Interrupt :00:1f.5[B] -> Link [LNKB] -> GSI 10 
(level, low) -> IRQ 10
[8.750883] PCI: Setting latency timer of device :00:1f.5 to 64
...
[9.564993] intel8x0_measure_ac97_clock: measured 50992 usecs
[9.565051] intel8x0: clocking to 48000
[9.570402] ACPI: PCI Interrupt :00:1f.6[B] -> Link [LNKB] -> GSI 10 
(level, low) -> IRQ 10
[9.570556] PCI: Setting latency timer of device :00:1f.6 to 64
[9.671887] MC'97 1 converters and GPIO not ready (0xf000)
[9.674535] usbcore: registered new interface driver snd-usb-audio

Re: [linux-usb-devel] autosuspend IRQ trouble

2006-12-05 Thread Dominik Brodowski
Hi,

On Tue, Dec 05, 2006 at 07:19:01AM -0500, Dominik Brodowski wrote:
> Hi,
> 
> git bisect proved that the patch
> 
> commit 40f122f343797d02390c5a157372cac0c5b50bb7
> Author: Alan Stern <[EMAIL PROTECTED]>
> Date:   Thu Nov 9 14:44:33 2006 -0500
> 
> USB: Add autosuspend support to the hub driver
> 
> This patch (as742b) adds autosuspend/autoresume support to the USB hub
> driver.  The largest aspect of the change is that we no longer need a
> special flag for root hubs that want to be resumed.  Now every hub is
> autoresumed whenever khubd needs to access it.
> 
> is the cause for IRQ #10 being disabled on my notebook:


Now I tested this a bit further... and strangely, if I modprobe the USB
modules somewhen later, it is no problem at all. Running many many possible
combinations of modprobe'ing yenta_socket, ehci_hcd, uhci_hcd and other
modules loaded at the same time in the init scripts I use, or even using
/etc/init.d/modules, does not reproduce the error... Very strange.

Dominik

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2006-12-05 Thread Alan Stern
On Tue, 5 Dec 2006, Dominik Brodowski wrote:

> Hi,
> 
> On Tue, Dec 05, 2006 at 07:19:01AM -0500, Dominik Brodowski wrote:
> > Hi,
> > 
> > git bisect proved that the patch
> > 
> > commit 40f122f343797d02390c5a157372cac0c5b50bb7
> > Author: Alan Stern <[EMAIL PROTECTED]>
> > Date:   Thu Nov 9 14:44:33 2006 -0500
> > 
> > USB: Add autosuspend support to the hub driver
> > 
> > This patch (as742b) adds autosuspend/autoresume support to the USB hub
> > driver.  The largest aspect of the change is that we no longer need a
> > special flag for root hubs that want to be resumed.  Now every hub is
> > autoresumed whenever khubd needs to access it.
> > 
> > is the cause for IRQ #10 being disabled on my notebook:
> 
> 
> Now I tested this a bit further... and strangely, if I modprobe the USB
> modules somewhen later, it is no problem at all. Running many many possible
> combinations of modprobe'ing yenta_socket, ehci_hcd, uhci_hcd and other
> modules loaded at the same time in the init scripts I use, or even using
> /etc/init.d/modules, does not reproduce the error... Very strange.

I don't entirely trust git-bisect.  Can you try reverting just this one 
patch by hand, to verify that the problem really does go away?

Also, can you post a system log showing the problem with CONFIG_USB_DEBUG 
turned on?

If you prevent ehci-hcd.ko from being loaded in the usual way (say by 
renaming it), does the problem still occur?  What about uhci-hcd.ko?

Alan Stern


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2006-12-06 Thread Dominik Brodowski
Hi,

On Tue, Dec 05, 2006 at 04:39:34PM -0500, Alan Stern wrote:
> > On Tue, Dec 05, 2006 at 07:19:01AM -0500, Dominik Brodowski wrote:
> > > Hi,
> > > 
> > > git bisect proved that the patch
> > > 
> > > commit 40f122f343797d02390c5a157372cac0c5b50bb7
> > > Author: Alan Stern <[EMAIL PROTECTED]>
> > > Date:   Thu Nov 9 14:44:33 2006 -0500
> > > 
> > > USB: Add autosuspend support to the hub driver
> > > 
> > > This patch (as742b) adds autosuspend/autoresume support to the USB hub
> > > driver.  The largest aspect of the change is that we no longer need a
> > > special flag for root hubs that want to be resumed.  Now every hub is
> > > autoresumed whenever khubd needs to access it.
> > > 
> > > is the cause for IRQ #10 being disabled on my notebook:
> > 
> > 
> > Now I tested this a bit further... and strangely, if I modprobe the USB
> > modules somewhen later, it is no problem at all. Running many many possible
> > combinations of modprobe'ing yenta_socket, ehci_hcd, uhci_hcd and other
> > modules loaded at the same time in the init scripts I use, or even using
> > /etc/init.d/modules, does not reproduce the error... Very strange.
> 
> I don't entirely trust git-bisect.  Can you try reverting just this one 
> patch by hand, to verify that the problem really does go away?

Verified by hand, well, by "git reset HEAD^ && git checkout -f", and yes,
it's caused by autosuspend... BTW, why don't you entirely trust git-bisect?

> Also, can you post a system log showing the problem with CONFIG_USB_DEBUG 
> turned on?

It's attached.

> If you prevent ehci-hcd.ko from being loaded in the usual way (say by 
> renaming it), does the problem still occur?

No, it doesn't occur.

>  What about uhci-hcd.ko?

If I rename ehci-hcd, uhci-hcd seems to be bound to that hub, and all works
well there. Also, ehci-hcd works fine
a) if my Matrox USB HD isn't connected to that port [haven't tested other 
   USB devices yet] or
b) if the modules uhci-hcd, ehci-hcd and yenta_socket[*] are modprobed in
   any order after bootup. I can only reproduce it at the stage where udev
   events are created through uevent at boot time -- it's right then when
   udev is processing events that this error occurs.

Any ideas?

Dominik


[0.00] Linux version 2.6.19 ([EMAIL PROTECTED]) (gcc-Version 4.1.1 
(Gentoo 4.1.1-r1)) #16 PREEMPT Wed Dec 6 21:25:29 EST 2006
...
[   10.282625] ACPI: bus type pci registered
[   10.286080] PCI: PCI BIOS revision 2.10 entry at 0xfd9b2, last bus=2
[   10.286137] PCI: Using configuration type 1
[   10.286190] Setting up standard PCI resources
...
[   10.328881] PCI: Probing PCI hardware (bus 00)
[   10.336758] Boot video device is :00:02.0
[   10.337473] PCI quirk: region 1000-107f claimed by ICH4 ACPI/GPIO/TCO
[   10.337535] PCI quirk: region 1180-11bf claimed by ICH4 GPIO
[   10.337675] PCI: Ignoring BAR0-3 of IDE controller :00:1f.1
[   10.338637] PCI: Firmware left :02:08.0 e100 interrupts enabled, 
disabling
[   10.338813] PCI: Transparent bridge - :00:1e.0
[   10.339136] PCI: Bus #03 (-#06) is hidden behind transparent bridge #02 
(-#02) (try 'pci=assign-busses')
[   10.339213] Please report the result to linux-kernel to fix this permanently
[   10.339527] PCI: Bus #07 (-#0a) is hidden behind transparent bridge #02 
(-#02) (try 'pci=assign-busses')
[   10.339605] Please report the result to linux-kernel to fix this permanently
[   10.339694] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
[   10.369585] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCIB._PRT]
[   10.374175] ACPI: PCI Interrupt Link [LNKA] (IRQs *10)
[   10.375301] ACPI: PCI Interrupt Link [LNKB] (IRQs *10)
[   10.376400] ACPI: PCI Interrupt Link [LNKC] (IRQs *11)
[   10.377546] ACPI: PCI Interrupt Link [LNKD] (IRQs *11)
[   10.378651] ACPI: PCI Interrupt Link [LNKE] (IRQs *5)
[   10.379740] ACPI: PCI Interrupt Link [LNKF] (IRQs 5) *0, disabled.
[   10.380933] ACPI: PCI Interrupt Link [LNKG] (IRQs 10) *0, disabled.
[   10.382172] ACPI: PCI Interrupt Link [LNKH] (IRQs *10)
[   10.392529] ACPI: Power Resource [PFAN] (on)
[   10.394992] SCSI subsystem initialized
[   10.395223] libata version 2.00 loaded.
[   10.395617] usbcore: registered new interface driver usbfs
[   10.395940] usbcore: registered new interface driver hub
[   10.396326] usbcore: registered new device driver usb
[   10.397025] PCI: Using ACPI for IRQ routing
[   10.397079] PCI: If a device doesn't work, try "pci=routeirq".  If it helps, 
post a report
...
[   10.406006] PCI: Ignore bogus resource 6 [0:0] of :00:02.0
[   10.406103] PCI: Bus 3, cardbus bridge: :02:03.0
[   10.406158]   IO window: 3400-34ff
[   10.406214]   IO window: 3800-38ff
[   10.406269]   PREFETCH window: 4000-41ff
[   10.406326]   MEM window: 4600-47ff
[   10.406402] PCI: Bus 7, cardbus bridge: :02:03.1
[   10.406456]   IO window: 3c00-3cff
[   10.406512]   IO window: 1400-14ff
[   10.406567]   

Re: [linux-usb-devel] autosuspend IRQ trouble

2006-12-07 Thread Dominik Brodowski
Hi,

On Thu, Dec 07, 2006 at 05:29:32PM -0500, Alan Stern wrote:
> > Any ideas?
> 
> No doubt it's caused by the peculiar timing of events at startup.  Lots of 
> things are happening all at once, and the computer can't keep up with 
> everything.
> 
> For instance, your log shows the Matrox HD was detected at timestamp 23.0 
> roughly, but the usb-storage driver for it wasn't loaded until 30.97, by 
> which time the HD had been autosuspended and the EHCI root hub along with 
> it (at 25.2).
> 
> Here's an experiment to try.  Boot without the Maxtrox HD, and after
> everything has settled down, plug it in.  Wait about 10 seconds for
> usb-storage to load and initialize.  Then do "rmmod usb-storage" and wait 
> another 10 seconds; the HD and EHCI should autosuspend.  Perhaps at that 
> point the "nobody cared" problem will occur again.

No, it doesn't -- the IRQ continues to work fine.

> Or perhaps not.  I can't think of any reason why the EHCI controller
> should have generated the unhandled IRQ, and it seems very suspicious that
> it occurred just as the cs port probing was going on.  So maybe
> yenta_socket is at fault, and the USB stuff just sets up the right
> timing conditions for the problem to show up.

Unfortunately, that's not the case; USB isn't that soon off the hook: if no
USB device is connected, no interrupt is received on that line at all; and
even if I block yenta_socket from being loaded on startup, the problem
persists. So it's not PCMCIA.

Oh, and it doesn't seem to be usb-storage-specific. Accidentally, I had my
usb-audio device connected to the other ehci port (usb4-2) once, and the
same message appeared. With 2 interrupts instead of 1, though, but
anyways...

Dominik

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2006-12-08 Thread Alan Stern
On Wed, 6 Dec 2006, Dominik Brodowski wrote:

> > > Now I tested this a bit further... and strangely, if I modprobe the USB
> > > modules somewhen later, it is no problem at all. Running many many 
> > > possible
> > > combinations of modprobe'ing yenta_socket, ehci_hcd, uhci_hcd and other
> > > modules loaded at the same time in the init scripts I use, or even using
> > > /etc/init.d/modules, does not reproduce the error... Very strange.
> > 
> > I don't entirely trust git-bisect.  Can you try reverting just this one 
> > patch by hand, to verify that the problem really does go away?
> 
> Verified by hand, well, by "git reset HEAD^ && git checkout -f", and yes,
> it's caused by autosuspend... BTW, why don't you entirely trust git-bisect?

Because in the past there have been occasions where people would say 
"git-bisect identified this as the offending patch" and it turned out they 
were wrong.

> > Also, can you post a system log showing the problem with CONFIG_USB_DEBUG 
> > turned on?
> 
> It's attached.
> 
> > If you prevent ehci-hcd.ko from being loaded in the usual way (say by 
> > renaming it), does the problem still occur?
> 
> No, it doesn't occur.
> 
> >  What about uhci-hcd.ko?
> 
> If I rename ehci-hcd, uhci-hcd seems to be bound to that hub, and all works
> well there. Also, ehci-hcd works fine
> a) if my Matrox USB HD isn't connected to that port [haven't tested other 
>USB devices yet] or
> b) if the modules uhci-hcd, ehci-hcd and yenta_socket[*] are modprobed in
>any order after bootup. I can only reproduce it at the stage where udev
>events are created through uevent at boot time -- it's right then when
>udev is processing events that this error occurs.
> 
> Any ideas?

No doubt it's caused by the peculiar timing of events at startup.  Lots of 
things are happening all at once, and the computer can't keep up with 
everything.

For instance, your log shows the Matrox HD was detected at timestamp 23.0 
roughly, but the usb-storage driver for it wasn't loaded until 30.97, by 
which time the HD had been autosuspended and the EHCI root hub along with 
it (at 25.2).

Here's an experiment to try.  Boot without the Maxtrox HD, and after
everything has settled down, plug it in.  Wait about 10 seconds for
usb-storage to load and initialize.  Then do "rmmod usb-storage" and wait 
another 10 seconds; the HD and EHCI should autosuspend.  Perhaps at that 
point the "nobody cared" problem will occur again.

Or perhaps not.  I can't think of any reason why the EHCI controller
should have generated the unhandled IRQ, and it seems very suspicious that
it occurred just as the cs port probing was going on.  So maybe
yenta_socket is at fault, and the USB stuff just sets up the right
timing conditions for the problem to show up.

Alan Stern


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2006-12-08 Thread Alan Stern
On Thu, 7 Dec 2006, Dominik Brodowski wrote:

> > Or perhaps not.  I can't think of any reason why the EHCI controller
> > should have generated the unhandled IRQ, and it seems very suspicious that
> > it occurred just as the cs port probing was going on.  So maybe
> > yenta_socket is at fault, and the USB stuff just sets up the right
> > timing conditions for the problem to show up.
> 
> Unfortunately, that's not the case; USB isn't that soon off the hook: if no
> USB device is connected, no interrupt is received on that line at all; and
> even if I block yenta_socket from being loaded on startup, the problem
> persists. So it's not PCMCIA.
> 
> Oh, and it doesn't seem to be usb-storage-specific. Accidentally, I had my
> usb-audio device connected to the other ehci port (usb4-2) once, and the
> same message appeared. With 2 interrupts instead of 1, though, but
> anyways...

Okay.  Here's a patch that will print out some information for each of the 
first 100 interrupts received by ehci-hcd.  Block yenta-socket from being 
loaded, so as to reduce the number of extraneous interrupts, and see what 
you get.

By the way, what happens if you also block snd-intel8x0 (you might have 
to rebuild it as a module)?

Alan Stern



Index: usb-2.6/drivers/usb/host/ehci-hcd.c
===
--- usb-2.6.orig/drivers/usb/host/ehci-hcd.c
+++ usb-2.6/drivers/usb/host/ehci-hcd.c
@@ -574,6 +574,14 @@ static irqreturn_t ehci_irq (struct usb_
spin_lock (&ehci->lock);
 
status = readl (&ehci->regs->status);
+   {
+   static int cnt;
+
+   if (cnt < 100) {
+   ++cnt;
+   ehci_info(ehci, "IRQ status %x\n", status);
+   }
+   }
 
/* e.g. cardbus physical eject */
if (status == ~(u32) 0) {


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2006-12-08 Thread Dominik Brodowski
Hi,

On Fri, Dec 08, 2006 at 11:00:15AM -0500, Alan Stern wrote:
> Okay.  Here's a patch that will print out some information for each of the 
> first 100 interrupts received by ehci-hcd.  Block yenta-socket from being 
> loaded, so as to reduce the number of extraneous interrupts, and see what 
> you get.

Now I did not only block yenta-socket (which does not cause any interrupts
during initialization) but also ohci1394 (which is on the other IRQ line,
but anyways) and snd-intel8x0; but that did not help:

http://userweb.kernel.org/~brodo/dmesg-autosuspend.txt

The "offending" IRQ status seems to be 2008; as INTR_MASK does neither
include STS_FLR nor STS_RECL (if I got the math correctly), IRQ_NONE is
returned.


Thanks,
Dominik

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2006-12-09 Thread Alan Stern
On Fri, 8 Dec 2006, Dominik Brodowski wrote:

> Hi,
> 
> On Fri, Dec 08, 2006 at 11:00:15AM -0500, Alan Stern wrote:
> > Okay.  Here's a patch that will print out some information for each of the 
> > first 100 interrupts received by ehci-hcd.  Block yenta-socket from being 
> > loaded, so as to reduce the number of extraneous interrupts, and see what 
> > you get.
> 
> Now I did not only block yenta-socket (which does not cause any interrupts
> during initialization) but also ohci1394 (which is on the other IRQ line,
> but anyways) and snd-intel8x0;

Good.

>  but that did not help:
> 
> http://userweb.kernel.org/~brodo/dmesg-autosuspend.txt
> 
> The "offending" IRQ status seems to be 2008; as INTR_MASK does neither
> include STS_FLR nor STS_RECL (if I got the math correctly), IRQ_NONE is
> returned.

Yes, that's right.  In fact the controller isn't supposed to send an IRQ
when only those two bits are on.  I suspect the STS_FLR bit is somehow
getting set in the intr_enable register (don't ask me how -- there doesn't
seem to be any code that could do it).  Can you modify the patch to print
out the value of that register as well as the value of the status 
register?

Also, does the problem occur if you block uhci-hcd from loading at startup
too?  Then ehci-hcd would be the only remaining user of IRQ 10.

Alan Stern


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2006-12-11 Thread Dominik Brodowski
On Sat, Dec 09, 2006 at 04:03:48PM -0500, Alan Stern wrote:
> >  but that did not help:
> > 
> > http://userweb.kernel.org/~brodo/dmesg-autosuspend.txt
> > 
> > The "offending" IRQ status seems to be 2008; as INTR_MASK does neither
> > include STS_FLR nor STS_RECL (if I got the math correctly), IRQ_NONE is
> > returned.
> 
> Yes, that's right.  In fact the controller isn't supposed to send an IRQ
> when only those two bits are on.  I suspect the STS_FLR bit is somehow
> getting set in the intr_enable register (don't ask me how -- there doesn't
> seem to be any code that could do it).  Can you modify the patch to print
> out the value of that register as well as the value of the status 
> register?

done (with only ehci-hcd being the only IRQ 10 user).

case A) usb-storage device connected to "left" USB port:

The flip occurs on or after IRQ status 8028

http://userweb.kernel.org/~brodo/dmesg-autosuspend-3.txt


case B) snd-usb-audio device connected to "right" USB port:

The flip occurs on or after IRQ status 0004.

http://userweb.kernel.org/~brodo/dmesg-autosuspend-2.txt
(unfortunately, without extended USB debug messages)


Thanks,
Dominik


PS: the patch I used:

--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -569,10 +569,22 @@ static irqreturn_t ehci_irq (struct usb_
struct ehci_hcd *ehci = hcd_to_ehci (hcd);
u32 status;
int bh;
+   u32 temp;
 
spin_lock (&ehci->lock);
 
status = readl (&ehci->regs->status);
+   temp = readl (&ehci->regs->intr_enable);
+
+   {
+   static int cnt;
+
+   if (cnt < 200) {
+   ++cnt;
+   ehci_info(ehci, "IRQ status %x, intr_enable %x\n",
+ status, temp);
+   }
+   }
 
/* e.g. cardbus physical eject */
if (status == ~(u32) 0) {

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2006-12-11 Thread Alan Stern
On Mon, 11 Dec 2006, Dominik Brodowski wrote:

> > Yes, that's right.  In fact the controller isn't supposed to send an IRQ
> > when only those two bits are on.  I suspect the STS_FLR bit is somehow
> > getting set in the intr_enable register (don't ask me how -- there doesn't
> > seem to be any code that could do it).  Can you modify the patch to print
> > out the value of that register as well as the value of the status 
> > register?
> 
> done (with only ehci-hcd being the only IRQ 10 user).
> 
> case A) usb-storage device connected to "left" USB port:
> 
> The flip occurs on or after IRQ status 8028
> 
> http://userweb.kernel.org/~brodo/dmesg-autosuspend-3.txt

There's no question; that's the reason for your problem.  But I wonder how 
that bit ever managed to get turned on...

I'll have to study the code some more.  Which kernel did you say you are 
using?

> case B) snd-usb-audio device connected to "right" USB port:
> 
> The flip occurs on or after IRQ status 0004.
> 
> http://userweb.kernel.org/~brodo/dmesg-autosuspend-2.txt
> (unfortunately, without extended USB debug messages)

And in this example there wasn't even a suspend-resume pair to confuse 
the issue, which suggests that the same thing might end up happening even 
without the patch git-bisect identified.  Can you try running exactly the 
same test, but with that patch reverted?

> Thanks,
>   Dominik
> 
> 
> PS: the patch I used:

It's fine.

Alan Stern


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2006-12-12 Thread Dominik Brodowski
On Mon, Dec 11, 2006 at 11:11:08PM -0500, Alan Stern wrote:
> On Mon, 11 Dec 2006, Dominik Brodowski wrote:
> 
> > > Yes, that's right.  In fact the controller isn't supposed to send an IRQ
> > > when only those two bits are on.  I suspect the STS_FLR bit is somehow
> > > getting set in the intr_enable register (don't ask me how -- there doesn't
> > > seem to be any code that could do it).  Can you modify the patch to print
> > > out the value of that register as well as the value of the status 
> > > register?
> > 
> > done (with only ehci-hcd being the only IRQ 10 user).
> > 
> > case A) usb-storage device connected to "left" USB port:
> > 
> > The flip occurs on or after IRQ status 8028
> > 
> > http://userweb.kernel.org/~brodo/dmesg-autosuspend-3.txt
> 
> There's no question; that's the reason for your problem.  But I wonder how 
> that bit ever managed to get turned on...
> 
> I'll have to study the code some more.  Which kernel did you say you are 
> using?

Latest Linus' git. Today's snd-usb-audio test was with head
4259cb25d436a79bf6b07d8075423573567c211d (plus an completely unrelated
pcmcia patch, but the related modules didn't even load).

> > case B) snd-usb-audio device connected to "right" USB port:
> > 
> > The flip occurs on or after IRQ status 0004.
> > 
> > 
> > (unfortunately, without extended USB debug messages)
> 
> And in this example there wasn't even a suspend-resume pair to confuse 
> the issue, which suggests that the same thing might end up happening even 
> without the patch git-bisect identified.  Can you try running exactly the 
> same test, but with that patch reverted?

Actually, there _was_ a suspend and resume going on, it just didn't show up
in the logs for I had CONFIG_USB_DEBUG disabled for that test accidentally. 

http://userweb.kernel.org/~brodo/dmesg-autosuspend-2b.txt

from today confirms it, and I also ran the pre-autosuspend kernel ( its head
is 8c03356a559ced6fa78931f498193f776d67e445 ) to re-check that it is an issue
which appeared with the autosuspend patch.

Thanks,
Dominik

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2006-12-12 Thread Alan Stern
On Tue, 12 Dec 2006, Dominik Brodowski wrote:

> > I'll have to study the code some more.  Which kernel did you say you are 
> > using?
> 
> Latest Linus' git. Today's snd-usb-audio test was with head
> 4259cb25d436a79bf6b07d8075423573567c211d (plus an completely unrelated
> pcmcia patch, but the related modules didn't even load).
> 
> > > case B) snd-usb-audio device connected to "right" USB port:
> > > 
> > > The flip occurs on or after IRQ status 0004.
> > > 
> > > 
> > > (unfortunately, without extended USB debug messages)
> > 
> > And in this example there wasn't even a suspend-resume pair to confuse 
> > the issue, which suggests that the same thing might end up happening even 
> > without the patch git-bisect identified.  Can you try running exactly the 
> > same test, but with that patch reverted?
> 
> Actually, there _was_ a suspend and resume going on, it just didn't show up
> in the logs for I had CONFIG_USB_DEBUG disabled for that test accidentally. 
> 
> http://userweb.kernel.org/~brodo/dmesg-autosuspend-2b.txt
> 
> from today confirms it, and I also ran the pre-autosuspend kernel ( its head
> is 8c03356a559ced6fa78931f498193f776d67e445 ) to re-check that it is an issue
> which appeared with the autosuspend patch.

Okay, I was fooled by the lack of debugging info.  And clearly the suspend 
or resume routine is implicated.

This suggests we check the value of the intr_enable register at the entry 
and exit of both ehci_bus_suspend() and ehci_bus_resume() in ehci_hub.c.  
You can add the appropriate printk statements easily.

Hmmm...  Perhaps the final writel() at the end of ehci_bus_resume() isn't
getting sent through.  The mere act of reading it back might be enough to
change the behavior.

On the other hand, your latest log suggests that the STS_FLR bit gets set 
during the ehci_bus_suspend() routine, not the resume routine.  So it will 
be best to check at the beginning and end of both routines.

Alan Stern


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2006-12-12 Thread Dominik Brodowski
On Tue, Dec 12, 2006 at 11:27:42AM -0500, Alan Stern wrote:
> > from today confirms it, and I also ran the pre-autosuspend kernel ( its head
> > is 8c03356a559ced6fa78931f498193f776d67e445 ) to re-check that it is an 
> > issue
> > which appeared with the autosuspend patch.
> 
> Okay, I was fooled by the lack of debugging info.  And clearly the suspend 
> or resume routine is implicated.
> 
> This suggests we check the value of the intr_enable register at the entry 
> and exit of both ehci_bus_suspend() and ehci_bus_resume() in ehci_hub.c.  
> You can add the appropriate printk statements easily.
> 
> Hmmm...  Perhaps the final writel() at the end of ehci_bus_resume() isn't
> getting sent through.  The mere act of reading it back might be enough to
> change the behavior.
> 
> On the other hand, your latest log suggests that the STS_FLR bit gets set 
> during the ehci_bus_suspend() routine, not the resume routine.  So it will 
> be best to check at the beginning and end of both routines.

Unfortunately, it doesn't get set anywhere then, but outside:

case A:
http://userweb.kernel.org/~brodo/dmesg-autosuspend-3-sr.txt

[   26.380849] ehci_bus_resume: intro 37
[   26.380852] ehci_hcd :00:1d.7: resume root hub
[   26.400866] ehci_bus_resume: exit 37 (mask was 37)
[   26.411590] hub 1-0:1.0: state 7 ports 6 chg  evt 
[   26.411679] usb 1-1: usb auto-resume
[   26.437439] ehci_hcd :00:1d.7: GetStatus port 1 status 001005 POWER 
sig=se0 PE CONNECT
[   26.448389] usb 1-1: finish resume
[   26.448681] ehci_hcd :00:1d.7: IRQ status 8009, intr_enable 37
...
[   26.453427] ehci_hcd :00:1d.7: IRQ status 8028, intr_enable 37
...
[   28.900407] ACPI: CPU0 (power states: C1[C1] C2[C2] C3[C3])
[   28.900419] ACPI: Processor [CPU0] (supports 8 throttling states)
[   17.451000] Time: acpi_pm clocksource has been installed.
[   17.545000] acpi_processor-0740 [00] processor_preregister_: Error while
parsing _PSD domain information. Assuming no coordination
[   17.546000] ehci_hcd :00:1d.7: IRQ status 2008, intr_enable 3f

All of a sudden, it seems...


Case B:
http://userweb.kernel.org/~brodo/dmesg-autosuspend-2-sr.txt

[   15.582559] ehci_hcd :00:1d.7: IRQ status 4, intr_enable 37
[   16.327713] libata version 2.00 loaded.
[   17.575888] hub 1-0:1.0: hub_suspend
[   17.575896] ehci_bus_suspend: intro 37
[   17.576013] ehci_bus_suspend: exit 37 (mask was 37)
[   17.576620] usb usb1: usb auto-suspend
...
[   22.312792] ehci_hcd :00:1d.7: IRQ status 8, intr_enable 3f


For Case B, it seems to happen while the hub is suspended; while for
Case A, it seems to happen while the hub is resumed. It just gets more and
more strange. Broken hardware?[*]

Thanks,
Dominik

[*] ... which otherwise seems to work fine...

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2006-12-12 Thread Alan Stern
On Tue, 12 Dec 2006, Dominik Brodowski wrote:

> > On the other hand, your latest log suggests that the STS_FLR bit gets set 
> > during the ehci_bus_suspend() routine, not the resume routine.  So it will 
> > be best to check at the beginning and end of both routines.
> 
> Unfortunately, it doesn't get set anywhere then, but outside:
> 
> case A:
> http://userweb.kernel.org/~brodo/dmesg-autosuspend-3-sr.txt
> 
> [   26.380849] ehci_bus_resume: intro 37
> [   26.380852] ehci_hcd :00:1d.7: resume root hub
> [   26.400866] ehci_bus_resume: exit 37 (mask was 37)
> [   26.411590] hub 1-0:1.0: state 7 ports 6 chg  evt 
> [   26.411679] usb 1-1: usb auto-resume
> [   26.437439] ehci_hcd :00:1d.7: GetStatus port 1 status 001005 POWER 
> sig=se0 PE CONNECT
> [   26.448389] usb 1-1: finish resume
> [   26.448681] ehci_hcd :00:1d.7: IRQ status 8009, intr_enable 37
> ...
> [   26.453427] ehci_hcd :00:1d.7: IRQ status 8028, intr_enable 37
> ...
> [   28.900407] ACPI: CPU0 (power states: C1[C1] C2[C2] C3[C3])
> [   28.900419] ACPI: Processor [CPU0] (supports 8 throttling states)
> [   17.451000] Time: acpi_pm clocksource has been installed.
> [   17.545000] acpi_processor-0740 [00] processor_preregister_: Error while
> parsing _PSD domain information. Assuming no coordination
> [   17.546000] ehci_hcd :00:1d.7: IRQ status 2008, intr_enable 3f
> 
> All of a sudden, it seems...
> 
> 
> Case B:
> http://userweb.kernel.org/~brodo/dmesg-autosuspend-2-sr.txt
> 
> [   15.582559] ehci_hcd :00:1d.7: IRQ status 4, intr_enable 37
> [   16.327713] libata version 2.00 loaded.
> [   17.575888] hub 1-0:1.0: hub_suspend
> [   17.575896] ehci_bus_suspend: intro 37
> [   17.576013] ehci_bus_suspend: exit 37 (mask was 37)
> [   17.576620] usb usb1: usb auto-suspend
> ...
> [   22.312792] ehci_hcd :00:1d.7: IRQ status 8, intr_enable 3f
> 
> 
> For Case B, it seems to happen while the hub is suspended; while for
> Case A, it seems to happen while the hub is resumed. It just gets more and
> more strange. Broken hardware?[*]

I can't find anything in the driver that would set the bit.  Certainly not 
while the controller is suspended and nothing is happening.  It's possible 
that some other driver is setting it by mistake, but that seems pretty 
unlikely.  However I haven't tried running any new kernels recently; I've 
been waiting for 2.6.20-rc1 to appear.  Perhaps the same thing will show 
up on my machine...

But if some other driver is responsible, why wouldn't it happen without 
the autosuspend patch?  More likely the suspend triggers the hardware into 
doing something really funky.

You could try another test: Let the IRQ be disabled, and then rmmod 
ehci-hcd and modprobe it back.  Perhaps then rmmod usb-storage to force 
another suspend, perhaps not.  Anyway, see what happens to intr_enable.  
You can always force interrupts to occur by turning the USB HD off or on.

It would be possible to patch the ehci_irq() routine to have it turn off
the STS_FLR bit whenever necessary.  But first I would like to know what
causes it to turn on at all.  Maybe it really is a hardware problem.

Alan Stern


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2007-01-28 Thread Dominik Brodowski
Hi,

Sorry for not getting back to you earlier -- many things kept me distracted:

On Tue, Dec 12, 2006 at 04:08:39PM -0500, Alan Stern wrote:
> On Tue, 12 Dec 2006, Dominik Brodowski wrote:
> 
> > > On the other hand, your latest log suggests that the STS_FLR bit gets set 
> > > during the ehci_bus_suspend() routine, not the resume routine.  So it 
> > > will 
> > > be best to check at the beginning and end of both routines.
> > 
> > Unfortunately, it doesn't get set anywhere then, but outside:
> > 
> > case A:
> > http://userweb.kernel.org/~brodo/dmesg-autosuspend-3-sr.txt
> > 
> > [   26.380849] ehci_bus_resume: intro 37
> > [   26.380852] ehci_hcd :00:1d.7: resume root hub
> > [   26.400866] ehci_bus_resume: exit 37 (mask was 37)
> > [   26.411590] hub 1-0:1.0: state 7 ports 6 chg  evt 
> > [   26.411679] usb 1-1: usb auto-resume
> > [   26.437439] ehci_hcd :00:1d.7: GetStatus port 1 status 001005 POWER 
> > sig=se0 PE CONNECT
> > [   26.448389] usb 1-1: finish resume
> > [   26.448681] ehci_hcd :00:1d.7: IRQ status 8009, intr_enable 37
> > ...
> > [   26.453427] ehci_hcd :00:1d.7: IRQ status 8028, intr_enable 37
> > ...
> > [   28.900407] ACPI: CPU0 (power states: C1[C1] C2[C2] C3[C3])
> > [   28.900419] ACPI: Processor [CPU0] (supports 8 throttling states)
> > [   17.451000] Time: acpi_pm clocksource has been installed.
> > [   17.545000] acpi_processor-0740 [00] processor_preregister_: Error while
> > parsing _PSD domain information. Assuming no coordination
> > [   17.546000] ehci_hcd :00:1d.7: IRQ status 2008, intr_enable 3f
> > 
> > All of a sudden, it seems...
> > 
> > 
> > Case B:
> > http://userweb.kernel.org/~brodo/dmesg-autosuspend-2-sr.txt
> > 
> > [   15.582559] ehci_hcd :00:1d.7: IRQ status 4, intr_enable 37
> > [   16.327713] libata version 2.00 loaded.
> > [   17.575888] hub 1-0:1.0: hub_suspend
> > [   17.575896] ehci_bus_suspend: intro 37
> > [   17.576013] ehci_bus_suspend: exit 37 (mask was 37)
> > [   17.576620] usb usb1: usb auto-suspend
> > ...
> > [   22.312792] ehci_hcd :00:1d.7: IRQ status 8, intr_enable 3f
> > 
> > 
> > For Case B, it seems to happen while the hub is suspended; while for
> > Case A, it seems to happen while the hub is resumed. It just gets more and
> > more strange. Broken hardware?[*]
> 
> I can't find anything in the driver that would set the bit.  Certainly not 
> while the controller is suspended and nothing is happening.  It's possible 
> that some other driver is setting it by mistake, but that seems pretty 
> unlikely.  However I haven't tried running any new kernels recently; I've 
> been waiting for 2.6.20-rc1 to appear.  Perhaps the same thing will show 
> up on my machine...
> 
> But if some other driver is responsible, why wouldn't it happen without 
> the autosuspend patch?  More likely the suspend triggers the hardware into 
> doing something really funky.
> 
> You could try another test: Let the IRQ be disabled, and then rmmod 
> ehci-hcd and modprobe it back.  Perhaps then rmmod usb-storage to force 
> another suspend, perhaps not.  Anyway, see what happens to intr_enable.  
> You can always force interrupts to occur by turning the USB HD off or on.

Unfortunately, the IRQ line is and stays disabled, and I don't know how I
could re-enable it.

> It would be possible to patch the ehci_irq() routine to have it turn off
> the STS_FLR bit whenever necessary.  But first I would like to know what
> causes it to turn on at all.  Maybe it really is a hardware problem.

I did that (see below), but then the IRQ subsystem continued to be "dead"
-- it seemed to me that the USB hub is in a completely broken state when
this codepath is entered...

To further complicate matters, I now also see IRQ10 being disabled whenever
I switch from the console (i810) to a terminal -- but that bug was
introduced _later_ than the autosuspend bug. And, 2.6.19 works perfectly
fine with regard to both issues...

Dominik



diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 025d333..d10cf42 100644
--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -569,6 +569,7 @@ static irqreturn_t ehci_irq (struct usb_hcd *hcd)
struct ehci_hcd *ehci = hcd_to_ehci (hcd);
u32 status;
int bh;
+   u32 intr_enable;
 
spin_lock (&ehci->lock);
 
@@ -580,6 +581,17 @@ static irqreturn_t ehci_irq (struct usb_hcd *hcd)
goto dead;
}
 
+   intr_enable = readl(&ehci->regs->intr_enable);
+
+   if (unlikely(intr_enable != INTR_MASK)) {
+   ehci_info (ehci, "STS_FLR - clearing. status: 0x%x intr 0x%x 
(0x%x\n", status, intr_enable, INTR_MASK);
+   writel(INTR_MASK, &ehci->regs->intr_enable);
+   if (!(status & INTR_MASK)) {
+   spin_unlock(&ehci->lock);
+   return IRQ_HANDLED;
+   }
+   }
+

Re: [linux-usb-devel] autosuspend IRQ trouble

2007-01-29 Thread Alan Stern
On Sun, 28 Jan 2007, Dominik Brodowski wrote:

> > You could try another test: Let the IRQ be disabled, and then rmmod 
> > ehci-hcd and modprobe it back.  Perhaps then rmmod usb-storage to force 
> > another suspend, perhaps not.  Anyway, see what happens to intr_enable.  
> > You can always force interrupts to occur by turning the USB HD off or on.
> 
> Unfortunately, the IRQ line is and stays disabled, and I don't know how I
> could re-enable it.

Loading a driver that uses the IRQ should re-enable it.

> > It would be possible to patch the ehci_irq() routine to have it turn off
> > the STS_FLR bit whenever necessary.  But first I would like to know what
> > causes it to turn on at all.  Maybe it really is a hardware problem.
> 
> I did that (see below), but then the IRQ subsystem continued to be "dead"
> -- it seemed to me that the USB hub is in a completely broken state when
> this codepath is entered...

What exactly do you mean?  Can you provide a dmesg log with
CONFIG_USB_DEBUG turned on?

> To further complicate matters, I now also see IRQ10 being disabled whenever
> I switch from the console (i810) to a terminal -- but that bug was
> introduced _later_ than the autosuspend bug. And, 2.6.19 works perfectly
> fine with regard to both issues...

Are you entirely certain that 2.6.19 works perfectly?  It does a lot less 
autosuspending than 2.6.20, true...  But if you force a suspend in 2.6.19 
do you then see the same sort of IRQ trouble?

Alan Stern

PS: Your patch did not include a PCI read to flush the interrupt-mask 
write.


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2007-01-29 Thread Dominik Brodowski
Hi,

On Mon, Jan 29, 2007 at 10:41:58AM -0500, Alan Stern wrote:
> On Sun, 28 Jan 2007, Dominik Brodowski wrote:
> 
> > > You could try another test: Let the IRQ be disabled, and then rmmod 
> > > ehci-hcd and modprobe it back.  Perhaps then rmmod usb-storage to force 
> > > another suspend, perhaps not.  Anyway, see what happens to intr_enable.  
> > > You can always force interrupts to occur by turning the USB HD off or on.
> > 
> > Unfortunately, the IRQ line is and stays disabled, and I don't know how I
> > could re-enable it.
> 
> Loading a driver that uses the IRQ should re-enable it.

Ah, okay, need to remember that. Thanks.

> > > It would be possible to patch the ehci_irq() routine to have it turn off
> > > the STS_FLR bit whenever necessary.  But first I would like to know what
> > > causes it to turn on at all.  Maybe it really is a hardware problem.
> > 
> > I did that (see below), but then the IRQ subsystem continued to be "dead"
> > -- it seemed to me that the USB hub is in a completely broken state when
> > this codepath is entered...
> 
> What exactly do you mean?

Allright, done some more debugging:
- using my patch with your fix, devices which are already plugged in when
  the STS_FLR exception occurs continue to work
- however, new devices which are plugged in, or devices which are removed,
  (unless the hub driver is awakened by other means) do not get noticed
- the STS_FLR exception is easily reproducible for me:
  - plug in USB HD
  - rmmod usb_storage
  - wait between one and seven seconds

> Can you provide a dmesg log with CONFIG_USB_DEBUG turned on?

http://userweb.kernel.org/~brodo/dmesg-2.6.19.2 - 2.6.19.2 which really
seems to work fine (even when doing suspend to RAM and suspend
to disk)
http://userweb.kernel.org/~brodo/dmesg-2.6.20-rc6 - the first part with
all the intialization, and the first STS_FLR exception occuring
http://userweb.kernel.org/~brodo/dmesg-2.6.20-rc6-part2 - second part where
the USB devices are partly in use, partly suspending is forced,
and STS_FLR exceptions occur

> > To further complicate matters, I now also see IRQ10 being disabled whenever
> > I switch from the console (i810) to a terminal -- but that bug was
> > introduced _later_ than the autosuspend bug. And, 2.6.19 works perfectly
> > fine with regard to both issues...
> 
> Are you entirely certain that 2.6.19 works perfectly?  It does a lot less 
> autosuspending than 2.6.20, true...  But if you force a suspend in 2.6.19 
> do you then see the same sort of IRQ trouble?

How do I force a suspend? Is suspend to RAM / suspend to disk enough to
force it, or from what you can else see in the dmesg?

Hope this helps,
Dominik

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2007-01-30 Thread Alan Stern
On Mon, 29 Jan 2007, Dominik Brodowski wrote:

> Allright, done some more debugging:
> - using my patch with your fix, devices which are already plugged in when
>   the STS_FLR exception occurs continue to work
> - however, new devices which are plugged in, or devices which are removed,
>   (unless the hub driver is awakened by other means) do not get noticed

This means that the port-change events don't generate interrupt requests.

Try running this test again, with CONFIG_USB_DEBUG turned on.  After 
plugging in a new device, make a copy of

/sys/class/usb_host/usb_host1/registers

and post it.  Ditto for unplugging an existing device.

> - the STS_FLR exception is easily reproducible for me:
>   - plug in USB HD
>   - rmmod usb_storage
>   - wait between one and seven seconds

Yep, it seems to happen every time the root hub is suspended.  Does it 
happen also if you simply unplug the USB HD?


> > Are you entirely certain that 2.6.19 works perfectly?  It does a lot less 
> > autosuspending than 2.6.20, true...  But if you force a suspend in 2.6.19 
> > do you then see the same sort of IRQ trouble?
> 
> How do I force a suspend? Is suspend to RAM / suspend to disk enough to
> force it, or from what you can else see in the dmesg?

Turn on CONFIG_PM_SYSFS_DEPRECATED in your kernel build of 2.6.19.  After 
booting and plugging in the USB HD, rmmod usb-storage.  Then do

echo -n 2 >/sys/bus/usb/devices/usb1/power/state

That will force the root hub to suspend.

Alan Stern


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2007-01-30 Thread Dominik Brodowski
Hi,

On Tue, Jan 30, 2007 at 10:50:19AM -0500, Alan Stern wrote:
> On Mon, 29 Jan 2007, Dominik Brodowski wrote:
> 
> > Allright, done some more debugging:
> > - using my patch with your fix, devices which are already plugged in when
> >   the STS_FLR exception occurs continue to work
> > - however, new devices which are plugged in, or devices which are removed,
> >   (unless the hub driver is awakened by other means) do not get noticed
> 
> This means that the port-change events don't generate interrupt requests.
> 
> Try running this test again, with CONFIG_USB_DEBUG turned on.  After 
> plugging in a new device, make a copy of
> 
>   /sys/class/usb_host/usb_host1/registers
> 
> and post it.  Ditto for unplugging an existing device.

http://userweb.kernel.org/~brodo/pre-removal
=> removed device 1
http://userweb.kernel.org/~brodo/post-removal
=> removed device 2
http://userweb.kernel.org/~brodo/post-removal2

next test:
http://userweb.kernel.org/~brodo/pre-insert
=> added device 2
http://userweb.kernel.org/~brodo/post-insert


The one thing which strikes me as odd is that all these tests are ONLY
reproducible if there is an usb device plugged in during boot. If it wasn't
plugged in there during boot, I can do whatever I want, everything works
perfectly fine.

Changing the only USB-related entry in the BIOS ("Legacy USB Support") does
not change anything.

> > - the STS_FLR exception is easily reproducible for me:
> >   - plug in USB HD
> >   - rmmod usb_storage
> >   - wait between one and seven seconds
> 
> Yep, it seems to happen every time the root hub is suspended.  Does it 
> happen also if you simply unplug the USB HD?

Yes.

> > > Are you entirely certain that 2.6.19 works perfectly?  It does a lot less 
> > > autosuspending than 2.6.20, true...  But if you force a suspend in 2.6.19 
> > > do you then see the same sort of IRQ trouble?
> > 
> > How do I force a suspend? Is suspend to RAM / suspend to disk enough to
> > force it, or from what you can else see in the dmesg?
> 
> Turn on CONFIG_PM_SYSFS_DEPRECATED in your kernel build of 2.6.19.  After 
> booting and plugging in the USB HD, rmmod usb-storage.  Then do
> 
>   echo -n 2 >/sys/bus/usb/devices/usb1/power/state
> 
> That will force the root hub to suspend.

Using this trick, I can get IRQ 10 being disabled.

So technically it's not a regression, but... ;)

Dominik

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2007-01-30 Thread Alan Stern
On Tue, 30 Jan 2007, Dominik Brodowski wrote:

> > Try running this test again, with CONFIG_USB_DEBUG turned on.  After 
> > plugging in a new device, make a copy of
> > 
> > /sys/class/usb_host/usb_host1/registers
> > 
> > and post it.  Ditto for unplugging an existing device.
> 
> http://userweb.kernel.org/~brodo/pre-removal
> => removed device 1
> http://userweb.kernel.org/~brodo/post-removal
> => removed device 2
> http://userweb.kernel.org/~brodo/post-removal2
> 
> next test:
> http://userweb.kernel.org/~brodo/pre-insert
> => added device 2
> http://userweb.kernel.org/~brodo/post-insert

These were all done with the controller supposedly suspended, right?  But 
they all show that it is actually running!  (Which isn't too surprising 
when you think about it, because the FLR bit can't get set unless the 
controller is running.)

> The one thing which strikes me as odd is that all these tests are ONLY
> reproducible if there is an usb device plugged in during boot. If it wasn't
> plugged in there during boot, I can do whatever I want, everything works
> perfectly fine.

This combined with everything else suggests very strongly a bug in the 
BIOS.  Have you checked for any BIOS updates?

> > Turn on CONFIG_PM_SYSFS_DEPRECATED in your kernel build of 2.6.19.  After 
> > booting and plugging in the USB HD, rmmod usb-storage.  Then do
> > 
> > echo -n 2 >/sys/bus/usb/devices/usb1/power/state
> > 
> > That will force the root hub to suspend.
> 
> Using this trick, I can get IRQ 10 being disabled.
> 
> So technically it's not a regression, but... ;)

There may not be anything we can do about a rogue BIOS, other than to 
avoid suspending the controller at all.

Just to make sure, try this:  In ehci-hub.c, near the end of 
ehci_bus_suspend(), print out the value returned by ehci_halt().  If it is 
0 then we will know that the controller does get suspended correctly and 
something (the BIOS?) starts it up for no good reason.

The fact that suspend-to-RAM and -to-disk work okay might be explained by
other things happening during the suspend procedure.  Not only is the USB
controller suspended, but its interrupt mask is cleared and its upstream
PCI link gets suspended as well.  That might be enough to prevent the BIOS
from interfering.

Alan Stern


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2007-01-30 Thread Dominik Brodowski
Hi,

On Tue, Jan 30, 2007 at 03:31:24PM -0500, Alan Stern wrote:
> > next test:
> > http://userweb.kernel.org/~brodo/pre-insert
> > => added device 2
> > http://userweb.kernel.org/~brodo/post-insert
> 
> These were all done with the controller supposedly suspended, right?

Yes.

> But 
> they all show that it is actually running!  (Which isn't too surprising 
> when you think about it, because the FLR bit can't get set unless the 
> controller is running.)

Ouch.

> > The one thing which strikes me as odd is that all these tests are ONLY
> > reproducible if there is an usb device plugged in during boot. If it wasn't
> > plugged in there during boot, I can do whatever I want, everything works
> > perfectly fine.
> 
> This combined with everything else suggests very strongly a bug in the 
> BIOS.  Have you checked for any BIOS updates?

Hmmm, not lately -- I'll check it out (but only later this week, I can't
risk breaking my notebook at the moment ;) )

> > > Turn on CONFIG_PM_SYSFS_DEPRECATED in your kernel build of 2.6.19.  After 
> > > booting and plugging in the USB HD, rmmod usb-storage.  Then do
> > > 
> > >   echo -n 2 >/sys/bus/usb/devices/usb1/power/state
> > > 
> > > That will force the root hub to suspend.
> > 
> > Using this trick, I can get IRQ 10 being disabled.
> > 
> > So technically it's not a regression, but... ;)
> 
> There may not be anything we can do about a rogue BIOS, other than to 
> avoid suspending the controller at all.
> 
> Just to make sure, try this:  In ehci-hub.c, near the end of 
> ehci_bus_suspend(), print out the value returned by ehci_halt().  If it is 
> 0 then we will know that the controller does get suspended correctly and 
> something (the BIOS?) starts it up for no good reason.

+   ret = ehci_halt (ehci);
+   printk(KERN_INFO "ehci_halt in ehci_bus_suspend returns %d\n", ret);

Jan 30 18:05:54 [kernel] [   23.918690] ehci_halt in ehci_bus_suspend returns 0
...
Jan 30 18:05:54 [kernel] [   24.570171] cs: IO port probe 
0x800-0x8ff:<6>ehci_hcd :00:1d.7: STS_FLR - clearing. status: 0x2008 intr 
0x3f (0x37

Well, if we notice it has been awakened by the BIOS could we call the
resume() routines as a workaround?

> The fact that suspend-to-RAM and -to-disk work okay might be explained by
> other things happening during the suspend procedure.  Not only is the USB
> controller suspended, but its interrupt mask is cleared and its upstream
> PCI link gets suspended as well.  That might be enough to prevent the BIOS
> from interfering.

Thanks,
Dominik

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] autosuspend IRQ trouble

2007-01-31 Thread Alan Stern
On Tue, 30 Jan 2007, Dominik Brodowski wrote:

> > There may not be anything we can do about a rogue BIOS, other than to 
> > avoid suspending the controller at all.
> > 
> > Just to make sure, try this:  In ehci-hub.c, near the end of 
> > ehci_bus_suspend(), print out the value returned by ehci_halt().  If it is 
> > 0 then we will know that the controller does get suspended correctly and 
> > something (the BIOS?) starts it up for no good reason.
> 
> +   ret = ehci_halt (ehci);
> +   printk(KERN_INFO "ehci_halt in ehci_bus_suspend returns %d\n", ret);
> 
> Jan 30 18:05:54 [kernel] [   23.918690] ehci_halt in ehci_bus_suspend returns > 0
> ...
> Jan 30 18:05:54 [kernel] [   24.570171] cs: IO port probe 
> 0x800-0x8ff:<6>ehci_hcd :00:1d.7: STS_FLR - clearing. status: 0x2008 intr 
> 0x3f (0x37

Yep, no question about it.  Something (probably the BIOS) is restarting 
the controller and turning on the FLR interrupt mask bit.

> Well, if we notice it has been awakened by the BIOS could we call the
> resume() routines as a workaround?

We don't have a process context handy in which to do that.  Also I'm not 
sure it would work; the resume routine expects the controller to be 
suspended, not already running.  And what's the point of suspending, being 
woken up a few seconds later, suspending again, getting woken up again, 
... ad infinitum?

Better simply to avoid trying to suspend the controller in the first
place.  There are patches under discussion to make that sort of thing 
easier.  For now you can try doing this:

Edit drivers/usb/core/usb.h and change the definition of
USB_AUTOSUSPEND_DELAY to (HZ*60).

Add to your /etc/rc.d/rc.local (or someplace equivalent) a
line saying:

echo disabled >/sys/bus/usb/devices/usb1/power/wakeup

Increasing the initial autosuspend delay to 60 seconds will give the 
system a chance to run rc.local, and disabling remote wakeup on the 
controller will then prevent it from being autosuspended.

Alternatively, since you don't have any high-speed USB devices attached, 
you could simply prevent ehci-hcd from being loaded in the first place.  
That would certainly solve the problem. :-)

Alan Stern


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel