Re: PROBLEM: kernel hang in ohci init
Ok, opened up: http://bugzilla.kernel.org/show_bug.cgi?id=9026 and brought it up to date with the discussion and David's comments on this thread. Timo, please feel free to revisit this later and update us when you find the time to do so. [ BTW I think the "add CC:" thing in bugzilla is broken, I was simply unable to add David Brownell, linux-acpi@ and Timo to the CC: for that bug, if somebody knows how to do this, please add them ... ] Thanks, Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
On 9/16/07, Satyam Sharma <[EMAIL PROTECTED]> wrote: > Hi Timo, > > > On 7/15/07, Timo Lindemann <[EMAIL PROTECTED]> wrote: > > To sum this up: > > > > the userspace 2.6.20.6 (the "good" kernel) and 2.6.22 (the "bad" kernel) > > were compiled in is exactly the same setup. I recompiled "good" to check > > for that, earlier, but "good" also works then. > > > > "good" does not exhibit the printks I placed in the section (the same > > ones I did for "bad"), making it plausible that the section is not > > executed at all. > > > > dmesg is not captured to disk, netconsole and serial console also do not > > work (they both did in the "good" kernel). Also, my keyboard does not > > work with "bad" during that phase -- Magic SysRq is also not working then. > > > > I can try to hook up the laptop to an external monitor to capture some > > more dmesg, and just shoot a photo, but I am right now trying to work > > with git, as Satyam suggested. > > Any updates on this for us? Or did the kernel start booting magically again > ca. 2.6.23-rc6? ;-) Should again add that best would still be to simply git-bisect Linus' (mainline) kernel tree between 2.6.20 (not 2.6.20.6) and 2.6.22 and just find the commit after which your box stops booting ... > Anyway, it appears the bug got introduced sometime between 2.6.20 and > 2.6.22 so probably bugzilla becomes a better place to track this one. Could > you open up a bug report (similar to your original post) there? > > Thanks, > > Satyam > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
Hi Timo, On 7/15/07, Timo Lindemann <[EMAIL PROTECTED]> wrote: > To sum this up: > > the userspace 2.6.20.6 (the "good" kernel) and 2.6.22 (the "bad" kernel) > were compiled in is exactly the same setup. I recompiled "good" to check > for that, earlier, but "good" also works then. > > "good" does not exhibit the printks I placed in the section (the same > ones I did for "bad"), making it plausible that the section is not > executed at all. > > dmesg is not captured to disk, netconsole and serial console also do not > work (they both did in the "good" kernel). Also, my keyboard does not > work with "bad" during that phase -- Magic SysRq is also not working then. > > I can try to hook up the laptop to an external monitor to capture some > more dmesg, and just shoot a photo, but I am right now trying to work > with git, as Satyam suggested. Any updates on this for us? Or did the kernel start booting magically again ca. 2.6.23-rc6? ;-) Anyway, it appears the bug got introduced sometime between 2.6.20 and 2.6.22 so probably bugzilla becomes a better place to track this one. Could you open up a bug report (similar to your original post) there? Thanks, Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
Hi Timo, On 7/15/07, Timo Lindemann [EMAIL PROTECTED] wrote: To sum this up: the userspace 2.6.20.6 (the good kernel) and 2.6.22 (the bad kernel) were compiled in is exactly the same setup. I recompiled good to check for that, earlier, but good also works then. good does not exhibit the printks I placed in the section (the same ones I did for bad), making it plausible that the section is not executed at all. dmesg is not captured to disk, netconsole and serial console also do not work (they both did in the good kernel). Also, my keyboard does not work with bad during that phase -- Magic SysRq is also not working then. I can try to hook up the laptop to an external monitor to capture some more dmesg, and just shoot a photo, but I am right now trying to work with git, as Satyam suggested. Any updates on this for us? Or did the kernel start booting magically again ca. 2.6.23-rc6? ;-) Anyway, it appears the bug got introduced sometime between 2.6.20 and 2.6.22 so probably bugzilla becomes a better place to track this one. Could you open up a bug report (similar to your original post) there? Thanks, Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
On 9/16/07, Satyam Sharma [EMAIL PROTECTED] wrote: Hi Timo, On 7/15/07, Timo Lindemann [EMAIL PROTECTED] wrote: To sum this up: the userspace 2.6.20.6 (the good kernel) and 2.6.22 (the bad kernel) were compiled in is exactly the same setup. I recompiled good to check for that, earlier, but good also works then. good does not exhibit the printks I placed in the section (the same ones I did for bad), making it plausible that the section is not executed at all. dmesg is not captured to disk, netconsole and serial console also do not work (they both did in the good kernel). Also, my keyboard does not work with bad during that phase -- Magic SysRq is also not working then. I can try to hook up the laptop to an external monitor to capture some more dmesg, and just shoot a photo, but I am right now trying to work with git, as Satyam suggested. Any updates on this for us? Or did the kernel start booting magically again ca. 2.6.23-rc6? ;-) Should again add that best would still be to simply git-bisect Linus' (mainline) kernel tree between 2.6.20 (not 2.6.20.6) and 2.6.22 and just find the commit after which your box stops booting ... Anyway, it appears the bug got introduced sometime between 2.6.20 and 2.6.22 so probably bugzilla becomes a better place to track this one. Could you open up a bug report (similar to your original post) there? Thanks, Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
Ok, opened up: http://bugzilla.kernel.org/show_bug.cgi?id=9026 and brought it up to date with the discussion and David's comments on this thread. Timo, please feel free to revisit this later and update us when you find the time to do so. [ BTW I think the add CC: thing in bugzilla is broken, I was simply unable to add David Brownell, linux-acpi@ and Timo to the CC: for that bug, if somebody knows how to do this, please add them ... ] Thanks, Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
On Sunday 15 July 2007, Satyam Sharma wrote: > On 7/15/07, Timo Lindemann <[EMAIL PROTECTED]> wrote: > > It is just odd that up to (not including) the 2.6.21-series every kernel > > boots, and after that, they just freeze. On *your* system, note -- all my OHCI+PCI systems that have been upgraded to 2.6.22 are behaving just peachy-keen-swell. And that's true for most people, it seems... > > I am kinda stumped here. It gets that way sometimes. Thing is, pci-quirks.c runs early enough in the boot process -- before the OHCI driver can even run!! -- that you can probably rule out the USB stack as being the cause of this regression. Disable the USB host controllers in your config, and see what happens... > Hey, just try git-bisect already :-) > > In fact, you can first try by just reverting / un-applying that patch that > you initially had a suspicion on. Extremely unlikely to matter, since it wouldn't have been able to run that early. Plus, you were seeing problems even before that recent change to pci-quirks ... > Or, because you've already spent > some time tracking down the issue, you could simply go through the > git history of that file / subsystem in question Where the subsystem in question is early PCI/ACPI initialization, before the drivers start binding to PCI devices... it's always annoying when changes in that area cause USB to break, since the only involvement of USB is to display a "rude failure" symptom. It took a long time to get the IRQ setup glitches fixed! One thing you might do is enable all the ACPI debug messaging and disable the usb/host/pci-quirks.c stuff (just comment it all out), assuming you can boot without USB keyboard/mouse. Then compare the relevant diagnostics between "good" and "bad" kernels. It's likely something interesting will appear. - Dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
To sum this up: the userspace 2.6.20.6 (the "good" kernel) and 2.6.22 (the "bad" kernel) were compiled in is exactly the same setup. I recompiled "good" to check for that, earlier, but "good" also works then. "good" does not exhibit the printks I placed in the section (the same ones I did for "bad"), making it plausible that the section is not executed at all. dmesg is not captured to disk, netconsole and serial console also do not work (they both did in the "good" kernel). Also, my keyboard does not work with "bad" during that phase -- Magic SysRq is also not working then. I can try to hook up the laptop to an external monitor to capture some more dmesg, and just shoot a photo, but I am right now trying to work with git, as Satyam suggested. Thanks very much for reading and helping :-) Regards, TL -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
On 7/15/07, Satyam Sharma <[EMAIL PROTECTED]> wrote: On 7/15/07, Timo Lindemann <[EMAIL PROTECTED]> wrote: [...] > >>> after this, kernel apparently goes into busy waiting (fans gradually > >>> turn louder) and hangs indefinitely. I have also made sure that writel > >>> (in linux/include/asm/io.h) really is entered, but never returns. > > > > Does the current kernel.org GIT tree do the same thing? A bunch > > of USB patches were recently merged, including ISTR one in that > > area ... > It does the same thing, git5, that is. Sorry I took so long, but I didnt > get to testing this earlier. > > It is just odd that up to (not including) the 2.6.21-series every kernel > boots, and after that, they just freeze. There could be another thing, of course. The kernel sources (or .config) needn't be the only variable here -- if you're using the "old" kernel image for the 2.6.20 kernel that works, it could be the case that perhaps you've upgraded userspace packages (compiler/toolchain) in the meanwhile that's causing this breakage ... so to test, try compiling the 2.6.20 on your system again (with same .config) and see if it works now ... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
On 7/15/07, Timo Lindemann <[EMAIL PROTECTED]> wrote: David Brownell wrote: > On Thursday 12 July 2007, Satyam Sharma wrote: > > Note that hangs in that file almost always mean "your BIOS is goofy". > Hunt for BIOS settings related to USB, and change them. This laptop's BIOS only offers "legacy support" enabled or disabled, both of which lead to frozen kernel. I will investigate whether the GIT tree freezes at the same point. Perhaps you could try updating your BIOS, if possible / applicable (?) >>> after this, kernel apparently goes into busy waiting (fans gradually >>> turn louder) and hangs indefinitely. I have also made sure that writel >>> (in linux/include/asm/io.h) really is entered, but never returns. > > Does the current kernel.org GIT tree do the same thing? A bunch > of USB patches were recently merged, including ISTR one in that > area ... It does the same thing, git5, that is. Sorry I took so long, but I didnt get to testing this earlier. It is just odd that up to (not including) the 2.6.21-series every kernel boots, and after that, they just freeze. I am kinda stumped here. Hey, just try git-bisect already :-) In fact, you can first try by just reverting / un-applying that patch that you initially had a suspicion on. Or, because you've already spent some time tracking down the issue, you could simply go through the git history of that file / subsystem in question and play around reverting individual patches that you find suspicious -- but really, there's no need to try and be cute with this: you could simply do a git-bisect (say between 2.6.20 and 2.6.21) and find the offending patch (or at least the one that un-hides the bug) that makes the boot fail ... [ BTW you haven't sent your dmesg / boot-time output ... if it isn't getting saved to disk, you could try serial / netconsole, copy it by hand, or simply take a photo and post it here. ] Cheers, Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
David Brownell wrote: > On Thursday 12 July 2007, Satyam Sharma wrote: > > Note that hangs in that file almost always mean "your BIOS is goofy". > Hunt for BIOS settings related to USB, and change them. This laptop's BIOS only offers "legacy support" enabled or disabled, both of which lead to frozen kernel. I will investigate whether the GIT tree freezes at the same point. >>> after this, kernel apparently goes into busy waiting (fans gradually >>> turn louder) and hangs indefinitely. I have also made sure that writel >>> (in linux/include/asm/io.h) really is entered, but never returns. > > Does the current kernel.org GIT tree do the same thing? A bunch > of USB patches were recently merged, including ISTR one in that > area ... It does the same thing, git5, that is. Sorry I took so long, but I didnt get to testing this earlier. It is just odd that up to (not including) the 2.6.21-series every kernel boots, and after that, they just freeze. I am kinda stumped here. Regards TL -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
David Brownell wrote: On Thursday 12 July 2007, Satyam Sharma wrote: Note that hangs in that file almost always mean your BIOS is goofy. Hunt for BIOS settings related to USB, and change them. This laptop's BIOS only offers legacy support enabled or disabled, both of which lead to frozen kernel. I will investigate whether the GIT tree freezes at the same point. after this, kernel apparently goes into busy waiting (fans gradually turn louder) and hangs indefinitely. I have also made sure that writel (in linux/include/asm/io.h) really is entered, but never returns. Does the current kernel.org GIT tree do the same thing? A bunch of USB patches were recently merged, including ISTR one in that area ... It does the same thing, git5, that is. Sorry I took so long, but I didnt get to testing this earlier. It is just odd that up to (not including) the 2.6.21-series every kernel boots, and after that, they just freeze. I am kinda stumped here. Regards TL -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
On 7/15/07, Timo Lindemann [EMAIL PROTECTED] wrote: David Brownell wrote: On Thursday 12 July 2007, Satyam Sharma wrote: Note that hangs in that file almost always mean your BIOS is goofy. Hunt for BIOS settings related to USB, and change them. This laptop's BIOS only offers legacy support enabled or disabled, both of which lead to frozen kernel. I will investigate whether the GIT tree freezes at the same point. Perhaps you could try updating your BIOS, if possible / applicable (?) after this, kernel apparently goes into busy waiting (fans gradually turn louder) and hangs indefinitely. I have also made sure that writel (in linux/include/asm/io.h) really is entered, but never returns. Does the current kernel.org GIT tree do the same thing? A bunch of USB patches were recently merged, including ISTR one in that area ... It does the same thing, git5, that is. Sorry I took so long, but I didnt get to testing this earlier. It is just odd that up to (not including) the 2.6.21-series every kernel boots, and after that, they just freeze. I am kinda stumped here. Hey, just try git-bisect already :-) In fact, you can first try by just reverting / un-applying that patch that you initially had a suspicion on. Or, because you've already spent some time tracking down the issue, you could simply go through the git history of that file / subsystem in question and play around reverting individual patches that you find suspicious -- but really, there's no need to try and be cute with this: you could simply do a git-bisect (say between 2.6.20 and 2.6.21) and find the offending patch (or at least the one that un-hides the bug) that makes the boot fail ... [ BTW you haven't sent your dmesg / boot-time output ... if it isn't getting saved to disk, you could try serial / netconsole, copy it by hand, or simply take a photo and post it here. ] Cheers, Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
On 7/15/07, Satyam Sharma [EMAIL PROTECTED] wrote: On 7/15/07, Timo Lindemann [EMAIL PROTECTED] wrote: [...] after this, kernel apparently goes into busy waiting (fans gradually turn louder) and hangs indefinitely. I have also made sure that writel (in linux/include/asm/io.h) really is entered, but never returns. Does the current kernel.org GIT tree do the same thing? A bunch of USB patches were recently merged, including ISTR one in that area ... It does the same thing, git5, that is. Sorry I took so long, but I didnt get to testing this earlier. It is just odd that up to (not including) the 2.6.21-series every kernel boots, and after that, they just freeze. There could be another thing, of course. The kernel sources (or .config) needn't be the only variable here -- if you're using the old kernel image for the 2.6.20 kernel that works, it could be the case that perhaps you've upgraded userspace packages (compiler/toolchain) in the meanwhile that's causing this breakage ... so to test, try compiling the 2.6.20 on your system again (with same .config) and see if it works now ... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
To sum this up: the userspace 2.6.20.6 (the good kernel) and 2.6.22 (the bad kernel) were compiled in is exactly the same setup. I recompiled good to check for that, earlier, but good also works then. good does not exhibit the printks I placed in the section (the same ones I did for bad), making it plausible that the section is not executed at all. dmesg is not captured to disk, netconsole and serial console also do not work (they both did in the good kernel). Also, my keyboard does not work with bad during that phase -- Magic SysRq is also not working then. I can try to hook up the laptop to an external monitor to capture some more dmesg, and just shoot a photo, but I am right now trying to work with git, as Satyam suggested. Thanks very much for reading and helping :-) Regards, TL -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
On Sunday 15 July 2007, Satyam Sharma wrote: On 7/15/07, Timo Lindemann [EMAIL PROTECTED] wrote: It is just odd that up to (not including) the 2.6.21-series every kernel boots, and after that, they just freeze. On *your* system, note -- all my OHCI+PCI systems that have been upgraded to 2.6.22 are behaving just peachy-keen-swell. And that's true for most people, it seems... I am kinda stumped here. It gets that way sometimes. Thing is, pci-quirks.c runs early enough in the boot process -- before the OHCI driver can even run!! -- that you can probably rule out the USB stack as being the cause of this regression. Disable the USB host controllers in your config, and see what happens... Hey, just try git-bisect already :-) In fact, you can first try by just reverting / un-applying that patch that you initially had a suspicion on. Extremely unlikely to matter, since it wouldn't have been able to run that early. Plus, you were seeing problems even before that recent change to pci-quirks ... Or, because you've already spent some time tracking down the issue, you could simply go through the git history of that file / subsystem in question Where the subsystem in question is early PCI/ACPI initialization, before the drivers start binding to PCI devices... it's always annoying when changes in that area cause USB to break, since the only involvement of USB is to display a rude failure symptom. It took a long time to get the IRQ setup glitches fixed! One thing you might do is enable all the ACPI debug messaging and disable the usb/host/pci-quirks.c stuff (just comment it all out), assuming you can boot without USB keyboard/mouse. Then compare the relevant diagnostics between good and bad kernels. It's likely something interesting will appear. - Dave - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
On Thursday 12 July 2007, Satyam Sharma wrote: > > > [2.] The version 2.6.22 of the linux kernel hangs when initializing the > > integrated ohci controller of the nvidia MCP51 chipset (pci device ids > > vendor:product == 10de:26d). I have traced through various printks that > > pci_init calls pci_fixup_device, later on in quirk_usb_ohci_handoff > > (file linux/drivers/usb/host/pci-quirks.c) kernel freezes in this > > section: Note that hangs in that file almost always mean "your BIOS is goofy". Hunt for BIOS settings related to USB, and change them. As a rule, if you tell your BIOS to ignore USB devices (mostly keyboards and disks), it will have even less of an excuse to break like that. > > ... > > if (control & OHCI_CTRL_IR) { > > int wait_time = 500; > > writel(OHCI_INTR_OC, base + OHCI_INTRENABLE); > > writel(OHCI_ORC, base + OHCI_CMDSTATUS); // this never returns > > ... > > after this, kernel apparently goes into busy waiting (fans gradually > > turn louder) and hangs indefinitely. I have also made sure that writel > > (in linux/include/asm/io.h) really is entered, but never returns. Does the current kernel.org GIT tree do the same thing? A bunch of USB patches were recently merged, including ISTR one in that area ... > > I can only guess that it might > > have to do with the patch > > "commit 4302a595cd9c6363b495460497ecbda49fa16858 > > Author: Benjamin Herrenschmidt <[EMAIL PROTECTED]> > > Date: Fri Dec 15 06:53:55 2006 +1100 > > USB: Rework the OHCI quirk mecanism as suggested by David > > " > > but I don't really have a clue, so this might be groundless suspicion. Should be unrelated. That patch related to how vendor-specific implementation differences get detected and handled ... basically just switching to a table-driven approach that can even handle board-specific wiring braindamage, rather than the original scheme which was just a big if/then/else looking only at chip vendors. - Dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
Hi Timo, Thanks for your report! On 7/12/07, Timo Lindemann <[EMAIL PROTECTED]> wrote: a problem report to something giving me a real headache: [2.] The version 2.6.22 of the linux kernel hangs when initializing the integrated ohci controller of the nvidia MCP51 chipset (pci device ids vendor:product == 10de:26d). I have traced through various printks that pci_init calls pci_fixup_device, later on in quirk_usb_ohci_handoff (file linux/drivers/usb/host/pci-quirks.c) kernel freezes in this section: ... if (control & OHCI_CTRL_IR) { int wait_time = 500; writel(OHCI_INTR_OC, base + OHCI_INTRENABLE); writel(OHCI_ORC, base + OHCI_CMDSTATUS); // this never returns ... after this, kernel apparently goes into busy waiting (fans gradually turn louder) and hangs indefinitely. I have also made sure that writel (in linux/include/asm/io.h) really is entered, but never returns. [ Added David Brownell to Cc: ] [6.] Reproducible by booting any version 2.6.21+ on that machine (nvidia MCP51-Chipset, see the lspci output) [...] [7.7] What is striking about that problem is that kernel 2.6.20.6 does not even enter the section mentioned in [2.]. 2.6.20 works, 2.6.21 doesn't, right? You could try git-bisect on Linus' tree (if you can use git) to find the offending commit that broke it: http://www.kernel.org/pub/software/scm/git/docs/git-bisect.html and http://www.reactivated.net/weblog/archives/2006/01/ using-git-bisect-to-find-buggy-kernel-patches/ (long URL broken in 2 lines) You don't need to "make clean" between git-bisect builds, but be prepared to lose a couple of hours on this still :-) [7.1] the ver_linux output under 2.6.20.6, in the directory of 2.6.22, says: Gnu C 4.2.1 Others have reported problems booting with gcc-4.2-compiled kernels too. Could you try building with 4.1? Modules Loaded rt2500* nvidia* forcedeth * nvidia and rt2500 are most assuredly not involved in this. They are not loaded by that kernel. [...] [7.3] no modules have been configured (all in-kernel) You're saying there are no modules, then how come those three are loaded? Also try reproducing the problem without proprietary (nvidia) drivers, please. [7.5] (I cannot run this with 2.6.22. In 2.6.20.6, the output can be retrieved from http://cip.uni-trier.de/~lindem/lspci.txt as this is really large) Ok. [X.] I tried hard to understand what's going on, but ultimately, I could not yet write a fix, workaround, or anything like that, so I am asking for help/enlightenment, or even an already-done fix. Really very sorry. Also, different options like noapic, nolapic, acpi=off, pci=routeirq|biosirq|usepirqmask were already tried; I also tried disabling quirks for that particular vendor:device-combination, which leads to another freeze further along. Also, commenting the writel() will hang indefinitely in the following wait_time loop. I can only guess that it might have to do with the patch "commit 4302a595cd9c6363b495460497ecbda49fa16858 Author: Benjamin Herrenschmidt <[EMAIL PROTECTED]> Date: Fri Dec 15 06:53:55 2006 +1100 USB: Rework the OHCI quirk mecanism as suggested by David " but I don't really have a clue, so this might be groundless suspicion. If so, I apologize about that. As mentioned earlier, git-bisect could help us narrow this down. It's not a silver bullet, but often useful. [ BTW, just-after a new kernel release is often an unlucky period to report bugs, it appears ... everybody gets busy with not missing the merge window to push in their shiny new stuff :-) ] Thanks, Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PROBLEM: kernel hang in ohci init
Hi all, a problem report to something giving me a real headache: [1.] Kernel hangs when initializing ohci-controller [2.] The version 2.6.22 of the linux kernel hangs when initializing the integrated ohci controller of the nvidia MCP51 chipset (pci device ids vendor:product == 10de:26d). I have traced through various printks that pci_init calls pci_fixup_device, later on in quirk_usb_ohci_handoff (file linux/drivers/usb/host/pci-quirks.c) kernel freezes in this section: ... if (control & OHCI_CTRL_IR) { int wait_time = 500; writel(OHCI_INTR_OC, base + OHCI_INTRENABLE); writel(OHCI_ORC, base + OHCI_CMDSTATUS); // this never returns ... after this, kernel apparently goes into busy waiting (fans gradually turn louder) and hangs indefinitely. I have also made sure that writel (in linux/include/asm/io.h) really is entered, but never returns. [3.] keywords: pci ohci kernel [4.] /proc/version can not be read, as kernel freezes in startup [5.] No Oops, no panic [6.] Reproducible by booting any version 2.6.21+ on that machine (nvidia MCP51-Chipset, see the lspci output) [7.1] the ver_linux output under 2.6.20.6, in the directory of 2.6.22, says: Gnu C 4.2.1 Gnu make 3.81 binutils 2.17.50.0.17 util-linux 2.12r mount 2.12r module-init-tools 3.2.2 e2fsprogs 1.40 jfsutils 1.1.11 reiserfsprogs 3.6.20 xfsprogs 2.8.21 pcmciautils014 PPP2.4.4 Linux C Library> libc.2.6 Dynamic linker (ldd) 2.6 Linux C++ Library so.6.0 Procps 3.2.7 Net-tools 1.60 Kbd1.12 Sh-utils 6.9 udev 113 wireless-tools 29 Modules Loaded rt2500* nvidia* forcedeth * nvidia and rt2500 are most assuredly not involved in this. They are not loaded by that kernel. [7.2] Processor information: processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 36 model name : AMD Turion(tm) 64 Mobile Technology ML-37 stepping: 2 cpu MHz : 800.000 cache size : 1024 KB fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm ts fid vid ttp tm stc bogomips: 1608.22 clflush size: 64 [7.3] no modules have been configured (all in-kernel) [7.4] n/a [7.5] (I cannot run this with 2.6.22. In 2.6.20.6, the output can be retrieved from http://cip.uni-trier.de/~lindem/lspci.txt as this is really large) [7.6] (I have SATA, but again, I don't reach /proc from within that kernel) [7.7] What is striking about that problem is that kernel 2.6.20.6 does not even enter the section mentioned in [2.]. If booted, serial console and netconsole do not work either, nor does magic sysrq key. Also, this is a 64bit cpu, running a 32bit linux distro, and it happens regardless whether 64bit resources are activated or not. [X.] I tried hard to understand what's going on, but ultimately, I could not yet write a fix, workaround, or anything like that, so I am asking for help/enlightenment, or even an already-done fix. Really very sorry. Also, different options like noapic, nolapic, acpi=off, pci=routeirq|biosirq|usepirqmask were already tried; I also tried disabling quirks for that particular vendor:device-combination, which leads to another freeze further along. Also, commenting the writel() will hang indefinitely in the following wait_time loop. I can only guess that it might have to do with the patch "commit 4302a595cd9c6363b495460497ecbda49fa16858 Author: Benjamin Herrenschmidt <[EMAIL PROTECTED]> Date: Fri Dec 15 06:53:55 2006 +1100 USB: Rework the OHCI quirk mecanism as suggested by David " but I don't really have a clue, so this might be groundless suspicion. If so, I apologize about that. Greetings and thanks for all the work with the kernel! -- Timo Lindemann - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PROBLEM: kernel hang in ohci init
Hi all, a problem report to something giving me a real headache: [1.] Kernel hangs when initializing ohci-controller [2.] The version 2.6.22 of the linux kernel hangs when initializing the integrated ohci controller of the nvidia MCP51 chipset (pci device ids vendor:product == 10de:26d). I have traced through various printks that pci_init calls pci_fixup_device, later on in quirk_usb_ohci_handoff (file linux/drivers/usb/host/pci-quirks.c) kernel freezes in this section: ... if (control OHCI_CTRL_IR) { int wait_time = 500; writel(OHCI_INTR_OC, base + OHCI_INTRENABLE); writel(OHCI_ORC, base + OHCI_CMDSTATUS); // this never returns ... after this, kernel apparently goes into busy waiting (fans gradually turn louder) and hangs indefinitely. I have also made sure that writel (in linux/include/asm/io.h) really is entered, but never returns. [3.] keywords: pci ohci kernel [4.] /proc/version can not be read, as kernel freezes in startup [5.] No Oops, no panic [6.] Reproducible by booting any version 2.6.21+ on that machine (nvidia MCP51-Chipset, see the lspci output) [7.1] the ver_linux output under 2.6.20.6, in the directory of 2.6.22, says: Gnu C 4.2.1 Gnu make 3.81 binutils 2.17.50.0.17 util-linux 2.12r mount 2.12r module-init-tools 3.2.2 e2fsprogs 1.40 jfsutils 1.1.11 reiserfsprogs 3.6.20 xfsprogs 2.8.21 pcmciautils014 PPP2.4.4 Linux C Library libc.2.6 Dynamic linker (ldd) 2.6 Linux C++ Library so.6.0 Procps 3.2.7 Net-tools 1.60 Kbd1.12 Sh-utils 6.9 udev 113 wireless-tools 29 Modules Loaded rt2500* nvidia* forcedeth * nvidia and rt2500 are most assuredly not involved in this. They are not loaded by that kernel. [7.2] Processor information: processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 36 model name : AMD Turion(tm) 64 Mobile Technology ML-37 stepping: 2 cpu MHz : 800.000 cache size : 1024 KB fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm ts fid vid ttp tm stc bogomips: 1608.22 clflush size: 64 [7.3] no modules have been configured (all in-kernel) [7.4] n/a [7.5] (I cannot run this with 2.6.22. In 2.6.20.6, the output can be retrieved from http://cip.uni-trier.de/~lindem/lspci.txt as this is really large) [7.6] (I have SATA, but again, I don't reach /proc from within that kernel) [7.7] What is striking about that problem is that kernel 2.6.20.6 does not even enter the section mentioned in [2.]. If booted, serial console and netconsole do not work either, nor does magic sysrq key. Also, this is a 64bit cpu, running a 32bit linux distro, and it happens regardless whether 64bit resources are activated or not. [X.] I tried hard to understand what's going on, but ultimately, I could not yet write a fix, workaround, or anything like that, so I am asking for help/enlightenment, or even an already-done fix. Really very sorry. Also, different options like noapic, nolapic, acpi=off, pci=routeirq|biosirq|usepirqmask were already tried; I also tried disabling quirks for that particular vendor:device-combination, which leads to another freeze further along. Also, commenting the writel() will hang indefinitely in the following wait_time loop. I can only guess that it might have to do with the patch commit 4302a595cd9c6363b495460497ecbda49fa16858 Author: Benjamin Herrenschmidt [EMAIL PROTECTED] Date: Fri Dec 15 06:53:55 2006 +1100 USB: Rework the OHCI quirk mecanism as suggested by David but I don't really have a clue, so this might be groundless suspicion. If so, I apologize about that. Greetings and thanks for all the work with the kernel! -- Timo Lindemann - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
Hi Timo, Thanks for your report! On 7/12/07, Timo Lindemann [EMAIL PROTECTED] wrote: a problem report to something giving me a real headache: [2.] The version 2.6.22 of the linux kernel hangs when initializing the integrated ohci controller of the nvidia MCP51 chipset (pci device ids vendor:product == 10de:26d). I have traced through various printks that pci_init calls pci_fixup_device, later on in quirk_usb_ohci_handoff (file linux/drivers/usb/host/pci-quirks.c) kernel freezes in this section: ... if (control OHCI_CTRL_IR) { int wait_time = 500; writel(OHCI_INTR_OC, base + OHCI_INTRENABLE); writel(OHCI_ORC, base + OHCI_CMDSTATUS); // this never returns ... after this, kernel apparently goes into busy waiting (fans gradually turn louder) and hangs indefinitely. I have also made sure that writel (in linux/include/asm/io.h) really is entered, but never returns. [ Added David Brownell to Cc: ] [6.] Reproducible by booting any version 2.6.21+ on that machine (nvidia MCP51-Chipset, see the lspci output) [...] [7.7] What is striking about that problem is that kernel 2.6.20.6 does not even enter the section mentioned in [2.]. 2.6.20 works, 2.6.21 doesn't, right? You could try git-bisect on Linus' tree (if you can use git) to find the offending commit that broke it: http://www.kernel.org/pub/software/scm/git/docs/git-bisect.html and http://www.reactivated.net/weblog/archives/2006/01/ using-git-bisect-to-find-buggy-kernel-patches/ (long URL broken in 2 lines) You don't need to make clean between git-bisect builds, but be prepared to lose a couple of hours on this still :-) [7.1] the ver_linux output under 2.6.20.6, in the directory of 2.6.22, says: Gnu C 4.2.1 Others have reported problems booting with gcc-4.2-compiled kernels too. Could you try building with 4.1? Modules Loaded rt2500* nvidia* forcedeth * nvidia and rt2500 are most assuredly not involved in this. They are not loaded by that kernel. [...] [7.3] no modules have been configured (all in-kernel) You're saying there are no modules, then how come those three are loaded? Also try reproducing the problem without proprietary (nvidia) drivers, please. [7.5] (I cannot run this with 2.6.22. In 2.6.20.6, the output can be retrieved from http://cip.uni-trier.de/~lindem/lspci.txt as this is really large) Ok. [X.] I tried hard to understand what's going on, but ultimately, I could not yet write a fix, workaround, or anything like that, so I am asking for help/enlightenment, or even an already-done fix. Really very sorry. Also, different options like noapic, nolapic, acpi=off, pci=routeirq|biosirq|usepirqmask were already tried; I also tried disabling quirks for that particular vendor:device-combination, which leads to another freeze further along. Also, commenting the writel() will hang indefinitely in the following wait_time loop. I can only guess that it might have to do with the patch commit 4302a595cd9c6363b495460497ecbda49fa16858 Author: Benjamin Herrenschmidt [EMAIL PROTECTED] Date: Fri Dec 15 06:53:55 2006 +1100 USB: Rework the OHCI quirk mecanism as suggested by David but I don't really have a clue, so this might be groundless suspicion. If so, I apologize about that. As mentioned earlier, git-bisect could help us narrow this down. It's not a silver bullet, but often useful. [ BTW, just-after a new kernel release is often an unlucky period to report bugs, it appears ... everybody gets busy with not missing the merge window to push in their shiny new stuff :-) ] Thanks, Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: kernel hang in ohci init
On Thursday 12 July 2007, Satyam Sharma wrote: [2.] The version 2.6.22 of the linux kernel hangs when initializing the integrated ohci controller of the nvidia MCP51 chipset (pci device ids vendor:product == 10de:26d). I have traced through various printks that pci_init calls pci_fixup_device, later on in quirk_usb_ohci_handoff (file linux/drivers/usb/host/pci-quirks.c) kernel freezes in this section: Note that hangs in that file almost always mean your BIOS is goofy. Hunt for BIOS settings related to USB, and change them. As a rule, if you tell your BIOS to ignore USB devices (mostly keyboards and disks), it will have even less of an excuse to break like that. ... if (control OHCI_CTRL_IR) { int wait_time = 500; writel(OHCI_INTR_OC, base + OHCI_INTRENABLE); writel(OHCI_ORC, base + OHCI_CMDSTATUS); // this never returns ... after this, kernel apparently goes into busy waiting (fans gradually turn louder) and hangs indefinitely. I have also made sure that writel (in linux/include/asm/io.h) really is entered, but never returns. Does the current kernel.org GIT tree do the same thing? A bunch of USB patches were recently merged, including ISTR one in that area ... I can only guess that it might have to do with the patch commit 4302a595cd9c6363b495460497ecbda49fa16858 Author: Benjamin Herrenschmidt [EMAIL PROTECTED] Date: Fri Dec 15 06:53:55 2006 +1100 USB: Rework the OHCI quirk mecanism as suggested by David but I don't really have a clue, so this might be groundless suspicion. Should be unrelated. That patch related to how vendor-specific implementation differences get detected and handled ... basically just switching to a table-driven approach that can even handle board-specific wiring braindamage, rather than the original scheme which was just a big if/then/else looking only at chip vendors. - Dave - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/