Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
Thomas Gleixner <[EMAIL PROTECTED]> writes: > > > > Interrupt 0 is stuck at 114 (the number is consistent across reboots). I > > don't experience any problem, time is running fine. Still it's strange > > that the timer is doing nothing; maybe something other than the PIT is > > used for time keeping? > > Yes, we switch away from PIT and use the local APIC timer. (LOC) Before this becomes a FAQ. Would anybody mind if I just renamed "timer" to "pit" to make this clear? -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > > * Andrew Morton <[EMAIL PROTECTED]> wrote: > > > I already bisected this on my old pIII, which has the same problem: > > clockevents-i386-drivers.patch > > yes - we know what the problem is (and will fix it): the stopping of the > PIT - nmi_watchdog=1 is hack to use the IO-APIC's PIT pin to also signal > NMIs. > > Just to clarify, this problem does not occur if HIGH_RES_TIMERS is > off, correct? s/does not occur/does occur/ we switch off the PIT whenever we can. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Fri, 2007-02-23 at 12:35 +0100, Ingo Molnar wrote: > * Andrew Morton <[EMAIL PROTECTED]> wrote: > > > I already bisected this on my old pIII, which has the same problem: > > clockevents-i386-drivers.patch > > yes - we know what the problem is (and will fix it): the stopping of the > PIT - nmi_watchdog=1 is hack to use the IO-APIC's PIT pin to also signal > NMIs. > > Just to clarify, this problem does not occur if HIGH_RES_TIMERS is off, > correct? It does, as we switch off PIT when lapic is available in any case. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
* Andrew Morton <[EMAIL PROTECTED]> wrote: > I already bisected this on my old pIII, which has the same problem: > clockevents-i386-drivers.patch yes - we know what the problem is (and will fix it): the stopping of the PIT - nmi_watchdog=1 is hack to use the IO-APIC's PIT pin to also signal NMIs. Just to clarify, this problem does not occur if HIGH_RES_TIMERS is off, correct? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
> On Wed, 21 Feb 2007 18:41:11 +0100 Thomas Gleixner <[EMAIL PROTECTED]> wrote: > On Wed, 2007-02-21 at 09:19 -0800, Daniel Walker wrote: > > > At this point the PIT / HPET _is_ active and incrementing jiffies. The > > > switch to local apic timers happens afterwards. > > > > Could be the switch over then which confuses the NMI . > > Why? The switch just stops the PIT/HPET. It does not fiddle with IO_APIC > and friends at all. > > > ftp://source.mvista.com/pub/dwalker/tglx/ > > Nothing obvious. Bisect time :( > I already bisected this on my old pIII, which has the same problem: clockevents-i386-drivers.patch - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
Arjan van de Ven wrote: > if it's something built in the last year or two you have the hw. > > I have an ICH4-M, and from Intel's datasheets it looks like I got the short straw.. -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1 -- suspend
Hi! > Ok, the merge window for 2.6.21 has closed, and -rc1 is out there. > > There's a lot of changes, as is usual for an -rc1 thing, but at least so > far it would seem that 2.6.20 has been a good base, and I don't think we > have anything *really* scary here. And lot of acpi/suspend changes, which seem to break my machine in weird and not really reproducible way. I'm looking onto that. (Yep, that should teach me to test -mm a bit more). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
On Thu, 2007-02-22 at 22:07 +0100, Pierre Ossman wrote: > Arjan van de Ven wrote: > > no; c3 saves a TON more power. > > > > you can try enabling HPET in your BIOS... > > > > > > Hah, I wish! This is a laptop, so the BIOS is as brain dead and broken > as is humanly possible. > > Can I determine if I have the required hardware? So I can tell if I'm > permanently screwed, or just temporarily. if it's something built in the last year or two you have the hw. -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
Hi, On Thu, Feb 22, 2007 at 10:07:19PM +0100, Pierre Ossman wrote: > Arjan van de Ven wrote: > > no; c3 saves a TON more power. > > > > you can try enabling HPET in your BIOS... > > > > > > Hah, I wish! This is a laptop, so the BIOS is as brain dead and broken > as is humanly possible. > > Can I determine if I have the required hardware? So I can tell if I'm > permanently screwed, or just temporarily. http://lkml.org/lkml/2006/11/14/153 and related posts in this older thread ("CONFIG_NO_HZ: missed ticks, stall (keyb IRQ required) [2.6.18-rc4-mm1]") should help. Andreas Mohr - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
Arjan van de Ven wrote: > no; c3 saves a TON more power. > > you can try enabling HPET in your BIOS... > > Hah, I wish! This is a laptop, so the BIOS is as brain dead and broken as is humanly possible. Can I determine if I have the required hardware? So I can tell if I'm permanently screwed, or just temporarily. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of >Thomas Gleixner >Sent: Thursday, February 22, 2007 8:00 AM >To: Pierre Ossman >Cc: Arjan van de Ven; Jan Engelhardt; Luca Tettamanti; >linux-kernel@vger.kernel.org >Subject: Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1] > >On Thu, 2007-02-22 at 16:13 +0100, Pierre Ossman wrote: >> > Sure. My dmesg is full of mmc debug crud right now, but >I'll just reboot >> > and I'll have a clean one for you. >> > >> >> Here we go. > >> [ 44.498253] ACPI: Lid Switch [C136] >> [ 44.577672] No dock devices found. >> [ 44.714156] ACPI: CPU0 (power states: C1[C1] C2[C2] C3[C3]) > >-^ > >Here is the reason. The local APIC stops working in C3 state >and we fall >back to the PIT in that case. Not really exciting for dynticks, but the >only way to keep the system alive. There is a patch coming up from >Intel, which finds out how to use HPET even if it is not enabled by the >BIOS. This will still end up on IRQ#0, but will give way longer idle >sleeps than the PIT. > > tglx > > Thomas, I have the patchset for this HPET part ready to roll out. But, looks like NO_HZ and lapic eventsource support is only in i386 and not in x86-64 in 2.6.21-rc1. Do you know the state of NO_HZ x86-64 support? Will it get into git soon? In which case I can rebase my patches against git and send it out for review/testing. Otherwise, I will send out my patches, which are against rt (and my initial implementation of this HPET part is only for x86-64). Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
On Thu, 2007-02-22 at 17:27 +0100, Pierre Ossman wrote: > Thomas Gleixner wrote: > > > > Here is the reason. The local APIC stops working in C3 state and we fall > > back to the PIT in that case. Not really exciting for dynticks, but the > > only way to keep the system alive. There is a patch coming up from > > Intel, which finds out how to use HPET even if it is not enabled by the > > BIOS. This will still end up on IRQ#0, but will give way longer idle > > sleeps than the PIT. > > > > > > So then the next two questions are; is it possible to disable C3 yeah there is a commandline thingy for it > and is > it a net power gain to get rid of the wakeups in favor of having C3. no; c3 saves a TON more power. you can try enabling HPET in your BIOS... -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
Thomas Gleixner wrote: > > Here is the reason. The local APIC stops working in C3 state and we fall > back to the PIT in that case. Not really exciting for dynticks, but the > only way to keep the system alive. There is a patch coming up from > Intel, which finds out how to use HPET even if it is not enabled by the > BIOS. This will still end up on IRQ#0, but will give way longer idle > sleeps than the PIT. > > So then the next two questions are; is it possible to disable C3 and is it a net power gain to get rid of the wakeups in favor of having C3. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
On Thu, 2007-02-22 at 16:13 +0100, Pierre Ossman wrote: > > Sure. My dmesg is full of mmc debug crud right now, but I'll just reboot > > and I'll have a clean one for you. > > > > Here we go. > [ 44.498253] ACPI: Lid Switch [C136] > [ 44.577672] No dock devices found. > [ 44.714156] ACPI: CPU0 (power states: C1[C1] C2[C2] C3[C3]) -^ Here is the reason. The local APIC stops working in C3 state and we fall back to the PIT in that case. Not really exciting for dynticks, but the only way to keep the system alive. There is a patch coming up from Intel, which finds out how to use HPET even if it is not enabled by the BIOS. This will still end up on IRQ#0, but will give way longer idle sleeps than the PIT. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
On Thu, 2007-02-22 at 13:36 +0100, Jan Engelhardt wrote: > > > >Yes, we switch away from PIT and use the local APIC timer. (LOC) > > What's the benefit of doing so - and why has not it been done before? > I mean, I run a regular 2.6.18.6, > /sys/devices/system/clocksource/clocksource0/current_clocksource (is this > related?) shows "acpi_pm", but the IRQ0 counter increases at HZ. Maybe I > am confusing things, but why the need for PIT when clocksource is acpi_pm > anyway? acpi_pm is only a readout device to keep track of current time. PIT and local APIC timer are used to provide either periodic or one shot programmable timer events. Up to now the kernel started PIT and local APIC timer in parallel with the same period where PIT incremented jiffies and local APIC timer called update_process_times() and profile_tick. We changed this to let the boot cpu increment jiffies inside the local apic timer interrupt after PIT has been stopped. A whole interrupt for jiffies64++ is waste. Also when we switch to nohz / high resolution mode, we want to use the local APIC, as it is much faster to access. The maximum delta to program is 27ms for the pit and ~1sec for the local APIC. This is important for dynticks, as we can achieve longer idle sleeps w/o reprogramming the timer. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
Arjan van de Ven wrote: > On Thu, 2007-02-22 at 15:10 +0100, Pierre Ossman wrote: > >> >> So with a local apic, and acpi_pm as clocksource, I shouldn't be getting >> timer >> interrupts? >> > > timer interrupts as in "irq0"? > > Yes: 0:9786349XT-PIC-XTtimer > you shouldn't if you use the hrtimers/tickless stuff... > CONFIG_HIGH_RES_TIMERS=y CONFIG_NO_HZ=y > can you get us a dmesg somewhere? maybe the kernel mentions why ;) > > Sure. My dmesg is full of mmc debug crud right now, but I'll just reboot and I'll have a clean one for you. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
On Thu, 2007-02-22 at 15:10 +0100, Pierre Ossman wrote: > Arjan van de Ven wrote: > > > > some can be used for both (PIT), but on a concept level the uses are > > independent. The advantage of local apic over PIT is that local apic is > > cheap to do "one shot" future events with, while the PIT will tick > > periodic at a fixed frequency. With tickless idle.. that's not what you > > want. > > > > So with a local apic, and acpi_pm as clocksource, I shouldn't be getting timer > interrupts? timer interrupts as in "irq0"? you shouldn't if you use the hrtimers/tickless stuff... can you get us a dmesg somewhere? maybe the kernel mentions why ;) > Yet I do. Which I assume means that the kernel will still get woken > up very often. if irq0 keeps increasing at 100Hz or 1000Hz or so.. then yes -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
Arjan van de Ven wrote: > > some can be used for both (PIT), but on a concept level the uses are > independent. The advantage of local apic over PIT is that local apic is > cheap to do "one shot" future events with, while the PIT will tick > periodic at a fixed frequency. With tickless idle.. that's not what you > want. > So with a local apic, and acpi_pm as clocksource, I shouldn't be getting timer interrupts? Yet I do. Which I assume means that the kernel will still get woken up very often. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
> What's the benefit of doing so - and why has not it been done before? > I mean, I run a regular 2.6.18.6, > /sys/devices/system/clocksource/clocksource0/current_clocksource (is this > related?) shows "acpi_pm", but the IRQ0 counter increases at HZ. Maybe I > am confusing things, but why the need for PIT when clocksource is acpi_pm > anyway? you're mixing up 2 concepts: 1) clocksource 2) eventsource 1) is for "what time is it now", and acpi_pm is useful for that, as are several other things such as rdtsc 2) is for "I need THIS to happen X miliseconds from now". acpi_pm is not useful for that, nor is rdtsc. some can be used for both (PIT), but on a concept level the uses are independent. The advantage of local apic over PIT is that local apic is cheap to do "one shot" future events with, while the PIT will tick periodic at a fixed frequency. With tickless idle.. that's not what you want. -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
On Feb 22 2007 00:17, Thomas Gleixner wrote: >On Thu, 2007-02-22 at 00:04 +0100, Luca Tettamanti wrote: >> >> Interrupt 0 is stuck at 114 (the number is consistent across reboots). I >> don't experience any problem, time is running fine. Still it's strange >> that the timer is doing nothing; maybe something other than the PIT is >> used for time keeping? > >Yes, we switch away from PIT and use the local APIC timer. (LOC) What's the benefit of doing so - and why has not it been done before? I mean, I run a regular 2.6.18.6, /sys/devices/system/clocksource/clocksource0/current_clocksource (is this related?) shows "acpi_pm", but the IRQ0 counter increases at HZ. Maybe I am confusing things, but why the need for PIT when clocksource is acpi_pm anyway? Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: request_module: runaway loop modprobe net-pf-1 (is Re: Linux 2.6.21-rc1)
From: Anders Larsen <[EMAIL PROTECTED]> Date: Thu, 22 Feb 2007 10:57:47 +0100 > On 2007-02-22 01:18:09, Greg KH wrote: > > On Thu, Feb 22, 2007 at 06:16:23AM +0900, OGAWA Hirofumi wrote: > > > E.g. something calls the request_modle(), and if hotplug is using > > > socket(PF_UNIX) and af_unix is module, it also calls request_modle()? > > > > > > Just my guess though... > > > > Ugh, why does anyone make af_unix a module these days. I thought only > > Debian was that foolish... :) > > Then how about making CONFIG_UNIX bool instead of tristate? Please see the archives, there have been discussions about this kind of suggestion before. We should not dis-allow AF_UNIX being modular just because it now becomes inconvenient. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: request_module: runaway loop modprobe net-pf-1 (is Re: Linux 2.6.21-rc1)
On 2007-02-22 01:18:09, Greg KH wrote: > On Thu, Feb 22, 2007 at 06:16:23AM +0900, OGAWA Hirofumi wrote: > > E.g. something calls the request_modle(), and if hotplug is using > > socket(PF_UNIX) and af_unix is module, it also calls request_modle()? > > > > Just my guess though... > > Ugh, why does anyone make af_unix a module these days. I thought only > Debian was that foolish... :) Then how about making CONFIG_UNIX bool instead of tristate? Cheers Anders diff --git a/net/unix/Kconfig b/net/unix/Kconfig index 5a69733..b589254 100644 --- a/net/unix/Kconfig +++ b/net/unix/Kconfig @@ -3,7 +3,7 @@ # config UNIX - tristate "Unix domain sockets" + bool "Unix domain sockets" ---help--- If you say Y here, you will include support for Unix domain sockets; sockets are the standard Unix mechanism for establishing and @@ -13,9 +13,5 @@ config UNIX an embedded system or something similar, you therefore definitely want to say Y here. - To compile this driver as a module, choose M here: the module will be - called unix. Note that several important services won't work - correctly if you say M here and then neglect to load the module. - Say Y unless you know what you are doing. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: request_module: runaway loop modprobe net-pf-1 (is Re: Linux 2.6.21-rc1)
On Thu, Feb 22, 2007 at 12:34:17PM +0900, YOSHIFUJI Hideaki / ?$B5HF#1QL@ wrote: > In article <[EMAIL PROTECTED]> (at Thu, 22 Feb 2007 11:04:40 +0900 (JST)), > YOSHIFUJI Hideaki / ?$B5HF#1QL@ <[EMAIL PROTECTED]> says: > > > In article <[EMAIL PROTECTED]> (at Thu, 22 Feb 2007 04:12:04 +0900), OGAWA > > Hirofumi <[EMAIL PROTECTED]> says: > > > > > YOSHIFUJI Hideaki / ?$B5HF#1QL@ <[EMAIL PROTECTED]> writes: > > > > > > > In article <[EMAIL PROTECTED]> (at Tue, 20 Feb 2007 20:53:45 -0800 > > > > (PST)), Linus Torvalds <[EMAIL PROTECTED]> says: > > > > > > > >> But there's a ton of architecture updates (arm, mips, powerpc, x86, > > > >> you > > > >> name it), ACPI updates, and lots of driver work. And just a lot of > > > >> cleanups. > > > > > > > > I cannot boot 2.6.21-rc1; it falls into OOM-Killer. > > > > > > > > Interesting error message I can see is: > > > >request_module: runaway loop modprobe net-pf-1 > > > > > > > > After bisecting, the commit > > > > Driver core: let request_module() send a /sys/modules/kmod/-uevent > > > > (id c353c3fb0700a3c17ea2b0237710a184232ccd7f) is to blame. > > > > > > > > Reverting it fixes the issue to me. > > > > > > /sbin/hotplug needs some module, but request_module() call /sbin/hotplug > > > loop? > > > Hm.. does the patch fix the problem? > > > > Yes, it absolutely fixes the issue. > > Several options: > > - To revert the changeset to blame No. > - To apply Ogawa-san's (or other appropriate) patch His patch just avoids the issue, but isn't correct. > - To select UNIX in init/Kconfig:KMOD I like this one, but some people might still want it as a module (I really don't know why, does anyone else???) > I think it would be a good idea to rate-limit frequency of requesing a > single module, anyway. Yes, if we can detect the loop, that would be best. Or, maybe the easiest thing is to just not do the netlink call if it's asking for the network module? :) Any other suggestions or thoughts? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: request_module: runaway loop modprobe net-pf-1 (is Re: Linux 2.6.21-rc1)
On Thu, Feb 22, 2007 at 06:16:23AM +0900, OGAWA Hirofumi wrote: > Greg KH <[EMAIL PROTECTED]> writes: > > > On Thu, Feb 22, 2007 at 04:12:04AM +0900, OGAWA Hirofumi wrote: > >> YOSHIFUJI Hideaki / ?$B5HF#1QL@ <[EMAIL PROTECTED]> writes: > >> > >> > In article <[EMAIL PROTECTED]> (at Tue, 20 Feb 2007 20:53:45 -0800 > >> > (PST)), Linus Torvalds <[EMAIL PROTECTED]> says: > >> > > >> >> But there's a ton of architecture updates (arm, mips, powerpc, x86, you > >> >> name it), ACPI updates, and lots of driver work. And just a lot of > >> >> cleanups. > >> > > >> > I cannot boot 2.6.21-rc1; it falls into OOM-Killer. > >> > > >> > Interesting error message I can see is: > >> >request_module: runaway loop modprobe net-pf-1 > >> > > >> > After bisecting, the commit > >> > Driver core: let request_module() send a /sys/modules/kmod/-uevent > >> > (id c353c3fb0700a3c17ea2b0237710a184232ccd7f) is to blame. > >> > > >> > Reverting it fixes the issue to me. > >> > >> /sbin/hotplug needs some module, but request_module() call /sbin/hotplug > >> loop? > >> Hm.. does the patch fix the problem? > > > > How does it loop? > > E.g. something calls the request_modle(), and if hotplug is using > socket(PF_UNIX) and af_unix is module, it also calls request_modle()? > > Just my guess though... Ugh, why does anyone make af_unix a module these days. I thought only Debian was that foolish... :) It will be interesting to see if this fixes the issue or not. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
On 2/22/07, Thomas Gleixner <[EMAIL PROTECTED]> wrote: On Thu, 2007-02-22 at 00:04 +0100, Luca Tettamanti wrote: > Hi Thomas, > I'm testing NO_HZ on my machines. On the laptop I see that the timer > interrupt counter is incremented (though slower than HZ). This machine > is running UP kernel. > > On my desktop I see this: > >CPU0 CPU1 > 0:114 0 IO-APIC-edge timer > 1: 1624 10771 IO-APIC-edge i8042 > 6: 3 0 IO-APIC-edge floppy > 7: 0 0 IO-APIC-edge parport0 > 9: 0 0 IO-APIC-fasteoi acpi > 12: 40111 184047 IO-APIC-edge i8042 > 16: 75624 998858 IO-APIC-fasteoi [EMAIL PROTECTED]::01:00.0, uhci_hcd:usb1 > 17: 0 0 IO-APIC-fasteoi uhci_hcd:usb4 > 18:711 5487 IO-APIC-fasteoi ide1, libata, ehci_hcd:usb7, uhci_hcd:usb3 > 19:617 2254 IO-APIC-fasteoi libata, uhci_hcd:usb2 > 20: 0 0 IO-APIC-fasteoi ehci_hcd:usb6, uhci_hcd:usb5 > 21:2483869 0 IO-APIC-fasteoi eth0 > 22: 2 0 IO-APIC-fasteoi ohci1394 > 218: 28872 360643 PCI-MSI-edge HDA Intel > 219: 32932 138196 PCI-MSI-edge libata > NMI: 0 0 > LOC:27611912827539 > ERR: 0 > MIS: 0 > > Interrupt 0 is stuck at 114 (the number is consistent across reboots). I > don't experience any problem, time is running fine. Still it's strange > that the timer is doing nothing; maybe something other than the PIT is > used for time keeping? Yes, we switch away from PIT and use the local APIC timer. (LOC) Ok, thanks for the clarification. Luca - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
On Thu, 2007-02-22 at 00:04 +0100, Luca Tettamanti wrote: > Hi Thomas, > I'm testing NO_HZ on my machines. On the laptop I see that the timer > interrupt counter is incremented (though slower than HZ). This machine > is running UP kernel. > > On my desktop I see this: > >CPU0 CPU1 > 0:114 0 IO-APIC-edge timer > 1: 1624 10771 IO-APIC-edge i8042 > 6: 3 0 IO-APIC-edge floppy > 7: 0 0 IO-APIC-edge parport0 > 9: 0 0 IO-APIC-fasteoi acpi > 12: 40111 184047 IO-APIC-edge i8042 > 16: 75624 998858 IO-APIC-fasteoi [EMAIL > PROTECTED]::01:00.0, uhci_hcd:usb1 > 17: 0 0 IO-APIC-fasteoi uhci_hcd:usb4 > 18:711 5487 IO-APIC-fasteoi ide1, libata, ehci_hcd:usb7, > uhci_hcd:usb3 > 19:617 2254 IO-APIC-fasteoi libata, uhci_hcd:usb2 > 20: 0 0 IO-APIC-fasteoi ehci_hcd:usb6, uhci_hcd:usb5 > 21:2483869 0 IO-APIC-fasteoi eth0 > 22: 2 0 IO-APIC-fasteoi ohci1394 > 218: 28872 360643 PCI-MSI-edge HDA Intel > 219: 32932 138196 PCI-MSI-edge libata > NMI: 0 0 > LOC:27611912827539 > ERR: 0 > MIS: 0 > > Interrupt 0 is stuck at 114 (the number is consistent across reboots). I > don't experience any problem, time is running fine. Still it's strange > that the timer is doing nothing; maybe something other than the PIT is > used for time keeping? Yes, we switch away from PIT and use the local APIC timer. (LOC) tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
NO_HZ: timer interrupt stuck [Re: Linux 2.6.21-rc1]
Linus Torvalds <[EMAIL PROTECTED]> ha scritto: > Ok, the merge window for 2.6.21 has closed, and -rc1 is out there. > > There's a lot of changes, as is usual for an -rc1 thing, but at least so > far it would seem that 2.6.20 has been a good base, and I don't think we > have anything *really* scary here. > > The most interesting core change may be the dyntick/nohz one, where timer > ticks will only happen when needed. It's been brewing for a _loong_ time, > but it's in the standard kernel now as an option. Hi Thomas, I'm testing NO_HZ on my machines. On the laptop I see that the timer interrupt counter is incremented (though slower than HZ). This machine is running UP kernel. On my desktop I see this: CPU0 CPU1 0:114 0 IO-APIC-edge timer 1: 1624 10771 IO-APIC-edge i8042 6: 3 0 IO-APIC-edge floppy 7: 0 0 IO-APIC-edge parport0 9: 0 0 IO-APIC-fasteoi acpi 12: 40111 184047 IO-APIC-edge i8042 16: 75624 998858 IO-APIC-fasteoi [EMAIL PROTECTED]::01:00.0, uhci_hcd:usb1 17: 0 0 IO-APIC-fasteoi uhci_hcd:usb4 18:711 5487 IO-APIC-fasteoi ide1, libata, ehci_hcd:usb7, uhci_hcd:usb3 19:617 2254 IO-APIC-fasteoi libata, uhci_hcd:usb2 20: 0 0 IO-APIC-fasteoi ehci_hcd:usb6, uhci_hcd:usb5 21:2483869 0 IO-APIC-fasteoi eth0 22: 2 0 IO-APIC-fasteoi ohci1394 218: 28872 360643 PCI-MSI-edge HDA Intel 219: 32932 138196 PCI-MSI-edge libata NMI: 0 0 LOC:27611912827539 ERR: 0 MIS: 0 Interrupt 0 is stuck at 114 (the number is consistent across reboots). I don't experience any problem, time is running fine. Still it's strange that the timer is doing nothing; maybe something other than the PIT is used for time keeping? This is the dmesg of the "abnormal" machine (dual core, SMP kernel): Linux version 2.6.20-ge696268a-dirty ([EMAIL PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #33 SMP PREEMPT Tue Feb 20 23:24:24 CET 2007 BIOS-provided physical RAM map: sanitize start sanitize end copy_e820_map() start: size: 0009c800 end: 0009c800 type: 1 copy_e820_map() type is E820_RAM copy_e820_map() start: 0009c800 size: 3800 end: 000a type: 2 copy_e820_map() start: 000e4000 size: 0001c000 end: 0010 type: 2 copy_e820_map() start: 0010 size: 3fe9 end: 3ff9 type: 1 copy_e820_map() type is E820_RAM copy_e820_map() start: 3ff9 size: e000 end: 3ff9e000 type: 3 copy_e820_map() start: 3ff9e000 size: 00042000 end: 3ffe type: 4 copy_e820_map() start: 3ffe size: 0002 end: 4000 type: 2 copy_e820_map() start: fee0 size: 1000 end: fee01000 type: 2 copy_e820_map() start: ffb0 size: 0050 end: 0001 type: 2 BIOS-e820: - 0009c800 (usable) BIOS-e820: 0009c800 - 000a (reserved) BIOS-e820: 000e4000 - 0010 (reserved) BIOS-e820: 0010 - 3ff9 (usable) BIOS-e820: 3ff9 - 3ff9e000 (ACPI data) BIOS-e820: 3ff9e000 - 3ffe (ACPI NVS) BIOS-e820: 3ffe - 4000 (reserved) BIOS-e820: fee0 - fee01000 (reserved) BIOS-e820: ffb0 - 0001 (reserved) 1023MB LOWMEM available. found SMP MP-table at 000ff780 Entering add_active_range(0, 0, 262032) 0 entries of 256 used Zone PFN ranges: DMA 0 -> 4096 Normal 4096 -> 262032 early_node_map[1] active PFN ranges 0:0 -> 262032 On node 0 totalpages: 262032 DMA zone: 32 pages used for memmap DMA zone: 0 pages reserved DMA zone: 4064 pages, LIFO batch:0 Normal zone: 2015 pages used for memmap Normal zone: 255921 pages, LIFO batch:31 DMI 2.4 present. ACPI: RSDP 000FA980, 0024 (r2 ACPIAM) ACPI: XSDT 3FF90100, 0054 (r1 KOZIRO FRONTIER 12000611 MSFT 97) ACPI: FACP 3FF90290, 00F4 (r3 MSTEST OEMFACP 12000611 MSFT 97) ACPI: DSDT 3FF905C0, 8F8C (r1 A0637 A06370000 INTL 20060113) ACPI: FACS 3FF9E000, 0040 ACPI: APIC 3FF90390, 006C (r1 MSTEST OEMAPIC 12000611 MSFT 97) ACPI: MCFG 3FF90400, 003C (r1 MSTEST OEMMCFG 12000611 MSFT 97) ACPI: SLIC 3FF90440, 0176 (r1 KOZIRO FRONTIER 12000611 MSFT 97) ACPI: OEMB 3FF9E040, 007B (r1 MSTEST AMI_OEM 12000611 MSFT 97) ACPI: HPET 3FF99550, 0038 (r1 MSTEST OEMHPET 12000611 MSFT 97) ACPI: PM-Timer IO Port: 0x808 ACPI: Local APIC address 0xfee0 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 6
Re: Linux 2.6.21-rc1 [git bisect]
On Wed, Feb 21, 2007 at 01:06:17PM -0800, Linus Torvalds wrote: > That said, one thing to worry about when doing bisection: the kernel > configuration. This bit me badly the one time I did a git bisect; it kept ping- ponging around a big change (sata? xtables?) that required me to answer the same two dozen questions about ten different times. If git-bisect itself could manage .config as part of the process, reverting to the one used with the most-recently-in-the-past try when it backs up to a revision, that would be, you know, wonderful. --Pete -- Pete Harlan ArtSelect, Inc. [EMAIL PROTECTED] http://www.artselect.com ArtSelect is a subsidiary of a21, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 2007-02-21 at 13:06 -0800, Linus Torvalds wrote: > > On Wed, 21 Feb 2007, Daniel Walker wrote: > > > > Here's the final commit from the bisect which caused it . It says "No > > changes to existing functionality" ? > > Ok, it wouldn't be the first time some change that is supposed to change > nothing does actually change something. Yeah , maybe I screwed something up. First time I've done a git bisect. > That said, one thing to worry about when doing bisection: the kernel > configuration. > > If you always just do "make oldconfig" or something, the kernel config for > the thing you test will depend on the _previous_ kernel you compiled, and > that is not always what you want. I've once had a failing kernel, did > bisection, and it turned out that since I had gone back in time to before > the option that caused the failure even existed, I had (by mistake) then > compiled some of the later kernels without that option enabled, and called > them "good". In this case I don't think anything was specifically turned on, beyond SMP. For instance HRT/dynamic tick was off. I didn't run "make oldconfig", but just running "make" asked for options that just got added, which was nice. > The end result: "git bisect" didn't actually end up pointing to the right > commit, just because I had effectively lied to it. > > That said, considering that you did get a commit that doesn't look > entirely unlikely (and that clearly changes things that are relevant), I > suspect you did actually find the right one. I think if it's not that exact commit it's still one in that set. I mainly wanted to confirm that it was an hrt/dynamic tick issue , and not some left field patches.. Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: request_module: runaway loop modprobe net-pf-1 (is Re: Linux 2.6.21-rc1)
Greg KH <[EMAIL PROTECTED]> writes: > On Thu, Feb 22, 2007 at 04:12:04AM +0900, OGAWA Hirofumi wrote: >> YOSHIFUJI Hideaki / ?$B5HF#1QL@ <[EMAIL PROTECTED]> writes: >> >> > In article <[EMAIL PROTECTED]> (at Tue, 20 Feb 2007 20:53:45 -0800 (PST)), >> > Linus Torvalds <[EMAIL PROTECTED]> says: >> > >> >> But there's a ton of architecture updates (arm, mips, powerpc, x86, you >> >> name it), ACPI updates, and lots of driver work. And just a lot of >> >> cleanups. >> > >> > I cannot boot 2.6.21-rc1; it falls into OOM-Killer. >> > >> > Interesting error message I can see is: >> >request_module: runaway loop modprobe net-pf-1 >> > >> > After bisecting, the commit >> > Driver core: let request_module() send a /sys/modules/kmod/-uevent >> > (id c353c3fb0700a3c17ea2b0237710a184232ccd7f) is to blame. >> > >> > Reverting it fixes the issue to me. >> >> /sbin/hotplug needs some module, but request_module() call /sbin/hotplug >> loop? >> Hm.. does the patch fix the problem? > > How does it loop? E.g. something calls the request_modle(), and if hotplug is using socket(PF_UNIX) and af_unix is module, it also calls request_modle()? Just my guess though... -- OGAWA Hirofumi <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 2007-02-21 at 13:06 -0800, Linus Torvalds wrote: > That said, considering that you did get a commit that doesn't look > entirely unlikely (and that clearly changes things that are relevant), I > suspect you did actually find the right one. Yup, thats the one which switches off PIT after we have the local APIC timers up and running. Which turns out to cause the nmi_watchdog not working anymore. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 21 Feb 2007, Daniel Walker wrote: > > Here's the final commit from the bisect which caused it . It says "No > changes to existing functionality" ? Ok, it wouldn't be the first time some change that is supposed to change nothing does actually change something. That said, one thing to worry about when doing bisection: the kernel configuration. If you always just do "make oldconfig" or something, the kernel config for the thing you test will depend on the _previous_ kernel you compiled, and that is not always what you want. I've once had a failing kernel, did bisection, and it turned out that since I had gone back in time to before the option that caused the failure even existed, I had (by mistake) then compiled some of the later kernels without that option enabled, and called them "good". The end result: "git bisect" didn't actually end up pointing to the right commit, just because I had effectively lied to it. That said, considering that you did get a commit that doesn't look entirely unlikely (and that clearly changes things that are relevant), I suspect you did actually find the right one. Linus --- > commit e9e2cdb412412326c4827fc78ba27f410d837e6e > Author: Thomas Gleixner <[EMAIL PROTECTED]> > Date: Fri Feb 16 01:28:04 2007 -0800 > > [PATCH] clockevents: i386 drivers > > Add clockevent drivers for i386: lapic (local) and PIT/HPET (global). > Update > the timer IRQ to call into the PIT/HPET driver's event handler and the > lapic-timer IRQ to call into the lapic clockevent driver. The > assignement of > timer functionality is delegated to the core framework code and replaces > the > compile and runtime evalution in do_timer_interrupt_hook() > > Use the clockevents broadcast support and implement the lapic_broadcast > function for ACPI. > > No changes to existing functionality. > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 2007-02-21 at 21:43 +0100, Thomas Gleixner wrote: > On Wed, 2007-02-21 at 12:00 -0800, Daniel Walker wrote: > > There's a compile failure during my bisect. > > > > distcc[3863] ERROR: compile /tmp//hrtimer.tmp.dwalker1.3795.i on > > dwalker3/120 failed > > kernel/hrtimer.c: In function 'hrtimer_cpu_notify': > > kernel/hrtimer.c:884: warning: implicit declaration of function > > 'clockevents_notify' > > kernel/hrtimer.c:884: error: 'CLOCK_EVT_NOTIFY_CPU_DEAD' undeclared (first > > use in this function) > > kernel/hrtimer.c:884: error: (Each undeclared identifier is reported only > > once > > kernel/hrtimer.c:884: error: for each function it appears in.) > > drivers/ide/setup-pci.c: In function 'ide_scan_pcibus': > > drivers/ide/setup-pci.c:866: warning: ignoring return value of > > '__pci_register_driver', declared with attribute warn_unused_result > > make[1]: *** [kernel/hrtimer.o] Error 1 > > hrmpf. we made it bisectable at some point. It was related some code under a hot plug ifdef .. Here's the final commit from the bisect which caused it . It says "No changes to existing functionality" ? e9e2cdb412412326c4827fc78ba27f410d837e6e is first bad commit commit e9e2cdb412412326c4827fc78ba27f410d837e6e Author: Thomas Gleixner <[EMAIL PROTECTED]> Date: Fri Feb 16 01:28:04 2007 -0800 [PATCH] clockevents: i386 drivers Add clockevent drivers for i386: lapic (local) and PIT/HPET (global). Update the timer IRQ to call into the PIT/HPET driver's event handler and the lapic-timer IRQ to call into the lapic clockevent driver. The assignement of timer functionality is delegated to the core framework code and replaces the compile and runtime evalution in do_timer_interrupt_hook() Use the clockevents broadcast support and implement the lapic_broadcast function for ACPI. No changes to existing functionality. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: request_module: runaway loop modprobe net-pf-1 (is Re: Linux 2.6.21-rc1)
On Thu, Feb 22, 2007 at 04:12:04AM +0900, OGAWA Hirofumi wrote: > YOSHIFUJI Hideaki / ?$B5HF#1QL@ <[EMAIL PROTECTED]> writes: > > > In article <[EMAIL PROTECTED]> (at Tue, 20 Feb 2007 20:53:45 -0800 (PST)), > > Linus Torvalds <[EMAIL PROTECTED]> says: > > > >> But there's a ton of architecture updates (arm, mips, powerpc, x86, you > >> name it), ACPI updates, and lots of driver work. And just a lot of > >> cleanups. > > > > I cannot boot 2.6.21-rc1; it falls into OOM-Killer. > > > > Interesting error message I can see is: > >request_module: runaway loop modprobe net-pf-1 > > > > After bisecting, the commit > > Driver core: let request_module() send a /sys/modules/kmod/-uevent > > (id c353c3fb0700a3c17ea2b0237710a184232ccd7f) is to blame. > > > > Reverting it fixes the issue to me. > > /sbin/hotplug needs some module, but request_module() call /sbin/hotplug loop? > Hm.. does the patch fix the problem? How does it loop? > BTW, mod_request_helper alias of /proc/sys/kernel/modprobe is really needed? What do you mean? > -- > OGAWA Hirofumi <[EMAIL PROTECTED]> > > > Don't use uevent until udevd or something like other setup done. > > Signed-off-by: OGAWA Hirofumi <[EMAIL PROTECTED]> > --- > > kernel/kmod.c |8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff -puN kernel/kmod.c~kmod-uevent-fix kernel/kmod.c > --- linux-2.6/kernel/kmod.c~kmod-uevent-fix 2007-02-22 03:42:37.0 > +0900 > +++ linux-2.6-hirofumi/kernel/kmod.c 2007-02-22 03:42:48.0 +0900 > @@ -90,11 +90,11 @@ int request_module(const char *fmt, ...) > if (ret >= MODULE_NAME_LEN) > return -ENAMETOOLONG; > > - strcpy(&modalias[strlen("MODALIAS=")], module_name); > - kobject_uevent_env(&kmod_mk.kobj, KOBJ_CHANGE, uevent_envp); > - > - if (modprobe_path[0] == '\0') > + if (modprobe_path[0] == '\0') { > + strcpy(&modalias[strlen("MODALIAS=")], module_name); > + kobject_uevent_env(&kmod_mk.kobj, KOBJ_CHANGE, uevent_envp); > goto out; > + } No, we want to still emit these messgages, even if we have a real "helper" application. I don't see how this would fix the problem. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 2007-02-21 at 12:00 -0800, Daniel Walker wrote: > There's a compile failure during my bisect. > > distcc[3863] ERROR: compile /tmp//hrtimer.tmp.dwalker1.3795.i on dwalker3/120 > failed > kernel/hrtimer.c: In function 'hrtimer_cpu_notify': > kernel/hrtimer.c:884: warning: implicit declaration of function > 'clockevents_notify' > kernel/hrtimer.c:884: error: 'CLOCK_EVT_NOTIFY_CPU_DEAD' undeclared (first > use in this function) > kernel/hrtimer.c:884: error: (Each undeclared identifier is reported only > once > kernel/hrtimer.c:884: error: for each function it appears in.) > drivers/ide/setup-pci.c: In function 'ide_scan_pcibus': > drivers/ide/setup-pci.c:866: warning: ignoring return value of > '__pci_register_driver', declared with attribute warn_unused_result > make[1]: *** [kernel/hrtimer.o] Error 1 hrmpf. we made it bisectable at some point. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 21 Feb 2007, Daniel Walker wrote: > > There's a compile failure during my bisect. When that happens, you need to pick another commit to try than the one git selected for you automatically. You can do that by doing git bisect visualize and select another commit somewhere fairly close to a mid-point, and try that with git reset --hard instead. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 2007-02-21 at 20:23 +0100, Thomas Gleixner wrote: > On Wed, 2007-02-21 at 10:23 -0800, Daniel Walker wrote: > > Right, but eventually there isn't a regular timer interrupt through the > > io-apic. I don't think in the past IRQ0 stops without the system > > crashing, so check_timer() could assume the timer (IRQ0) is _always_ > > regular. > > > > do you know what the requirement are for routing the NMI through the > > io-apic? > > Sorry. I checked. switching PIT off really breaks nmi_watchdog=1, as > this just mirrors IRQ#0 to the NMI. No IRQ#0 from PIT, no NMI > > We could keep PIT running with an empty interrupt handler when > nmi_watchdog=1 is set, but this interferes nicely with broadcasting. > > Does nmi_watchdog=2 work ? We might switch to that, when a local APIC is > available. > > tglx > > There's a compile failure during my bisect. distcc[3863] ERROR: compile /tmp//hrtimer.tmp.dwalker1.3795.i on dwalker3/120 failed kernel/hrtimer.c: In function 'hrtimer_cpu_notify': kernel/hrtimer.c:884: warning: implicit declaration of function 'clockevents_notify' kernel/hrtimer.c:884: error: 'CLOCK_EVT_NOTIFY_CPU_DEAD' undeclared (first use in this function) kernel/hrtimer.c:884: error: (Each undeclared identifier is reported only once kernel/hrtimer.c:884: error: for each function it appears in.) drivers/ide/setup-pci.c: In function 'ide_scan_pcibus': drivers/ide/setup-pci.c:866: warning: ignoring return value of '__pci_register_driver', declared with attribute warn_unused_result make[1]: *** [kernel/hrtimer.o] Error 1 from this commit, commit f8381cba04ba8173fd5a2b8e5cd8b3290ee13a98 Author: Thomas Gleixner <[EMAIL PROTECTED]> Date: Fri Feb 16 01:28:02 2007 -0800 [PATCH] tick-management: broadcast functionality With Ingo Molnar <[EMAIL PROTECTED]> Add broadcast functionality, so per cpu clock event devices can be registere as dummy devices or switched from/to broadcast on demand. The broadcast function distributes the events via the broadcast function of the clock even device. This is primarily designed to replace the switch apic timer to / fr IPI in power states, where the apic stops. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 21 Feb 2007 11:24:33 -0800 (PST) Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > On Wed, 21 Feb 2007, Kok, Auke wrote: > > > > I think we need to drop this now. The report that says that this *fixes* > > something might have been on regular interrupts only. I currently suspect > > that > > it breaks all MSI interrupts, which would make sense if I look a the code. > > Very bad indeed. > > > > I'll try to come up with something else or send a patch that reverts it. > > I'm going to be off-line for a couple of days, And I'll be offline for five or six days. > so I just reverted it. OK, but this change was needed because the new IRQ-debugging code reliably causes e1000 to explode. So perhaps until e1000 gets sorted out we should disable the debug code: --- a/lib/Kconfig.debug~a +++ a/lib/Kconfig.debug @@ -79,7 +79,7 @@ config DEBUG_KERNEL config DEBUG_SHIRQ bool "Debug shared IRQ handlers" - depends on DEBUG_KERNEL && GENERIC_HARDIRQS + depends on DEBUG_KERNEL && GENERIC_HARDIRQS && BROKEN help Enable this to generate a spurious interrupt as soon as a shared interrupt handler is registered, and just before one is deregistered. _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 2007-02-21 at 20:23 +0100, Thomas Gleixner wrote: > On Wed, 2007-02-21 at 10:23 -0800, Daniel Walker wrote: > > Right, but eventually there isn't a regular timer interrupt through the > > io-apic. I don't think in the past IRQ0 stops without the system > > crashing, so check_timer() could assume the timer (IRQ0) is _always_ > > regular. > > > > do you know what the requirement are for routing the NMI through the > > io-apic? > > Sorry. I checked. switching PIT off really breaks nmi_watchdog=1, as > this just mirrors IRQ#0 to the NMI. No IRQ#0 from PIT, no NMI That's what I suspected .. > We could keep PIT running with an empty interrupt handler when > nmi_watchdog=1 is set, but this interferes nicely with broadcasting. > > Does nmi_watchdog=2 work ? We might switch to that, when a local APIC is > available. Oddly, nmi_watchdog=2 doesn't work in 2.6.21-rc1, but it works in 2.6.20-rt8 however I'm not sure of the config could have been PREEMPT_RT was on. Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 21 Feb 2007, Kok, Auke wrote: > > I think we need to drop this now. The report that says that this *fixes* > something might have been on regular interrupts only. I currently suspect that > it breaks all MSI interrupts, which would make sense if I look a the code. > Very bad indeed. > > I'll try to come up with something else or send a patch that reverts it. I'm going to be off-line for a couple of days, so I just reverted it. Linus --- commit b5bf28cde894b3bb3bd25c13a7647020562f9ea0 Author: Linus Torvalds <[EMAIL PROTECTED]> Date: Wed Feb 21 11:21:44 2007 -0800 Revert "e1000: fix shared interrupt warning message" This reverts commit d2ed16356ff4fb9de23fbc5e5d582ce580390106. As Thomas Gleixner reports: "e1000 is not working anymore. ifup fails permanentely. ADDRCONF(NETDEV_UP): eth0: link is not ready nothing else" The broken commit was identified with "git bisect". Auke Kok says: "I think we need to drop this now. The report that says that this *fixes* something might have been on regular interrupts only. I currently suspect that it breaks all MSI interrupts, which would make sense if I look a the code. Very bad indeed." Cc: Jesse Brandeburg <[EMAIL PROTECTED]> Acked-by: Auke Kok <[EMAIL PROTECTED]> Cc: Andrew Morton <[EMAIL PROTECTED]> Cc: Jeff Garzik <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index a710237..98215fd 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -1417,6 +1417,10 @@ e1000_open(struct net_device *netdev) if ((err = e1000_setup_all_rx_resources(adapter))) goto err_setup_rx; + err = e1000_request_irq(adapter); + if (err) + goto err_req_irq; + e1000_power_up_phy(adapter); if ((err = e1000_up(adapter))) @@ -1427,10 +1431,6 @@ e1000_open(struct net_device *netdev) e1000_update_mng_vlan(adapter); } - err = e1000_request_irq(adapter); - if (err) - goto err_req_irq; - /* If AMT is enabled, let the firmware know that the network * interface is now open */ if (adapter->hw.mac_type == e1000_82573 && @@ -1439,10 +1439,10 @@ e1000_open(struct net_device *netdev) return E1000_SUCCESS; -err_req_irq: - e1000_down(adapter); err_up: e1000_power_down_phy(adapter); + e1000_free_irq(adapter); +err_req_irq: e1000_free_all_rx_resources(adapter); err_setup_rx: e1000_free_all_tx_resources(adapter); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 21 Feb 2007, Faik Uygur wrote: > > CHK include/linux/version.h > CHK include/linux/utsrelease.h > CHK include/linux/compile.h > CC [M] drivers/char/ip2/ip2main.o > In file included from drivers/char/ip2/ip2main.c:285: > drivers/char/ip2/i2lib.c: In function `iiSendPendingMail_t': > drivers/char/ip2/i2lib.c:83: sorry, unimplemented: inlining failed in call > to 'iiSendPendingMail': function body not available > drivers/char/ip2/i2lib.c:157: sorry, unimplemented: called from here > make[3]: *** [drivers/char/ip2/ip2main.o] Error 1 > make[2]: *** [drivers/char/ip2] Error 2 > make[1]: *** [drivers/char] Error 2 > make: *** [drivers] Error 2 Yeah, that thing was crud. Linus --- commit 5fc7e655a50b0a19229a6b4a8a5e23bfedf700a4 Author: Linus Torvalds <[EMAIL PROTECTED]> Date: Wed Feb 21 11:18:26 2007 -0800 Fix bogus 'inline' in drivers/char/ip2/i2lib.c Not only was the function way too big to be inlined in the first place, it was used before it was even defined. Noted-by: Faik Uygur <[EMAIL PROTECTED]> Cc: Jiri Slaby <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> diff --git a/drivers/char/ip2/i2lib.c b/drivers/char/ip2/i2lib.c index f86fa0c..e46120d 100644 --- a/drivers/char/ip2/i2lib.c +++ b/drivers/char/ip2/i2lib.c @@ -80,7 +80,7 @@ static int i2RetryFlushOutput(i2ChanStrPtr); // Not a documented part of the library routines (careful...) but the Diagnostic // i2diag.c finds them useful to help the throughput in certain limited // single-threaded operations. -static inline void iiSendPendingMail(i2eBordStrPtr); +static void iiSendPendingMail(i2eBordStrPtr); static void serviceOutgoingFifo(i2eBordStrPtr); // Functions defined in ip2.c as part of interrupt handling @@ -166,7 +166,7 @@ static void iiSendPendingMail_t(unsigned long data) // If any outgoing mail bits are set and there is outgoing mailbox is empty, // send the mail and clear the bits. //** -static inline void +static void iiSendPendingMail(i2eBordStrPtr pB) { if (pB->i2eOutMailWaiting && (!pB->i2eWaitingForEmptyFifo) ) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 2007-02-21 at 10:23 -0800, Daniel Walker wrote: > Right, but eventually there isn't a regular timer interrupt through the > io-apic. I don't think in the past IRQ0 stops without the system > crashing, so check_timer() could assume the timer (IRQ0) is _always_ > regular. > > do you know what the requirement are for routing the NMI through the > io-apic? Sorry. I checked. switching PIT off really breaks nmi_watchdog=1, as this just mirrors IRQ#0 to the NMI. No IRQ#0 from PIT, no NMI We could keep PIT running with an empty interrupt handler when nmi_watchdog=1 is set, but this interferes nicely with broadcasting. Does nmi_watchdog=2 work ? We might switch to that, when a local APIC is available. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
Jiri Slaby napsal(a): Faik Uygur napsal(a): Hi, Hi. 21 Şub 2007 Çar 06:53 tarihinde, Linus Torvalds şunları yazmıştı: Ok, the merge window for 2.6.21 has closed, and -rc1 is out there. CHK include/linux/version.h CHK include/linux/utsrelease.h CHK include/linux/compile.h CC [M] drivers/char/ip2/ip2main.o In file included from drivers/char/ip2/ip2main.c:285: drivers/char/ip2/i2lib.c: In function `iiSendPendingMail_t': drivers/char/ip2/i2lib.c:83: sorry, unimplemented: inlining failed in call to 'iiSendPendingMail': function body not available drivers/char/ip2/i2lib.c:157: sorry, unimplemented: called from here make[3]: *** [drivers/char/ip2/ip2main.o] Error 1 make[2]: *** [drivers/char/ip2] Error 2 make[1]: *** [drivers/char] Error 2 make: *** [drivers] Error 2 With cleanup changes in commit 40565f1962c5be9b9e285e05af01ab7771534868 compilation fails. What compiler? Oh, I can reproduce with gcc 3.4. Going to fix it. thanks, -- http://www.fi.muni.cz/~xslaby/Jiri Slaby faculty of informatics, masaryk university, brno, cz e-mail: jirislaby gmail com, gpg pubkey fingerprint: B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, Feb 21, 2007 at 07:34:01PM +0100, Andreas Schwab wrote: > I'm getting an undefined symbol with CONFIG_AGP=m: > > WARNING: "compat_agp_ioctl" [drivers/char/agp/agpgart.ko] undefined! Fix went to Linus an hour ago. It's been in -mm for a week, and agpgart.git for a day or so. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
Faik Uygur napsal(a): Hi, Hi. 21 Şub 2007 Çar 06:53 tarihinde, Linus Torvalds şunları yazmıştı: Ok, the merge window for 2.6.21 has closed, and -rc1 is out there. CHK include/linux/version.h CHK include/linux/utsrelease.h CHK include/linux/compile.h CC [M] drivers/char/ip2/ip2main.o In file included from drivers/char/ip2/ip2main.c:285: drivers/char/ip2/i2lib.c: In function `iiSendPendingMail_t': drivers/char/ip2/i2lib.c:83: sorry, unimplemented: inlining failed in call to 'iiSendPendingMail': function body not available drivers/char/ip2/i2lib.c:157: sorry, unimplemented: called from here make[3]: *** [drivers/char/ip2/ip2main.o] Error 1 make[2]: *** [drivers/char/ip2] Error 2 make[1]: *** [drivers/char] Error 2 make: *** [drivers] Error 2 With cleanup changes in commit 40565f1962c5be9b9e285e05af01ab7771534868 compilation fails. What compiler? thanks, -- http://www.fi.muni.cz/~xslaby/Jiri Slaby faculty of informatics, masaryk university, brno, cz e-mail: jirislaby gmail com, gpg pubkey fingerprint: B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
I'm getting an undefined symbol with CONFIG_AGP=m: WARNING: "compat_agp_ioctl" [drivers/char/agp/agpgart.ko] undefined! Andreas. -- Andreas Schwab, SuSE Labs, [EMAIL PROTECTED] SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 2007-02-21 at 19:18 +0100, Thomas Gleixner wrote: > On Wed, 2007-02-21 at 09:38 -0800, Daniel Walker wrote: > > > > > > > > Could be the switch over then which confuses the NMI . > > > > > > Why? The switch just stops the PIT/HPET. It does not fiddle with IO_APIC > > > and friends at all. > > > > I'm not an expert on the io-apic, but the check_timer() function seemed > > to assume IRQ0 was happening regularly .. > > Again: > > check_timer() is called _BEFORE_ we even touch the local APIC timers. At > this point PIT/HPET _IS_ firing IRQ0 with HZ frequency. Right, but eventually there isn't a regular timer interrupt through the io-apic. I don't think in the past IRQ0 stops without the system crashing, so check_timer() could assume the timer (IRQ0) is _always_ regular. do you know what the requirement are for routing the NMI through the io-apic? > > Well, I'm pretty sure it's HRT, cause in prior versions this only > > happened when HRT is enabled. Then you guys went to the lapic all the > > time, and now this is happening all the time .. > > The NMI is stuck: > > if (nmi_count(cpu) - prev_nmi_count[cpu] <= 5) { > printk("CPU#%d: NMI appears to be stuck (%d->%d)!\n", > cpu, > prev_nmi_count[cpu], > nmi_count(cpu)); > > This has nothing to do with jiffies. I think it has to do with IRQ0 . Did I mention this doesn't happen in 2.6.20 . > There have been a bunch of changes in arch/i386/kernel/nmi.c as well. > > > You can't reproduce this? > > Nope. Do you use nmi_watchdog=1 ? Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 2007-02-21 at 09:38 -0800, Daniel Walker wrote: > > > > > > Could be the switch over then which confuses the NMI . > > > > Why? The switch just stops the PIT/HPET. It does not fiddle with IO_APIC > > and friends at all. > > I'm not an expert on the io-apic, but the check_timer() function seemed > to assume IRQ0 was happening regularly .. Again: check_timer() is called _BEFORE_ we even touch the local APIC timers. At this point PIT/HPET _IS_ firing IRQ0 with HZ frequency. > Well, I'm pretty sure it's HRT, cause in prior versions this only > happened when HRT is enabled. Then you guys went to the lapic all the > time, and now this is happening all the time .. The NMI is stuck: if (nmi_count(cpu) - prev_nmi_count[cpu] <= 5) { printk("CPU#%d: NMI appears to be stuck (%d->%d)!\n", cpu, prev_nmi_count[cpu], nmi_count(cpu)); This has nothing to do with jiffies. There have been a bunch of changes in arch/i386/kernel/nmi.c as well. > You can't reproduce this? Nope. Also all my machines emit something like: "ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])" In your boot log nothing to see. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 2007-02-21 at 18:41 +0100, Thomas Gleixner wrote: > On Wed, 2007-02-21 at 09:19 -0800, Daniel Walker wrote: > > > At this point the PIT / HPET _is_ active and incrementing jiffies. The > > > switch to local apic timers happens afterwards. > > > > Could be the switch over then which confuses the NMI . > > Why? The switch just stops the PIT/HPET. It does not fiddle with IO_APIC > and friends at all. I'm not an expert on the io-apic, but the check_timer() function seemed to assume IRQ0 was happening regularly .. > > ftp://source.mvista.com/pub/dwalker/tglx/ > > Nothing obvious. Bisect time :( Well, I'm pretty sure it's HRT, cause in prior versions this only happened when HRT is enabled. Then you guys went to the lapic all the time, and now this is happening all the time .. You can't reproduce this? Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 2007-02-21 at 09:19 -0800, Daniel Walker wrote: > > At this point the PIT / HPET _is_ active and incrementing jiffies. The > > switch to local apic timers happens afterwards. > > Could be the switch over then which confuses the NMI . Why? The switch just stops the PIT/HPET. It does not fiddle with IO_APIC and friends at all. > ftp://source.mvista.com/pub/dwalker/tglx/ Nothing obvious. Bisect time :( tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 2007-02-21 at 18:07 +0100, Thomas Gleixner wrote: > On Wed, 2007-02-21 at 08:24 -0800, Daniel Walker wrote: > > > The most interesting core change may be the dyntick/nohz one, where timer > > > ticks will only happen when needed. It's been brewing for a _loong_ time, > > > but it's in the standard kernel now as an option. > > > > On i386 I get the following, > > > > TCP cubic registered > > > > NET: Registered protocol family 1 > > > > NET: Registered protocol family 17 > > > > Testing NMI watchdog ... CPU#0: NMI appears to be stuck (24->24)! > > > > CPU#1: NMI appears to be stuck (0->0)! > > > > CPU#2: NMI appears to be stuck (0->0)! > > > > CPU#3: NMI appears to be stuck (0->0)! > > > > when I add nmi_watchdog=1 to my boot args which worked on prior kernels. > > On closer inspection it looks like arch/i386/kernel/io_apic.c : > > check_timer() --> timer_irq_works() depends on IRQ0 incrementing jiffies > > which is no longer the case AFAIK. > > At this point the PIT / HPET _is_ active and incrementing jiffies. The > switch to local apic timers happens afterwards. Could be the switch over then which confuses the NMI . > > I'm not sure exactly how that relates to the NMI, but the check_timer() > > function disabled the NMI through the io-apic if it can't get the > > "timer" working through the io-apic. > > Boot log please. > > tglx > .config is in there too . ftp://source.mvista.com/pub/dwalker/tglx/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Wed, 2007-02-21 at 08:24 -0800, Daniel Walker wrote: > > The most interesting core change may be the dyntick/nohz one, where timer > > ticks will only happen when needed. It's been brewing for a _loong_ time, > > but it's in the standard kernel now as an option. > > On i386 I get the following, > > TCP cubic registered > > NET: Registered protocol family 1 > > NET: Registered protocol family 17 > > Testing NMI watchdog ... CPU#0: NMI appears to be stuck (24->24)! > > CPU#1: NMI appears to be stuck (0->0)! > > CPU#2: NMI appears to be stuck (0->0)! > > CPU#3: NMI appears to be stuck (0->0)! > > when I add nmi_watchdog=1 to my boot args which worked on prior kernels. > On closer inspection it looks like arch/i386/kernel/io_apic.c : > check_timer() --> timer_irq_works() depends on IRQ0 incrementing jiffies > which is no longer the case AFAIK. At this point the PIT / HPET _is_ active and incrementing jiffies. The switch to local apic timers happens afterwards. > I'm not sure exactly how that relates to the NMI, but the check_timer() > function disabled the NMI through the io-apic if it can't get the > "timer" working through the io-apic. Boot log please. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Tue, 2007-02-20 at 20:53 -0800, Linus Torvalds wrote: > Ok, the merge window for 2.6.21 has closed, and -rc1 is out there. > > There's a lot of changes, as is usual for an -rc1 thing, but at least so > far it would seem that 2.6.20 has been a good base, and I don't think we > have anything *really* scary here. > > The most interesting core change may be the dyntick/nohz one, where timer > ticks will only happen when needed. It's been brewing for a _loong_ time, > but it's in the standard kernel now as an option. On i386 I get the following, TCP cubic registered NET: Registered protocol family 1 NET: Registered protocol family 17 Testing NMI watchdog ... CPU#0: NMI appears to be stuck (24->24)! CPU#1: NMI appears to be stuck (0->0)! CPU#2: NMI appears to be stuck (0->0)! CPU#3: NMI appears to be stuck (0->0)! when I add nmi_watchdog=1 to my boot args which worked on prior kernels. On closer inspection it looks like arch/i386/kernel/io_apic.c : check_timer() --> timer_irq_works() depends on IRQ0 incrementing jiffies which is no longer the case AFAIK. I'm not sure exactly how that relates to the NMI, but the check_timer() function disabled the NMI through the io-apic if it can't get the "timer" working through the io-apic. Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
request_module: runaway loop modprobe net-pf-1 (is Re: Linux 2.6.21-rc1)
Hello. In article <[EMAIL PROTECTED]> (at Tue, 20 Feb 2007 20:53:45 -0800 (PST)), Linus Torvalds <[EMAIL PROTECTED]> says: > But there's a ton of architecture updates (arm, mips, powerpc, x86, you > name it), ACPI updates, and lots of driver work. And just a lot of > cleanups. I cannot boot 2.6.21-rc1; it falls into OOM-Killer. Interesting error message I can see is: request_module: runaway loop modprobe net-pf-1 After bisecting, the commit Driver core: let request_module() send a /sys/modules/kmod/-uevent (id c353c3fb0700a3c17ea2b0237710a184232ccd7f) is to blame. Reverting it fixes the issue to me. Regards, --yoshfuji - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
Thomas Gleixner wrote: On Tue, 2007-02-20 at 20:53 -0800, Linus Torvalds wrote: But there's a ton of architecture updates (arm, mips, powerpc, x86, you name it), ACPI updates, and lots of driver work. And just a lot of cleanups. Have fun, Yup. Fun starts in drivers/net/e1000 e1000 is not working anymore. ifup fails permanentely. ADDRCONF(NETDEV_UP): eth0: link is not ready nothing else bisect identifies: d2ed16356ff4fb9de23fbc5e5d582ce580390106 is first bad commit > commit d2ed16356ff4fb9de23fbc5e5d582ce580390106 I think we need to drop this now. The report that says that this *fixes* something might have been on regular interrupts only. I currently suspect that it breaks all MSI interrupts, which would make sense if I look a the code. Very bad indeed. I'll try to come up with something else or send a patch that reverts it. Auke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
On Tue, 2007-02-20 at 20:53 -0800, Linus Torvalds wrote: > But there's a ton of architecture updates (arm, mips, powerpc, x86, you > name it), ACPI updates, and lots of driver work. And just a lot of > cleanups. > > Have fun, Yup. Fun starts in drivers/net/e1000 e1000 is not working anymore. ifup fails permanentely. ADDRCONF(NETDEV_UP): eth0: link is not ready nothing else bisect identifies: d2ed16356ff4fb9de23fbc5e5d582ce580390106 is first bad commit commit d2ed16356ff4fb9de23fbc5e5d582ce580390106 Author: Kok, Auke <[EMAIL PROTECTED]> Date: Fri Feb 16 14:39:26 2007 -0800 e1000: fix shared interrupt warning message Signed-off-by: Jesse Brandeburg <[EMAIL PROTECTED]> Signed-off-by: Auke Kok <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> Reverting this patch on top of -rc1 helps as well. tglx lspci output: 04:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller Subsystem: Intel Corporation Unknown device 30a5 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc1
Hi, 21 Şub 2007 Çar 06:53 tarihinde, Linus Torvalds şunları yazmıştı: > Ok, the merge window for 2.6.21 has closed, and -rc1 is out there. CHK include/linux/version.h CHK include/linux/utsrelease.h CHK include/linux/compile.h CC [M] drivers/char/ip2/ip2main.o In file included from drivers/char/ip2/ip2main.c:285: drivers/char/ip2/i2lib.c: In function `iiSendPendingMail_t': drivers/char/ip2/i2lib.c:83: sorry, unimplemented: inlining failed in call to 'iiSendPendingMail': function body not available drivers/char/ip2/i2lib.c:157: sorry, unimplemented: called from here make[3]: *** [drivers/char/ip2/ip2main.o] Error 1 make[2]: *** [drivers/char/ip2] Error 2 make[1]: *** [drivers/char] Error 2 make: *** [drivers] Error 2 With cleanup changes in commit 40565f1962c5be9b9e285e05af01ab7771534868 compilation fails. Regards, - Faik - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Linux 2.6.21-rc1
Ok, the merge window for 2.6.21 has closed, and -rc1 is out there. There's a lot of changes, as is usual for an -rc1 thing, but at least so far it would seem that 2.6.20 has been a good base, and I don't think we have anything *really* scary here. The most interesting core change may be the dyntick/nohz one, where timer ticks will only happen when needed. It's been brewing for a _loong_ time, but it's in the standard kernel now as an option. But there's a ton of architecture updates (arm, mips, powerpc, x86, you name it), ACPI updates, and lots of driver work. And just a lot of cleanups. Have fun, Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/