Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)
On Sunday 30 September 2007 16:06:59 Thomas Gleixner wrote: > On Sun, 30 Sep 2007, Andi Kleen wrote: > > >>> OK, this explains 2) and 3). I just looked into the code and the logic > >>> vs. noapictimer on SMP is completely broken. > > > > noapictimer really doesn't make any sense on non SMP imho with the old > > timer architecture. That is why I never bothered to implement it. > > It's purely a UP hack. > > It does not matter whether it makes sense to you or not. It is a command > line option which bricks systems. A lot of command line options do that -- if not they would be usually default or automatically used by the kernel. > There is neither an explanation in > Dokumentation/kernel-parameters.txt nor a check in the code, which > disables this completely. Fair enough. I can add a warning in the Documentation. > It makes a lot of sense even with the existing architecture. Trouble > shooting a box, where the local apic timer does not work correctly is not > an UP only requirement. It should not be needed with current systems as far as I know (see my previous mail) > I understand the code quite well. I'm just surprised from time to time by > interesting hacks in the so clean x8664 tree. No hack in this area as far as I know. > > [1] Or let's call it "I trust all my time to the CPU" and no more > > southrbridge > > aka put all eggs in one basket. Given the trends in CPU power saving that > > is a quite dangerous strategy. > > No, it's not dangerous. It definitely caused a lot of problems in the single socket multi core world; but yes you probably worked around all of them that I'm aware of currently. What I just objected to was that you complained that the current x86-64 time code -- which works much more conservatively and thus needs less workarounds -- doesn't have all of them. You basically tried to apply the special debugging strategies for clockevents to the old code and then complained that they don't work. > We spent quite some time to make the clock events > layer flexible enough to handle the current problems and the design allows > to add more infrastructure when necessary. Grand words for relatively simple changes. Anyways as far as I know even for hypothetical future C2+ capable multi socket systems the current x86-64 time code should work -- it should automatically select broadcasting. The only thing it relies on that if there are no multi socket C1E systems with broken APIC timers. Since that could be only future CPUs anyways and I haven't seen any indication that of the upcomming CPUs will have such broken C1. > The maybe new (mis)features of > upcoming CPUs need to be addressed with or without clock events and they > need to be done careful and not by random hacks. Not sure what random hacks you refer to. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)
On Sun, 30 Sep 2007, Andi Kleen wrote: OK, this explains 2) and 3). I just looked into the code and the logic vs. noapictimer on SMP is completely broken. noapictimer really doesn't make any sense on non SMP imho with the old timer architecture. That is why I never bothered to implement it. It's purely a UP hack. It does not matter whether it makes sense to you or not. It is a command line option which bricks systems. There is neither an explanation in Dokumentation/kernel-parameters.txt nor a check in the code, which disables this completely. It makes a lot of sense even with the existing architecture. Trouble shooting a box, where the local apic timer does not work correctly is not an UP only requirement. Yes, it is a hack, a _bad_ hack. ..and thanks for the explanation. Thanks for finding it so quickly guys. Sounds like this will be fixed properly in 2.6.24 with the x86 merge (which hopefully brings in the hrt patch too) There is nothing really to fix currently. Clockevents changes behaviour majorly (always using APIC timers without irq 0 backups[1]) and that causes problems that need new workarounds and new fixes (surprise surprise!) That merge would probably fix a few more such "Thomas doesn't understand the code" bugs I guess because he hacks much more on i386 than x86-64; but if the overall result will be really better is a totally different question. I understand the code quite well. I'm just surprised from time to time by interesting hacks in the so clean x8664 tree. [1] Or let's call it "I trust all my time to the CPU" and no more southrbridge aka put all eggs in one basket. Given the trends in CPU power saving that is a quite dangerous strategy. No, it's not dangerous. We spent quite some time to make the clock events layer flexible enough to handle the current problems and the design allows to add more infrastructure when necessary. The maybe new (mis)features of upcoming CPUs need to be addressed with or without clock events and they need to be done careful and not by random hacks. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)
> > PIT keeps jiffies (and the system) running, but the local APIC timer > interrupts can get out of sync due to this C1E effect. The way C1e works on AMD is that even when one core is woken up by the PIT the APIC timer resumes on the other core on the socket too because the deep power saving that breaks the APIC timer is only active with both cores idle. And on true multi socket systems there is currently no such deep C1e -- apic timer should always work. At least that is how it was supposed to work and while I admit I haven't read every mail in this endless thread closely I didn't think Rafael's box contradicted that. > I don't think this is a critical problem, but it is wrong nevertheless. > > I think it's safe to revert the C1E patch Yes the C1e patch is completely redundant on a non clockevents kernel. > and postpone the fix to the > clock events conversion. Well, a change is only needed together with clockevent's "apicrunsmaintimer" default; but not on any non clockevents kernel. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)
> > > OK, this explains 2) and 3). I just looked into the code and the logic > > vs. noapictimer on SMP is completely broken. noapictimer really doesn't make any sense on non SMP imho with the old timer architecture. That is why I never bothered to implement it. It's purely a UP hack. > ..and thanks for the explanation. > > Thanks for finding it so quickly guys. Sounds like this will be fixed > properly in 2.6.24 with the x86 merge (which hopefully brings in the hrt > patch too) There is nothing really to fix currently. Clockevents changes behaviour majorly (always using APIC timers without irq 0 backups[1]) and that causes problems that need new workarounds and new fixes (surprise surprise!) That merge would probably fix a few more such "Thomas doesn't understand the code" bugs I guess because he hacks much more on i386 than x86-64; but if the overall result will be really better is a totally different question. -Andi [1] Or let's call it "I trust all my time to the CPU" and no more southrbridge aka put all eggs in one basket. Given the trends in CPU power saving that is a quite dangerous strategy. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION from 2.6.23-rc8]
On Fri, 2007-09-28 at 11:07 -0400, Chuck Ebbert wrote: > On 09/26/2007 06:35 PM, Thomas Gleixner wrote: > > It's even worse than I thought on the first check: > > > > "noapictimer" on the command line of an SMP box prevents _ONLY_ the boot > > CPU apic timer from being used. But the secondary CPU is still > > unconditionally setting up the APIC timer and uses the non calibrated > > variable calibration_result, which is of course 0, to setup the APIC > > timer. Wreckage guaranteed. > > > > Is this why I get 1000 spurious interrupts/second on IRQ7 when booting > x86_64 with "noapic"? No, thats a different problem. The wreckage is a stuck local apic timer interrupt. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION from 2.6.23-rc8]
On 09/26/2007 06:35 PM, Thomas Gleixner wrote: > It's even worse than I thought on the first check: > > "noapictimer" on the command line of an SMP box prevents _ONLY_ the boot > CPU apic timer from being used. But the secondary CPU is still > unconditionally setting up the APIC timer and uses the non calibrated > variable calibration_result, which is of course 0, to setup the APIC > timer. Wreckage guaranteed. > Is this why I get 1000 spurious interrupts/second on IRQ7 when booting x86_64 with "noapic"? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)
On Thursday, 27 September 2007 01:21, Thomas Gleixner wrote: > On Thu, 2007-09-27 at 01:30 +0200, Rafael J. Wysocki wrote: > > > > Tested for a couple of times with each kernel, the results seem to be > > > > reproducible 100% of the time. > > > > > > Thanks for going through this debug marathon. > > > > No big deal. I'm glad that you've found what's up. > > > > Well, we still have the "CPU hotplug during suspend w/ the hrt patch" > > problem > > to debug ... ;-) > > Yeah. Knowing the actual line of code where it breaks might be helpful. Instead, I have a fix (appended, against 2.6.23-rc8-mm2). :-) Next, I'm going to enable NO_HZ and HIGH_RES_TIMERS and see what happens. ;-) Greetings, Rafael --- From: Rafael J. Wysocki <[EMAIL PROTECTED]> Fix CPU hotplug breakage on HP nx6325 and similar boxes caused by the reference to disable_apic_timer (labeled as __initdata) from the CPU initialization code. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> --- arch/x86_64/kernel/apic.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6.23-rc8-mm2/arch/x86_64/kernel/apic.c === --- linux-2.6.23-rc8-mm2.orig/arch/x86_64/kernel/apic.c +++ linux-2.6.23-rc8-mm2/arch/x86_64/kernel/apic.c @@ -42,7 +42,7 @@ int apic_verbosity; static int apic_calibrate_pmtmr __initdata; -int disable_apic_timer __initdata; +int disable_apic_timer __cpuinitdata; /* Local APIC timer works in C2? */ int local_apic_timer_c2_ok; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION from 2.6.23-rc8]
On 09/26/2007 06:35 PM, Thomas Gleixner wrote: > > It's even worse than I thought on the first check: > > "noapictimer" on the command line of an SMP box prevents _ONLY_ the boot > CPU apic timer from being used. But the secondary CPU is still > unconditionally setting up the APIC timer and uses the non calibrated > variable calibration_result, which is of course 0, to setup the APIC > timer. Wreckage guaranteed. > Well, that would explain a lot of the things I've been seeing... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)
On Thu, 2007-09-27 at 01:30 +0200, Rafael J. Wysocki wrote: > > > Tested for a couple of times with each kernel, the results seem to be > > > reproducible 100% of the time. > > > > Thanks for going through this debug marathon. > > No big deal. I'm glad that you've found what's up. > > Well, we still have the "CPU hotplug during suspend w/ the hrt patch" problem > to debug ... ;-) Yeah. Knowing the actual line of code where it breaks might be helpful. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)
Thomas, On Wednesday, 26 September 2007 23:34, Thomas Gleixner wrote: > Rafael, > > On Wed, 2007-09-26 at 23:00 +0200, Rafael J. Wysocki wrote: > > > > > First, with the "x86-64: Disable local APIC timer use on AMD systems > > > > > with C1E" > > > > > patch and my collection of suspend patches applied, the box doesn't > > > > > boot > > > > > (the suspend patches don't even thouch the boot code, so they should > > > > > be > > > > > irrelevant here). However, it boots if patch-2.6.23-rc7-hrt1.patch > > > > > (adjusted > > > > > for 2.6.23-rc8) is applied in addition. Is this expected? > > > > > > > > No. That's odd. It is nothing else than adding "noapictimer" to the > > > > kernel command line. > > > > > > Seems to be reproducible, though. I'll investigate further. > > > > So far, the results are the following: > > > > 1) current Linus' tree doesn't boot with any command line (regression) > > > > [ Linus, please revert commit e66485d747505e9d960b864fc6c37f8b2afafaf0 > > > >x86-64: Disable local APIC timer use on AMD systems with C1E > > > >It's not necessary for 2.6.23 and actually kills the box that it's > > supposed to fix. ] > > > > 2) 2.6.23-rc8 w/ the "x86-64: Disable local APIC timer use on AMD systems > > with C1E" > >patch applied behaves like the current -git > > > > 3) 2.6.23-rc8 w/o this patch doesn't boot with either "noapictimer" _or_ > > OK, this explains 2) and 3). I just looked into the code and the logic > vs. noapictimer on SMP is completely broken. > > On i386 the noapictimer option not only disables the local APIC timer, > it also registers the CPUs for broadcasting via IPI on SMP systems. > > The x8664 code uses the broadcast only when the local apic timer is > active, i.e. "noapictimer" is not on the command line. This defeats the > whole purpose of "noapictimer". It should be there to make boxen work, > where the local APIC timer actually has a hardware problem, e.g. the > nx6325. > > The current implementation of x86_64 only fixes the ACPI c-states > related problem where the APIC timer stops in C3(2), nothing else. > > On nx6325 and other AMD X2 equipped systems which have the C1E enabled > we run into the following: > > PIT keeps jiffies (and the system) running, but the local APIC timer > interrupts can get out of sync due to this C1E effect. > > I don't think this is a critical problem, but it is wrong nevertheless. > > I think it's safe to revert the C1E patch and postpone the fix to the > clock events conversion. > > > "apicmaintimer" > > on your box is not going to work. See the C1E patch. "apicmaintimer" > switches off PIT and then waits for ever for the local APIC timer > interrupts. > > > 4) 2.6.22 behaves like 2.6.23-rc8 > > No surprise > > > 5) 2.6.23-rc8 with (adjusted) patch-2.6.23-rc7-hrt1.patch boots only with > >"noapictimer" > > > > 6) 2.6.23-rc8 with (adjusted) patch-2.6.23-rc7-hrt1.patch and with the > >"x86-64: Disable local APIC timer use on AMD systems with C1E" patch > > boots > >without any extra command line options > > That's consistent behaviour. > > > Tested for a couple of times with each kernel, the results seem to be > > reproducible 100% of the time. > > Thanks for going through this debug marathon. No big deal. I'm glad that you've found what's up. Well, we still have the "CPU hotplug during suspend w/ the hrt patch" problem to debug ... ;-) Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)
On Wed, 2007-09-26 at 15:22 -0700, Linus Torvalds wrote: > > On Wed, 26 Sep 2007, Thomas Gleixner wrote: > > > > > > 1) current Linus' tree doesn't boot with any command line (regression) > > > > > > [ Linus, please revert commit e66485d747505e9d960b864fc6c37f8b2afafaf0 > > Reverted. > > > OK, this explains 2) and 3). I just looked into the code and the logic > > vs. noapictimer on SMP is completely broken. > > ..and thanks for the explanation. > > Thanks for finding it so quickly guys. Sounds like this will be fixed > properly in 2.6.24 with the x86 merge (which hopefully brings in the hrt > patch too) It's even worse than I thought on the first check: "noapictimer" on the command line of an SMP box prevents _ONLY_ the boot CPU apic timer from being used. But the secondary CPU is still unconditionally setting up the APIC timer and uses the non calibrated variable calibration_result, which is of course 0, to setup the APIC timer. Wreckage guaranteed. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)
On Wed, 26 Sep 2007, Thomas Gleixner wrote: > > > > 1) current Linus' tree doesn't boot with any command line (regression) > > > > [ Linus, please revert commit e66485d747505e9d960b864fc6c37f8b2afafaf0 Reverted. > OK, this explains 2) and 3). I just looked into the code and the logic > vs. noapictimer on SMP is completely broken. ..and thanks for the explanation. Thanks for finding it so quickly guys. Sounds like this will be fixed properly in 2.6.24 with the x86 merge (which hopefully brings in the hrt patch too) Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)
Rafael, On Wed, 2007-09-26 at 23:00 +0200, Rafael J. Wysocki wrote: > > > > First, with the "x86-64: Disable local APIC timer use on AMD systems > > > > with C1E" > > > > patch and my collection of suspend patches applied, the box doesn't boot > > > > (the suspend patches don't even thouch the boot code, so they should be > > > > irrelevant here). However, it boots if patch-2.6.23-rc7-hrt1.patch > > > > (adjusted > > > > for 2.6.23-rc8) is applied in addition. Is this expected? > > > > > > No. That's odd. It is nothing else than adding "noapictimer" to the > > > kernel command line. > > > > Seems to be reproducible, though. I'll investigate further. > > So far, the results are the following: > > 1) current Linus' tree doesn't boot with any command line (regression) > > [ Linus, please revert commit e66485d747505e9d960b864fc6c37f8b2afafaf0 > >x86-64: Disable local APIC timer use on AMD systems with C1E > >It's not necessary for 2.6.23 and actually kills the box that it's > supposed to fix. ] > > 2) 2.6.23-rc8 w/ the "x86-64: Disable local APIC timer use on AMD systems > with C1E" >patch applied behaves like the current -git > > 3) 2.6.23-rc8 w/o this patch doesn't boot with either "noapictimer" _or_ OK, this explains 2) and 3). I just looked into the code and the logic vs. noapictimer on SMP is completely broken. On i386 the noapictimer option not only disables the local APIC timer, it also registers the CPUs for broadcasting via IPI on SMP systems. The x8664 code uses the broadcast only when the local apic timer is active, i.e. "noapictimer" is not on the command line. This defeats the whole purpose of "noapictimer". It should be there to make boxen work, where the local APIC timer actually has a hardware problem, e.g. the nx6325. The current implementation of x86_64 only fixes the ACPI c-states related problem where the APIC timer stops in C3(2), nothing else. On nx6325 and other AMD X2 equipped systems which have the C1E enabled we run into the following: PIT keeps jiffies (and the system) running, but the local APIC timer interrupts can get out of sync due to this C1E effect. I don't think this is a critical problem, but it is wrong nevertheless. I think it's safe to revert the C1E patch and postpone the fix to the clock events conversion. > "apicmaintimer" on your box is not going to work. See the C1E patch. "apicmaintimer" switches off PIT and then waits for ever for the local APIC timer interrupts. > 4) 2.6.22 behaves like 2.6.23-rc8 No surprise > 5) 2.6.23-rc8 with (adjusted) patch-2.6.23-rc7-hrt1.patch boots only with >"noapictimer" > > 6) 2.6.23-rc8 with (adjusted) patch-2.6.23-rc7-hrt1.patch and with the >"x86-64: Disable local APIC timer use on AMD systems with C1E" patch boots >without any extra command line options That's consistent behaviour. > Tested for a couple of times with each kernel, the results seem to be > reproducible 100% of the time. Thanks for going through this debug marathon. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)
On Wednesday, 26 September 2007 21:49, Rafael J. Wysocki wrote: > On Wednesday, 26 September 2007 20:51, Thomas Gleixner wrote: > > On Wed, 2007-09-26 at 17:25 +0200, Rafael J. Wysocki wrote: > > > There still are some oddities. > > > > > > First, with the "x86-64: Disable local APIC timer use on AMD systems with > > > C1E" > > > patch and my collection of suspend patches applied, the box doesn't boot > > > (the suspend patches don't even thouch the boot code, so they should be > > > irrelevant here). However, it boots if patch-2.6.23-rc7-hrt1.patch > > > (adjusted > > > for 2.6.23-rc8) is applied in addition. Is this expected? > > > > No. That's odd. It is nothing else than adding "noapictimer" to the > > kernel command line. > > Seems to be reproducible, though. I'll investigate further. So far, the results are the following: 1) current Linus' tree doesn't boot with any command line (regression) [ Linus, please revert commit e66485d747505e9d960b864fc6c37f8b2afafaf0 x86-64: Disable local APIC timer use on AMD systems with C1E It's not necessary for 2.6.23 and actually kills the box that it's supposed to fix. ] 2) 2.6.23-rc8 w/ the "x86-64: Disable local APIC timer use on AMD systems with C1E" patch applied behaves like the current -git 3) 2.6.23-rc8 w/o this patch doesn't boot with either "noapictimer" _or_ "apicmaintimer" 4) 2.6.22 behaves like 2.6.23-rc8 5) 2.6.23-rc8 with (adjusted) patch-2.6.23-rc7-hrt1.patch boots only with "noapictimer" 6) 2.6.23-rc8 with (adjusted) patch-2.6.23-rc7-hrt1.patch and with the "x86-64: Disable local APIC timer use on AMD systems with C1E" patch boots without any extra command line options Tested for a couple of times with each kernel, the results seem to be reproducible 100% of the time. Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/