Re: [BUG] 2.6.24 refuses to boot - ATA problem?
Gene Heskett wrote: On Sunday 03 February 2008, Ingo Molnar wrote: * Gene Heskett [EMAIL PROTECTED] wrote: I believe its the same, but lemme paste it for sure, yes: [ 26.339926] ENABLING IO-APIC IRQs [ 26.340119] ..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1 [ 26.350129] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 26.350182] ...trying to set up timer (IRQ0) through the 8259A ... failed. [ 26.350185] ...trying to set up timer as Virtual Wire IRQ... failed. [ 26.360186] ...trying to set up timer as ExtINT IRQ... works. The third line is the only line that makes it to the screen during the boot trace. Now, what does this tell us? the question would be: - if you remove the acpi_use_timer_override boot flag - and if you boot a kernel with this hack applied = do those weird PATA failures come back? If the failues do _not_ come back then the problem is somehow affected/worked-around by the IO-APIC code that generates the above 4 lines. If the failures are still the same then the above 4 lines are really just an uninteresting side-effect of the acpi_use_timer_override flag - and the real side-effects (that fixes PATA on your box) are to be found elsewhere. Sadly, the latter variant is the expected answer. Ingo And at this point, I can't tell. This reboot was from a cold start, without the argument, and cold by long enough to make the rounds about the house and pick up a beer, but not take my evening pillbox. A minute cold, maybe 2 max. The log is clean since except for a kudzu nag of some sort: .. Just to muddy your observations: it is quite possible that a cold (power-off) reboot may be required to properly observe what happens here. Cheers - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] 2.6.24 refuses to boot - ATA problem?
On Monday 04 February 2008, Mark Lord wrote: Gene Heskett wrote: On Sunday 03 February 2008, Ingo Molnar wrote: * Gene Heskett [EMAIL PROTECTED] wrote: I believe its the same, but lemme paste it for sure, yes: [ 26.339926] ENABLING IO-APIC IRQs [ 26.340119] ..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1 [ 26.350129] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 26.350182] ...trying to set up timer (IRQ0) through the 8259A ... failed. [ 26.350185] ...trying to set up timer as Virtual Wire IRQ... failed. [ 26.360186] ...trying to set up timer as ExtINT IRQ... works. The third line is the only line that makes it to the screen during the boot trace. Now, what does this tell us? the question would be: - if you remove the acpi_use_timer_override boot flag - and if you boot a kernel with this hack applied = do those weird PATA failures come back? If the failues do _not_ come back then the problem is somehow affected/worked-around by the IO-APIC code that generates the above 4 lines. If the failures are still the same then the above 4 lines are really just an uninteresting side-effect of the acpi_use_timer_override flag - and the real side-effects (that fixes PATA on your box) are to be found elsewhere. Sadly, the latter variant is the expected answer. Ingo And at this point, I can't tell. This reboot was from a cold start, without the argument, and cold by long enough to make the rounds about the house and pick up a beer, but not take my evening pillbox. A minute cold, maybe 2 max. The log is clean since except for a kudzu nag of some sort: .. Just to muddy your observations: it is quite possible that a cold (power-off) reboot may be required to properly observe what happens here. Precisely why I've now done that twice, without using the extra argument. No recurrence dammit. Cheers -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) He who makes a beast of himself gets rid of the pain of being a man. -- Dr. Johnson - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] 2.6.24 refuses to boot - ATA problem?
Chris Rankin wrote: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1.00: status: { DRDY } ata1: soft resetting link ata1.00: configured for UDMA/66 ata1: EH complete ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1.00: status: { DRDY } ata1: soft resetting link Had at least one other report like this... Sleepiness prevents me from recalling more at the moment, but I think the other report was fixed with a special ACPI switch... /me puts in pile for Monday... Jeff - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] 2.6.24 refuses to boot - ATA problem?
On Saturday 02 February 2008, Jeff Garzik wrote: Chris Rankin wrote: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1.00: status: { DRDY } ata1: soft resetting link ata1.00: configured for UDMA/66 ata1: EH complete ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1.00: status: { DRDY } ata1: soft resetting link Had at least one other report like this... Sleepiness prevents me from recalling more at the moment, but I think the other report was fixed with a special ACPI switch... I think that one came from me, but it also gets over 14,000 hits on google. Now Jeff, here is the strange part. That error was killing me, many times an hour and eventually crashing completely, repeatedly. I applied that kernel argument acpi_use_timer_override once and have not had the error since, and that includes one test of a full let it cool for a minute powerdown reboot to see if it would come back, which it did not. That argument causes the kernel to log this as its responding to that command: [ 27.097095] ENABLING IO-APIC IRQs [ 27.097287] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 27.107291] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 27.107343] ...trying to set up timer (IRQ0) through the 8259A ... failed. [ 27.107346] ...trying to set up timer as Virtual Wire IRQ... failed. [ 27.117353] ...trying to set up timer as ExtINT IRQ... works. The last 4 lines above are not logged without that argument. So my theory ATM is that this forced the kernel to initialize something in the boards registers that it does not initialize without that command, and that its going fubar as shown in the msg quoted above is a totally random thing, perhaps dependent on the phase of one of jupiters moons as to what state it powers up in. And I got lucky, so far in that my single powerdown reset didn't trigger it again... And you _know_ what that knocking sound is by now. :) That's my admittedly hardware oriented view of the goings on. But I also think it should be a good clue as to what piece of the acpi code needs walked around in and its tires kicked again, with an eye toward making that item a wee bit more intelligently done. If you can cobble up something that will extract the data and prove what fails, I'll be glad to play guinea pig. With ccache, a kernel build is 15 minutes to actually running it. My $0.02 in 1934 dollars. Adjust for inflation since. /me puts in pile for Monday... Jeff Thanks Jeff. I'm glad to see that this isn't scheduled to 'fall through the cracks' as does happen when folks get busy. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) What!? Me worry? -- Alfred E. Newman - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] 2.6.24 refuses to boot - ATA problem?
* Gene Heskett [EMAIL PROTECTED] wrote: I think that one came from me, but it also gets over 14,000 hits on google. Now Jeff, here is the strange part. That error was killing me, many times an hour and eventually crashing completely, repeatedly. I applied that kernel argument acpi_use_timer_override once and have not had the error since, and that includes one test of a full let it cool for a minute powerdown reboot to see if it would come back, which it did not. That argument causes the kernel to log this as its responding to that command: [ 27.097095] ENABLING IO-APIC IRQs [ 27.097287] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 27.107291] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 27.107343] ...trying to set up timer (IRQ0) through the 8259A ... failed. [ 27.107346] ...trying to set up timer as Virtual Wire IRQ... failed. [ 27.117353] ...trying to set up timer as ExtINT IRQ... works. The last 4 lines above are not logged without that argument. So my theory ATM is that this forced the kernel to initialize something in the boards registers that it does not initialize without that command, and that its going fubar as shown in the msg quoted above is a totally random thing, perhaps dependent on the phase of one of jupiters moons as to what state it powers up in. And I got lucky, so far in that my single powerdown reset didn't trigger it again... And you _know_ what that knocking sound is by now. :) that's weird. Could you try the hack below and _remove_ the acpi_use_timer_override flag? The change should artificially cause the above 4 lines to appear again, in all cases. This would test the following aspects of your theory: is this unknown side-effect of the the acpi_use_timer_override flag related to the timer setup sequence in io_apic_32.c? If not, then the difference most likely lies in the different ACPI setup sequence. Ingo --- arch/x86/kernel/io_apic_32.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux/arch/x86/kernel/io_apic_32.c === --- linux.orig/arch/x86/kernel/io_apic_32.c +++ linux/arch/x86/kernel/io_apic_32.c @@ -2208,7 +2208,7 @@ static inline void __init check_timer(vo * Ok, does IRQ0 through the IOAPIC work? */ unmask_IO_APIC_irq(0); - if (timer_irq_works()) { + if (timer_irq_works() 0) { if (nmi_watchdog == NMI_IO_APIC) { disable_8259A_irq(0); setup_nmi(); - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] 2.6.24 refuses to boot - ATA problem?
* Ingo Molnar [EMAIL PROTECTED] wrote: [ 27.097095] ENABLING IO-APIC IRQs [ 27.097287] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 27.107291] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 27.107343] ...trying to set up timer (IRQ0) through the 8259A ... failed. [ 27.107346] ...trying to set up timer as Virtual Wire IRQ... failed. [ 27.117353] ...trying to set up timer as ExtINT IRQ... works. The last 4 lines above are not logged without that argument. So my theory ATM is that this forced the kernel to initialize something in the boards registers that it does not initialize without that command, and that its going fubar as shown in the msg quoted above is a totally random thing, perhaps dependent on the phase of one of jupiters moons as to what state it powers up in. And I got lucky, so far in that my single powerdown reset didn't trigger it again... And you _know_ what that knocking sound is by now. :) that's weird. Could you try the hack below and _remove_ the acpi_use_timer_override flag? The change should artificially cause the above 4 lines to appear again, in all cases. This would test the following aspects of your theory: is this unknown side-effect of the the acpi_use_timer_override flag related to the timer setup sequence in io_apic_32.c? If not, then the difference most likely lies in the different ACPI setup sequence. i tried that patch on a box here, and it produces similar 4 lines: [0.172141] ENABLING IO-APIC IRQs [0.175498] init IO_APIC IRQs [0.176059] IO-APIC (apicid-pin) 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected. [0.187942] ..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1 [0.233859] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [0.236014] ...trying to set up timer (IRQ0) through the 8259A ... failed. [0.236014] ...trying to set up timer as Virtual Wire IRQ... failed. [0.236014] ...trying to set up timer as ExtINT IRQ... works. [0.277879] Using local APIC timer interrupts. but ... in all likelyhood it's some ACPI side-effects of the acpi_use_timer_override flag, not really this IO-APIC/timer-setup detail that matters. Ingo - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] 2.6.24 refuses to boot - ATA problem?
On Saturday 02 February 2008, Ingo Molnar wrote: * Gene Heskett [EMAIL PROTECTED] wrote: I think that one came from me, but it also gets over 14,000 hits on google. Now Jeff, here is the strange part. That error was killing me, many times an hour and eventually crashing completely, repeatedly. I applied that kernel argument acpi_use_timer_override once and have not had the error since, and that includes one test of a full let it cool for a minute powerdown reboot to see if it would come back, which it did not. That argument causes the kernel to log this as its responding to that command: [ 27.097095] ENABLING IO-APIC IRQs [ 27.097287] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 27.107291] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 27.107343] ...trying to set up timer (IRQ0) through the 8259A ... failed. [ 27.107346] ...trying to set up timer as Virtual Wire IRQ... failed. [ 27.117353] ...trying to set up timer as ExtINT IRQ... works. The last 4 lines above are not logged without that argument. So my theory ATM is that this forced the kernel to initialize something in the boards registers that it does not initialize without that command, and that its going fubar as shown in the msg quoted above is a totally random thing, perhaps dependent on the phase of one of jupiters moons as to what state it powers up in. And I got lucky, so far in that my single powerdown reset didn't trigger it again... And you _know_ what that knocking sound is by now. :) that's weird. Could you try the hack below and _remove_ the acpi_use_timer_override flag? The change should artificially cause the above 4 lines to appear again, in all cases. This would test the following aspects of your theory: is this unknown side-effect of the the acpi_use_timer_override flag related to the timer setup sequence in io_apic_32.c? If not, then the difference most likely lies in the different ACPI setup sequence. Ingo --- arch/x86/kernel/io_apic_32.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux/arch/x86/kernel/io_apic_32.c === --- linux.orig/arch/x86/kernel/io_apic_32.c +++ linux/arch/x86/kernel/io_apic_32.c @@ -2208,7 +2208,7 @@ static inline void __init check_timer(vo * Ok, does IRQ0 through the IOAPIC work? */ unmask_IO_APIC_irq(0); - if (timer_irq_works()) { + if (timer_irq_works() 0) { if (nmi_watchdog == NMI_IO_APIC) { disable_8259A_irq(0); setup_nmi(); I believe its the same, but lemme paste it for sure, yes: [ 26.339926] ENABLING IO-APIC IRQs [ 26.340119] ..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1 [ 26.350129] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 26.350182] ...trying to set up timer (IRQ0) through the 8259A ... failed. [ 26.350185] ...trying to set up timer as Virtual Wire IRQ... failed. [ 26.360186] ...trying to set up timer as ExtINT IRQ... works. The third line is the only line that makes it to the screen during the boot trace. Now, what does this tell us? -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) As far as the laws of mathematics refer to reality, they are not certain, and as far as they are certain, they do not refer to reality. -- Albert Einstein - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] 2.6.24 refuses to boot - ATA problem?
* Gene Heskett [EMAIL PROTECTED] wrote: I believe its the same, but lemme paste it for sure, yes: [ 26.339926] ENABLING IO-APIC IRQs [ 26.340119] ..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1 [ 26.350129] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 26.350182] ...trying to set up timer (IRQ0) through the 8259A ... failed. [ 26.350185] ...trying to set up timer as Virtual Wire IRQ... failed. [ 26.360186] ...trying to set up timer as ExtINT IRQ... works. The third line is the only line that makes it to the screen during the boot trace. Now, what does this tell us? the question would be: - if you remove the acpi_use_timer_override boot flag - and if you boot a kernel with this hack applied = do those weird PATA failures come back? If the failues do _not_ come back then the problem is somehow affected/worked-around by the IO-APIC code that generates the above 4 lines. If the failures are still the same then the above 4 lines are really just an uninteresting side-effect of the acpi_use_timer_override flag - and the real side-effects (that fixes PATA on your box) are to be found elsewhere. Sadly, the latter variant is the expected answer. Ingo - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] 2.6.24 refuses to boot - ATA problem?
On Sunday 03 February 2008, Ingo Molnar wrote: * Gene Heskett [EMAIL PROTECTED] wrote: I believe its the same, but lemme paste it for sure, yes: [ 26.339926] ENABLING IO-APIC IRQs [ 26.340119] ..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1 [ 26.350129] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 26.350182] ...trying to set up timer (IRQ0) through the 8259A ... failed. [ 26.350185] ...trying to set up timer as Virtual Wire IRQ... failed. [ 26.360186] ...trying to set up timer as ExtINT IRQ... works. The third line is the only line that makes it to the screen during the boot trace. Now, what does this tell us? the question would be: - if you remove the acpi_use_timer_override boot flag - and if you boot a kernel with this hack applied = do those weird PATA failures come back? If the failues do _not_ come back then the problem is somehow affected/worked-around by the IO-APIC code that generates the above 4 lines. If the failures are still the same then the above 4 lines are really just an uninteresting side-effect of the acpi_use_timer_override flag - and the real side-effects (that fixes PATA on your box) are to be found elsewhere. Sadly, the latter variant is the expected answer. Ingo And at this point, I can't tell. This reboot was from a cold start, without the argument, and cold by long enough to make the rounds about the house and pick up a beer, but not take my evening pillbox. A minute cold, maybe 2 max. The log is clean since except for a kudzu nag of some sort: [ 50.535388] warning: process `kudzu' used the deprecated sysctl system call with 1.23. which isn't your problem, but fedora's. As I said before, that error has not returned since the first time I used that argument, and I have booted several times now without it. Uptime now is just over an hour though, so I'm not taking bets just yet. :) -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Now I lay me down to sleep I pray the double lock will keep; May no brick through the window break, And, no one rob me till I awake. - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html