Re: Weird PCI interrupt delivery problem (resolution, sort of)
On Wed, Jan 25, 2006 at 08:04:07AM -0700, Scott Long wrote: > Either that, or the read imposes enough delay to let whatever was > happening during the DELAY call work. I find it hard to believe that > uncached writes would get delayed like this. I've lost the original > posting on this, could you provide the dmesg and computer make/model > again? It's a Toshiba Satellite L25-S1192. The chipset is ATI Radeon Xpress 200M (RS480). Verbose dmesgs are up at http://www.gank.org/freebsd/l25 acpi+apic.txt is a 6.0-RELEASE GENERIC kernel (before I upgraded the memory, but the APIC thing is independent of that) apic2.txt is a verbose dmesg with my current kernel (stock 6.0-STABLE + read-after-write change to local_apic.c). Craig ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Weird PCI interrupt delivery problem (resolution, sort of)
John Baldwin wrote: On Tuesday 24 January 2006 19:34, Craig Boston wrote: On Tue, Jan 24, 2006 at 10:43:49AM -0500, John Baldwin wrote: What if you do a read of the lapic before the write? Maybe doing 'x = lapic->eoi; lapic->eoi = 0;'? Reading the lapic before the write has no effect. Reading the lapic after the write makes it work. Hmm, perhaps the read forces the write to post? Scott? Either that, or the read imposes enough delay to let whatever was happening during the DELAY call work. I find it hard to believe that uncached writes would get delayed like this. I've lost the original posting on this, could you provide the dmesg and computer make/model again? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Weird PCI interrupt delivery problem (resolution, sort of)
On Tuesday 24 January 2006 19:34, Craig Boston wrote: > On Tue, Jan 24, 2006 at 10:43:49AM -0500, John Baldwin wrote: > > What if you do a read of the lapic before the write? Maybe doing 'x = > > lapic->eoi; lapic->eoi = 0;'? > > Reading the lapic before the write has no effect. > > Reading the lapic after the write makes it work. Hmm, perhaps the read forces the write to post? Scott? -- John Baldwin <[EMAIL PROTECTED]> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Weird PCI interrupt delivery problem (resolution, sort of)
On Tue, Jan 24, 2006 at 10:43:49AM -0500, John Baldwin wrote: > What if you do a read of the lapic before the write? Maybe doing 'x = > lapic->eoi; lapic->eoi = 0;'? Reading the lapic before the write has no effect. Reading the lapic after the write makes it work. Craig ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Weird PCI interrupt delivery problem (resolution, sort of)
On Monday 23 January 2006 21:25, Craig Boston wrote: > On Fri, Jan 20, 2006 at 03:42:21PM -0500, John Baldwin wrote: > > On Thu, Jan 19, 2006 at 10:17:39PM -0700, Scott Long wrote: > > > This points to a bus coherency problem. I wonder if your BIOS is > > > incorrectly setting the memory region of the apics as cachable. You'll > > > want to bug Baldwin about this. > > > > Hmm, well, you can actually try the PAT patch if you are feeling brave as > > it maps all devices (including APICs) as uncacheable. > > Tried the updated PAT patch (with s/pmap_unmapbios/pmap_unmap_bios/ to > get ACPI to compile). Unfortunately if it is a caching problem, PAT > isn't able to fix it. Same result as stock kernel -- interrupts stop > arriving after a dozen or so. AFAICT the local APIC is the only > memory-mapped I/O region that seems to be problematic. Ok. > Instead of writing the value twice, I also tried inserting an > __asm("nop") before the write with no effect. Also, a single write to > an unrelated area doesn't help: > > +static volatile int dummyeoi; > + > lapic_eoi(void) > { > > + dummyeoi = 1; > lapic->eoi = 0; > + dummyeoi = 2; > } > > I'm _reasonably_ certain that marking dummyeoi volatile and leaving it > uninitialized will prevent gcc from optimizng that out. Forcing R/W > cycles (++dummyeoi) before and after doesn't work either. > > A DELAY(1) before the lapic->eoi write does the trick, but DELAY does > lots of complicated things so I don't know how useful of a data point > that is. > > I'm probably missing something, but if bad cache behavior was causing > writes to the lapic EOI register to not always take effect, wouldn't the > _next_ irq (even if it's a different line) cause the one that's > currently pending to be acknowledged? What if you do a read of the lapic before the write? Maybe doing 'x = lapic->eoi; lapic->eoi = 0;'? -- John Baldwin <[EMAIL PROTECTED]> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Weird PCI interrupt delivery problem (resolution, sort of)
On Fri, Jan 20, 2006 at 03:42:21PM -0500, John Baldwin wrote: > On Thu, Jan 19, 2006 at 10:17:39PM -0700, Scott Long wrote: > > This points to a bus coherency problem. I wonder if your BIOS is > > incorrectly setting the memory region of the apics as cachable. You'll > > want to bug Baldwin about this. > > Hmm, well, you can actually try the PAT patch if you are feeling brave as it > maps all devices (including APICs) as uncacheable. Tried the updated PAT patch (with s/pmap_unmapbios/pmap_unmap_bios/ to get ACPI to compile). Unfortunately if it is a caching problem, PAT isn't able to fix it. Same result as stock kernel -- interrupts stop arriving after a dozen or so. AFAICT the local APIC is the only memory-mapped I/O region that seems to be problematic. Instead of writing the value twice, I also tried inserting an __asm("nop") before the write with no effect. Also, a single write to an unrelated area doesn't help: +static volatile int dummyeoi; + lapic_eoi(void) { + dummyeoi = 1; lapic->eoi = 0; + dummyeoi = 2; } I'm _reasonably_ certain that marking dummyeoi volatile and leaving it uninitialized will prevent gcc from optimizng that out. Forcing R/W cycles (++dummyeoi) before and after doesn't work either. A DELAY(1) before the lapic->eoi write does the trick, but DELAY does lots of complicated things so I don't know how useful of a data point that is. I'm probably missing something, but if bad cache behavior was causing writes to the lapic EOI register to not always take effect, wouldn't the _next_ irq (even if it's a different line) cause the one that's currently pending to be acknowledged? Craig ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Weird PCI interrupt delivery problem (resolution, sort of)
On Friday 20 January 2006 16:26, Craig Boston wrote: > On Fri, Jan 20, 2006 at 03:42:21PM -0500, John Baldwin wrote: > > Hmm, well, you can actually try the PAT patch if you are feeling brave as > > it maps all devices (including APICs) as uncacheable. > > Heh, took me a minute to find. I first found the one at > http://people.freebsd.org/~jhb/patches/pat.patch > but it maps devices as write-back. I'm guessing you mean to use the > version in perforce? Yeah, I need to generate an updated patch. -- John Baldwin <[EMAIL PROTECTED]> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Weird PCI interrupt delivery problem (resolution, sort of)
On Fri, Jan 20, 2006 at 03:42:21PM -0500, John Baldwin wrote: > Hmm, well, you can actually try the PAT patch if you are feeling brave as it > maps all devices (including APICs) as uncacheable. Heh, took me a minute to find. I first found the one at http://people.freebsd.org/~jhb/patches/pat.patch but it maps devices as write-back. I'm guessing you mean to use the version in perforce? I'll give it a try tonight. Could hardy make things worse -- I just noticed that X now randomly locks up hard, ever since I bumped up the memory from 256Mb to 2G -- though text mode still works fine. (yes, I tried reverting all my local patches and testing the memory) Craig ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Weird PCI interrupt delivery problem (resolution, sort of)
On Friday 20 January 2006 10:27, Craig Boston wrote: > On Thu, Jan 19, 2006 at 10:17:39PM -0700, Scott Long wrote: > > This points to a bus coherency problem. I wonder if your BIOS is > > incorrectly setting the memory region of the apics as cachable. You'll > > want to bug Baldwin about this. > > I CC-ed him on my post since he was working with me on the problem > before. For some reason the Cc: header got wiped out when it went to > the list (but I checked my server logs and it did deliver a copy of the > message to him). Hmm, well, you can actually try the PAT patch if you are feeling brave as it maps all devices (including APICs) as uncacheable. -- John Baldwin <[EMAIL PROTECTED]> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Weird PCI interrupt delivery problem (resolution, sort of)
On Thu, Jan 19, 2006 at 10:17:39PM -0700, Scott Long wrote: > This points to a bus coherency problem. I wonder if your BIOS is > incorrectly setting the memory region of the apics as cachable. You'll > want to bug Baldwin about this. I CC-ed him on my post since he was working with me on the problem before. For some reason the Cc: header got wiped out when it went to the list (but I checked my server logs and it did deliver a copy of the message to him). Craig ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Weird PCI interrupt delivery problem (resolution, sort of)
Craig Boston wrote: After trying everything I could think of to do to the I/O APIC code and coming up empty, tonight I went back to the local APIC. I had previously ruled it out since the lapic timer interrupt continued to work fine even when the others stopped. However, adding some DELAY(1) calls at key points caused it to work, much like adding WITNESS does. I managed to get it down to a single change that makes APIC mode work on this laptop: --- local_apic.c.orig Thu Jan 19 18:32:37 2006 +++ local_apic.cThu Jan 19 18:32:28 2006 @@ -599,4 +599,5 @@ lapic_eoi(void) { lapic->eoi = 0; + lapic->eoi = 0; } ...and welcome to bizarro world. There's absolutely no reason I can think of why that would change anything, other than buggy hardware. I looked at what Linux was doing, and they're also using a single write to EOI interrupts, so long as the X86_GOOD_APIC config option is enabled (and it is for P5/MMX or newer). Otherwise it does an extra read before writing to any APIC register. I don't know if linux works on this hardware or not -- the live CD I tried wasn't compiled for APIC support. At this point, since AFAIK nobody else has reported the same problem, I'm content with a local workaround. It's just... wierd. Craig This points to a bus coherency problem. I wonder if your BIOS is incorrectly setting the memory region of the apics as cachable. You'll want to bug Baldwin about this. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"