On 21.07.2010, at 14:36, Andriy Gapon wrote: > on 21/07/2010 15:25 Markus Gebert said the following: >> On 21.07.2010, at 10:33, Andriy Gapon wrote: >> >>> on 21/07/2010 03:57 Markus Gebert said the following: >>>> Another thing though: Today I compared verbose boot output from 8-stable >>>> and the current box. I saw that the ioapic sets up IRQ routing differently >>>> on these two systems although the hardware is the same. This seemed not so >>>> interesting at first, but then I noticed that 8-stable sets up two routes >>>> (to lapic0 and lapic2, or sometimes lapic3) for IRQ58 (mpt0), while current >>>> only uses one route (to lapic0). >>> My understanding that it's not "two routes", but re-routing. During early >>> boot all interrupts are bound to BSP; later, when APs become online, the >>> interrupts are re-distributed among available CPUs. >> >> I guess you're right, misinterpretation on my side. Thanks for clarifying >> this. >> >> >> Now being aware of this, it seems to me that in the machdep.lapic_allclocks=0 >> case, there might just be more interrupts to be assigned/routed due to "more >> clocks being used". If that's true, maybe it's just "luck" that in this case >> the mpt interrupt gets assigned to lapic0/cpu0 and the box runs fine. I'm >> just >> guessing though, since I have no clue how interrupts are assigned to lapics >> exactly (round-robin? some logic?). > > Yes, round-robin, for interrupts that not explicitly bound to specific CPUs. > The process is deterministic, but hard to predict indeed.
I see. >>>> I used 'cpuset -c -l 0 -x 58' in an attempt to make my 8-stable box behave >>>> like the one running current. Indeed, this seems to have changed IRQ58 to >>>> be routed to lapic0 only. And the box was running for hours without showing >>>> the symptoms. >>>> >>>> I just checked boot verbose outpout of my 8-stable box again (booted with >>>> machdep.lapic_allclocks=0 as mentioned above). And now it seems to have set >>>> up IRQ routes just like the current box (one route for IRQ58 to lapic0). >>> Not sure how to interpret this properly. One possibility is a hardware >>> problem where interrupt message route between ioapic2 and CPU to which >>> lapic3 >>> belongs is flaky. Perhaps, this might be a FreeBSD problem: it could be that >>> the system somehow tells to not set up such routes, but we don't listen. >>> But >>> this is far fetched. >> >> >> I'm not sure either. If my "theory" above proved to be true, it would have >> been >> just luck, that 6.x and 7.x (and current) run just fine on the X4100M2. A >> (short) test on Ubuntu didn't trigger the problem, so the Linux kernel is >> either lucky too by selecting an interrupt route that is "not flaky", or >> there's indeed some way to figure out not to use some lapics for some >> interrupts. Or we didn't test Linux thoroughly enough. > > Yep, it would be interesting to see how interrupts were distributed among > CPUs on > that Linux. Well I can't provide this kind of information about _that_ Ubuntu Linux right now, because it was wiped from the second test machine to test current. But we have a few productive X4100M2 running Debian and there it looks like this: ---- # uname -a Linux XX 2.6.26-2-amd64 #1 SMP Tue Mar 9 22:29:32 UTC 2010 x86_64 GNU/Linux # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 0: 36 0 0 1 IO-APIC-edge timer 1: 0 0 0 2 IO-APIC-edge i8042 7: 1 0 0 0 IO-APIC-edge 8: 0 0 0 1 IO-APIC-edge rtc0 9: 0 0 0 0 IO-APIC-fasteoi acpi 12: 0 0 0 4 IO-APIC-edge i8042 14: 0 0 0 74 IO-APIC-edge ide0 21: 0 0 0 2 IO-APIC-fasteoi ehci_hcd:usb2 22: 0 0 1 31 IO-APIC-fasteoi ohci_hcd:usb1 56: 52836 302759221 129 50868 IO-APIC-fasteoi eth2 57: 288921 1070387307 225 98210 IO-APIC-fasteoi eth3 1271: 92146 45282139 9 4885 PCI-MSI-edge ioc0 NMI: 0 0 0 0 Non-maskable interrupts LOC: 258132347 312890202 166484456 147070084 Local timer interrupts RES: 118623017 84540907 100591028 107693244 Rescheduling interrupts CAL: 108384 89281 110429 104206 function call interrupts TLB: 14719843 24105630 12456528 18955140 TLB shootdowns TRM: 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 Threshold APIC interrupts SPU: 0 0 0 0 Spurious interrupts ERR: 1 ---- Not sure how to interpret this. At first sight no IRQ58, but I guess they might be using MSI for mpt, which might avoid the problem entirely. Markus _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"