On Mar 9, 2010, at 4:27 PM, John Baldwin wrote:

> On Tuesday 09 March 2010 3:40:26 pm Kevin Day wrote:
>> 
>> 
>> If I boot up on an Opteron 2218 system, it boots normally. If I boot the 
> exact same VM moved to a 2352, I get:
>> 
>> acpi0: <INTEL 440BX> on motherboard
>> PCIe: Memory Mapped configuration base @ 0xe0000000
>>   (very long pause)
>> ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 0 vector 48
>> acpi0: [MPSAFE]
>> acpi0: [ITHREAD]
>> 
>> then booting normally.
> 
> It's probably worth adding some printfs to narrow down where the pause is 
> happening.  This looks to be all during the acpi_attach() routine, so maybe 
> you can start there.

Okay, good pointer. This is what I've narrowed down:

acpi_enable_pcie() calls pcie_cfgregopen(). It's called here with 
pcie_cfgregopen(0xe0000000, 0, 255). inside pcie_cfgregopen, the pause starts 
here:

        /* XXX: We should make sure this really fits into the direct map. */
        pcie_base = (vm_offset_t)pmap_mapdev(base, (maxbus + 1) << 20);

pmap_mapdev calls pmap_mapdev_attr, and in there this evaluates to true:

        /*
         * If the specified range of physical addresses fits within the direct
         * map window, use the direct map. 
         */
        if (pa < dmaplimit && pa + size < dmaplimit) {

so we call pmap_change_attr which called pmap_change_attr_locked. It's changing 
0x10000000 bytes starting at 0xffffff00e0000000.  The very last line before 
returning from pmap_change_attr_locked is:

                pmap_invalidate_cache_range(base, tmpva);

And this is where the delay is. This is calling MFENCE/CLFLUSH in a loop 8 
million times. We actually had a problem with CLFLUSH causing panics on these 
same CPUs under Xen, which is partially why we're looking at VMware now. (see 
kern/138863). I'm wondering if VMware didn't encounter the same problem and 
replace CLFLUSH with a software emulated version that is far slower... based on 
the speed is probably invalidating the entire cache. A quick change to 
pmap_invalidate_cache_range to just clear the entire cache if the area being 
cleared is over 8MB seems to have fixed it. i.e.:

        else if (cpu_feature & CPUID_CLFSH)  {

to

        else if ((cpu_feature & CPUID_CLFSH) && ((eva-sva) < (2<<22))) {


However, I'm a little blurry on if everything leading to this point is correct. 
It's ending up with 256MB of memory for the pci area, which seems really 
excessive. Is the problem just that it wants room for 256 busses, or...? Anyone 
know this code path well enough to know if this is deviating from the norm?

-- Kevin

_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to