CC'ing Xiao and Alex again. On 10/29/15 19:39, Jordan Justen wrote: > On 2015-10-29 04:45:37, Laszlo Ersek wrote: >> On 10/29/15 02:32, Jordan Justen wrote: >>> + ASSERT (MaxProcessors > 0); >>> + PcdSet32 (PcdCpuMaxLogicalProcessorNumber, MaxProcessors); >> >> I think that when this branch is active, then >> PcdCpuApInitTimeOutInMicroSeconds should *also* be set, namely to >> MAX_UINT32 (~71 minutes, the closest we can get to "infinity"). When >> this hint is available from QEMU, then we should practically disable >> the timeout option in CpuDxe's AP counting. > > I think this is a good idea, but I don't think 71 minutes is useful. > Perhaps 30 seconds? This seems more than adequate for hundreds of > processors to startup. Or perhaps some timeout based on the number of > processors? > > Janusz and I were discussing > https://github.com/tianocore/edk2/issues/21 on irc. We increased the > timeout to 10 seconds, and with only 8 processors it was still timing > out. > > Obviously we are somehow failing to start the processors correctly, or > QEMU/KVM is doing something wrong. > > Have you been able to reproduce this issue? It seems like we need to > set the timeout to 71 minutes, and then debug QEMU/KVM to see what > state the APs are in... > > Unfortunately I haven't yet been able to reproduce the bug on my > system. :(
I've been staring at the following things for a few tens of minutes now: (1) Kernel commit b18d5431acc7. Note that the commit changes the return value of the vmx_get_mt_mask() function *exactly* in the following case: kvm_arch_has_noncoherent_dma(vcpu->kvm) && (kvm_read_cr0(vcpu) & X86_CR0_CD) The first sub-condition is satisfied by GPU passthrough / device assignment, I think; the second part depends on the VCPU having turned on (or having *left* on) CR0.CD. (2) Consult the vmx_vcpu_reset() function in "arch/x86/kvm/vmx.c" (current upstream). You will find: cr0 = X86_CR0_NW | X86_CR0_CD | X86_CR0_ET; vmx_set_cr0(vcpu, cr0); /* enter rmode */ Meaning a VCPU will start with CD and NW set, in real mode, after re-set. This setting dates back to the birth of KVM: commit 6aa8b732ca01c3d7a54e93f4d701b8aabbe60fb7 Author: Avi Kivity <a...@qumranet.com> Date: Sun Dec 10 02:21:36 2006 -0800 [PATCH] kvm: userspace interface Search that commit for "0x60000010" (the second hit, although the comment that contains the first hit is quite telling as well). (3) Consult the Intel SDM, Table 11-5. "Cache Operating Modes". The (CD, NW) == (1, 1) setting in CR0 is documented as: - "Memory coherency is not maintained." - "(P6 family and Pentium processors.) State of the processor after a power up or reset. " - [in footnote 2] "The Pentium 4 and more recent processor families do not support this mode; setting the CD and NW bits to 1 selects the no-fill cache mode." In other words, the settings implemented by vmx_vcpu_reset() actually invoke the behavior of the "no-fill cache mode" (which is (CD, NW) == (1, 0)) for all practical purposes. (4) Same reference. The (CD, NW) == (1, 0) setting in CR0 is documented as: - "No-fill Cache Mode. Memory coherency is maintained." - "(Pentium 4 and later processor families.) State of processor after a power up or reset. " (5) The AsmEnableCache() function in "MdePkg/Library/BaseLib/Ia32/EnableCache.c". It clears both CD and NW in CR0. (6) This setting ((CD, NW) == (0, 0))is documented in the Intel SDM as: - "Normal Cache Mode. Highest performance cache operation." (7) The AsmEnableCache() function is invoked by MtrrLib [UefiCpuPkg/Library/MtrrLib/MtrrLib.c] after any and all MTRR changes. Consider: PostMtrrChange() | MtrrSetAllMtrrs() PostMtrrChangeEnableCache() AsmEnableCache() Where MtrrSetAllMtrrs() is a public function of the library; plus PostMtrrChange() is invoked by all of the following public functions: - MtrrSetMemoryAttribute() - MtrrSetVariableMtrr() - MtrrSetFixedMtrr() (8) Because we call MtrrLib in PlatformPei first, there are two consequences: (a) The boot VCPU has CR0.CD *set* in all parts of OVMF that run earlier than that. This caused a widely reported boot perf regression in SEC (the LZMA decompression). Ultimately another MTRR change in KVM was reverted, so (as far as I know) this symptom has not been seen recently. (In any case, we should probably fix this sometime...) (b) The other consequence is that the boot VCPU's CR0.CD is clear in the rest of OVMF. Which is what makes its speed acceptable, I guess (as long as no APs are started up). (9) Our AP startup code massages CR0, but only for mode switches. CR0.CD and CR0.NW are never touched. Now, I guess this could be easily added to the assembly encoded as a C array ("mStartupCodeTemplate" in "UefiCpuPkg/CpuDxe/ApStartup.c") -- when cr0 is massaged anyway, just clear bits 29 and 30 too; same as in AsmEnableCache(). However, for testing the idea, perhaps the following one-liner suffices too -- this is the earliest an AP executes C code: > diff --git a/UefiCpuPkg/CpuDxe/CpuMp.c b/UefiCpuPkg/CpuDxe/CpuMp.c > index 3f56faa..e7f5b41 100644 > --- a/UefiCpuPkg/CpuDxe/CpuMp.c > +++ b/UefiCpuPkg/CpuDxe/CpuMp.c > @@ -1451,6 +1451,8 @@ ApEntryPointInC ( > VOID* TopOfApStack; > UINTN ProcessorNumber; > > + AsmEnableCache (); > + > if (!mAPsAlreadyInitFinished) { > FillInProcessorInformation (FALSE, mMpSystemData.NumberOfProcessors); > TopOfApStack = (UINT8*)mApStackStart + gApStackSize; This should clear CR0.CD, and "undo" kernel commit b18d5431acc7 for the AP (by falsifying the second subcondition seen in (1)). Janusz, can you please test this one-liner (with no other out-of-tree patch applied)? Thanks! Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel