On 11/03/15 15:35, Xiao Guangrong wrote: > > > On 11/03/2015 08:34 PM, Paolo Bonzini wrote: >> >> >> On 03/11/2015 02:14, Laszlo Ersek wrote: >>> Anyway, with the following host kernel change, the AP startup problem >>> goes away (tested on top of v4.3): >>> >>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>>>> index 9a9a198..4f978ad 100644 >>>>> --- a/arch/x86/kvm/x86.c >>>>> +++ b/arch/x86/kvm/x86.c >>>>> @@ -622,7 +622,8 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned >>>>> long cr0) >>>>> if ((cr0 ^ old_cr0) & update_bits) >>>>> kvm_mmu_reset_context(vcpu); >>>>> >>>>> - if ((cr0 ^ old_cr0) & X86_CR0_CD) >>>>> + if (!kvm_check_has_quirk(vcpu->kvm, >>>>> KVM_X86_QUIRK_CD_NW_CLEARED) && >>>>> + (cr0 ^ old_cr0) & X86_CR0_CD) >>>>> kvm_zap_gfn_range(vcpu->kvm, 0, ~0ULL); >>>>> >>>>> return 0; >>> (Honestly I just imitated fb279950ba here; I'm not making any better >>> argument for this diff. But, independently, I wonder why this hunk >>> didn't have the noncoherent DMA check either, originally.) >> >> Great job. I look forward to the testing results. >> >> It should also have the noncoherent DMA check, in fact, though that's >> just an optimization and it would have masked the bug on your system. >> > > Hmm... but kvm_zap_gfn_range even other shadow page zapping operations > are really usual behaviour in host - it depends on how we handle memory > overcommit/memory-layout-change on host. > > I doubt it is really a right way to go. kvm_zap_gfn_range() is just a time > delay factor to trigger OVMF boot issue but there must have other factors > have such delay too, for example, more vcpus in OVMF, high overload on > host, > etc.
You are right, but practical details, concrete numbers and timeouts matter. * When APs are started up all at once by OVMF, the delay is especially hurtful (each separate AP causes the *entire* gfn range to be zapped), exactly while the complete set of the APs is timed against a timeout. This is (probably) very different from the time profile of physical hardware. * Jordan and Janusz actually measured the time required for 7 APs to boot up: <http://thread.gmane.org/gmane.comp.bios.edk2.devel/3537/focus=3573>. 10 seconds weren't enough. That sounds quite unexpected. * We should (hopefully!) address this in OVMF in the long term, but for now, this -- technically likely correct -- host side change (a) regressed OVMF for a bunch of users (who use OVMF mainly on bleeding-edge distros, in connection with GPU assignment / gaming) (b) allocates guest side development and review resources from work that has been going on for a long time. I'm sure you are aware of Linus's opinion (IIRC) about the glibc change when glibc decided that memcpy() would *actually* copy from high addresses to low addresses. This broke the flash plugin for users. Yes, the flash plugin should have used memmove() in the first place, but users didn't care. And, as I said, you probably know Linus's opinion too about breaking user scernarios, in that specific case, and in general. :) * I had actually grepped the following subdirs for kvm_zap_gfn_range before posting my email yesterday: - arch/x86/kvm - virt/kvm There are only two (direct) callers. One of them is what we're discussing. The other is update_mtrr() [arch/x86/kvm/mtrr.c], and that one doesn't (generally) zap the entire range, only what is covered by the MTRR in question. And, even that call is protected by kvm_arch_has_noncoherent_dma()! * "more vcpus in OVMF, high overload on host" might be possible vectors for this, and I guess they should be addressed in the long term. But end-users who care about OVMF today (for gaming) take extreme care *anyway* to shed any and all unwanted load on the host. The setup is optimized with hugepages, VCPU pinning, even PCPU isolation. Alex's blog <http://vfio.blogspot.com/> and the vfio-users list have a bunch of tips in this regard. So, while I do not doubt that you are technically correct, and host scenarios could be constructed to hammer on this UefiCpuPkg/CpuDxe issue (when built into OVMF), *in practice* they shouldn't matter *now*. In due time I'm quite positive the edk2 community will come to address those issues. (I guess Adobe fixed the flash plugin to use memmove() as well...) Thanks Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel