On 11/03/15 15:35, Xiao Guangrong wrote:
> 
> 
> On 11/03/2015 08:34 PM, Paolo Bonzini wrote:
>>
>>
>> On 03/11/2015 02:14, Laszlo Ersek wrote:
>>> Anyway, with the following host kernel change, the AP startup problem
>>> goes away (tested on top of v4.3):
>>>
>>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>>> index 9a9a198..4f978ad 100644
>>>>> --- a/arch/x86/kvm/x86.c
>>>>> +++ b/arch/x86/kvm/x86.c
>>>>> @@ -622,7 +622,8 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned
>>>>> long cr0)
>>>>>       if ((cr0 ^ old_cr0) & update_bits)
>>>>>           kvm_mmu_reset_context(vcpu);
>>>>>
>>>>> -    if ((cr0 ^ old_cr0) & X86_CR0_CD)
>>>>> +    if (!kvm_check_has_quirk(vcpu->kvm,
>>>>> KVM_X86_QUIRK_CD_NW_CLEARED) &&
>>>>> +        (cr0 ^ old_cr0) & X86_CR0_CD)
>>>>>           kvm_zap_gfn_range(vcpu->kvm, 0, ~0ULL);
>>>>>
>>>>>       return 0;
>>> (Honestly I just imitated fb279950ba here; I'm not making any better
>>> argument for this diff. But, independently, I wonder why this hunk
>>> didn't have the noncoherent DMA check either, originally.)
>>
>> Great job.  I look forward to the testing results.
>>
>> It should also have the noncoherent DMA check, in fact, though that's
>> just an optimization and it would have masked the bug on your system.
>>
> 
> Hmm... but kvm_zap_gfn_range even other shadow page zapping operations
> are really usual behaviour in host - it depends on how we handle memory
> overcommit/memory-layout-change on host.
> 
> I doubt it is really a right way to go. kvm_zap_gfn_range() is just a time
> delay factor to trigger OVMF boot issue but there must have other factors
> have such delay too, for example, more vcpus in OVMF, high overload on
> host,
> etc.

You are right, but practical details, concrete numbers and timeouts matter.

* When APs are started up all at once by OVMF, the delay is especially
hurtful (each separate AP causes the *entire* gfn range to be zapped),
exactly while the complete set of the APs is timed against a timeout.
This is (probably) very different from the time profile of physical
hardware.

* Jordan and Janusz actually measured the time required for 7 APs to
boot up:
<http://thread.gmane.org/gmane.comp.bios.edk2.devel/3537/focus=3573>. 10
seconds weren't enough. That sounds quite unexpected.

* We should (hopefully!) address this in OVMF in the long term, but for
now, this -- technically likely correct -- host side change

(a) regressed OVMF for a bunch of users (who use OVMF mainly on
bleeding-edge distros, in connection with GPU assignment / gaming)

(b) allocates guest side development and review resources from work that
has been going on for a long time.

I'm sure you are aware of Linus's opinion (IIRC) about the glibc change
when glibc decided that memcpy() would *actually* copy from high
addresses to low addresses. This broke the flash plugin for users. Yes,
the flash plugin should have used memmove() in the first place, but
users didn't care. And, as I said, you probably know Linus's opinion too
about breaking user scernarios, in that specific case, and in general. :)

* I had actually grepped the following subdirs for kvm_zap_gfn_range
before posting my email yesterday:

- arch/x86/kvm
- virt/kvm

There are only two (direct) callers. One of them is what we're discussing.

The other is update_mtrr() [arch/x86/kvm/mtrr.c], and that one doesn't
(generally) zap the entire range, only what is covered by the MTRR in
question. And, even that call is protected by
kvm_arch_has_noncoherent_dma()!

* "more vcpus in OVMF, high overload on host" might be possible vectors
for this, and I guess they should be addressed in the long term. But
end-users who care about OVMF today (for gaming) take extreme care
*anyway* to shed any and all unwanted load on the host. The setup is
optimized with hugepages, VCPU pinning, even PCPU isolation. Alex's blog
<http://vfio.blogspot.com/> and the vfio-users list have a bunch of tips
in this regard.


So, while I do not doubt that you are technically correct, and host
scenarios could be constructed to hammer on this UefiCpuPkg/CpuDxe issue
(when built into OVMF), *in practice* they shouldn't matter *now*. In
due time I'm quite positive the edk2 community will come to address
those issues. (I guess Adobe fixed the flash plugin to use memmove() as
well...)

Thanks
Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to