Radim Krčmář <rkrc...@redhat.com> wrote:

> 2014-11-26 19:01+0200, Nadav Amit:
>> Sorry for the late and long reply, but I got an issue with the new version
>> (and my previous version as well). Indeed, the SDM states that DFR should
>> be the same for enabled CPUs, and that the BIOS should get all CPUs in
>> either xAPIC or x2APIC. Yet, there is nothing that says all CPUs need to be
>> in xAPIC/x2APIC mode.
>> 
>> In my tests (which pass on bare-metal), I got a scenario in which some CPUs
>> are in xAPIC mode, the BSP changed (which is currently not handled correctly
>> by KVM) and the BSP has x2APIC enabled.
> 
> How many (V)CPUs were you using?
> (We fail hard with logical destination x2APIC and 16+ VCPUs.)
2 at the moment. What failure do you refer to?

> 
>>                                        All the core APICs are
>> software-enabled. The expected behaviour is that the CPUs with x2APIC
>> enabled would work in x2APIC mode.
> 
> (Nice, I bet that made some Intel designers happy.)
> 
> There shouldn't be any message conflict when using APIC IDs <255, so it
> might be possible if the x2APIC isn't programmed to issue weird
> messages, like physical to nonexistent APIC ID 0xff000000, which would
> be also interpreted as xAPIC broadcast.
> 
>> I think such a transitory scenario is possible on real-systems as well,
>> perhaps during CPU hot-plug. It appears the previous version (before all of
>> our changes) handled it better. I presume the most efficient way is to start
>> determining the APIC logical mode from the BSP, and if it is disabled,
>> traverse the rest of the CPUs until finding the first one with APIC enabled.
>> Yet, I have not finished doing and checking the BSP fix and other dependent
>> INIT signal handling fixes.
>> 
>> In the meanwhile, would you be ok with restoring some of the previous
>> behaviour - i.e., x2APIC is enabled if any CPU turned it on (regardless to
>> whether APIC is software enabled), otherwise use the configuration of the
>> last enabled APIC?
> 
> I don't think this patch improves anything.
> (Both behaviors are wrong and I think the current one is a bit less so.)
> 
> Our x2APIC implementation is a hack that allowed faster IPI thanks to 1
> MSR exit instead of 2 MMIO ones.  No OS, that doesn't know KVM's
> limitations, should have enabled it because we didn't emulate interrupt
> remapping, which is an architectural requirement for x2APIC …
It is a shame - I was under the impression QEMU emulation of the Intel IOMMU
would include it as well, and I now see they only did DMAR…

> And for more concrete points:
> - Physical x2APIC isn't affected (only broadcast, which is incorrect
>  either way)
> 
> - Logical x2APIC and xAPIC don't work at the same time
No, but it is important to determine what is the “consensus” APIC mode.

>  - Btw. logical x2APIC isn't supposed to work (see KVM_X2APIC_CID_BITS)
Why? It is as if there is only a single cluster. You can still send an APIC
message to multiple CPUs within the same cluster (0).

>  - Logical xAPIC is shifted incorrectly in x2APIC mode, so they are all
>    going to be inaccessible (ldr = 0)
>  - Our map isn't designed to allow x2APIC and xAPIC at the same time
> 
> - Your patch does not cover the case where sw-disabled x2APIC is
>  "before" sw-enabled xAPIC, only if it is after.
I thought I covered it. The rationale was that if any lapic is in x2APIC
mode, then the are all in x2APIC mode. It is done similarly to the previous
version (3.18).

Anyhow, I have my workarounds, so do as you find appropriately. Once I deal
with the BSP issues, I may resubmit another version.

Nadav

>> -- >8 —
>> Subject: [PATCH] KVM: x86: Traverse all CPUs during recalculate_apic_map
>> 
>> ---
>> arch/x86/kvm/lapic.c | 20 +++++++++++---------
>> 1 file changed, 11 insertions(+), 9 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 9c90d31..6dc2be6 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -139,6 +139,7 @@ static void recalculate_apic_map(struct kvm *kvm)
>>      struct kvm_apic_map *new, *old = NULL;
>>      struct kvm_vcpu *vcpu;
>>      int i;
>> +    bool any_enabled = false;
>> 
>>      new = kzalloc(sizeof(struct kvm_apic_map), GFP_KERNEL);
>> 
>> @@ -160,13 +161,21 @@ static void recalculate_apic_map(struct kvm *kvm)
>>              if (!kvm_apic_present(vcpu))
>>                      continue;
>> 
>> +            /*
>> +             * All APICs DFRs have to be configured the same mode by an OS.
>> +             * We take advatage of this while building logical id lookup
>> +             * table. After reset APICs are in software disabled mode, so if
>> +             * we find apic with different setting we assume this is the 
>> mode
>> +             * OS wants all apics to be in; build lookup table accordingly.
>> +             */
>>              if (apic_x2apic_mode(apic)) {
>>                      new->ldr_bits = 32;
>>                      new->cid_shift = 16;
>>                      new->cid_mask = (1 << KVM_X2APIC_CID_BITS) - 1;
>>                      new->lid_mask = 0xffff;
>>                      new->broadcast = X2APIC_BROADCAST;
>> -            } else if (kvm_apic_get_reg(apic, APIC_LDR)) {
>> +                    break;
>> +            } else if (!any_enabled && kvm_apic_get_reg(apic, APIC_LDR)) {
>>                      if (kvm_apic_get_reg(apic, APIC_DFR) ==
>>                                                      APIC_DFR_CLUSTER) {
>>                              new->cid_shift = 4;
>> @@ -179,15 +188,8 @@ static void recalculate_apic_map(struct kvm *kvm)
>>                      }
>>              }
>> 
>> -            /*
>> -             * All APICs have to be configured in the same mode by an OS.
>> -             * We take advatage of this while building logical id loockup
>> -             * table. After reset APICs are in software disabled mode, so if
>> -             * we find apic with different setting we assume this is the 
>> mode
>> -             * OS wants all apics to be in; build lookup table accordingly.
>> -             */
>>              if (kvm_apic_sw_enabled(apic))
>> -                    break;
>> +                    any_enabled = true;
>>      }
>> 
>>      kvm_for_each_vcpu(i, vcpu, kvm) {
>> -- 
>> 1.9.1


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to