Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Yang Zhang

On 2016/7/11 23:52, Radim Krčmář wrote:

2016-07-11 16:14+0200, Paolo Bonzini:

On 11/07/2016 15:48, Radim Krčmář wrote:

I guess the easiest solution is to replace kvm_apic_id with a field in
struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.


(I guess the fewest LOC is to look at vcpu->vcpu_id, which is equal to
 x2apic id.  xapic id cannot be greater than 255 and all of those are
 covered by the initial value of max_id.)


Yes, this would work too.  Or even better perhaps, look at vcpu->vcpu_id
in kvm_apic_id?


APIC ID is writeable in xAPIC mode, which would make the implementation
weird without an extra variable.  Always read-only APIC ID would be
best, IMO.


Or we can just simply put the assignment of apic_base to the end.


Yes, this would work, I'd also remove recalculates from
kvm_apic_set_*apic_id() and add a compiler barrier with comment for good
measure, even though set_virtual_x2apic_mode() serves as one.


Why a compiler barrier?


True, it should be a proper pair of smp_wmb() and smp_rmb() in
recalculate ... and current kvm_apic_id() reads in a wrong order, so
changing the apic_base alone update wouldn't get rid of this race.


(What makes a bit wary is that it doesn't avoid the same problem if we
 changed KVM to reset apic id to xapic id first when disabling apic.)


Yes, this is why I prefer it fixed once and for all in kvm_apic_id...


Seems most reasonable.  We'll need to be careful to have a correct value
in the apic page, but there shouldn't be any races there.


Yes, it is more reasonable.




Races in recalculation and APIC ID changes also lead to invalid physical
maps, which haven't been taken care of properly ...


Hmm, true, but can be fixed separately.  Probably the mutex should be
renamed so that it can be taken outside recalculate_apic_map...


Good point, it'll make reasoning easier and shouldn't introduce any
extra scalability issues.


If we can ensure all the updates to LDR,DFR,ID and apic mode are in 
correct sequence and followed with apic map recalculation, it should be 
enough. It's guest's responsibility to ensure the apic updating must
happen in right time(means no interrupt is in flying), otherwise the 
interrupt may deliver to wrong VCPU.


--
best regards
yang


Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Yang Zhang

On 2016/7/11 23:52, Radim Krčmář wrote:

2016-07-11 16:14+0200, Paolo Bonzini:

On 11/07/2016 15:48, Radim Krčmář wrote:

I guess the easiest solution is to replace kvm_apic_id with a field in
struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.


(I guess the fewest LOC is to look at vcpu->vcpu_id, which is equal to
 x2apic id.  xapic id cannot be greater than 255 and all of those are
 covered by the initial value of max_id.)


Yes, this would work too.  Or even better perhaps, look at vcpu->vcpu_id
in kvm_apic_id?


APIC ID is writeable in xAPIC mode, which would make the implementation
weird without an extra variable.  Always read-only APIC ID would be
best, IMO.


Or we can just simply put the assignment of apic_base to the end.


Yes, this would work, I'd also remove recalculates from
kvm_apic_set_*apic_id() and add a compiler barrier with comment for good
measure, even though set_virtual_x2apic_mode() serves as one.


Why a compiler barrier?


True, it should be a proper pair of smp_wmb() and smp_rmb() in
recalculate ... and current kvm_apic_id() reads in a wrong order, so
changing the apic_base alone update wouldn't get rid of this race.


(What makes a bit wary is that it doesn't avoid the same problem if we
 changed KVM to reset apic id to xapic id first when disabling apic.)


Yes, this is why I prefer it fixed once and for all in kvm_apic_id...


Seems most reasonable.  We'll need to be careful to have a correct value
in the apic page, but there shouldn't be any races there.


Yes, it is more reasonable.




Races in recalculation and APIC ID changes also lead to invalid physical
maps, which haven't been taken care of properly ...


Hmm, true, but can be fixed separately.  Probably the mutex should be
renamed so that it can be taken outside recalculate_apic_map...


Good point, it'll make reasoning easier and shouldn't introduce any
extra scalability issues.


If we can ensure all the updates to LDR,DFR,ID and apic mode are in 
correct sequence and followed with apic map recalculation, it should be 
enough. It's guest's responsibility to ensure the apic updating must
happen in right time(means no interrupt is in flying), otherwise the 
interrupt may deliver to wrong VCPU.


--
best regards
yang


Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Paolo Bonzini


On 11/07/2016 17:52, Radim Krčmář wrote:
> 2016-07-11 16:14+0200, Paolo Bonzini:
>> On 11/07/2016 15:48, Radim Krčmář wrote:
> I guess the easiest solution is to replace kvm_apic_id with a field in
> struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.
>>>
>>> (I guess the fewest LOC is to look at vcpu->vcpu_id, which is equal to
>>>  x2apic id.  xapic id cannot be greater than 255 and all of those are
>>>  covered by the initial value of max_id.)
>>
>> Yes, this would work too.  Or even better perhaps, look at vcpu->vcpu_id
>> in kvm_apic_id?
> 
> APIC ID is writeable in xAPIC mode, which would make the implementation
> weird without an extra variable.  Always read-only APIC ID would be
> best, IMO.

You can do

if (x2apic mode)
return lapic->vcpu->vcpu_id;
else
return get_reg(APIC_ID) >> 24;

The point is to avoid returning a shifted APIC_ID without shifting it.

The alternative of course is just caching it, which at this point is not
particularly harder...

Paolo

>>> (What makes a bit wary is that it doesn't avoid the same problem if we
>>>  changed KVM to reset apic id to xapic id first when disabling apic.)
>>
>> Yes, this is why I prefer it fixed once and for all in kvm_apic_id...
> 
> Seems most reasonable.  We'll need to be careful to have a correct value
> in the apic page, but there shouldn't be any races there.
> 
>>> Races in recalculation and APIC ID changes also lead to invalid physical
>>> maps, which haven't been taken care of properly ...
>>
>> Hmm, true, but can be fixed separately.  Probably the mutex should be
>> renamed so that it can be taken outside recalculate_apic_map...
> 
> Good point, it'll make reasoning easier and shouldn't introduce any
> extra scalability issues.



Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Paolo Bonzini


On 11/07/2016 17:52, Radim Krčmář wrote:
> 2016-07-11 16:14+0200, Paolo Bonzini:
>> On 11/07/2016 15:48, Radim Krčmář wrote:
> I guess the easiest solution is to replace kvm_apic_id with a field in
> struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.
>>>
>>> (I guess the fewest LOC is to look at vcpu->vcpu_id, which is equal to
>>>  x2apic id.  xapic id cannot be greater than 255 and all of those are
>>>  covered by the initial value of max_id.)
>>
>> Yes, this would work too.  Or even better perhaps, look at vcpu->vcpu_id
>> in kvm_apic_id?
> 
> APIC ID is writeable in xAPIC mode, which would make the implementation
> weird without an extra variable.  Always read-only APIC ID would be
> best, IMO.

You can do

if (x2apic mode)
return lapic->vcpu->vcpu_id;
else
return get_reg(APIC_ID) >> 24;

The point is to avoid returning a shifted APIC_ID without shifting it.

The alternative of course is just caching it, which at this point is not
particularly harder...

Paolo

>>> (What makes a bit wary is that it doesn't avoid the same problem if we
>>>  changed KVM to reset apic id to xapic id first when disabling apic.)
>>
>> Yes, this is why I prefer it fixed once and for all in kvm_apic_id...
> 
> Seems most reasonable.  We'll need to be careful to have a correct value
> in the apic page, but there shouldn't be any races there.
> 
>>> Races in recalculation and APIC ID changes also lead to invalid physical
>>> maps, which haven't been taken care of properly ...
>>
>> Hmm, true, but can be fixed separately.  Probably the mutex should be
>> renamed so that it can be taken outside recalculate_apic_map...
> 
> Good point, it'll make reasoning easier and shouldn't introduce any
> extra scalability issues.



Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Radim Krčmář
2016-07-11 16:14+0200, Paolo Bonzini:
> On 11/07/2016 15:48, Radim Krčmář wrote:
 I guess the easiest solution is to replace kvm_apic_id with a field in
 struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.
>> 
>> (I guess the fewest LOC is to look at vcpu->vcpu_id, which is equal to
>>  x2apic id.  xapic id cannot be greater than 255 and all of those are
>>  covered by the initial value of max_id.)
> 
> Yes, this would work too.  Or even better perhaps, look at vcpu->vcpu_id
> in kvm_apic_id?

APIC ID is writeable in xAPIC mode, which would make the implementation
weird without an extra variable.  Always read-only APIC ID would be
best, IMO.

>>> Or we can just simply put the assignment of apic_base to the end.
>> 
>> Yes, this would work, I'd also remove recalculates from
>> kvm_apic_set_*apic_id() and add a compiler barrier with comment for good
>> measure, even though set_virtual_x2apic_mode() serves as one.
> 
> Why a compiler barrier?

True, it should be a proper pair of smp_wmb() and smp_rmb() in
recalculate ... and current kvm_apic_id() reads in a wrong order, so
changing the apic_base alone update wouldn't get rid of this race.

>> (What makes a bit wary is that it doesn't avoid the same problem if we
>>  changed KVM to reset apic id to xapic id first when disabling apic.)
> 
> Yes, this is why I prefer it fixed once and for all in kvm_apic_id...

Seems most reasonable.  We'll need to be careful to have a correct value
in the apic page, but there shouldn't be any races there.

>> Races in recalculation and APIC ID changes also lead to invalid physical
>> maps, which haven't been taken care of properly ...
> 
> Hmm, true, but can be fixed separately.  Probably the mutex should be
> renamed so that it can be taken outside recalculate_apic_map...

Good point, it'll make reasoning easier and shouldn't introduce any
extra scalability issues.


Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Radim Krčmář
2016-07-11 16:14+0200, Paolo Bonzini:
> On 11/07/2016 15:48, Radim Krčmář wrote:
 I guess the easiest solution is to replace kvm_apic_id with a field in
 struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.
>> 
>> (I guess the fewest LOC is to look at vcpu->vcpu_id, which is equal to
>>  x2apic id.  xapic id cannot be greater than 255 and all of those are
>>  covered by the initial value of max_id.)
> 
> Yes, this would work too.  Or even better perhaps, look at vcpu->vcpu_id
> in kvm_apic_id?

APIC ID is writeable in xAPIC mode, which would make the implementation
weird without an extra variable.  Always read-only APIC ID would be
best, IMO.

>>> Or we can just simply put the assignment of apic_base to the end.
>> 
>> Yes, this would work, I'd also remove recalculates from
>> kvm_apic_set_*apic_id() and add a compiler barrier with comment for good
>> measure, even though set_virtual_x2apic_mode() serves as one.
> 
> Why a compiler barrier?

True, it should be a proper pair of smp_wmb() and smp_rmb() in
recalculate ... and current kvm_apic_id() reads in a wrong order, so
changing the apic_base alone update wouldn't get rid of this race.

>> (What makes a bit wary is that it doesn't avoid the same problem if we
>>  changed KVM to reset apic id to xapic id first when disabling apic.)
> 
> Yes, this is why I prefer it fixed once and for all in kvm_apic_id...

Seems most reasonable.  We'll need to be careful to have a correct value
in the apic page, but there shouldn't be any races there.

>> Races in recalculation and APIC ID changes also lead to invalid physical
>> maps, which haven't been taken care of properly ...
> 
> Hmm, true, but can be fixed separately.  Probably the mutex should be
> renamed so that it can be taken outside recalculate_apic_map...

Good point, it'll make reasoning easier and shouldn't introduce any
extra scalability issues.


Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Paolo Bonzini


On 11/07/2016 15:48, Radim Krčmář wrote:
>>> I guess the easiest solution is to replace kvm_apic_id with a field in
>>> struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.
> 
> (I guess the fewest LOC is to look at vcpu->vcpu_id, which is equal to
>  x2apic id.  xapic id cannot be greater than 255 and all of those are
>  covered by the initial value of max_id.)

Yes, this would work too.  Or even better perhaps, look at vcpu->vcpu_id
in kvm_apic_id?

>> Or we can just simply put the assignment of apic_base to the end.
> 
> Yes, this would work, I'd also remove recalculates from
> kvm_apic_set_*apic_id() and add a compiler barrier with comment for good
> measure, even though set_virtual_x2apic_mode() serves as one.

Why a compiler barrier?

> (What makes a bit wary is that it doesn't avoid the same problem if we
>  changed KVM to reset apic id to xapic id first when disabling apic.)

Yes, this is why I prefer it fixed once and for all in kvm_apic_id...

> Races in recalculation and APIC ID changes also lead to invalid physical
> maps, which haven't been taken care of properly ...

Hmm, true, but can be fixed separately.  Probably the mutex should be
renamed so that it can be taken outside recalculate_apic_map...

Paolo

> Having apic id stored in big endian, or "0-7,8-31" format, would be
> safest.  I wanted to change the apic map to do incremental updates with
> with respect to the APIC that has changed, instead of being completely
> recomputed, so maybe the time is now. :)
> 


Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Paolo Bonzini


On 11/07/2016 15:48, Radim Krčmář wrote:
>>> I guess the easiest solution is to replace kvm_apic_id with a field in
>>> struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.
> 
> (I guess the fewest LOC is to look at vcpu->vcpu_id, which is equal to
>  x2apic id.  xapic id cannot be greater than 255 and all of those are
>  covered by the initial value of max_id.)

Yes, this would work too.  Or even better perhaps, look at vcpu->vcpu_id
in kvm_apic_id?

>> Or we can just simply put the assignment of apic_base to the end.
> 
> Yes, this would work, I'd also remove recalculates from
> kvm_apic_set_*apic_id() and add a compiler barrier with comment for good
> measure, even though set_virtual_x2apic_mode() serves as one.

Why a compiler barrier?

> (What makes a bit wary is that it doesn't avoid the same problem if we
>  changed KVM to reset apic id to xapic id first when disabling apic.)

Yes, this is why I prefer it fixed once and for all in kvm_apic_id...

> Races in recalculation and APIC ID changes also lead to invalid physical
> maps, which haven't been taken care of properly ...

Hmm, true, but can be fixed separately.  Probably the mutex should be
renamed so that it can be taken outside recalculate_apic_map...

Paolo

> Having apic id stored in big endian, or "0-7,8-31" format, would be
> safest.  I wanted to change the apic map to do incremental updates with
> with respect to the APIC that has changed, instead of being completely
> recomputed, so maybe the time is now. :)
> 


Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Radim Krčmář
2016-07-11 18:14+0800, Yang Zhang:
> On 2016/7/11 15:43, Paolo Bonzini wrote:
>> On 11/07/2016 08:07, Yang Zhang wrote:
>> > > 
>> > >  mutex_lock(>arch.apic_map_lock);
>> > > 
>> > > +kvm_for_each_vcpu(i, vcpu, kvm)
>> > > +if (kvm_apic_present(vcpu))
>> > > +max_id = max(max_id, kvm_apic_id(vcpu->arch.apic));
>> > > +
>> > > +new = kzalloc(sizeof(struct kvm_apic_map) +
>> > > +  sizeof(struct kvm_lapic *) * (max_id + 1),
>> > > GFP_KERNEL);
>> > > +
>> > 
>> > I think this may cause the host runs out of memory if a malicious guest
>> > did follow thing:
>> > 1. vcpu a is doing apic map recalculation.
>> > 2. vcpu b write the apic id with 0xff
>> > 3. then vcpu b enable the x2apic: in kvm_lapic_set_base(), we will set
>> > apic_base to new value before reset the apic id.
>> > 4. vcpu a may see the x2apic enabled in vcpu b plus an old apic
>> > id(0xff), and max_id will become (0xff >> 24).

Indeed, thanks.  The guest doesn't even have to be malicious ...

>> The bug is not really here but in patch 6---but you're right nevertheless!

Yes.

>> I guess the easiest solution is to replace kvm_apic_id with a field in
>> struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.

(I guess the fewest LOC is to look at vcpu->vcpu_id, which is equal to
 x2apic id.  xapic id cannot be greater than 255 and all of those are
 covered by the initial value of max_id.)

> Or we can just simply put the assignment of apic_base to the end.

Yes, this would work, I'd also remove recalculates from
kvm_apic_set_*apic_id() and add a compiler barrier with comment for good
measure, even though set_virtual_x2apic_mode() serves as one.
(What makes a bit wary is that it doesn't avoid the same problem if we
 changed KVM to reset apic id to xapic id first when disabling apic.)

Races in recalculation and APIC ID changes also lead to invalid physical
maps, which haven't been taken care of properly ...
Having apic id stored in big endian, or "0-7,8-31" format, would be
safest.  I wanted to change the apic map to do incremental updates with
with respect to the APIC that has changed, instead of being completely
recomputed, so maybe the time is now. :)


Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Radim Krčmář
2016-07-11 18:14+0800, Yang Zhang:
> On 2016/7/11 15:43, Paolo Bonzini wrote:
>> On 11/07/2016 08:07, Yang Zhang wrote:
>> > > 
>> > >  mutex_lock(>arch.apic_map_lock);
>> > > 
>> > > +kvm_for_each_vcpu(i, vcpu, kvm)
>> > > +if (kvm_apic_present(vcpu))
>> > > +max_id = max(max_id, kvm_apic_id(vcpu->arch.apic));
>> > > +
>> > > +new = kzalloc(sizeof(struct kvm_apic_map) +
>> > > +  sizeof(struct kvm_lapic *) * (max_id + 1),
>> > > GFP_KERNEL);
>> > > +
>> > 
>> > I think this may cause the host runs out of memory if a malicious guest
>> > did follow thing:
>> > 1. vcpu a is doing apic map recalculation.
>> > 2. vcpu b write the apic id with 0xff
>> > 3. then vcpu b enable the x2apic: in kvm_lapic_set_base(), we will set
>> > apic_base to new value before reset the apic id.
>> > 4. vcpu a may see the x2apic enabled in vcpu b plus an old apic
>> > id(0xff), and max_id will become (0xff >> 24).

Indeed, thanks.  The guest doesn't even have to be malicious ...

>> The bug is not really here but in patch 6---but you're right nevertheless!

Yes.

>> I guess the easiest solution is to replace kvm_apic_id with a field in
>> struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.

(I guess the fewest LOC is to look at vcpu->vcpu_id, which is equal to
 x2apic id.  xapic id cannot be greater than 255 and all of those are
 covered by the initial value of max_id.)

> Or we can just simply put the assignment of apic_base to the end.

Yes, this would work, I'd also remove recalculates from
kvm_apic_set_*apic_id() and add a compiler barrier with comment for good
measure, even though set_virtual_x2apic_mode() serves as one.
(What makes a bit wary is that it doesn't avoid the same problem if we
 changed KVM to reset apic id to xapic id first when disabling apic.)

Races in recalculation and APIC ID changes also lead to invalid physical
maps, which haven't been taken care of properly ...
Having apic id stored in big endian, or "0-7,8-31" format, would be
safest.  I wanted to change the apic map to do incremental updates with
with respect to the APIC that has changed, instead of being completely
recomputed, so maybe the time is now. :)


Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Yang Zhang

On 2016/7/11 15:43, Paolo Bonzini wrote:



On 11/07/2016 08:07, Yang Zhang wrote:


 mutex_lock(>arch.apic_map_lock);

+kvm_for_each_vcpu(i, vcpu, kvm)
+if (kvm_apic_present(vcpu))
+max_id = max(max_id, kvm_apic_id(vcpu->arch.apic));
+
+new = kzalloc(sizeof(struct kvm_apic_map) +
+  sizeof(struct kvm_lapic *) * (max_id + 1),
GFP_KERNEL);
+


I think this may cause the host runs out of memory if a malicious guest
did follow thing:
1. vcpu a is doing apic map recalculation.
2. vcpu b write the apic id with 0xff
3. then vcpu b enable the x2apic: in kvm_lapic_set_base(), we will set
apic_base to new value before reset the apic id.
4. vcpu a may see the x2apic enabled in vcpu b plus an old apic
id(0xff), and max_id will become (0xff >> 24).


The bug is not really here but in patch 6---but you're right nevertheless!

I guess the easiest solution is to replace kvm_apic_id with a field in
struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.


Or we can just simply put the assignment of apic_base to the end.

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index fdc05ae..9c69059 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1745,7 +1745,6 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 
value)

return;
}

-   vcpu->arch.apic_base = value;

/* update jump label if enable bit changes */
if ((old_value ^ value) & MSR_IA32_APICBASE_ENABLE) {
@@ -1753,7 +1752,6 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 
value)

static_key_slow_dec_deferred(_hw_disabled);
else
static_key_slow_inc(_hw_disabled.key);
-   recalculate_apic_map(vcpu->kvm);
}

if ((old_value ^ value) & X2APIC_ENABLE) {
@@ -1764,6 +1762,8 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 
value)

kvm_x86_ops->set_virtual_x2apic_mode(vcpu, false);
}

+   vcpu->arch.apic_base = value;
+   recalculate_apic_map(vcpu->kvm);
apic->base_address = apic->vcpu->arch.apic_base &
 MSR_IA32_APICBASE_BASE;


btw, i noticed that there is no apic map recalculation after turn off 
the x2apic mode.Is it correct?


--
best regards
yang


Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Yang Zhang

On 2016/7/11 15:43, Paolo Bonzini wrote:



On 11/07/2016 08:07, Yang Zhang wrote:


 mutex_lock(>arch.apic_map_lock);

+kvm_for_each_vcpu(i, vcpu, kvm)
+if (kvm_apic_present(vcpu))
+max_id = max(max_id, kvm_apic_id(vcpu->arch.apic));
+
+new = kzalloc(sizeof(struct kvm_apic_map) +
+  sizeof(struct kvm_lapic *) * (max_id + 1),
GFP_KERNEL);
+


I think this may cause the host runs out of memory if a malicious guest
did follow thing:
1. vcpu a is doing apic map recalculation.
2. vcpu b write the apic id with 0xff
3. then vcpu b enable the x2apic: in kvm_lapic_set_base(), we will set
apic_base to new value before reset the apic id.
4. vcpu a may see the x2apic enabled in vcpu b plus an old apic
id(0xff), and max_id will become (0xff >> 24).


The bug is not really here but in patch 6---but you're right nevertheless!

I guess the easiest solution is to replace kvm_apic_id with a field in
struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.


Or we can just simply put the assignment of apic_base to the end.

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index fdc05ae..9c69059 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1745,7 +1745,6 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 
value)

return;
}

-   vcpu->arch.apic_base = value;

/* update jump label if enable bit changes */
if ((old_value ^ value) & MSR_IA32_APICBASE_ENABLE) {
@@ -1753,7 +1752,6 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 
value)

static_key_slow_dec_deferred(_hw_disabled);
else
static_key_slow_inc(_hw_disabled.key);
-   recalculate_apic_map(vcpu->kvm);
}

if ((old_value ^ value) & X2APIC_ENABLE) {
@@ -1764,6 +1762,8 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 
value)

kvm_x86_ops->set_virtual_x2apic_mode(vcpu, false);
}

+   vcpu->arch.apic_base = value;
+   recalculate_apic_map(vcpu->kvm);
apic->base_address = apic->vcpu->arch.apic_base &
 MSR_IA32_APICBASE_BASE;


btw, i noticed that there is no apic map recalculation after turn off 
the x2apic mode.Is it correct?


--
best regards
yang


Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Paolo Bonzini


On 11/07/2016 08:07, Yang Zhang wrote:
>>
>>  mutex_lock(>arch.apic_map_lock);
>>
>> +kvm_for_each_vcpu(i, vcpu, kvm)
>> +if (kvm_apic_present(vcpu))
>> +max_id = max(max_id, kvm_apic_id(vcpu->arch.apic));
>> +
>> +new = kzalloc(sizeof(struct kvm_apic_map) +
>> +  sizeof(struct kvm_lapic *) * (max_id + 1),
>> GFP_KERNEL);
>> +
> 
> I think this may cause the host runs out of memory if a malicious guest
> did follow thing:
> 1. vcpu a is doing apic map recalculation.
> 2. vcpu b write the apic id with 0xff
> 3. then vcpu b enable the x2apic: in kvm_lapic_set_base(), we will set
> apic_base to new value before reset the apic id.
> 4. vcpu a may see the x2apic enabled in vcpu b plus an old apic
> id(0xff), and max_id will become (0xff >> 24).

The bug is not really here but in patch 6---but you're right nevertheless!

I guess the easiest solution is to replace kvm_apic_id with a field in
struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.

It can be added easily in patch 6 itself, it's like 3 new lines of code
because all reads and writes go through kvm_apic_id and kvm_apic_set_id;
the kvm_apic_id wrapper can be kept for simplicity.

Thanks again!

Paolo


Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Paolo Bonzini


On 11/07/2016 08:07, Yang Zhang wrote:
>>
>>  mutex_lock(>arch.apic_map_lock);
>>
>> +kvm_for_each_vcpu(i, vcpu, kvm)
>> +if (kvm_apic_present(vcpu))
>> +max_id = max(max_id, kvm_apic_id(vcpu->arch.apic));
>> +
>> +new = kzalloc(sizeof(struct kvm_apic_map) +
>> +  sizeof(struct kvm_lapic *) * (max_id + 1),
>> GFP_KERNEL);
>> +
> 
> I think this may cause the host runs out of memory if a malicious guest
> did follow thing:
> 1. vcpu a is doing apic map recalculation.
> 2. vcpu b write the apic id with 0xff
> 3. then vcpu b enable the x2apic: in kvm_lapic_set_base(), we will set
> apic_base to new value before reset the apic id.
> 4. vcpu a may see the x2apic enabled in vcpu b plus an old apic
> id(0xff), and max_id will become (0xff >> 24).

The bug is not really here but in patch 6---but you're right nevertheless!

I guess the easiest solution is to replace kvm_apic_id with a field in
struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.

It can be added easily in patch 6 itself, it's like 3 new lines of code
because all reads and writes go through kvm_apic_id and kvm_apic_set_id;
the kvm_apic_id wrapper can be kept for simplicity.

Thanks again!

Paolo


Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Yang Zhang

On 2016/7/8 1:15, Radim Krčmář wrote:

x2APIC supports up to 2^32-1 LAPICs, but most guest in coming years will
have slighly less VCPUs.  Dynamic size saves memory at the cost of
turning one constant into a variable.

apic_map mutex had to be moved before allocation to avoid races with cpu
hotplug.

Signed-off-by: Radim Krčmář 
---
 v2:
 * replaced size with max_apic_id to minimize chances of overflow [Andrew]
 * fixed allocation size [Paolo]

 arch/x86/include/asm/kvm_host.h |  3 ++-
 arch/x86/kvm/lapic.c| 18 +-
 arch/x86/kvm/lapic.h|  2 +-
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3194b19b9c7b..643e3dffcd85 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -682,11 +682,12 @@ struct kvm_arch_memory_slot {
 struct kvm_apic_map {
struct rcu_head rcu;
u8 mode;
-   struct kvm_lapic *phys_map[256];
+   u32 max_apic_id;
union {
struct kvm_lapic *xapic_flat_map[8];
struct kvm_lapic *xapic_cluster_map[16][4];
};
+   struct kvm_lapic *phys_map[];
 };

 /* Hyper-V emulation context */
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 9880d03f533d..224fc1c5fcc6 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -120,7 +120,7 @@ static inline bool kvm_apic_map_get_logical_dest(struct 
kvm_apic_map *map,
switch (map->mode) {
case KVM_APIC_MODE_X2APIC: {
u32 offset = (dest_id >> 16) * 16;
-   u32 max_apic_id = ARRAY_SIZE(map->phys_map) - 1;
+   u32 max_apic_id = map->max_apic_id;

if (offset <= max_apic_id) {
u8 cluster_size = min(max_apic_id - offset + 1, 16U);
@@ -152,14 +152,22 @@ static void recalculate_apic_map(struct kvm *kvm)
struct kvm_apic_map *new, *old = NULL;
struct kvm_vcpu *vcpu;
int i;
-
-   new = kzalloc(sizeof(struct kvm_apic_map), GFP_KERNEL);
+   u32 max_id = 255;

mutex_lock(>arch.apic_map_lock);

+   kvm_for_each_vcpu(i, vcpu, kvm)
+   if (kvm_apic_present(vcpu))
+   max_id = max(max_id, kvm_apic_id(vcpu->arch.apic));
+
+   new = kzalloc(sizeof(struct kvm_apic_map) +
+ sizeof(struct kvm_lapic *) * (max_id + 1), GFP_KERNEL);
+


I think this may cause the host runs out of memory if a malicious guest 
did follow thing:

1. vcpu a is doing apic map recalculation.
2. vcpu b write the apic id with 0xff
3. then vcpu b enable the x2apic: in kvm_lapic_set_base(), we will set 
apic_base to new value before reset the apic id.
4. vcpu a may see the x2apic enabled in vcpu b plus an old apic 
id(0xff), and max_id will become (0xff >> 24).



if (!new)
goto out;

+   new->max_apic_id = max_id;
+
kvm_for_each_vcpu(i, vcpu, kvm) {
struct kvm_lapic *apic = vcpu->arch.apic;
struct kvm_lapic **cluster;
@@ -172,7 +180,7 @@ static void recalculate_apic_map(struct kvm *kvm)
aid = kvm_apic_id(apic);
ldr = kvm_lapic_get_reg(apic, APIC_LDR);

-   if (aid < ARRAY_SIZE(new->phys_map))
+   if (aid <= new->max_apic_id)
new->phys_map[aid] = apic;

if (apic_x2apic_mode(apic)) {
@@ -710,7 +718,7 @@ static inline bool kvm_apic_map_get_dest_lapic(struct kvm 
*kvm,
return false;

if (irq->dest_mode == APIC_DEST_PHYSICAL) {
-   if (irq->dest_id >= ARRAY_SIZE(map->phys_map)) {
+   if (irq->dest_id > map->max_apic_id) {
*bitmap = 0;
} else {
*dst = >phys_map[irq->dest_id];
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 336ba51bb16e..8d811139d2b3 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -200,7 +200,7 @@ static inline int kvm_lapic_latched_init(struct kvm_vcpu 
*vcpu)
return lapic_in_kernel(vcpu) && test_bit(KVM_APIC_INIT, 
>arch.apic->pending_events);
 }

-static inline int kvm_apic_id(struct kvm_lapic *apic)
+static inline u32 kvm_apic_id(struct kvm_lapic *apic)
 {
return (kvm_lapic_get_reg(apic, APIC_ID) >> 24) & 0xff;
 }




--
best regards
yang


Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-11 Thread Yang Zhang

On 2016/7/8 1:15, Radim Krčmář wrote:

x2APIC supports up to 2^32-1 LAPICs, but most guest in coming years will
have slighly less VCPUs.  Dynamic size saves memory at the cost of
turning one constant into a variable.

apic_map mutex had to be moved before allocation to avoid races with cpu
hotplug.

Signed-off-by: Radim Krčmář 
---
 v2:
 * replaced size with max_apic_id to minimize chances of overflow [Andrew]
 * fixed allocation size [Paolo]

 arch/x86/include/asm/kvm_host.h |  3 ++-
 arch/x86/kvm/lapic.c| 18 +-
 arch/x86/kvm/lapic.h|  2 +-
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3194b19b9c7b..643e3dffcd85 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -682,11 +682,12 @@ struct kvm_arch_memory_slot {
 struct kvm_apic_map {
struct rcu_head rcu;
u8 mode;
-   struct kvm_lapic *phys_map[256];
+   u32 max_apic_id;
union {
struct kvm_lapic *xapic_flat_map[8];
struct kvm_lapic *xapic_cluster_map[16][4];
};
+   struct kvm_lapic *phys_map[];
 };

 /* Hyper-V emulation context */
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 9880d03f533d..224fc1c5fcc6 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -120,7 +120,7 @@ static inline bool kvm_apic_map_get_logical_dest(struct 
kvm_apic_map *map,
switch (map->mode) {
case KVM_APIC_MODE_X2APIC: {
u32 offset = (dest_id >> 16) * 16;
-   u32 max_apic_id = ARRAY_SIZE(map->phys_map) - 1;
+   u32 max_apic_id = map->max_apic_id;

if (offset <= max_apic_id) {
u8 cluster_size = min(max_apic_id - offset + 1, 16U);
@@ -152,14 +152,22 @@ static void recalculate_apic_map(struct kvm *kvm)
struct kvm_apic_map *new, *old = NULL;
struct kvm_vcpu *vcpu;
int i;
-
-   new = kzalloc(sizeof(struct kvm_apic_map), GFP_KERNEL);
+   u32 max_id = 255;

mutex_lock(>arch.apic_map_lock);

+   kvm_for_each_vcpu(i, vcpu, kvm)
+   if (kvm_apic_present(vcpu))
+   max_id = max(max_id, kvm_apic_id(vcpu->arch.apic));
+
+   new = kzalloc(sizeof(struct kvm_apic_map) +
+ sizeof(struct kvm_lapic *) * (max_id + 1), GFP_KERNEL);
+


I think this may cause the host runs out of memory if a malicious guest 
did follow thing:

1. vcpu a is doing apic map recalculation.
2. vcpu b write the apic id with 0xff
3. then vcpu b enable the x2apic: in kvm_lapic_set_base(), we will set 
apic_base to new value before reset the apic id.
4. vcpu a may see the x2apic enabled in vcpu b plus an old apic 
id(0xff), and max_id will become (0xff >> 24).



if (!new)
goto out;

+   new->max_apic_id = max_id;
+
kvm_for_each_vcpu(i, vcpu, kvm) {
struct kvm_lapic *apic = vcpu->arch.apic;
struct kvm_lapic **cluster;
@@ -172,7 +180,7 @@ static void recalculate_apic_map(struct kvm *kvm)
aid = kvm_apic_id(apic);
ldr = kvm_lapic_get_reg(apic, APIC_LDR);

-   if (aid < ARRAY_SIZE(new->phys_map))
+   if (aid <= new->max_apic_id)
new->phys_map[aid] = apic;

if (apic_x2apic_mode(apic)) {
@@ -710,7 +718,7 @@ static inline bool kvm_apic_map_get_dest_lapic(struct kvm 
*kvm,
return false;

if (irq->dest_mode == APIC_DEST_PHYSICAL) {
-   if (irq->dest_id >= ARRAY_SIZE(map->phys_map)) {
+   if (irq->dest_id > map->max_apic_id) {
*bitmap = 0;
} else {
*dst = >phys_map[irq->dest_id];
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 336ba51bb16e..8d811139d2b3 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -200,7 +200,7 @@ static inline int kvm_lapic_latched_init(struct kvm_vcpu 
*vcpu)
return lapic_in_kernel(vcpu) && test_bit(KVM_APIC_INIT, 
>arch.apic->pending_events);
 }

-static inline int kvm_apic_id(struct kvm_lapic *apic)
+static inline u32 kvm_apic_id(struct kvm_lapic *apic)
 {
return (kvm_lapic_get_reg(apic, APIC_ID) >> 24) & 0xff;
 }




--
best regards
yang


[PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-07 Thread Radim Krčmář
x2APIC supports up to 2^32-1 LAPICs, but most guest in coming years will
have slighly less VCPUs.  Dynamic size saves memory at the cost of
turning one constant into a variable.

apic_map mutex had to be moved before allocation to avoid races with cpu
hotplug.

Signed-off-by: Radim Krčmář 
---
 v2:
 * replaced size with max_apic_id to minimize chances of overflow [Andrew]
 * fixed allocation size [Paolo]

 arch/x86/include/asm/kvm_host.h |  3 ++-
 arch/x86/kvm/lapic.c| 18 +-
 arch/x86/kvm/lapic.h|  2 +-
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3194b19b9c7b..643e3dffcd85 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -682,11 +682,12 @@ struct kvm_arch_memory_slot {
 struct kvm_apic_map {
struct rcu_head rcu;
u8 mode;
-   struct kvm_lapic *phys_map[256];
+   u32 max_apic_id;
union {
struct kvm_lapic *xapic_flat_map[8];
struct kvm_lapic *xapic_cluster_map[16][4];
};
+   struct kvm_lapic *phys_map[];
 };
 
 /* Hyper-V emulation context */
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 9880d03f533d..224fc1c5fcc6 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -120,7 +120,7 @@ static inline bool kvm_apic_map_get_logical_dest(struct 
kvm_apic_map *map,
switch (map->mode) {
case KVM_APIC_MODE_X2APIC: {
u32 offset = (dest_id >> 16) * 16;
-   u32 max_apic_id = ARRAY_SIZE(map->phys_map) - 1;
+   u32 max_apic_id = map->max_apic_id;
 
if (offset <= max_apic_id) {
u8 cluster_size = min(max_apic_id - offset + 1, 16U);
@@ -152,14 +152,22 @@ static void recalculate_apic_map(struct kvm *kvm)
struct kvm_apic_map *new, *old = NULL;
struct kvm_vcpu *vcpu;
int i;
-
-   new = kzalloc(sizeof(struct kvm_apic_map), GFP_KERNEL);
+   u32 max_id = 255;
 
mutex_lock(>arch.apic_map_lock);
 
+   kvm_for_each_vcpu(i, vcpu, kvm)
+   if (kvm_apic_present(vcpu))
+   max_id = max(max_id, kvm_apic_id(vcpu->arch.apic));
+
+   new = kzalloc(sizeof(struct kvm_apic_map) +
+ sizeof(struct kvm_lapic *) * (max_id + 1), GFP_KERNEL);
+
if (!new)
goto out;
 
+   new->max_apic_id = max_id;
+
kvm_for_each_vcpu(i, vcpu, kvm) {
struct kvm_lapic *apic = vcpu->arch.apic;
struct kvm_lapic **cluster;
@@ -172,7 +180,7 @@ static void recalculate_apic_map(struct kvm *kvm)
aid = kvm_apic_id(apic);
ldr = kvm_lapic_get_reg(apic, APIC_LDR);
 
-   if (aid < ARRAY_SIZE(new->phys_map))
+   if (aid <= new->max_apic_id)
new->phys_map[aid] = apic;
 
if (apic_x2apic_mode(apic)) {
@@ -710,7 +718,7 @@ static inline bool kvm_apic_map_get_dest_lapic(struct kvm 
*kvm,
return false;
 
if (irq->dest_mode == APIC_DEST_PHYSICAL) {
-   if (irq->dest_id >= ARRAY_SIZE(map->phys_map)) {
+   if (irq->dest_id > map->max_apic_id) {
*bitmap = 0;
} else {
*dst = >phys_map[irq->dest_id];
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 336ba51bb16e..8d811139d2b3 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -200,7 +200,7 @@ static inline int kvm_lapic_latched_init(struct kvm_vcpu 
*vcpu)
return lapic_in_kernel(vcpu) && test_bit(KVM_APIC_INIT, 
>arch.apic->pending_events);
 }
 
-static inline int kvm_apic_id(struct kvm_lapic *apic)
+static inline u32 kvm_apic_id(struct kvm_lapic *apic)
 {
return (kvm_lapic_get_reg(apic, APIC_ID) >> 24) & 0xff;
 }
-- 
2.9.0



[PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

2016-07-07 Thread Radim Krčmář
x2APIC supports up to 2^32-1 LAPICs, but most guest in coming years will
have slighly less VCPUs.  Dynamic size saves memory at the cost of
turning one constant into a variable.

apic_map mutex had to be moved before allocation to avoid races with cpu
hotplug.

Signed-off-by: Radim Krčmář 
---
 v2:
 * replaced size with max_apic_id to minimize chances of overflow [Andrew]
 * fixed allocation size [Paolo]

 arch/x86/include/asm/kvm_host.h |  3 ++-
 arch/x86/kvm/lapic.c| 18 +-
 arch/x86/kvm/lapic.h|  2 +-
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3194b19b9c7b..643e3dffcd85 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -682,11 +682,12 @@ struct kvm_arch_memory_slot {
 struct kvm_apic_map {
struct rcu_head rcu;
u8 mode;
-   struct kvm_lapic *phys_map[256];
+   u32 max_apic_id;
union {
struct kvm_lapic *xapic_flat_map[8];
struct kvm_lapic *xapic_cluster_map[16][4];
};
+   struct kvm_lapic *phys_map[];
 };
 
 /* Hyper-V emulation context */
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 9880d03f533d..224fc1c5fcc6 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -120,7 +120,7 @@ static inline bool kvm_apic_map_get_logical_dest(struct 
kvm_apic_map *map,
switch (map->mode) {
case KVM_APIC_MODE_X2APIC: {
u32 offset = (dest_id >> 16) * 16;
-   u32 max_apic_id = ARRAY_SIZE(map->phys_map) - 1;
+   u32 max_apic_id = map->max_apic_id;
 
if (offset <= max_apic_id) {
u8 cluster_size = min(max_apic_id - offset + 1, 16U);
@@ -152,14 +152,22 @@ static void recalculate_apic_map(struct kvm *kvm)
struct kvm_apic_map *new, *old = NULL;
struct kvm_vcpu *vcpu;
int i;
-
-   new = kzalloc(sizeof(struct kvm_apic_map), GFP_KERNEL);
+   u32 max_id = 255;
 
mutex_lock(>arch.apic_map_lock);
 
+   kvm_for_each_vcpu(i, vcpu, kvm)
+   if (kvm_apic_present(vcpu))
+   max_id = max(max_id, kvm_apic_id(vcpu->arch.apic));
+
+   new = kzalloc(sizeof(struct kvm_apic_map) +
+ sizeof(struct kvm_lapic *) * (max_id + 1), GFP_KERNEL);
+
if (!new)
goto out;
 
+   new->max_apic_id = max_id;
+
kvm_for_each_vcpu(i, vcpu, kvm) {
struct kvm_lapic *apic = vcpu->arch.apic;
struct kvm_lapic **cluster;
@@ -172,7 +180,7 @@ static void recalculate_apic_map(struct kvm *kvm)
aid = kvm_apic_id(apic);
ldr = kvm_lapic_get_reg(apic, APIC_LDR);
 
-   if (aid < ARRAY_SIZE(new->phys_map))
+   if (aid <= new->max_apic_id)
new->phys_map[aid] = apic;
 
if (apic_x2apic_mode(apic)) {
@@ -710,7 +718,7 @@ static inline bool kvm_apic_map_get_dest_lapic(struct kvm 
*kvm,
return false;
 
if (irq->dest_mode == APIC_DEST_PHYSICAL) {
-   if (irq->dest_id >= ARRAY_SIZE(map->phys_map)) {
+   if (irq->dest_id > map->max_apic_id) {
*bitmap = 0;
} else {
*dst = >phys_map[irq->dest_id];
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 336ba51bb16e..8d811139d2b3 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -200,7 +200,7 @@ static inline int kvm_lapic_latched_init(struct kvm_vcpu 
*vcpu)
return lapic_in_kernel(vcpu) && test_bit(KVM_APIC_INIT, 
>arch.apic->pending_events);
 }
 
-static inline int kvm_apic_id(struct kvm_lapic *apic)
+static inline u32 kvm_apic_id(struct kvm_lapic *apic)
 {
return (kvm_lapic_get_reg(apic, APIC_ID) >> 24) & 0xff;
 }
-- 
2.9.0