Re: [PATCH] MAINTAINERS: Update my email address to @kernel.org

2019-07-16 Thread Will Deacon
On Tue, Jul 16, 2019 at 06:43:08PM +0100, Marc Zyngier wrote:
> I will soon lose access to my @arm.com email address, so let's
> update the MAINTAINERS file to point to my @kernel.org address,
> as well as .mailmap for good measure.
> 
> Note that my @arm.com address will still work, but someone else
> will be reading whatever is sent there. Don't say you didn't know!
> 
> Signed-off-by: Marc Zyngier 
> ---
> 
> Notes:
> Yes, I'm sending this from my ARM address. That's intentional.
> I'll probably send it as part of a pull request later in the
> cycle, but that's just so that people know what is coming.
> 
>  .mailmap| 1 +
>  MAINTAINERS | 8 
>  2 files changed, 5 insertions(+), 4 deletions(-)

Let's see if you manage a better job of getting people to use your new
address than I have:

Acked-by: Will Deacon 

Will


Re: BUG: KASAN: slab-out-of-bounds in kvm_pmu_get_canonical_pmc+0x48/0x78

2019-07-16 Thread Andrew Murray
On Tue, Jul 16, 2019 at 11:14:37PM +0800, Zenghui Yu wrote:
> 
> On 2019/7/16 23:05, Zenghui Yu wrote:
> > Hi folks,
> > 
> > Running the latest kernel with KASAN enabled, we will hit the following
> > KASAN BUG during guest's boot process.
> > 
> > I'm in commit 9637d517347e80ee2fe1c5d8ce45ba1b88d8b5cd.
> > 
> > Any problems in the chained PMU code? Or just a false positive?
> > 
> > ---8<---
> > 
> > [  654.706268]
> > ==
> > [  654.706280] BUG: KASAN: slab-out-of-bounds in
> > kvm_pmu_get_canonical_pmc+0x48/0x78
> > [  654.706286] Read of size 8 at addr 801d6c8fea38 by task
> > qemu-kvm/23268
> > 
> > [  654.706296] CPU: 2 PID: 23268 Comm: qemu-kvm Not tainted 5.2.0+ #178
> > [  654.706301] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.58
> > 10/24/2018
> > [  654.706305] Call trace:
> > [  654.706311]  dump_backtrace+0x0/0x238
> > [  654.706317]  show_stack+0x24/0x30
> > [  654.706325]  dump_stack+0xe0/0x134
> > [  654.706332]  print_address_description+0x80/0x408
> > [  654.706338]  __kasan_report+0x164/0x1a0
> > [  654.706343]  kasan_report+0xc/0x18
> > [  654.706348]  __asan_load8+0x88/0xb0
> > [  654.706353]  kvm_pmu_get_canonical_pmc+0x48/0x78
> 
> I noticed that we will use "pmc->idx" and the "chained" bitmap to
> determine if the pmc is chained, in kvm_pmu_pmc_is_chained().
> 
> Should we initialize the idx and the bitmap appropriately before
> doing kvm_pmu_stop_counter()?  Like:

Hi Zenghui,

Thanks for spotting this and investigating - I'll make sure to use KASAN
in the future when testing...

> 
> 
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index 3dd8238..cf3119a 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -224,12 +224,12 @@ void kvm_pmu_vcpu_reset(struct kvm_vcpu *vcpu)
>   int i;
>   struct kvm_pmu *pmu = &vcpu->arch.pmu;
> 
> + bitmap_zero(vcpu->arch.pmu.chained, ARMV8_PMU_MAX_COUNTER_PAIRS);
> +
>   for (i = 0; i < ARMV8_PMU_MAX_COUNTERS; i++) {
> - kvm_pmu_stop_counter(vcpu, &pmu->pmc[i]);
>   pmu->pmc[i].idx = i;
> + kvm_pmu_stop_counter(vcpu, &pmu->pmc[i]);
>   }
> -
> - bitmap_zero(vcpu->arch.pmu.chained, ARMV8_PMU_MAX_COUNTER_PAIRS);
>  }

We have to be a little careful here, as the vcpu may be reset after use.
Upon resetting we must ensure that any existing perf_events are released -
this is why kvm_pmu_stop_counter is called before bitmap_zero (as
kvm_pmu_stop_counter relies on kvm_pmu_pmc_is_chained).

(For example, by clearing the bitmap before stopping the counters, we will
attempt to release the perf event for both pmc's in a chained pair. Whereas
we should only release the canonical pmc. It's actually OK right now as we
set the non-canonical pmc perf_event will be NULL - but who knows that this
will hold true in the future. The code makes the assumption that the
non-canonical perf event isn't touched on a chained pair).

The KASAN bug gets fixed by moving the assignment of idx before 
kvm_pmu_stop_counter. Therefore I'd suggest you drop the bitmap_zero hunks.

Can you send a patch with just the idx assignment hunk please?

Thanks,

Andrew Murray

> 
>  /**
> 
> 
> Thanks,
> zenghui
> 
> > [  654.706358]  kvm_pmu_stop_counter+0x28/0x118
> > [  654.706363]  kvm_pmu_vcpu_reset+0x60/0xa8
> > [  654.706369]  kvm_reset_vcpu+0x30/0x4d8
> > [  654.706376]  kvm_arch_vcpu_ioctl+0xa04/0xc18
> > [  654.706381]  kvm_vcpu_ioctl+0x17c/0xde8
> > [  654.706387]  do_vfs_ioctl+0x150/0xaf8
> > [  654.706392]  ksys_ioctl+0x84/0xb8
> > [  654.706397]  __arm64_sys_ioctl+0x4c/0x60
> > [  654.706403]  el0_svc_common.constprop.0+0xb4/0x208
> > [  654.706409]  el0_svc_handler+0x3c/0xa8
> > [  654.706414]  el0_svc+0x8/0xc
> > 
> > [  654.706422] Allocated by task 23268:
> > [  654.706429]  __kasan_kmalloc.isra.0+0xd0/0x180
> > [  654.706435]  kasan_slab_alloc+0x14/0x20
> > [  654.706440]  kmem_cache_alloc+0x17c/0x4a8
> > [  654.706445]  kvm_arch_vcpu_create+0xa0/0x130
> > [  654.706451]  kvm_vm_ioctl+0x844/0x1218
> > [  654.706456]  do_vfs_ioctl+0x150/0xaf8
> > [  654.706461]  ksys_ioctl+0x84/0xb8
> > [  654.706466]  __arm64_sys_ioctl+0x4c/0x60
> > [  654.706472]  el0_svc_common.constprop.0+0xb4/0x208
> > [  654.706478]  el0_svc_handler+0x3c/0xa8
> > [  654.706482]  el0_svc+0x8/0xc
> > 
> > [  654.706490] Freed by task 0:
> > [  654.706493] (stack is not available)
> > 
> > [  654.706501] The buggy address belongs to the object at 801d6c8fc010
> >   which belongs to the cache kvm_vcpu of size 10784
> > [  654.706507] The buggy address is located 8 bytes to the right of
> >   10784-byte region [801d6c8fc010, 801d6c8fea30)
> > [  654.706510] The buggy address belongs to the page:
> > [  654.706516] page:7e0075b23f00 refcount:1 mapcount:0
> > mapping:801db257e480 index:0x0 compound_mapcount: 0
> > [  654.706524] flags: 0xe010200(slab|head)
> > [  654.706532] raw: 0e010200 801db2586ee0 801db2586e

[PATCH] MAINTAINERS: Update my email address to @kernel.org

2019-07-16 Thread Marc Zyngier
I will soon lose access to my @arm.com email address, so let's
update the MAINTAINERS file to point to my @kernel.org address,
as well as .mailmap for good measure.

Note that my @arm.com address will still work, but someone else
will be reading whatever is sent there. Don't say you didn't know!

Signed-off-by: Marc Zyngier 
---

Notes:
Yes, I'm sending this from my ARM address. That's intentional.
I'll probably send it as part of a pull request later in the
cycle, but that's just so that people know what is coming.

 .mailmap| 1 +
 MAINTAINERS | 8 
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/.mailmap b/.mailmap
index 0fef932de3db..23cfed2e015c 100644
--- a/.mailmap
+++ b/.mailmap
@@ -132,6 +132,7 @@ Linus Lüssing  

 Li Yang  
 Li Yang  
 Maciej W. Rozycki  
+Marc Zyngier  
 Marcin Nowakowski  
 Mark Brown 
 Mark Yao  
diff --git a/MAINTAINERS b/MAINTAINERS
index 677ef41cb012..eff3dca4869d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1161,7 +1161,7 @@ F:include/uapi/linux/if_arcnet.h
 
 ARM ARCHITECTED TIMER DRIVER
 M: Mark Rutland 
-M: Marc Zyngier 
+M: Marc Zyngier 
 L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers)
 S: Maintained
 F: arch/arm/include/asm/arch_timer.h
@@ -8303,7 +8303,7 @@ S:Obsolete
 F: include/uapi/linux/ipx.h
 
 IRQ DOMAINS (IRQ NUMBER MAPPING LIBRARY)
-M: Marc Zyngier 
+M: Marc Zyngier 
 S: Maintained
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core
 F: Documentation/IRQ-domain.txt
@@ -8321,7 +8321,7 @@ F:kernel/irq/
 IRQCHIP DRIVERS
 M: Thomas Gleixner 
 M: Jason Cooper 
-M: Marc Zyngier 
+M: Marc Zyngier 
 L: linux-ker...@vger.kernel.org
 S: Maintained
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core
@@ -8633,7 +8633,7 @@ F:arch/x86/include/asm/svm.h
 F: arch/x86/kvm/svm.c
 
 KERNEL VIRTUAL MACHINE FOR ARM/ARM64 (KVM/arm, KVM/arm64)
-M: Marc Zyngier 
+M: Marc Zyngier 
 R: James Morse 
 R: Julien Thierry 
 R: Suzuki K Pouloze 
-- 
2.20.1



Re: [PATCH 53/59] KVM: arm64: nv: Implement maintenance interrupt forwarding

2019-07-16 Thread Alexandru Elisei
On 6/21/19 10:38 AM, Marc Zyngier wrote:
> When we take a maintenance interrupt, we need to decide whether
> it is generated on an action from the guest, or if it is something
> that needs to be forwarded to the guest hypervisor.
>
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm64/kvm/nested.c|  2 +-
>  virt/kvm/arm/vgic/vgic-init.c  | 30 ++
>  virt/kvm/arm/vgic/vgic-v3-nested.c | 25 +
>  3 files changed, 52 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index df2db9ab7cfb..ab61f0f30ee6 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -545,7 +545,7 @@ bool vgic_state_is_nested(struct kvm_vcpu *vcpu)
>   bool imo = __vcpu_sys_reg(vcpu, HCR_EL2) & HCR_IMO;
>   bool fmo = __vcpu_sys_reg(vcpu, HCR_EL2) & HCR_FMO;
>  
> - WARN(imo != fmo, "Separate virtual IRQ/FIQ settings not supported\n");
> + WARN_ONCE(imo != fmo, "Separate virtual IRQ/FIQ settings not 
> supported\n");
>  
>   return nested_virt_in_use(vcpu) && imo && fmo && !is_hyp_ctxt(vcpu);
>  }
> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
> index 3bdb31eaed64..ec54bc8d5126 100644
> --- a/virt/kvm/arm/vgic/vgic-init.c
> +++ b/virt/kvm/arm/vgic/vgic-init.c
> @@ -17,9 +17,11 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "vgic.h"
>  
>  /*
> @@ -240,6 +242,16 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
>   if (!irqchip_in_kernel(vcpu->kvm))
>   return 0;
>  
> + if (nested_virt_in_use(vcpu)) {
> + /* FIXME: remove this hack */
> + if (vcpu->kvm->arch.vgic.maint_irq == 0)
> + vcpu->kvm->arch.vgic.maint_irq = 
> kvm_vgic_global_state.maint_irq;
> + ret = kvm_vgic_set_owner(vcpu, vcpu->kvm->arch.vgic.maint_irq,
> +  vcpu);
> + if (ret)
> + return ret;
> + }
> +
>   /*
>* If we are creating a VCPU with a GICv3 we must also register the
>* KVM io device for the redistributor that belongs to this VCPU.
> @@ -455,12 +467,23 @@ static int vgic_init_cpu_dying(unsigned int cpu)
>  
>  static irqreturn_t vgic_maintenance_handler(int irq, void *data)
>  {
> + struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)data;
> +
>   /*
>* We cannot rely on the vgic maintenance interrupt to be
>* delivered synchronously. This means we can only use it to
>* exit the VM, and we perform the handling of EOIed
>* interrupts on the exit path (see vgic_fold_lr_state).
>*/
> +
> + /* If not nested, deactivate */
> + if (!vcpu || !vgic_state_is_nested(vcpu)) {
> + irq_set_irqchip_state(irq, IRQCHIP_STATE_ACTIVE, false);
> + return IRQ_HANDLED;
> + }
> +
> + /* Assume nested from now */
> + vgic_v3_handle_nested_maint_irq(vcpu);
>   return IRQ_HANDLED;
>  }
>  
> @@ -531,6 +554,13 @@ int kvm_vgic_hyp_init(void)
>   return ret;
>   }
>  
> + ret = irq_set_vcpu_affinity(kvm_vgic_global_state.maint_irq,
> + kvm_get_running_vcpus());
> + if (ret) {
> + kvm_err("Error setting vcpu affinity\n");
> + goto out_free_irq;
> + }
> +
>   ret = cpuhp_setup_state(CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING,
>   "kvm/arm/vgic:starting",
>   vgic_init_cpu_starting, vgic_init_cpu_dying);
> diff --git a/virt/kvm/arm/vgic/vgic-v3-nested.c 
> b/virt/kvm/arm/vgic/vgic-v3-nested.c
> index c917d49e4a14..7c5f82ae68e0 100644
> --- a/virt/kvm/arm/vgic/vgic-v3-nested.c
> +++ b/virt/kvm/arm/vgic/vgic-v3-nested.c
> @@ -172,10 +172,20 @@ void vgic_v3_sync_nested(struct kvm_vcpu *vcpu)
>  void vgic_v3_load_nested(struct kvm_vcpu *vcpu)
>  {
>   struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> + struct vgic_irq *irq;
> + unsigned long flags;
>  
>   vgic_cpu->shadow_vgic_v3 = vgic_cpu->nested_vgic_v3;
>   vgic_v3_create_shadow_lr(vcpu);
>   __vgic_v3_restore_state(vcpu_shadow_if(vcpu));
> +
> + irq = vgic_get_irq(vcpu->kvm, vcpu, vcpu->kvm->arch.vgic.maint_irq);
> + raw_spin_lock_irqsave(&irq->irq_lock, flags);
> + if (irq->line_level || irq->active)
> + irq_set_irqchip_state(kvm_vgic_global_state.maint_irq,
> +   IRQCHIP_STATE_ACTIVE, true);
> + raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
> + vgic_put_irq(vcpu->kvm, irq);
>  }
>  
>  void vgic_v3_put_nested(struct kvm_vcpu *vcpu)
> @@ -190,11 +200,14 @@ void vgic_v3_put_nested(struct kvm_vcpu *vcpu)
>*/
>   vgic_v3_fixup_shadow_lr_state(vcpu);
>   vgic_cpu->nested_vgic_v3 = vgic_cpu->shadow_vgic_v3;
> + irq_set_irqchip_state(kvm_vgic_global_state.maint_irq,
> +   IRQCHIP

Re: BUG: KASAN: slab-out-of-bounds in kvm_pmu_get_canonical_pmc+0x48/0x78

2019-07-16 Thread Zenghui Yu


On 2019/7/16 23:05, Zenghui Yu wrote:

Hi folks,

Running the latest kernel with KASAN enabled, we will hit the following
KASAN BUG during guest's boot process.

I'm in commit 9637d517347e80ee2fe1c5d8ce45ba1b88d8b5cd.

Any problems in the chained PMU code? Or just a false positive?

---8<---

[  654.706268] 
==
[  654.706280] BUG: KASAN: slab-out-of-bounds in 
kvm_pmu_get_canonical_pmc+0x48/0x78
[  654.706286] Read of size 8 at addr 801d6c8fea38 by task 
qemu-kvm/23268


[  654.706296] CPU: 2 PID: 23268 Comm: qemu-kvm Not tainted 5.2.0+ #178
[  654.706301] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.58 
10/24/2018

[  654.706305] Call trace:
[  654.706311]  dump_backtrace+0x0/0x238
[  654.706317]  show_stack+0x24/0x30
[  654.706325]  dump_stack+0xe0/0x134
[  654.706332]  print_address_description+0x80/0x408
[  654.706338]  __kasan_report+0x164/0x1a0
[  654.706343]  kasan_report+0xc/0x18
[  654.706348]  __asan_load8+0x88/0xb0
[  654.706353]  kvm_pmu_get_canonical_pmc+0x48/0x78


I noticed that we will use "pmc->idx" and the "chained" bitmap to
determine if the pmc is chained, in kvm_pmu_pmc_is_chained().

Should we initialize the idx and the bitmap appropriately before
doing kvm_pmu_stop_counter()?  Like:


diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index 3dd8238..cf3119a 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -224,12 +224,12 @@ void kvm_pmu_vcpu_reset(struct kvm_vcpu *vcpu)
int i;
struct kvm_pmu *pmu = &vcpu->arch.pmu;

+   bitmap_zero(vcpu->arch.pmu.chained, ARMV8_PMU_MAX_COUNTER_PAIRS);
+
for (i = 0; i < ARMV8_PMU_MAX_COUNTERS; i++) {
-   kvm_pmu_stop_counter(vcpu, &pmu->pmc[i]);
pmu->pmc[i].idx = i;
+   kvm_pmu_stop_counter(vcpu, &pmu->pmc[i]);
}
-
-   bitmap_zero(vcpu->arch.pmu.chained, ARMV8_PMU_MAX_COUNTER_PAIRS);
 }

 /**


Thanks,
zenghui


[  654.706358]  kvm_pmu_stop_counter+0x28/0x118
[  654.706363]  kvm_pmu_vcpu_reset+0x60/0xa8
[  654.706369]  kvm_reset_vcpu+0x30/0x4d8
[  654.706376]  kvm_arch_vcpu_ioctl+0xa04/0xc18
[  654.706381]  kvm_vcpu_ioctl+0x17c/0xde8
[  654.706387]  do_vfs_ioctl+0x150/0xaf8
[  654.706392]  ksys_ioctl+0x84/0xb8
[  654.706397]  __arm64_sys_ioctl+0x4c/0x60
[  654.706403]  el0_svc_common.constprop.0+0xb4/0x208
[  654.706409]  el0_svc_handler+0x3c/0xa8
[  654.706414]  el0_svc+0x8/0xc

[  654.706422] Allocated by task 23268:
[  654.706429]  __kasan_kmalloc.isra.0+0xd0/0x180
[  654.706435]  kasan_slab_alloc+0x14/0x20
[  654.706440]  kmem_cache_alloc+0x17c/0x4a8
[  654.706445]  kvm_arch_vcpu_create+0xa0/0x130
[  654.706451]  kvm_vm_ioctl+0x844/0x1218
[  654.706456]  do_vfs_ioctl+0x150/0xaf8
[  654.706461]  ksys_ioctl+0x84/0xb8
[  654.706466]  __arm64_sys_ioctl+0x4c/0x60
[  654.706472]  el0_svc_common.constprop.0+0xb4/0x208
[  654.706478]  el0_svc_handler+0x3c/0xa8
[  654.706482]  el0_svc+0x8/0xc

[  654.706490] Freed by task 0:
[  654.706493] (stack is not available)

[  654.706501] The buggy address belongs to the object at 801d6c8fc010
  which belongs to the cache kvm_vcpu of size 10784
[  654.706507] The buggy address is located 8 bytes to the right of
  10784-byte region [801d6c8fc010, 801d6c8fea30)
[  654.706510] The buggy address belongs to the page:
[  654.706516] page:7e0075b23f00 refcount:1 mapcount:0 
mapping:801db257e480 index:0x0 compound_mapcount: 0

[  654.706524] flags: 0xe010200(slab|head)
[  654.706532] raw: 0e010200 801db2586ee0 801db2586ee0 
801db257e480
[  654.706538] raw:  00010001 0001 


[  654.706542] page dumped because: kasan: bad access detected

[  654.706549] Memory state around the buggy address:
[  654.706554]  801d6c8fe900: 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00
[  654.706560]  801d6c8fe980: 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00
[  654.706565] >801d6c8fea00: 00 00 00 00 00 00 fc fc fc fc fc fc fc 
fc fc fc

[  654.706568] ^
[  654.706573]  801d6c8fea80: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[  654.706578]  801d6c8feb00: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[  654.706582] 
==


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


BUG: KASAN: slab-out-of-bounds in kvm_pmu_get_canonical_pmc+0x48/0x78

2019-07-16 Thread Zenghui Yu

Hi folks,

Running the latest kernel with KASAN enabled, we will hit the following
KASAN BUG during guest's boot process.

I'm in commit 9637d517347e80ee2fe1c5d8ce45ba1b88d8b5cd.

Any problems in the chained PMU code? Or just a false positive?

---8<---

[  654.706268] 
==
[  654.706280] BUG: KASAN: slab-out-of-bounds in 
kvm_pmu_get_canonical_pmc+0x48/0x78
[  654.706286] Read of size 8 at addr 801d6c8fea38 by task 
qemu-kvm/23268


[  654.706296] CPU: 2 PID: 23268 Comm: qemu-kvm Not tainted 5.2.0+ #178
[  654.706301] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.58 
10/24/2018

[  654.706305] Call trace:
[  654.706311]  dump_backtrace+0x0/0x238
[  654.706317]  show_stack+0x24/0x30
[  654.706325]  dump_stack+0xe0/0x134
[  654.706332]  print_address_description+0x80/0x408
[  654.706338]  __kasan_report+0x164/0x1a0
[  654.706343]  kasan_report+0xc/0x18
[  654.706348]  __asan_load8+0x88/0xb0
[  654.706353]  kvm_pmu_get_canonical_pmc+0x48/0x78
[  654.706358]  kvm_pmu_stop_counter+0x28/0x118
[  654.706363]  kvm_pmu_vcpu_reset+0x60/0xa8
[  654.706369]  kvm_reset_vcpu+0x30/0x4d8
[  654.706376]  kvm_arch_vcpu_ioctl+0xa04/0xc18
[  654.706381]  kvm_vcpu_ioctl+0x17c/0xde8
[  654.706387]  do_vfs_ioctl+0x150/0xaf8
[  654.706392]  ksys_ioctl+0x84/0xb8
[  654.706397]  __arm64_sys_ioctl+0x4c/0x60
[  654.706403]  el0_svc_common.constprop.0+0xb4/0x208
[  654.706409]  el0_svc_handler+0x3c/0xa8
[  654.706414]  el0_svc+0x8/0xc

[  654.706422] Allocated by task 23268:
[  654.706429]  __kasan_kmalloc.isra.0+0xd0/0x180
[  654.706435]  kasan_slab_alloc+0x14/0x20
[  654.706440]  kmem_cache_alloc+0x17c/0x4a8
[  654.706445]  kvm_arch_vcpu_create+0xa0/0x130
[  654.706451]  kvm_vm_ioctl+0x844/0x1218
[  654.706456]  do_vfs_ioctl+0x150/0xaf8
[  654.706461]  ksys_ioctl+0x84/0xb8
[  654.706466]  __arm64_sys_ioctl+0x4c/0x60
[  654.706472]  el0_svc_common.constprop.0+0xb4/0x208
[  654.706478]  el0_svc_handler+0x3c/0xa8
[  654.706482]  el0_svc+0x8/0xc

[  654.706490] Freed by task 0:
[  654.706493] (stack is not available)

[  654.706501] The buggy address belongs to the object at 801d6c8fc010
 which belongs to the cache kvm_vcpu of size 10784
[  654.706507] The buggy address is located 8 bytes to the right of
 10784-byte region [801d6c8fc010, 801d6c8fea30)
[  654.706510] The buggy address belongs to the page:
[  654.706516] page:7e0075b23f00 refcount:1 mapcount:0 
mapping:801db257e480 index:0x0 compound_mapcount: 0

[  654.706524] flags: 0xe010200(slab|head)
[  654.706532] raw: 0e010200 801db2586ee0 801db2586ee0 
801db257e480
[  654.706538] raw:  00010001 0001 


[  654.706542] page dumped because: kasan: bad access detected

[  654.706549] Memory state around the buggy address:
[  654.706554]  801d6c8fe900: 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00
[  654.706560]  801d6c8fe980: 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00
[  654.706565] >801d6c8fea00: 00 00 00 00 00 00 fc fc fc fc fc fc fc 
fc fc fc

[  654.706568] ^
[  654.706573]  801d6c8fea80: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[  654.706578]  801d6c8feb00: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[  654.706582] 
==



Thanks,
zenghui

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 50/59] KVM: arm64: nv: Nested GICv3 Support

2019-07-16 Thread Alexandru Elisei
On 6/21/19 10:38 AM, Marc Zyngier wrote:
> From: Jintack Lim 
>
> When entering a nested VM, we set up the hypervisor control interface
> based on what the guest hypervisor has set. Especially, we investigate
> each list register written by the guest hypervisor whether HW bit is
> set.  If so, we translate hw irq number from the guest's point of view
> to the real hardware irq number if there is a mapping.
>
> Signed-off-by: Jintack Lim 
> [Rewritten to support GICv3 instead of GICv2]
> Signed-off-by: Marc Zyngier 
> [Redesigned execution flow around vcpu load/put]
> Signed-off-by: Christoffer Dall 
> ---
>  arch/arm/include/asm/kvm_emulate.h |   1 +
>  arch/arm/include/asm/kvm_host.h|   6 +-
>  arch/arm64/include/asm/kvm_host.h  |   5 +-
>  arch/arm64/kvm/Makefile|   1 +
>  arch/arm64/kvm/nested.c|  10 ++
>  arch/arm64/kvm/sys_regs.c  | 178 -
>  include/kvm/arm_vgic.h |  18 +++
>  virt/kvm/arm/arm.c |   7 +-
>  virt/kvm/arm/vgic/vgic-v3-nested.c | 177 
>  virt/kvm/arm/vgic/vgic-v3.c|  28 +
>  virt/kvm/arm/vgic/vgic.c   |  32 ++
>  11 files changed, 456 insertions(+), 7 deletions(-)
>  create mode 100644 virt/kvm/arm/vgic/vgic-v3-nested.c
>
> diff --git a/arch/arm/include/asm/kvm_emulate.h 
> b/arch/arm/include/asm/kvm_emulate.h
> index 865ce545b465..a53f19041e16 100644
> --- a/arch/arm/include/asm/kvm_emulate.h
> +++ b/arch/arm/include/asm/kvm_emulate.h
> @@ -334,5 +334,6 @@ static inline unsigned long 
> vcpu_data_host_to_guest(struct kvm_vcpu *vcpu,
>  static inline void vcpu_ptrauth_setup_lazy(struct kvm_vcpu *vcpu) {}
>  
>  static inline bool is_hyp_ctxt(struct kvm_vcpu *vcpu) { return false; }
> +static inline int kvm_inject_nested_irq(struct kvm_vcpu *vcpu) { BUG(); }
>  
>  #endif /* __ARM_KVM_EMULATE_H__ */
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index cc761610e41e..d6923ed55796 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -35,10 +35,12 @@
>  #define KVM_MAX_VCPUS VGIC_V2_MAX_CPUS
>  #endif
>  
> +/* KVM_REQ_GUEST_HYP_IRQ_PENDING is actually unused */
>  #define KVM_REQ_SLEEP \
>   KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> -#define KVM_REQ_IRQ_PENDING  KVM_ARCH_REQ(1)
> -#define KVM_REQ_VCPU_RESET   KVM_ARCH_REQ(2)
> +#define KVM_REQ_IRQ_PENDING  KVM_ARCH_REQ(1)
> +#define KVM_REQ_VCPU_RESET   KVM_ARCH_REQ(2)
> +#define KVM_REQ_GUEST_HYP_IRQ_PENDINGKVM_ARCH_REQ(3)
>  
>  DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
>  
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index e0fe9acb46bf..e2e44cc650bf 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -53,8 +53,9 @@
>  
>  #define KVM_REQ_SLEEP \
>   KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> -#define KVM_REQ_IRQ_PENDING  KVM_ARCH_REQ(1)
> -#define KVM_REQ_VCPU_RESET   KVM_ARCH_REQ(2)
> +#define KVM_REQ_IRQ_PENDING  KVM_ARCH_REQ(1)
> +#define KVM_REQ_VCPU_RESET   KVM_ARCH_REQ(2)
> +#define KVM_REQ_GUEST_HYP_IRQ_PENDINGKVM_ARCH_REQ(3)
>  
>  DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
>  
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index f11bd8b0d837..045a8f18f465 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -38,3 +38,4 @@ kvm-$(CONFIG_KVM_ARM_PMU) += $(KVM)/arm/pmu.o
>  
>  kvm-$(CONFIG_KVM_ARM_HOST) += nested.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += emulate-nested.o
> +kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-v3-nested.o
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 214d59019935..df2db9ab7cfb 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -539,3 +539,13 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm)
>   kvm->arch.nested_mmus_size = 0;
>   kvm_free_stage2_pgd(&kvm->arch.mmu);
>  }
> +
> +bool vgic_state_is_nested(struct kvm_vcpu *vcpu)
> +{
> + bool imo = __vcpu_sys_reg(vcpu, HCR_EL2) & HCR_IMO;
> + bool fmo = __vcpu_sys_reg(vcpu, HCR_EL2) & HCR_FMO;
> +
> + WARN(imo != fmo, "Separate virtual IRQ/FIQ settings not supported\n");
> +
> + return nested_virt_in_use(vcpu) && imo && fmo && !is_hyp_ctxt(vcpu);
> +}
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 2031a59fcf49..ba3bcd29c02d 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -26,6 +26,8 @@
>  #include 
>  #include 
>  
> +#include 
> +
>  #include 
>  #include 
>  #include 
> @@ -505,6 +507,18 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
>   return true;
>  }
>  
> +/*
> + * The architecture says that non-secure write accesses to this register from
> + * EL1 are trapped to EL2, if either:
> + *  - HCR_EL2.FMO==1, or
> + *  - HCR_EL2.IMO==1
> + */
> +stati

Re: [RFC] Add virtual SDEI support in qemu

2019-07-16 Thread Dave Martin
On Mon, Jul 15, 2019 at 03:44:46PM +0100, Mark Rutland wrote:
> On Mon, Jul 15, 2019 at 03:26:39PM +0100, James Morse wrote:
> > On 15/07/2019 14:48, Mark Rutland wrote:
> > > On Mon, Jul 15, 2019 at 02:41:00PM +0100, Dave Martin wrote:
> > >> One option (suggested to me by James Morse) would be to allow userspace
> > >> to disable in the in-kernel PSCI implementation and provide its own
> > >> PSCI to the guest via SMC -- in which case userspace that wants to
> > >> implement SDEI would have to implement PSCI as well.
> > > 
> > > I think this would be the best approach, since it puts userspace in
> > > charge of everything.
> > > 
> > > However, this interacts poorly with FW-based mitigations that we
> > > implement in hyp. I suspect we'd probably need a mechanism to delegate
> > > that responsibility back to the kernel, and figure out if that has any
> > > interaction with thigns that got punted to userspace...
> > 
> > This has come up before:
> > https://lore.kernel.org/r/59c139d0.3040...@arm.com
> > 
> > I agree Qemu should opt-in to this, it needs to be a feature that is 
> > enabled.
> > 
> > I had an early version of something like this for testing SDEI before
> > there was firmware available. The review feedback from Christoffer was
> > that it should include HVC and SMC, their immediates, and shouldn't be
> > tied to SMC-CC ranges.
> > 
> > I think this should be a catch-all as Heyi describes to deliver
> > 'unhandled SMC/HVC' to user-space as hypercall exits. We should
> > include the immediate in the struct.
> > 
> > We can allow Qemu to disable the in-kernel PSCI implementation, which
> > would let it be done in user-space via this catch-all mechanism. (PSCI
> > in user-space has come up on another thread recently). The in-kernel
> > PSCI needs to be default-on for backwards compatibility.
> > 
> > As Mark points out, the piece that's left is the 'arch workaround'
> > stuff. We always need to handle these in the kernel. I don't think
> > these should be routed-back, they should be un-obtainable by
> > user-space.
> 
> Sure; I meant that those should be handled in the kernel rather than
> going to host userspace and back.
> 
> I was suggesting was that userspace would opt into taking ownership of
> all HVC calls, then explicitly opt-in to the kernel handling specific
> (sets of) calls.

The most logical thing to do would be to have userspace handle all
calls, but add an ioctl to forward a call to KVM.  This puts userspace
in charge of the SMCCC interface, with KVM handling only those things
that userspace can't do for itself, on request.

If the performance overhead is unacceptable for certain calls, we could
have a way to delegate specific function IDs to KVM.  I suspect that
will be the exception rather than the rule.

> There are probably issues with that, but I suspect defining "all
> undandled calls" will be problematic otherwise.

Agreed: the set of calls not handled by KVM will mutate over time.

Cheers
---Dave
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC] Add virtual SDEI support in qemu

2019-07-16 Thread Marc Zyngier
On 16/07/2019 09:30, Dave Martin wrote:
> On Mon, Jul 15, 2019 at 02:48:49PM +0100, Mark Rutland wrote:
>> On Mon, Jul 15, 2019 at 02:41:00PM +0100, Dave Martin wrote:
> 
> [...]
> 
>>> So long as KVM_EXIT_HYPERCALL reports sufficient information so that
>>> userspace can identify the cause as an SMC and retrieve the SMC
>>> immediate field, this seems feasible.
>>>
>>> For its own SMCCC APIs, KVM exclusively uses HVC, so rerouting SMC to
>>> userspace shouldn't conflict.
>>
>> Be _very_ careful here! In systems without EL3 (and without NV), SMC
>> UNDEFs rather than trapping to EL2. Given that, we shouldn't build a
>> hypervisor ABI that depends on SMC.
> 
> Good point.  I was hoping that was all ancient history by now, but if
> not...

Unfortunately, XGene-1 is still a thing...

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 0/3] Support CPU hotplug for ARM64

2019-07-16 Thread Marc Zyngier
Hi Jia,

On 16/07/2019 08:59, Jia He wrote:
> Hi Marc
> 
> On 2019/7/10 17:15, Marc Zyngier wrote:
>> On 09/07/2019 20:06, Maran Wilson wrote:
>>> On 7/5/2019 3:12 AM, James Morse wrote:
 Hi guys,

 (CC: +kvmarm list)

 On 29/06/2019 03:42, Xiongfeng Wang wrote:
> This patchset mark all the GICC node in MADT as possible CPUs even though 
> it
> is disabled. But only those enabled GICC node are marked as present CPUs.
> So that kernel will initialize some CPU related data structure in advance 
> before
> the CPU is actually hot added into the system. This patchset also 
> implement
> 'acpi_(un)map_cpu()' and 'arch_(un)register_cpu()' for ARM64. These 
> functions are
> needed to enable CPU hotplug.
>
> To support CPU hotplug, we need to add all the possible GICC node in MADT
> including those CPUs that are not present but may be hot added later. 
> Those
> CPUs are marked as disabled in GICC nodes.
 ... what do you need this for?

 (The term cpu-hotplug in the arm world almost never means hot-adding a new 
 package/die to
 the platform, we usually mean taking CPUs online/offline for power 
 management. e.g.
 cpuhp_offline_cpu_device())

 It looks like you're adding support for hot-adding a new package/die to 
 the platform ...
 but only for virtualisation.

 I don't see why this is needed for virtualisation. The in-kernel irqchip 
 needs to know
 these vcpu exist before you can enter the guest for the first time. You 
 can't create them
 late. At best you're saving the host scheduling a vcpu that is offline. Is 
 this really a
 problem?

 If we moved PSCI support to user-space, you could avoid creating host vcpu 
 threads until
 the guest brings the vcpu online, which would solve that problem, and save 
 the host
 resources for the thread too. (and its acpi/dt agnostic)

 I don't see the difference here between booting the guest with 
 'maxcpus=1', and bringing
 the vcpu online later. The only real difference seems to be moving the 
 can-be-online
 policy into the hypervisor/VMM...
>>> Isn't that an important distinction from a cloud service provider's
>>> perspective?
>>>
>>> As far as I understand it, you also need CPU hotplug capabilities to
>>> support things like Kata runtime under Kubernetes. i.e. when
>>> implementing your containers in the form of light weight VMs for the
>>> additional security ... and the orchestration layer cannot determine
>>> ahead of time how much CPU/memory resources are going to be needed to
>>> run the pod(s).
>> Why would it be any different? You can pre-allocate your vcpus, leave
>> them parked until some external agent decides to signal the container
>> that it it can use another bunch of CPUs. At that point, the container
>> must actively boot these vcpus (they aren't going to come up by magic).
>>
>> Given that you must have sized your virtual platform to deal with the
>> maximum set of resources you anticipate (think of the GIC
>> redistributors, for example), I really wonder what you gain here.
> I agree with your point in GIC aspect. It will mess up things if it makes
> 
> GIC resource hotpluggable in qemu.

It is far worse than just a mess. You'd need to come up with a way to
place your redistributors in memory, and tell the running guest where
these redistributors are. Currently, there is no method to describe such
changes to the address space, and I certainly don't want QEMU to invent
one. This needs to be modeled after what would happen on real HW.

> But it also would be better that vmm
> 
> only startup limited vcpu thread resource.
> 
> How about:
> 
> 1. qemu only starts only N vcpu thread (-smp N, maxcpus=M)
> 
> 2. qemu reserves the GIC resource with maxium M vcpu number

Note that this implies actually initializing M vcpus in the VM. You may
not have created the corresponding (M - N) threads, but the vcpus will
exist. Can you please quantify how much you'd save by doing that?

> 3. when qmp cmd cpu hotplug-add is triggerred, send a GED event to guest 
> kernel
> 
> 4. guest kernel recv it and trigger the acpi plug process.
> 
> Currently ACPI_CPU_HOTPLUG is enabled for Kconfig but completely not workable.

Well, there so far *zero* CPU_HOTPLUG in the arm64 kernel other than
getting CPUs in and out of PSCI.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...


Re: [RFC] Add virtual SDEI support in qemu

2019-07-16 Thread Dave Martin
On Mon, Jul 15, 2019 at 02:48:49PM +0100, Mark Rutland wrote:
> On Mon, Jul 15, 2019 at 02:41:00PM +0100, Dave Martin wrote:

[...]

> > So long as KVM_EXIT_HYPERCALL reports sufficient information so that
> > userspace can identify the cause as an SMC and retrieve the SMC
> > immediate field, this seems feasible.
> > 
> > For its own SMCCC APIs, KVM exclusively uses HVC, so rerouting SMC to
> > userspace shouldn't conflict.
> 
> Be _very_ careful here! In systems without EL3 (and without NV), SMC
> UNDEFs rather than trapping to EL2. Given that, we shouldn't build a
> hypervisor ABI that depends on SMC.

Good point.  I was hoping that was all ancient history by now, but if
not...

[...]

Cheers
---Dave
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 0/3] Support CPU hotplug for ARM64

2019-07-16 Thread Jia He

Hi Marc

On 2019/7/10 17:15, Marc Zyngier wrote:

On 09/07/2019 20:06, Maran Wilson wrote:

On 7/5/2019 3:12 AM, James Morse wrote:

Hi guys,

(CC: +kvmarm list)

On 29/06/2019 03:42, Xiongfeng Wang wrote:

This patchset mark all the GICC node in MADT as possible CPUs even though it
is disabled. But only those enabled GICC node are marked as present CPUs.
So that kernel will initialize some CPU related data structure in advance before
the CPU is actually hot added into the system. This patchset also implement
'acpi_(un)map_cpu()' and 'arch_(un)register_cpu()' for ARM64. These functions 
are
needed to enable CPU hotplug.

To support CPU hotplug, we need to add all the possible GICC node in MADT
including those CPUs that are not present but may be hot added later. Those
CPUs are marked as disabled in GICC nodes.

... what do you need this for?

(The term cpu-hotplug in the arm world almost never means hot-adding a new 
package/die to
the platform, we usually mean taking CPUs online/offline for power management. 
e.g.
cpuhp_offline_cpu_device())

It looks like you're adding support for hot-adding a new package/die to the 
platform ...
but only for virtualisation.

I don't see why this is needed for virtualisation. The in-kernel irqchip needs 
to know
these vcpu exist before you can enter the guest for the first time. You can't 
create them
late. At best you're saving the host scheduling a vcpu that is offline. Is this 
really a
problem?

If we moved PSCI support to user-space, you could avoid creating host vcpu 
threads until
the guest brings the vcpu online, which would solve that problem, and save the 
host
resources for the thread too. (and its acpi/dt agnostic)

I don't see the difference here between booting the guest with 'maxcpus=1', and 
bringing
the vcpu online later. The only real difference seems to be moving the 
can-be-online
policy into the hypervisor/VMM...

Isn't that an important distinction from a cloud service provider's
perspective?

As far as I understand it, you also need CPU hotplug capabilities to
support things like Kata runtime under Kubernetes. i.e. when
implementing your containers in the form of light weight VMs for the
additional security ... and the orchestration layer cannot determine
ahead of time how much CPU/memory resources are going to be needed to
run the pod(s).

Why would it be any different? You can pre-allocate your vcpus, leave
them parked until some external agent decides to signal the container
that it it can use another bunch of CPUs. At that point, the container
must actively boot these vcpus (they aren't going to come up by magic).

Given that you must have sized your virtual platform to deal with the
maximum set of resources you anticipate (think of the GIC
redistributors, for example), I really wonder what you gain here.

I agree with your point in GIC aspect. It will mess up things if it makes

GIC resource hotpluggable in qemu. But it also would be better that vmm

only startup limited vcpu thread resource.

How about:

1. qemu only starts only N vcpu thread (-smp N, maxcpus=M)

2. qemu reserves the GIC resource with maxium M vcpu number

3. when qmp cmd cpu hotplug-add is triggerred, send a GED event to guest kernel

4. guest kernel recv it and trigger the acpi plug process.

Currently ACPI_CPU_HOTPLUG is enabled for Kconfig but completely not workable.


---
Cheers,
Jia
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 0/3] Support CPU hotplug for ARM64

2019-07-16 Thread Xiongfeng Wang



On 2019/7/5 18:12, James Morse wrote:
> Hi guys,
> 
> (CC: +kvmarm list)
> 
> On 29/06/2019 03:42, Xiongfeng Wang wrote:
>> This patchset mark all the GICC node in MADT as possible CPUs even though it
>> is disabled. But only those enabled GICC node are marked as present CPUs.
>> So that kernel will initialize some CPU related data structure in advance 
>> before
>> the CPU is actually hot added into the system. This patchset also implement 
>> 'acpi_(un)map_cpu()' and 'arch_(un)register_cpu()' for ARM64. These 
>> functions are
>> needed to enable CPU hotplug.
>>
>> To support CPU hotplug, we need to add all the possible GICC node in MADT
>> including those CPUs that are not present but may be hot added later. Those
>> CPUs are marked as disabled in GICC nodes.
> 
> ... what do you need this for?
> 
> (The term cpu-hotplug in the arm world almost never means hot-adding a new 
> package/die to
> the platform, we usually mean taking CPUs online/offline for power 
> management. e.g.
> cpuhp_offline_cpu_device())
> 
> It looks like you're adding support for hot-adding a new package/die to the 
> platform ...
> but only for virtualisation.

I read the GIC driver these days. It is a lot of work to configure the GIC at 
runtime,
and this patchset doesn't support this.
Actually, my original idea is hot-adding cores to the platform, and it is only 
for virtualisation.
These cores need to be on the same physical package. The GIC is initialized when
the kernel boots and GICR is initialized when the core is hot-added and brought 
up.

> 
> I don't see why this is needed for virtualisation. The in-kernel irqchip 
> needs to know
> these vcpu exist before you can enter the guest for the first time. You can't 
> create them
> late. At best you're saving the host scheduling a vcpu that is offline. Is 
> this really a
> problem?
> 
> If we moved PSCI support to user-space, you could avoid creating host vcpu 
> threads until
> the guest brings the vcpu online, which would solve that problem, and save 
> the host
> resources for the thread too. (and its acpi/dt agnostic)
> 
> I don't see the difference here between booting the guest with 'maxcpus=1', 
> and bringing
> the vcpu online later. The only real difference seems to be moving the 
> can-be-online
> policy into the hypervisor/VMM...
> 
> 
> I think physical package/die hotadd is a much bigger, uglier problem than 
> doing the same
> under virtualisation. Its best to do this on real hardware first so we don't 
> miss
> something. (cpu-topology, numa, memory, errata, timers?)
> I'm worried that doing virtualisation first means the firmware-requirements 
> for physical
> hotadd stuff is "whatever Qemu does".
> 
> 
> Thanks,
> 
> James
> 
> .
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm