On Fri, 5 Jan 2018 22:47:22 -0200 Jose Ricardo Ziviani <jos...@linux.vnet.ibm.com> wrote:
> Power9 supports 4 HW threads/core but it's possible to emulate > doorbells to implement virtual SMT. KVM has the KVM_CAP_PPC_SMT_POSSIBLE > which returns a bitmap with all SMT modes supported by the host. > > Today, QEMU forces the SMT mode based on PVR compat table, this is > silently done in spapr_fixup_cpu_dt. Then, if user passes thread=8 the > guest will end up with 4 threads/core without any feedback to the user. > It is confusing and will crash QEMU if a cpu is hotplugged in that > guest. > > This patch makes use of KVM_CAP_PPC_SMT_POSSIBLE to check if the host > supports the SMT mode so it allows Power9 guests to have 8 threads/core > if desired. > > Reported-by: Satheesh Rajendran <sathe...@in.ibm.com> > Signed-off-by: Jose Ricardo Ziviani <jos...@linux.vnet.ibm.com> > --- Hi, I agree with the general idea but I have a few questions. The MIN(smp_threads, ppc_compat_max_threads(cpu)) computation is performed in spapr_fixup_cpu_dt() at CAS, but it is also performed in spapr_populate_cpu_dt() at machine reset or when a CPU is added. Shouldn't your patch address the latter as well ? > hw/ppc/spapr.c | 14 +++++++++++++- > hw/ppc/trace-events | 1 + > target/ppc/kvm.c | 5 +++++ > target/ppc/kvm_ppc.h | 6 ++++++ > 4 files changed, 25 insertions(+), 1 deletion(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index d1acfe8858..ea2503cd2f 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -345,7 +345,19 @@ static int spapr_fixup_cpu_dt(void *fdt, > sPAPRMachineState *spapr) > PowerPCCPU *cpu = POWERPC_CPU(cs); > DeviceClass *dc = DEVICE_GET_CLASS(cs); > int index = spapr_vcpu_id(cpu); > - int compat_smt = MIN(smp_threads, ppc_compat_max_threads(cpu)); Considering that we have: int ppc_compat_max_threads(PowerPCCPU *cpu) { const CompatInfo *compat = compat_by_pvr(cpu->compat_pvr); int n_threads = CPU(cpu)->nr_threads; if (cpu->compat_pvr) { g_assert(compat); n_threads = MIN(n_threads, compat->max_threads); } return n_threads; } and void qemu_init_vcpu(CPUState *cpu) { cpu->nr_cores = smp_cores; cpu->nr_threads = smp_threads; ... } ppc_compat_max_threads() already returns the smaller value of smp_threads and the maximum number of HW threads for the PVR. I don't quite understand why we had this compat_smt calculation in the first place... > + > + /* set smt to maximum for this current pvr if the number > + * passed is higher than defined by PVR compat mode AND > + * if KVM cannot emulate it.*/ > + int compat_smt = smp_threads; > + if ((kvmppc_cap_smt_possible() & smp_threads) != smp_threads && > + smp_threads > ppc_compat_max_threads(cpu)) { > + compat_smt = ppc_compat_max_threads(cpu); > + > + trace_spapr_fixup_cpu_smt(index, smp_threads, > + kvmppc_cap_smt_possible(), > + ppc_compat_max_threads(cpu)); > + } ... so I'm wondering if the above shouldn't be performed in ppc_compat_max_threads() directly ? > > if ((index % smt) != 0) { > continue; > diff --git a/hw/ppc/trace-events b/hw/ppc/trace-events > index b7c3e64b5e..a8e29d7ab1 100644 > --- a/hw/ppc/trace-events > +++ b/hw/ppc/trace-events > @@ -16,6 +16,7 @@ spapr_irq_alloc(int irq) "irq %d" > spapr_irq_alloc_block(int first, int num, bool lsi, int align) "first irq > %d, %d irqs, lsi=%d, alignnum %d" > spapr_irq_free(int src, int irq, int num) "Source#%d, first irq %d, %d irqs" > spapr_irq_free_warn(int src, int irq) "Source#%d, irq %d is already free" > +spapr_fixup_cpu_smt(int idx, int smpt, int kvmt, int pvrt) "CPU(%d): > expected smt %d, kvm support %d, max smt pvr %d" > > # hw/ppc/spapr_hcall.c > spapr_cas_pvr_try(uint32_t pvr) "0x%x" > diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c > index 518dd06e98..aac5667bf4 100644 > --- a/target/ppc/kvm.c > +++ b/target/ppc/kvm.c > @@ -2456,6 +2456,11 @@ bool kvmppc_has_cap_mmu_hash_v3(void) > return cap_mmu_hash_v3; > } > > +int kvmppc_cap_smt_possible(void) > +{ > + return cap_ppc_smt_possible; > +} > + > PowerPCCPUClass *kvm_ppc_get_host_cpu_class(void) > { > uint32_t host_pvr = mfpvr(); > diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h > index ecb55493cc..6ac33d2b4a 100644 > --- a/target/ppc/kvm_ppc.h > +++ b/target/ppc/kvm_ppc.h > @@ -59,6 +59,7 @@ bool kvmppc_has_cap_fixup_hcalls(void); > bool kvmppc_has_cap_htm(void); > bool kvmppc_has_cap_mmu_radix(void); > bool kvmppc_has_cap_mmu_hash_v3(void); > +int kvmppc_cap_smt_possible(void); > int kvmppc_enable_hwrng(void); > int kvmppc_put_books_sregs(PowerPCCPU *cpu); > PowerPCCPUClass *kvm_ppc_get_host_cpu_class(void); > @@ -290,6 +291,11 @@ static inline bool kvmppc_has_cap_mmu_hash_v3(void) > return false; > } > > +static inline int kvmppc_cap_smt_possible(void) > +{ > + return -1; When CONFIG_KVM is set, the semantics of kvmppc_cap_smt_possible() is: - a bitmap with supported SMT modes if KVM has KVM_CAP_PPC_SMT_POSSIBLE - 0 if KVM doesn't have KVM_CAP_PPC_SMT_POSSIBLE or we're running in TCG mode so it looks a bit weird to return -1 when CONFIG_KVM isn't set (when running in TCG mode, we would get different values depending on how the QEMU binary was compiled). Shouldn't this stub return 0 instead ? Cheers, -- Greg > +} > + > static inline int kvmppc_enable_hwrng(void) > { > return -1;