[PATCH v3 3/9] powerpc/xics: Add icp_native_cause_ipi_rm
Function to cause an IPI by directly updating the MFFR register in the XICS. The function is meant for real-mode callers since they cannot use the smp_ops->cause_ipi function which uses an ioremapped address. Normal usage is for the the KVM real mode code to set the IPI message using smp_muxed_ipi_message_pass and then invoke icp_native_cause_ipi_rm to cause the actual IPI. The function requires kvm_hstate.xics_phys to have been initialized with the physical address of XICS. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/xics.h | 1 + arch/powerpc/sysdev/xics/icp-native.c | 21 + 2 files changed, 22 insertions(+) diff --git a/arch/powerpc/include/asm/xics.h b/arch/powerpc/include/asm/xics.h index 0e25bdb..2546048 100644 --- a/arch/powerpc/include/asm/xics.h +++ b/arch/powerpc/include/asm/xics.h @@ -30,6 +30,7 @@ #ifdef CONFIG_PPC_ICP_NATIVE extern int icp_native_init(void); extern void icp_native_flush_interrupt(void); +extern void icp_native_cause_ipi_rm(int cpu); #else static inline int icp_native_init(void) { return -ENODEV; } #endif diff --git a/arch/powerpc/sysdev/xics/icp-native.c b/arch/powerpc/sysdev/xics/icp-native.c index eae3265..afdf62f 100644 --- a/arch/powerpc/sysdev/xics/icp-native.c +++ b/arch/powerpc/sysdev/xics/icp-native.c @@ -159,6 +159,27 @@ static void icp_native_cause_ipi(int cpu, unsigned long data) icp_native_set_qirr(cpu, IPI_PRIORITY); } +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE +void icp_native_cause_ipi_rm(int cpu) +{ + /* +* Currently not used to send IPIs to another CPU +* on the same core. Only caller is KVM real mode. +* Need the physical address of the XICS to be +* previously saved in kvm_hstate in the paca. +*/ + unsigned long xics_phys; + + /* +* Just like the cause_ipi functions, it is required to +* include a full barrier (out8 includes a sync) before +* causing the IPI. +*/ + xics_phys = paca[cpu].kvm_hstate.xics_phys; + out_rm8((u8 *)(xics_phys + XICS_MFRR), IPI_PRIORITY); +} +#endif + /* * Called when an interrupt is received on an off-line CPU to * clear the interrupt, so that the CPU can go back to nap mode. -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 8/9] KVM: PPC: Book3S HV: Send IPI to host core to wake VCPU
This patch adds support to real-mode KVM to search for a core running in the host partition and send it an IPI message with VCPU to be woken. This avoids having to switch to the host partition to complete an H_IPI hypercall when the VCPU which is the target of the the H_IPI is not loaded (is not running in the guest). The patch also includes the support in the IPI handler running in the host to do the wakeup by calling kvmppc_xics_ipi_action for the PPC_MSG_RM_HOST_ACTION message. When a guest is being destroyed, we need to ensure that there are no pending IPIs waiting to wake up a VCPU before we free the VCPUs of the guest. This is accomplished by: - Forces a PPC_MSG_CALL_FUNCTION IPI to be completed by all CPUs before freeing any VCPUs in kvm_arch_destroy_vm() - Any PPC_MSG_RM_HOST_ACTION messages must be executed first before any other PPC_MSG_CALL_FUNCTION messages Signed-off-by: Suresh Warrier --- arch/powerpc/kernel/smp.c| 11 + arch/powerpc/kvm/book3s_hv_rm_xics.c | 81 ++-- arch/powerpc/kvm/powerpc.c | 10 + 3 files changed, 99 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index e222efc..cb8be5d 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -257,6 +257,17 @@ irqreturn_t smp_ipi_demux(void) do { all = xchg(&info->messages, 0); +#if defined(CONFIG_KVM_XICS) && defined(CONFIG_KVM_BOOK3S_HV_POSSIBLE) + /* +* Must check for PPC_MSG_RM_HOST_ACTION messages +* before PPC_MSG_CALL_FUNCTION messages because when +* a VM is destroyed, we call kick_all_cpus_sync() +* to ensure that any pending PPC_MSG_RM_HOST_ACTION +* messages have completed before we free any VCPUs. +*/ + if (all & IPI_MESSAGE(PPC_MSG_RM_HOST_ACTION)) + kvmppc_xics_ipi_action(); +#endif if (all & IPI_MESSAGE(PPC_MSG_CALL_FUNCTION)) generic_smp_call_function_interrupt(); if (all & IPI_MESSAGE(PPC_MSG_RESCHEDULE)) diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c b/arch/powerpc/kvm/book3s_hv_rm_xics.c index 43ffbfe..a8ca3ed 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_xics.c +++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c @@ -51,11 +51,70 @@ static void ics_rm_check_resend(struct kvmppc_xics *xics, /* -- ICP routines -- */ +/* + * We start the search from our current CPU Id in the core map + * and go in a circle until we get back to our ID looking for a + * core that is running in host context and that hasn't already + * been targeted for another rm_host_ops. + * + * In the future, could consider using a fairer algorithm (one + * that distributes the IPIs better) + * + * Returns -1, if no CPU could be found in the host + * Else, returns a CPU Id which has been reserved for use + */ +static inline int grab_next_hostcore(int start, + struct kvmppc_host_rm_core *rm_core, int max, int action) +{ + bool success; + int core; + union kvmppc_rm_state old, new; + + for (core = start + 1; core < max; core++) { + old = new = READ_ONCE(rm_core[core].rm_state); + + if (!old.in_host || old.rm_action) + continue; + + /* Try to grab this host core if not taken already. */ + new.rm_action = action; + + success = cmpxchg64(&rm_core[core].rm_state.raw, + old.raw, new.raw) == old.raw; + if (success) { + /* +* Make sure that the store to the rm_action is made +* visible before we return to caller (and the +* subsequent store to rm_data) to synchronize with +* the IPI handler. +*/ + smp_wmb(); + return core; + } + } + + return -1; +} + +static inline int find_available_hostcore(int action) +{ + int core; + int my_core = smp_processor_id() >> threads_shift; + struct kvmppc_host_rm_core *rm_core = kvmppc_host_rm_ops_hv->rm_core; + + core = grab_next_hostcore(my_core, rm_core, cpu_nr_cores(), action); + if (core == -1) + core = grab_next_hostcore(core, rm_core, my_core, action); + + return core; +} + static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu, struct kvm_vcpu *this_vcpu) { struct kvmppc_icp *this_icp = this_vcpu->arch.icp; int cpu; + int hcore, hcpu; /* Mark the target VCPU as having an interrupt pending */ vcpu->stat.queue_intr++; @@ -67,11 +126,25 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu,
[PATCH v3 4/9] KVM: PPC: Book3S HV: Host-side RM data structures
This patch defines the data structures to support the setting up of host side operations while running in real mode in the guest, and also the functions to allocate and free it. The operations are for now limited to virtual XICS operations. Currently, we have only defined one operation in the data structure: - Wake up a VCPU sleeping in the host when it receives a virtual interrupt The operations are assigned at the core level because PowerKVM requires that the host run in SMT off mode. For each core, we will need to manage its state atomically - where the state is defined by: 1. Is the core running in the host? 2. Is there a Real Mode (RM) operation pending on the host? Currently, core state is only managed at the whole-core level even when the system is in split-core mode. This just limits the number of free or "available" cores in the host to perform any host-side operations. The kvmppc_host_rm_core.rm_data allows any data to be passed by KVM in real mode to the host core along with the operation to be performed. The kvmppc_host_rm_ops structure is allocated the very first time a guest VM is started. Initial core state is also set - all online cores are in the host. This structure is never deleted, not even when there are no active guests. However, it needs to be freed when the module is unloaded because the kvmppc_host_rm_ops_hv can contain function pointers to kvm-hv.ko functions for the different supported host operations. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/kvm_ppc.h | 31 arch/powerpc/kvm/book3s_hv.c | 70 arch/powerpc/kvm/book3s_hv_builtin.c | 3 ++ 3 files changed, 104 insertions(+) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index c6ef05b..47cd441 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -437,6 +437,8 @@ static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu) { return vcpu->arch.irq_type == KVMPPC_IRQ_XICS; } +extern void kvmppc_alloc_host_rm_ops(void); +extern void kvmppc_free_host_rm_ops(void); extern void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu); extern int kvmppc_xics_create_icp(struct kvm_vcpu *vcpu, unsigned long server); extern int kvm_vm_ioctl_xics_irq(struct kvm *kvm, struct kvm_irq_level *args); @@ -446,6 +448,8 @@ extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu, u32 cpu); #else +static inline void kvmppc_alloc_host_rm_ops(void) {}; +static inline void kvmppc_free_host_rm_ops(void) {}; static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu) { return 0; } static inline void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu) { } @@ -459,6 +463,33 @@ static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd) { return 0; } #endif +/* + * Host-side operations we want to set up while running in real + * mode in the guest operating on the xics. + * Currently only VCPU wakeup is supported. + */ + +union kvmppc_rm_state { + unsigned long raw; + struct { + u32 in_host; + u32 rm_action; + }; +}; + +struct kvmppc_host_rm_core { + union kvmppc_rm_state rm_state; + void *rm_data; + char pad[112]; +}; + +struct kvmppc_host_rm_ops { + struct kvmppc_host_rm_core *rm_core; + void(*vcpu_kick)(struct kvm_vcpu *vcpu); +}; + +extern struct kvmppc_host_rm_ops *kvmppc_host_rm_ops_hv; + static inline unsigned long kvmppc_get_epr(struct kvm_vcpu *vcpu) { #ifdef CONFIG_KVM_BOOKE_HV diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 54b45b7..4042623 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -2967,6 +2967,73 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu) goto out_srcu; } +#ifdef CONFIG_KVM_XICS +/* + * Allocate a per-core structure for managing state about which cores are + * running in the host versus the guest and for exchanging data between + * real mode KVM and CPU running in the host. + * This is only done for the first VM. + * The allocated structure stays even if all VMs have stopped. + * It is only freed when the kvm-hv module is unloaded. + * It's OK for this routine to fail, we just don't support host + * core operations like redirecting H_IPI wakeups. + */ +void kvmppc_alloc_host_rm_ops(void) +{ + struct kvmppc_host_rm_ops *ops; + unsigned long l_ops; + int cpu, core; + int size; + + /* Not the first time here ? */ + if (kvmppc_host_rm_ops_hv != NULL) + return; + + ops = kzalloc(sizeof(struct kvmppc_host_rm_ops), GFP_KERNEL); + if (!ops) + return; + + size = cpu_nr_cores() * sizeof(struct kvmppc_host_rm_core); + ops-
[PATCH v3 2/9] powerpc/smp: Add smp_muxed_ipi_set_message
smp_muxed_ipi_message_pass() invokes smp_ops->cause_ipi, which uses an ioremapped address to access registers on the XICS interrupt controller to cause the IPI. Because of this real mode callers cannot call smp_muxed_ipi_message_pass() for IPI messaging. This patch creates a separate function smp_muxed_ipi_set_message just to set the IPI message without the cause_ipi routine. After calling this function to set the IPI message, real mode callers must cause the IPI by writing to the XICS registers directly. As part of this, we also change smp_muxed_ipi_message_pass to call smp_muxed_ipi_set_message to set the message instead of doing it directly inside the routine. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/smp.h | 1 + arch/powerpc/kernel/smp.c | 9 - 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 9ef9c37..78083ed 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -124,6 +124,7 @@ extern const char *smp_ipi_name[]; /* for irq controllers with only a single ipi */ extern void smp_muxed_ipi_set_data(int cpu, unsigned long data); extern void smp_muxed_ipi_message_pass(int cpu, int msg); +extern void smp_muxed_ipi_set_message(int cpu, int msg); extern irqreturn_t smp_ipi_demux(void); void smp_init_pSeries(void); diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index a53a130..e222efc 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -218,7 +218,7 @@ void smp_muxed_ipi_set_data(int cpu, unsigned long data) info->data = data; } -void smp_muxed_ipi_message_pass(int cpu, int msg) +void smp_muxed_ipi_set_message(int cpu, int msg) { struct cpu_messages *info = &per_cpu(ipi_message, cpu); char *message = (char *)&info->messages; @@ -228,6 +228,13 @@ void smp_muxed_ipi_message_pass(int cpu, int msg) */ smp_mb(); message[msg] = 1; +} + +void smp_muxed_ipi_message_pass(int cpu, int msg) +{ + struct cpu_messages *info = &per_cpu(ipi_message, cpu); + + smp_muxed_ipi_set_message(cpu, msg); /* * cause_ipi functions are required to include a full barrier * before doing whatever causes the IPI. -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 6/9] KVM: PPC: Book3S HV: kvmppc_host_rm_ops - handle offlining CPUs
The kvmppc_host_rm_ops structure keeps track of which cores are are in the host by maintaining a bitmask of active/runnable online CPUs that have not entered the guest. This patch adds support to manage the bitmask when a CPU is offlined or onlined in the host. Signed-off-by: Suresh Warrier --- arch/powerpc/kvm/book3s_hv.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 95a2ed3..da2cc56 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -3012,6 +3012,36 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu) } #ifdef CONFIG_KVM_XICS +static int kvmppc_cpu_notify(struct notifier_block *self, unsigned long action, + void *hcpu) +{ + unsigned long cpu = (long)hcpu; + + switch (action) { + case CPU_UP_PREPARE: + case CPU_UP_PREPARE_FROZEN: + kvmppc_set_host_core(cpu); + break; + +#ifdef CONFIG_HOTPLUG_CPU + case CPU_DEAD: + case CPU_DEAD_FROZEN: + case CPU_UP_CANCELED: + case CPU_UP_CANCELED_FROZEN: + kvmppc_clear_host_core(cpu); + break; +#endif + default: + break; + } + + return NOTIFY_OK; +} + +static struct notifier_block kvmppc_cpu_notifier = { + .notifier_call = kvmppc_cpu_notify, +}; + /* * Allocate a per-core structure for managing state about which cores are * running in the host versus the guest and for exchanging data between @@ -3045,6 +3075,8 @@ void kvmppc_alloc_host_rm_ops(void) return; } + get_online_cpus(); + for (cpu = 0; cpu < nr_cpu_ids; cpu += threads_per_core) { if (!cpu_online(cpu)) continue; @@ -3063,14 +3095,21 @@ void kvmppc_alloc_host_rm_ops(void) l_ops = (unsigned long) ops; if (cmpxchg64((unsigned long *)&kvmppc_host_rm_ops_hv, 0, l_ops)) { + put_online_cpus(); kfree(ops->rm_core); kfree(ops); + return; } + + register_cpu_notifier(&kvmppc_cpu_notifier); + + put_online_cpus(); } void kvmppc_free_host_rm_ops(void) { if (kvmppc_host_rm_ops_hv) { + unregister_cpu_notifier(&kvmppc_cpu_notifier); kfree(kvmppc_host_rm_ops_hv->rm_core); kfree(kvmppc_host_rm_ops_hv); kvmppc_host_rm_ops_hv = NULL; -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 9/9] KVM: PPC: Book3S HV: Add tunable to control H_IPI redirection
Redirecting the wakeup of a VCPU from the H_IPI hypercall to a core running in the host is usually a good idea, most workloads seemed to benefit. However, in one heavily interrupt-driven SMT1 workload, some regression was observed. This patch adds a kvm_hv module parameter called h_ipi_redirect to control this feature. The default value for this tunable is 1 - that is enable the feature. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/kvm_ppc.h | 1 + arch/powerpc/kvm/book3s_hv.c | 11 +++ arch/powerpc/kvm/book3s_hv_rm_xics.c | 5 - 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 1b93519..29d1442 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -448,6 +448,7 @@ extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu, u32 cpu); extern void kvmppc_xics_ipi_action(void); +extern int h_ipi_redirect; #else static inline void kvmppc_alloc_host_rm_ops(void) {}; static inline void kvmppc_free_host_rm_ops(void) {}; diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index d6280ed..182ec84 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -81,6 +81,17 @@ static int target_smt_mode; module_param(target_smt_mode, int, S_IRUGO | S_IWUSR); MODULE_PARM_DESC(target_smt_mode, "Target threads per core (0 = max)"); +#ifdef CONFIG_KVM_XICS +static struct kernel_param_ops module_param_ops = { + .set = param_set_int, + .get = param_get_int, +}; + +module_param_cb(h_ipi_redirect, &module_param_ops, &h_ipi_redirect, + S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(h_ipi_redirect, "Redirect H_IPI wakeup to a free host core"); +#endif + static void kvmppc_end_cede(struct kvm_vcpu *vcpu); static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c b/arch/powerpc/kvm/book3s_hv_rm_xics.c index a8ca3ed..4c062e7 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_xics.c +++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c @@ -24,6 +24,9 @@ #define DEBUG_PASSUP +int h_ipi_redirect = 1; +EXPORT_SYMBOL(h_ipi_redirect); + static void icp_rm_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp *icp, u32 new_irq); @@ -134,7 +137,7 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu, cpu = vcpu->arch.thread_cpu; if (cpu < 0 || cpu >= nr_cpu_ids) { hcore = -1; - if (kvmppc_host_rm_ops_hv) + if (kvmppc_host_rm_ops_hv && h_ipi_redirect) hcore = find_available_hostcore(XICS_RM_KICK_VCPU); if (hcore != -1) { hcpu = hcore << threads_shift; -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 7/9] KVM: PPC: Book3S HV: Host side kick VCPU when poked by real-mode KVM
This patch adds the support for the kick VCPU operation for kvmppc_host_rm_ops. The kvmppc_xics_ipi_action() function provides the function to be invoked for a host side operation when poked by the real mode KVM. This is initiated by KVM by sending an IPI to any free host core. KVM real mode must set the rm_action to XICS_RM_KICK_VCPU and rm_data to point to the VCPU to be woken up before sending the IPI. Note that we have allocated one kvmppc_host_rm_core structure per core. The above values need to be set in the structure corresponding to the core to which the IPI will be sent. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/kvm_ppc.h | 1 + arch/powerpc/kvm/book3s_hv.c | 2 ++ arch/powerpc/kvm/book3s_hv_rm_xics.c | 36 3 files changed, 39 insertions(+) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 47cd441..1b93519 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -447,6 +447,7 @@ extern u64 kvmppc_xics_get_icp(struct kvm_vcpu *vcpu); extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu, u32 cpu); +extern void kvmppc_xics_ipi_action(void); #else static inline void kvmppc_alloc_host_rm_ops(void) {}; static inline void kvmppc_free_host_rm_ops(void) {}; diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index da2cc56..d6280ed 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -3085,6 +3085,8 @@ void kvmppc_alloc_host_rm_ops(void) ops->rm_core[core].rm_state.in_host = 1; } + ops->vcpu_kick = kvmppc_fast_vcpu_kick_hv; + /* * Make the contents of the kvmppc_host_rm_ops structure visible * to other CPUs before we assign it to the global variable. diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c b/arch/powerpc/kvm/book3s_hv_rm_xics.c index 24f5807..43ffbfe 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_xics.c +++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include "book3s_xics.h" @@ -623,3 +624,38 @@ int kvmppc_rm_h_eoi(struct kvm_vcpu *vcpu, unsigned long xirr) bail: return check_too_hard(xics, icp); } + +/* --- Non-real mode XICS-related built-in routines --- */ + +/** + * Host Operations poked by RM KVM + */ +static void rm_host_ipi_action(int action, void *data) +{ + switch (action) { + case XICS_RM_KICK_VCPU: + kvmppc_host_rm_ops_hv->vcpu_kick(data); + break; + default: + WARN(1, "Unexpected rm_action=%d data=%p\n", action, data); + break; + } + +} + +void kvmppc_xics_ipi_action(void) +{ + int core; + unsigned int cpu = smp_processor_id(); + struct kvmppc_host_rm_core *rm_corep; + + core = cpu >> threads_shift; + rm_corep = &kvmppc_host_rm_ops_hv->rm_core[core]; + + if (rm_corep->rm_data) { + rm_host_ipi_action(rm_corep->rm_state.rm_action, + rm_corep->rm_data); + rm_corep->rm_data = NULL; + rm_corep->rm_state.rm_action = 0; + } +} -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 5/9] KVM: PPC: Book3S HV: Manage core host state
Update the core host state in kvmppc_host_rm_ops whenever the primary thread of the core enters the guest or returns back. Signed-off-by: Suresh Warrier --- arch/powerpc/kvm/book3s_hv.c | 44 1 file changed, 44 insertions(+) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 4042623..95a2ed3 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -2261,6 +2261,46 @@ static void post_guest_process(struct kvmppc_vcore *vc, bool is_master) } /* + * Clear core from the list of active host cores as we are about to + * enter the guest. Only do this if it is the primary thread of the + * core (not if a subcore) that is entering the guest. + */ +static inline void kvmppc_clear_host_core(int cpu) +{ + int core; + + if (!kvmppc_host_rm_ops_hv || cpu_thread_in_core(cpu)) + return; + /* +* Memory barrier can be omitted here as we will do a smp_wmb() +* later in kvmppc_start_thread and we need ensure that state is +* visible to other CPUs only after we enter guest. +*/ + core = cpu >> threads_shift; + kvmppc_host_rm_ops_hv->rm_core[core].rm_state.in_host = 0; +} + +/* + * Advertise this core as an active host core since we exited the guest + * Only need to do this if it is the primary thread of the core that is + * exiting. + */ +static inline void kvmppc_set_host_core(int cpu) +{ + int core; + + if (!kvmppc_host_rm_ops_hv || cpu_thread_in_core(cpu)) + return; + + /* +* Memory barrier can be omitted here because we do a spin_unlock +* immediately after this which provides the memory barrier. +*/ + core = cpu >> threads_shift; + kvmppc_host_rm_ops_hv->rm_core[core].rm_state.in_host = 1; +} + +/* * Run a set of guest threads on a physical core. * Called with vc->lock held. */ @@ -2372,6 +2412,8 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc) } } + kvmppc_clear_host_core(pcpu); + /* Start all the threads */ active = 0; for (sub = 0; sub < core_info.n_subcores; ++sub) { @@ -2468,6 +2510,8 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc) kvmppc_ipi_thread(pcpu + i); } + kvmppc_set_host_core(pcpu); + spin_unlock(&vc->lock); /* make sure updates to secondary vcpu structs are visible now */ -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 0/9] KVM: PPC: Book3S HV: Optimize wakeup VCPU from H_IPI
When the VCPU target of an H_IPI hypercall is not running in the guest, we need to do a kick VCPU (wake the VCPU thread) to make it runnable. The real-mode version of the H_IPI hypercall cannot do this because it involves waking a sleeping thread. Thus the hcall returns H_TOO_HARD which forces a switch back to host so that the H_IPI call can be completed in virtual mode. This has been found to cause a slowdown for many workloads like YCSB MongoDB, small message networking, etc. One solution is to hand off this job of waking the VCPU to a CPU that is running in the host by sending it a message through the IPI mechanism from the hypercall. This patch set optimizes the wakeup of the target VCPU by posting a free core already running in the host to do the wakeup, thus avoiding the switch to host and back. It requires maintaining a bitmask of all the available cores in the system to indicate if they are in the host or running in some guest. It also requires the H_IPI hypercall to search for a free host core and send it a new IPI message PPC_MSG_RM_HOST_ACTION after stashing away some parameters like the pointer to VCPU for the IPI handler. Locks are avoided by using atomic operations to save core state, to find and reserve a core in the host, etc. Note that it is possible for a guest to be destroyed and its VCPUs freed before the IPI handler gets to run. This case is handled by ensuring that any pending PPC_MSG_RM_HOST_ACTION IPIs are completed before proceeding with freeing the VCPUs. Currently, powerpc only support 4 IPI messages and all 4 are already taken for other purposes. This patch also set increases the number of supported IPI messages to 8. It also provides the code to send an IPI from hypercall running in real-mode since the existing cause_ipi functions cannot be executed in real-mode. A tunable h_ipi_redirect is also included in the patch set to disable the feature. v3: * Updated/clarified commit logs. * Fix build break when building without KVM. v2: * Complete patch set sent to both kvm and linuxppc mailing lists to avoid build-breaks. * Broke up real mode IPI messaging function into two pieces - one to set the message and one to cause the IPI. New function icp_native_cause_ipi_rm added to arch/powerpc/sysdev/xics/icp-native.c Suresh Warrier (9): powerpc/smp: Support more IPI messages powerpc/smp: Add smp_muxed_ipi_set_message powerpc/xics: Add icp_native_cause_ipi_rm KVM: PPC: Book3S HV: Host-side RM data structures KVM: PPC: Book3S HV: Manage core host state KVM: PPC: Book3S HV: kvmppc_host_rm_ops - handle offlining CPUs KVM: PPC: Book3S HV: Host side kick VCPU when poked by real-mode KVM KVM: PPC: Book3S HV: Send IPI to host core to wake VCPU KVM: PPC: Book3S HV: Add tunable to control H_IPI redirection arch/powerpc/include/asm/kvm_ppc.h| 33 +++ arch/powerpc/include/asm/smp.h| 4 + arch/powerpc/include/asm/xics.h | 1 + arch/powerpc/kernel/smp.c | 28 +- arch/powerpc/kvm/book3s_hv.c | 166 ++ arch/powerpc/kvm/book3s_hv_builtin.c | 3 + arch/powerpc/kvm/book3s_hv_rm_xics.c | 120 +++- arch/powerpc/kvm/powerpc.c| 10 ++ arch/powerpc/sysdev/xics/icp-native.c | 21 + 9 files changed, 378 insertions(+), 8 deletions(-) -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 1/9] powerpc/smp: Support more IPI messages
This patch increases the number of demuxed messages for a controller with a single ipi to 8 for 64-bit systems This is required because we want to use the IPI mechanism to send messages from a CPU running in KVM real mode in a guest to a CPU in the host to take some action. Currently, we only support 4 messages and all 4 are already taken. Define a fifth message PPC_MSG_RM_HOST_ACTION for this purpose. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/smp.h | 3 +++ arch/powerpc/kernel/smp.c | 8 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 825663c..9ef9c37 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -114,6 +114,9 @@ extern int cpu_to_core_id(int cpu); #define PPC_MSG_TICK_BROADCAST 2 #define PPC_MSG_DEBUGGER_BREAK 3 +/* This is only used by the powernv kernel */ +#define PPC_MSG_RM_HOST_ACTION 4 + /* for irq controllers that have dedicated ipis per message (4) */ extern int smp_request_message_ipi(int virq, int message); extern const char *smp_ipi_name[]; diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index ec9ec20..a53a130 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -206,7 +206,7 @@ int smp_request_message_ipi(int virq, int msg) #ifdef CONFIG_PPC_SMP_MUXED_IPI struct cpu_messages { - int messages; /* current messages */ + long messages; /* current messages */ unsigned long data; /* data for cause ipi */ }; static DEFINE_PER_CPU_SHARED_ALIGNED(struct cpu_messages, ipi_message); @@ -236,15 +236,15 @@ void smp_muxed_ipi_message_pass(int cpu, int msg) } #ifdef __BIG_ENDIAN__ -#define IPI_MESSAGE(A) (1 << (24 - 8 * (A))) +#define IPI_MESSAGE(A) (1uL << ((BITS_PER_LONG - 8) - 8 * (A))) #else -#define IPI_MESSAGE(A) (1 << (8 * (A))) +#define IPI_MESSAGE(A) (1uL << (8 * (A))) #endif irqreturn_t smp_ipi_demux(void) { struct cpu_messages *info = this_cpu_ptr(&ipi_message); - unsigned int all; + unsigned long all; mb(); /* order any irq clear */ -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 6/9] KVM: PPC: Book3S HV: kvmppc_host_rm_ops - handle offlining CPUs
The kvmppc_host_rm_ops structure keeps track of which cores are are in the host by maintaining a bitmask of active/runnable online CPUs that have not entered the guest. This patch adds support to manage the bitmask when a CPU is offlined or onlined in the host. Signed-off-by: Suresh Warrier --- arch/powerpc/kvm/book3s_hv.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 95a2ed3..da2cc56 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -3012,6 +3012,36 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu) } #ifdef CONFIG_KVM_XICS +static int kvmppc_cpu_notify(struct notifier_block *self, unsigned long action, + void *hcpu) +{ + unsigned long cpu = (long)hcpu; + + switch (action) { + case CPU_UP_PREPARE: + case CPU_UP_PREPARE_FROZEN: + kvmppc_set_host_core(cpu); + break; + +#ifdef CONFIG_HOTPLUG_CPU + case CPU_DEAD: + case CPU_DEAD_FROZEN: + case CPU_UP_CANCELED: + case CPU_UP_CANCELED_FROZEN: + kvmppc_clear_host_core(cpu); + break; +#endif + default: + break; + } + + return NOTIFY_OK; +} + +static struct notifier_block kvmppc_cpu_notifier = { + .notifier_call = kvmppc_cpu_notify, +}; + /* * Allocate a per-core structure for managing state about which cores are * running in the host versus the guest and for exchanging data between @@ -3045,6 +3075,8 @@ void kvmppc_alloc_host_rm_ops(void) return; } + get_online_cpus(); + for (cpu = 0; cpu < nr_cpu_ids; cpu += threads_per_core) { if (!cpu_online(cpu)) continue; @@ -3063,14 +3095,21 @@ void kvmppc_alloc_host_rm_ops(void) l_ops = (unsigned long) ops; if (cmpxchg64((unsigned long *)&kvmppc_host_rm_ops_hv, 0, l_ops)) { + put_online_cpus(); kfree(ops->rm_core); kfree(ops); + return; } + + register_cpu_notifier(&kvmppc_cpu_notifier); + + put_online_cpus(); } void kvmppc_free_host_rm_ops(void) { if (kvmppc_host_rm_ops_hv) { + unregister_cpu_notifier(&kvmppc_cpu_notifier); kfree(kvmppc_host_rm_ops_hv->rm_core); kfree(kvmppc_host_rm_ops_hv); kvmppc_host_rm_ops_hv = NULL; -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/9] powerpc/smp: Support more IPI messages
This patch increases the number of demuxed messages for a controller with a single ipi to 8 for 64-bit systems This is required because we want to use the IPI mechanism to send messages from a CPU running in KVM real mode in a guest to a CPU in the host to take some action. Currently, we only support 4 messages and all 4 are already taken. Define a fifth message PPC_MSG_RM_HOST_ACTION for this purpose. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/smp.h | 3 +++ arch/powerpc/kernel/smp.c | 8 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 825663c..9ef9c37 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -114,6 +114,9 @@ extern int cpu_to_core_id(int cpu); #define PPC_MSG_TICK_BROADCAST 2 #define PPC_MSG_DEBUGGER_BREAK 3 +/* This is only used by the powernv kernel */ +#define PPC_MSG_RM_HOST_ACTION 4 + /* for irq controllers that have dedicated ipis per message (4) */ extern int smp_request_message_ipi(int virq, int message); extern const char *smp_ipi_name[]; diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index ec9ec20..a53a130 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -206,7 +206,7 @@ int smp_request_message_ipi(int virq, int msg) #ifdef CONFIG_PPC_SMP_MUXED_IPI struct cpu_messages { - int messages; /* current messages */ + long messages; /* current messages */ unsigned long data; /* data for cause ipi */ }; static DEFINE_PER_CPU_SHARED_ALIGNED(struct cpu_messages, ipi_message); @@ -236,15 +236,15 @@ void smp_muxed_ipi_message_pass(int cpu, int msg) } #ifdef __BIG_ENDIAN__ -#define IPI_MESSAGE(A) (1 << (24 - 8 * (A))) +#define IPI_MESSAGE(A) (1uL << ((BITS_PER_LONG - 8) - 8 * (A))) #else -#define IPI_MESSAGE(A) (1 << (8 * (A))) +#define IPI_MESSAGE(A) (1uL << (8 * (A))) #endif irqreturn_t smp_ipi_demux(void) { struct cpu_messages *info = this_cpu_ptr(&ipi_message); - unsigned int all; + unsigned long all; mb(); /* order any irq clear */ -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/9] powerpc/smp: Add smp_muxed_ipi_set_message
smp_muxed_ipi_message_pass() invokes smp_ops->cause_ipi, which updates the MFFR through an ioremapped address, to cause the IPI. Because of this real mode callers cannot call smp_muxed_ipi_message_pass() for IPI messaging. This patch creates a separate function smp_muxed_ipi_set_message just to set the IPI message without the cause_ipi routine. After calling this function to set the IPI message, real mode callers must cause the IPI directly. As part of this, we also change smp_muxed_ipi_message_pass to call smp_muxed_ipi_set_message to set the message instead of doing it directly inside the routine. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/smp.h | 1 + arch/powerpc/kernel/smp.c | 9 - 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 9ef9c37..78083ed 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -124,6 +124,7 @@ extern const char *smp_ipi_name[]; /* for irq controllers with only a single ipi */ extern void smp_muxed_ipi_set_data(int cpu, unsigned long data); extern void smp_muxed_ipi_message_pass(int cpu, int msg); +extern void smp_muxed_ipi_set_message(int cpu, int msg); extern irqreturn_t smp_ipi_demux(void); void smp_init_pSeries(void); diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index a53a130..e222efc 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -218,7 +218,7 @@ void smp_muxed_ipi_set_data(int cpu, unsigned long data) info->data = data; } -void smp_muxed_ipi_message_pass(int cpu, int msg) +void smp_muxed_ipi_set_message(int cpu, int msg) { struct cpu_messages *info = &per_cpu(ipi_message, cpu); char *message = (char *)&info->messages; @@ -228,6 +228,13 @@ void smp_muxed_ipi_message_pass(int cpu, int msg) */ smp_mb(); message[msg] = 1; +} + +void smp_muxed_ipi_message_pass(int cpu, int msg) +{ + struct cpu_messages *info = &per_cpu(ipi_message, cpu); + + smp_muxed_ipi_set_message(cpu, msg); /* * cause_ipi functions are required to include a full barrier * before doing whatever causes the IPI. -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 4/9] KVM: PPC: Book3S HV: Host-side RM data structures
This patch defines the data structures to support the setting up of host side operations while running in real mode in the guest, and also the functions to allocate and free it. The operations are for now limited to virtual XICS operations. Currently, we have only defined one operation in the data structure: - Wake up a VCPU sleeping in the host when it receives a virtual interrupt The operations are assigned at the core level because PowerKVM requires that the host run in SMT off mode. For each core, we will need to manage its state atomically - where the state is defined by: 1. Is the core running in the host? 2. Is there a Real Mode (RM) operation pending on the host? Currently, core state is only managed at the whole-core level even when the system is in split-core mode. This just limits the number of free or "available" cores in the host to perform any host-side operations. The kvmppc_host_rm_core.rm_data allows any data to be passed by KVM in real mode to the host core along with the operation to be performed. The kvmppc_host_rm_ops structure is allocated the very first time a guest VM is started. Initial core state is also set - all online cores are in the host. This structure is never deleted, not even when there are no active guests. However, it needs to be freed when the module is unloaded because the kvmppc_host_rm_ops_hv can contain function pointers to kvm-hv.ko functions for the different supported host operations. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/kvm_ppc.h | 31 arch/powerpc/kvm/book3s_hv.c | 70 arch/powerpc/kvm/book3s_hv_builtin.c | 3 ++ 3 files changed, 104 insertions(+) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index c6ef05b..47cd441 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -437,6 +437,8 @@ static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu) { return vcpu->arch.irq_type == KVMPPC_IRQ_XICS; } +extern void kvmppc_alloc_host_rm_ops(void); +extern void kvmppc_free_host_rm_ops(void); extern void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu); extern int kvmppc_xics_create_icp(struct kvm_vcpu *vcpu, unsigned long server); extern int kvm_vm_ioctl_xics_irq(struct kvm *kvm, struct kvm_irq_level *args); @@ -446,6 +448,8 @@ extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu, u32 cpu); #else +static inline void kvmppc_alloc_host_rm_ops(void) {}; +static inline void kvmppc_free_host_rm_ops(void) {}; static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu) { return 0; } static inline void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu) { } @@ -459,6 +463,33 @@ static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd) { return 0; } #endif +/* + * Host-side operations we want to set up while running in real + * mode in the guest operating on the xics. + * Currently only VCPU wakeup is supported. + */ + +union kvmppc_rm_state { + unsigned long raw; + struct { + u32 in_host; + u32 rm_action; + }; +}; + +struct kvmppc_host_rm_core { + union kvmppc_rm_state rm_state; + void *rm_data; + char pad[112]; +}; + +struct kvmppc_host_rm_ops { + struct kvmppc_host_rm_core *rm_core; + void(*vcpu_kick)(struct kvm_vcpu *vcpu); +}; + +extern struct kvmppc_host_rm_ops *kvmppc_host_rm_ops_hv; + static inline unsigned long kvmppc_get_epr(struct kvm_vcpu *vcpu) { #ifdef CONFIG_KVM_BOOKE_HV diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 54b45b7..4042623 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -2967,6 +2967,73 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu) goto out_srcu; } +#ifdef CONFIG_KVM_XICS +/* + * Allocate a per-core structure for managing state about which cores are + * running in the host versus the guest and for exchanging data between + * real mode KVM and CPU running in the host. + * This is only done for the first VM. + * The allocated structure stays even if all VMs have stopped. + * It is only freed when the kvm-hv module is unloaded. + * It's OK for this routine to fail, we just don't support host + * core operations like redirecting H_IPI wakeups. + */ +void kvmppc_alloc_host_rm_ops(void) +{ + struct kvmppc_host_rm_ops *ops; + unsigned long l_ops; + int cpu, core; + int size; + + /* Not the first time here ? */ + if (kvmppc_host_rm_ops_hv != NULL) + return; + + ops = kzalloc(sizeof(struct kvmppc_host_rm_ops), GFP_KERNEL); + if (!ops) + return; + + size = cpu_nr_cores() * sizeof(struct kvmppc_host_rm_core); + ops-
[PATCH v2 8/9] KVM: PPC: Book3S HV: Send IPI to host core to wake VCPU
This patch adds support to real-mode KVM to search for a core running in the host partition and send it an IPI message with VCPU to be woken. This avoids having to switch to the host partition to complete an H_IPI hypercall when the VCPU which is the target of the the H_IPI is not loaded (is not running in the guest). The patch also includes the support in the IPI handler running in the host to do the wakeup by calling kvmppc_xics_ipi_action for the PPC_MSG_RM_HOST_ACTION message. When a guest is being destroyed, we need to ensure that there are no pending IPIs waiting to wake up a VCPU before we free the VCPUs of the guest. This is accomplished by: - Forces a PPC_MSG_CALL_FUNCTION IPI to be completed by all CPUs before freeing any VCPUs in kvm_arch_destroy_vm() - Any PPC_MSG_RM_HOST_ACTION messages must be executed first before any other PPC_MSG_CALL_FUNCTION messages Signed-off-by: Suresh Warrier --- arch/powerpc/kernel/smp.c| 11 + arch/powerpc/kvm/book3s_hv_rm_xics.c | 81 ++-- arch/powerpc/kvm/powerpc.c | 10 + 3 files changed, 99 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index e222efc..cb8be5d 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -257,6 +257,17 @@ irqreturn_t smp_ipi_demux(void) do { all = xchg(&info->messages, 0); +#if defined(CONFIG_KVM_XICS) && defined(CONFIG_KVM_BOOK3S_HV_POSSIBLE) + /* +* Must check for PPC_MSG_RM_HOST_ACTION messages +* before PPC_MSG_CALL_FUNCTION messages because when +* a VM is destroyed, we call kick_all_cpus_sync() +* to ensure that any pending PPC_MSG_RM_HOST_ACTION +* messages have completed before we free any VCPUs. +*/ + if (all & IPI_MESSAGE(PPC_MSG_RM_HOST_ACTION)) + kvmppc_xics_ipi_action(); +#endif if (all & IPI_MESSAGE(PPC_MSG_CALL_FUNCTION)) generic_smp_call_function_interrupt(); if (all & IPI_MESSAGE(PPC_MSG_RESCHEDULE)) diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c b/arch/powerpc/kvm/book3s_hv_rm_xics.c index 43ffbfe..a8ca3ed 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_xics.c +++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c @@ -51,11 +51,70 @@ static void ics_rm_check_resend(struct kvmppc_xics *xics, /* -- ICP routines -- */ +/* + * We start the search from our current CPU Id in the core map + * and go in a circle until we get back to our ID looking for a + * core that is running in host context and that hasn't already + * been targeted for another rm_host_ops. + * + * In the future, could consider using a fairer algorithm (one + * that distributes the IPIs better) + * + * Returns -1, if no CPU could be found in the host + * Else, returns a CPU Id which has been reserved for use + */ +static inline int grab_next_hostcore(int start, + struct kvmppc_host_rm_core *rm_core, int max, int action) +{ + bool success; + int core; + union kvmppc_rm_state old, new; + + for (core = start + 1; core < max; core++) { + old = new = READ_ONCE(rm_core[core].rm_state); + + if (!old.in_host || old.rm_action) + continue; + + /* Try to grab this host core if not taken already. */ + new.rm_action = action; + + success = cmpxchg64(&rm_core[core].rm_state.raw, + old.raw, new.raw) == old.raw; + if (success) { + /* +* Make sure that the store to the rm_action is made +* visible before we return to caller (and the +* subsequent store to rm_data) to synchronize with +* the IPI handler. +*/ + smp_wmb(); + return core; + } + } + + return -1; +} + +static inline int find_available_hostcore(int action) +{ + int core; + int my_core = smp_processor_id() >> threads_shift; + struct kvmppc_host_rm_core *rm_core = kvmppc_host_rm_ops_hv->rm_core; + + core = grab_next_hostcore(my_core, rm_core, cpu_nr_cores(), action); + if (core == -1) + core = grab_next_hostcore(core, rm_core, my_core, action); + + return core; +} + static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu, struct kvm_vcpu *this_vcpu) { struct kvmppc_icp *this_icp = this_vcpu->arch.icp; int cpu; + int hcore, hcpu; /* Mark the target VCPU as having an interrupt pending */ vcpu->stat.queue_intr++; @@ -67,11 +126,25 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu,
[PATCH v2 5/9] KVM: PPC: Book3S HV: Manage core host state
Update the core host state in kvmppc_host_rm_ops whenever the primary thread of the core enters the guest or returns back. Signed-off-by: Suresh Warrier --- arch/powerpc/kvm/book3s_hv.c | 44 1 file changed, 44 insertions(+) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 4042623..95a2ed3 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -2261,6 +2261,46 @@ static void post_guest_process(struct kvmppc_vcore *vc, bool is_master) } /* + * Clear core from the list of active host cores as we are about to + * enter the guest. Only do this if it is the primary thread of the + * core (not if a subcore) that is entering the guest. + */ +static inline void kvmppc_clear_host_core(int cpu) +{ + int core; + + if (!kvmppc_host_rm_ops_hv || cpu_thread_in_core(cpu)) + return; + /* +* Memory barrier can be omitted here as we will do a smp_wmb() +* later in kvmppc_start_thread and we need ensure that state is +* visible to other CPUs only after we enter guest. +*/ + core = cpu >> threads_shift; + kvmppc_host_rm_ops_hv->rm_core[core].rm_state.in_host = 0; +} + +/* + * Advertise this core as an active host core since we exited the guest + * Only need to do this if it is the primary thread of the core that is + * exiting. + */ +static inline void kvmppc_set_host_core(int cpu) +{ + int core; + + if (!kvmppc_host_rm_ops_hv || cpu_thread_in_core(cpu)) + return; + + /* +* Memory barrier can be omitted here because we do a spin_unlock +* immediately after this which provides the memory barrier. +*/ + core = cpu >> threads_shift; + kvmppc_host_rm_ops_hv->rm_core[core].rm_state.in_host = 1; +} + +/* * Run a set of guest threads on a physical core. * Called with vc->lock held. */ @@ -2372,6 +2412,8 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc) } } + kvmppc_clear_host_core(pcpu); + /* Start all the threads */ active = 0; for (sub = 0; sub < core_info.n_subcores; ++sub) { @@ -2468,6 +2510,8 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc) kvmppc_ipi_thread(pcpu + i); } + kvmppc_set_host_core(pcpu); + spin_unlock(&vc->lock); /* make sure updates to secondary vcpu structs are visible now */ -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/9] powerpc/powernv: Add icp_native_cause_ipi_rm
Function to cause an IPI. Requires kvm_hstate.xics_phys to be initialized with physical address of XICS. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/xics.h | 1 + arch/powerpc/sysdev/xics/icp-native.c | 19 +++ 2 files changed, 20 insertions(+) diff --git a/arch/powerpc/include/asm/xics.h b/arch/powerpc/include/asm/xics.h index 0e25bdb..2546048 100644 --- a/arch/powerpc/include/asm/xics.h +++ b/arch/powerpc/include/asm/xics.h @@ -30,6 +30,7 @@ #ifdef CONFIG_PPC_ICP_NATIVE extern int icp_native_init(void); extern void icp_native_flush_interrupt(void); +extern void icp_native_cause_ipi_rm(int cpu); #else static inline int icp_native_init(void) { return -ENODEV; } #endif diff --git a/arch/powerpc/sysdev/xics/icp-native.c b/arch/powerpc/sysdev/xics/icp-native.c index eae3265..e39b18a 100644 --- a/arch/powerpc/sysdev/xics/icp-native.c +++ b/arch/powerpc/sysdev/xics/icp-native.c @@ -159,6 +159,25 @@ static void icp_native_cause_ipi(int cpu, unsigned long data) icp_native_set_qirr(cpu, IPI_PRIORITY); } +void icp_native_cause_ipi_rm(int cpu) +{ + /* +* Currently not used to send IPIs to another CPU +* on the same core. Only caller is KVM real mode. +* Need the physical address of the XICS to be +* previously saved in kvm_hstate in the paca. +*/ + unsigned long xics_phys; + + /* +* Just like the cause_ipi functions, it is required to +* include a full barrier (out8 includes a sync) before +* causing the IPI. +*/ + xics_phys = paca[cpu].kvm_hstate.xics_phys; + out_rm8((u8 *)(xics_phys + XICS_MFRR), IPI_PRIORITY); +} + /* * Called when an interrupt is received on an off-line CPU to * clear the interrupt, so that the CPU can go back to nap mode. -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 9/9] KVM: PPC: Book3S HV: Add tunable to control H_IPI redirection
Redirecting the wakeup of a VCPU from the H_IPI hypercall to a core running in the host is usually a good idea, most workloads seemed to benefit. However, in one heavily interrupt-driven SMT1 workload, some regression was observed. This patch adds a kvm_hv module parameter called h_ipi_redirect to control this feature. The default value for this tunable is 1 - that is enable the feature. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/kvm_ppc.h | 1 + arch/powerpc/kvm/book3s_hv.c | 11 +++ arch/powerpc/kvm/book3s_hv_rm_xics.c | 5 - 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 1b93519..29d1442 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -448,6 +448,7 @@ extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu, u32 cpu); extern void kvmppc_xics_ipi_action(void); +extern int h_ipi_redirect; #else static inline void kvmppc_alloc_host_rm_ops(void) {}; static inline void kvmppc_free_host_rm_ops(void) {}; diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index d6280ed..182ec84 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -81,6 +81,17 @@ static int target_smt_mode; module_param(target_smt_mode, int, S_IRUGO | S_IWUSR); MODULE_PARM_DESC(target_smt_mode, "Target threads per core (0 = max)"); +#ifdef CONFIG_KVM_XICS +static struct kernel_param_ops module_param_ops = { + .set = param_set_int, + .get = param_get_int, +}; + +module_param_cb(h_ipi_redirect, &module_param_ops, &h_ipi_redirect, + S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(h_ipi_redirect, "Redirect H_IPI wakeup to a free host core"); +#endif + static void kvmppc_end_cede(struct kvm_vcpu *vcpu); static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c b/arch/powerpc/kvm/book3s_hv_rm_xics.c index a8ca3ed..4c062e7 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_xics.c +++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c @@ -24,6 +24,9 @@ #define DEBUG_PASSUP +int h_ipi_redirect = 1; +EXPORT_SYMBOL(h_ipi_redirect); + static void icp_rm_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp *icp, u32 new_irq); @@ -134,7 +137,7 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu, cpu = vcpu->arch.thread_cpu; if (cpu < 0 || cpu >= nr_cpu_ids) { hcore = -1; - if (kvmppc_host_rm_ops_hv) + if (kvmppc_host_rm_ops_hv && h_ipi_redirect) hcore = find_available_hostcore(XICS_RM_KICK_VCPU); if (hcore != -1) { hcpu = hcore << threads_shift; -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 7/9] KVM: PPC: Book3S HV: Host side kick VCPU when poked by real-mode KVM
This patch adds the support for the kick VCPU operation for kvmppc_host_rm_ops. The kvmppc_xics_ipi_action() function provides the function to be invoked for a host side operation when poked by the real mode KVM. This is initiated by KVM by sending an IPI to any free host core. KVM real mode must set the rm_action to XICS_RM_KICK_VCPU and rm_data to point to the VCPU to be woken up before sending the IPI. Note that we have allocated one kvmppc_host_rm_core structure per core. The above values need to be set in the structure corresponding to the core to which the IPI will be sent. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/kvm_ppc.h | 1 + arch/powerpc/kvm/book3s_hv.c | 2 ++ arch/powerpc/kvm/book3s_hv_rm_xics.c | 36 3 files changed, 39 insertions(+) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 47cd441..1b93519 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -447,6 +447,7 @@ extern u64 kvmppc_xics_get_icp(struct kvm_vcpu *vcpu); extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu, u32 cpu); +extern void kvmppc_xics_ipi_action(void); #else static inline void kvmppc_alloc_host_rm_ops(void) {}; static inline void kvmppc_free_host_rm_ops(void) {}; diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index da2cc56..d6280ed 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -3085,6 +3085,8 @@ void kvmppc_alloc_host_rm_ops(void) ops->rm_core[core].rm_state.in_host = 1; } + ops->vcpu_kick = kvmppc_fast_vcpu_kick_hv; + /* * Make the contents of the kvmppc_host_rm_ops structure visible * to other CPUs before we assign it to the global variable. diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c b/arch/powerpc/kvm/book3s_hv_rm_xics.c index 24f5807..43ffbfe 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_xics.c +++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include "book3s_xics.h" @@ -623,3 +624,38 @@ int kvmppc_rm_h_eoi(struct kvm_vcpu *vcpu, unsigned long xirr) bail: return check_too_hard(xics, icp); } + +/* --- Non-real mode XICS-related built-in routines --- */ + +/** + * Host Operations poked by RM KVM + */ +static void rm_host_ipi_action(int action, void *data) +{ + switch (action) { + case XICS_RM_KICK_VCPU: + kvmppc_host_rm_ops_hv->vcpu_kick(data); + break; + default: + WARN(1, "Unexpected rm_action=%d data=%p\n", action, data); + break; + } + +} + +void kvmppc_xics_ipi_action(void) +{ + int core; + unsigned int cpu = smp_processor_id(); + struct kvmppc_host_rm_core *rm_corep; + + core = cpu >> threads_shift; + rm_corep = &kvmppc_host_rm_ops_hv->rm_core[core]; + + if (rm_corep->rm_data) { + rm_host_ipi_action(rm_corep->rm_state.rm_action, + rm_corep->rm_data); + rm_corep->rm_data = NULL; + rm_corep->rm_state.rm_action = 0; + } +} -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/9] KVM: PPC: Book3S HV: Optimize wakeup VCPU from H_IPI
When the VCPU target of an H_IPI hypercall is not running in the guest, we need to do a kick VCPU (wake the VCPU thread) to make it runnable. The real-mode version of the H_IPI hypercall cannot do this because it involves waking a sleeping thread. Thus the hcall returns H_TOO_HARD which forces a switch back to host so that the H_IPI call can be completed in virtual mode. This has been found to cause a slowdown for many workloads like YCSB MongoDB, small message networking, etc. One solution is to hand off this job of waking the VCPU to a CPU that is running in the host by sending it a message through the IPI mechanism from the hypercall. This patch set optimizes the wakeup of the target VCPU by posting a free core already running in the host to do the wakeup, thus avoiding the switch to host and back. It requires maintaining a bitmask of all the available cores in the system to indicate if they are in the host or running in some guest. It also requires the H_IPI hypercall to search for a free host core and send it a new IPI message PPC_MSG_RM_HOST_ACTION after stashing away some parameters like the pointer to VCPU for the IPI handler. Locks are avoided by using atomic operations to save core state, to find and reserve a core in the host, etc. Note that it is possible for a guest to be destroyed and its VCPUs freed before the IPI handler gets to run. This case is handled by ensuring that any pending PPC_MSG_RM_HOST_ACTION IPIs are completed before proceeding with freeing the VCPUs. Currently, powerpc only support 4 IPI messages and all 4 are already taken for other purposes. This patch also set increases the number of supported IPI messages to 8. It also provides the code to send an IPI from hypercall running in real-mode since the existing cause_ipi functions cannot be executed in real-mode. A tunable h_ipi_redirect is also included in the patch set to disable the feature. v2: * Complete patch set sent to both kvm and linuxppc mailing lists to avoid build-breaks. * Broke up real mode IPI messaging function into two pieces - one to set the message and one to cause the IPI. New function icp_native_cause_ipi_rm added to arch/powerpc/sysdev/xics/icp-native.c Suresh Warrier (9): powerpc/smp: Support more IPI messages powerpc/smp: Add smp_muxed_ipi_set_message powerpc/powernv: Add icp_native_cause_ipi_rm KVM: PPC: Book3S HV: Host-side RM data structures KVM: PPC: Book3S HV: Manage core host state KVM: PPC: Book3S HV: kvmppc_host_rm_ops - handle offlining CPUs KVM: PPC: Book3S HV: Host side kick VCPU when poked by real-mode KVM KVM: PPC: Book3S HV: Send IPI to host core to wake VCPU KVM: PPC: Book3S HV: Add tunable to control H_IPI redirection arch/powerpc/include/asm/kvm_ppc.h| 33 +++ arch/powerpc/include/asm/smp.h| 4 + arch/powerpc/include/asm/xics.h | 1 + arch/powerpc/kernel/smp.c | 28 +- arch/powerpc/kvm/book3s_hv.c | 166 ++ arch/powerpc/kvm/book3s_hv_builtin.c | 3 + arch/powerpc/kvm/book3s_hv_rm_xics.c | 120 +++- arch/powerpc/kvm/powerpc.c| 10 ++ arch/powerpc/sysdev/xics/icp-native.c | 19 9 files changed, 376 insertions(+), 8 deletions(-) -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/6] KVM: PPC: Book3S HV: kvmppc_host_rm_ops - handle offlining CPUs
The kvmppc_host_rm_ops structure keeps track of which cores are are in the host by maintaining a bitmask of active/runnable online CPUs that have not entered the guest. This patch adds support to manage the bitmask when a CPU is offlined or onlined in the host. Signed-off-by: Suresh Warrier --- arch/powerpc/kvm/book3s_hv.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 7a62aaa..390af5b 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -3012,6 +3012,36 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu) } #ifdef CONFIG_KVM_XICS +static int kvmppc_cpu_notify(struct notifier_block *self, unsigned long action, + void *hcpu) +{ + unsigned long cpu = (long)hcpu; + + switch (action) { + case CPU_UP_PREPARE: + case CPU_UP_PREPARE_FROZEN: + kvmppc_set_host_core(cpu); + break; + +#ifdef CONFIG_HOTPLUG_CPU + case CPU_DEAD: + case CPU_DEAD_FROZEN: + case CPU_UP_CANCELED: + case CPU_UP_CANCELED_FROZEN: + kvmppc_clear_host_core(cpu); + break; +#endif + default: + break; + } + + return NOTIFY_OK; +} + +static struct notifier_block kvmppc_cpu_notifier = { + .notifier_call = kvmppc_cpu_notify, +}; + /* * Allocate a per-core structure for managing state about which cores are * running in the host versus the guest and for exchanging data between @@ -3045,6 +3075,8 @@ void kvmppc_alloc_host_rm_ops(void) return; } + get_online_cpus(); + for (cpu = 0; cpu < nr_cpu_ids; cpu += threads_per_core) { if (!cpu_online(cpu)) continue; @@ -3063,14 +3095,21 @@ void kvmppc_alloc_host_rm_ops(void) l_ops = (unsigned long) ops; if (cmpxchg64((unsigned long *)&kvmppc_host_rm_ops_hv, 0, l_ops)) { + put_online_cpus(); kfree(ops->rm_core); kfree(ops); + return; } + + register_cpu_notifier(&kvmppc_cpu_notifier); + + put_online_cpus(); } void kvmppc_free_host_rm_ops(void) { if (kvmppc_host_rm_ops_hv) { + unregister_cpu_notifier(&kvmppc_cpu_notifier); kfree(kvmppc_host_rm_ops_hv->rm_core); kfree(kvmppc_host_rm_ops_hv); kvmppc_host_rm_ops_hv = NULL; -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/6] KVM: PPC: Book3S HV: Send IPI to host core to wake VCPU
This patch adds support to real-mode KVM to search for a core running in the host partition and send it an IPI message with VCPU to be woken. This avoids having to switch to the host partition to complete an H_IPI hypercall when the VCPU which is the target of the the H_IPI is not loaded (is not running in the guest). The patch also includes the support in the IPI handler running in the host to do the wakeup by calling kvmppc_xics_ipi_action for the PPC_MSG_RM_HOST_ACTION message. When a guest is being destroyed, we need to ensure that there are no pending IPIs waiting to wake up a VCPU before we free the VCPUs of the guest. This is accomplished by: - Forces a PPC_MSG_CALL_FUNCTION IPI to be completed by all CPUs before freeing any VCPUs in kvm_arch_destroy_vm() - Any PPC_MSG_RM_HOST_ACTION messages must be executed first before any other PPC_MSG_CALL_FUNCTION messages Signed-off-by: Suresh Warrier --- arch/powerpc/kernel/smp.c| 11 + arch/powerpc/kvm/book3s_hv_rm_xics.c | 81 ++-- arch/powerpc/kvm/powerpc.c | 10 + 3 files changed, 99 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 8c07bfad..8958c2a 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -280,6 +280,17 @@ irqreturn_t smp_ipi_demux(void) do { all = xchg(&info->messages, 0); +#if defined(CONFIG_KVM_XICS) && defined(CONFIG_KVM_BOOK3S_HV_POSSIBLE) + /* +* Must check for PPC_MSG_RM_HOST_ACTION messages +* before PPC_MSG_CALL_FUNCTION messages because when +* a VM is destroyed, we call kick_all_cpus_sync() +* to ensure that any pending PPC_MSG_RM_HOST_ACTION +* messages have completed before we free any VCPUs. +*/ + if (all & IPI_MESSAGE(PPC_MSG_RM_HOST_ACTION)) + kvmppc_xics_ipi_action(); +#endif if (all & IPI_MESSAGE(PPC_MSG_CALL_FUNCTION)) generic_smp_call_function_interrupt(); if (all & IPI_MESSAGE(PPC_MSG_RESCHEDULE)) diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c b/arch/powerpc/kvm/book3s_hv_rm_xics.c index 43ffbfe..b8d6403 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_xics.c +++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c @@ -51,11 +51,70 @@ static void ics_rm_check_resend(struct kvmppc_xics *xics, /* -- ICP routines -- */ +/* + * We start the search from our current CPU Id in the core map + * and go in a circle until we get back to our ID looking for a + * core that is running in host context and that hasn't already + * been targeted for another rm_host_ops. + * + * In the future, could consider using a fairer algorithm (one + * that distributes the IPIs better) + * + * Returns -1, if no CPU could be found in the host + * Else, returns a CPU Id which has been reserved for use + */ +static inline int grab_next_hostcore(int start, + struct kvmppc_host_rm_core *rm_core, int max, int action) +{ + bool success; + int core; + union kvmppc_rm_state old, new; + + for (core = start + 1; core < max; core++) { + old = new = READ_ONCE(rm_core[core].rm_state); + + if (!old.in_host || old.rm_action) + continue; + + /* Try to grab this host core if not taken already. */ + new.rm_action = action; + + success = cmpxchg64(&rm_core[core].rm_state.raw, + old.raw, new.raw) == old.raw; + if (success) { + /* +* Make sure that the store to the rm_action is made +* visible before we return to caller (and the +* subsequent store to rm_data) to synchronize with +* the IPI handler. +*/ + smp_wmb(); + return core; + } + } + + return -1; +} + +static inline int find_available_hostcore(int action) +{ + int core; + int my_core = smp_processor_id() >> threads_shift; + struct kvmppc_host_rm_core *rm_core = kvmppc_host_rm_ops_hv->rm_core; + + core = grab_next_hostcore(my_core, rm_core, cpu_nr_cores(), action); + if (core == -1) + core = grab_next_hostcore(core, rm_core, my_core, action); + + return core; +} + static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu, struct kvm_vcpu *this_vcpu) { struct kvmppc_icp *this_icp = this_vcpu->arch.icp; int cpu; + int hcore, hcpu; /* Mark the target VCPU as having an interrupt pending */ vcpu->stat.queue_intr++; @@ -67,11 +126,25 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu,
[PATCH 1/6] KVM: PPC: Book3S HV: Host-side RM data structures
This patch defines the data structures to support the setting up of host side operations while running in real mode in the guest, and also the functions to allocate and free it. The operations are for now limited to virtual XICS operations. Currently, we have only defined one operation in the data structure: - Wake up a VCPU sleeping in the host when it receives a virtual interrupt The operations are assigned at the core level because PowerKVM requires that the host run in SMT off mode. For each core, we will need to manage its state atomically - where the state is defined by: 1. Is the core running in the host? 2. Is there a Real Mode (RM) operation pending on the host? Currently, core state is only managed at the whole-core level even when the system is in split-core mode. This just limits the number of free or "available" cores in the host to perform any host-side operations. The kvmppc_host_rm_core.rm_data allows any data to be passed by KVM in real mode to the host core along with the operation to be performed. The kvmppc_host_rm_ops structure is allocated the very first time a guest VM is started. Initial core state is also set - all online cores are in the host. This structure is never deleted, not even when there are no active guests. However, it needs to be freed when the module is unloaded because the kvmppc_host_rm_ops_hv can contain function pointers to kvm-hv.ko functions for the different supported host operations. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/kvm_ppc.h | 31 arch/powerpc/kvm/book3s_hv.c | 70 arch/powerpc/kvm/book3s_hv_builtin.c | 3 ++ 3 files changed, 104 insertions(+) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index c6ef05b..47cd441 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -437,6 +437,8 @@ static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu) { return vcpu->arch.irq_type == KVMPPC_IRQ_XICS; } +extern void kvmppc_alloc_host_rm_ops(void); +extern void kvmppc_free_host_rm_ops(void); extern void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu); extern int kvmppc_xics_create_icp(struct kvm_vcpu *vcpu, unsigned long server); extern int kvm_vm_ioctl_xics_irq(struct kvm *kvm, struct kvm_irq_level *args); @@ -446,6 +448,8 @@ extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu, u32 cpu); #else +static inline void kvmppc_alloc_host_rm_ops(void) {}; +static inline void kvmppc_free_host_rm_ops(void) {}; static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu) { return 0; } static inline void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu) { } @@ -459,6 +463,33 @@ static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd) { return 0; } #endif +/* + * Host-side operations we want to set up while running in real + * mode in the guest operating on the xics. + * Currently only VCPU wakeup is supported. + */ + +union kvmppc_rm_state { + unsigned long raw; + struct { + u32 in_host; + u32 rm_action; + }; +}; + +struct kvmppc_host_rm_core { + union kvmppc_rm_state rm_state; + void *rm_data; + char pad[112]; +}; + +struct kvmppc_host_rm_ops { + struct kvmppc_host_rm_core *rm_core; + void(*vcpu_kick)(struct kvm_vcpu *vcpu); +}; + +extern struct kvmppc_host_rm_ops *kvmppc_host_rm_ops_hv; + static inline unsigned long kvmppc_get_epr(struct kvm_vcpu *vcpu) { #ifdef CONFIG_KVM_BOOKE_HV diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 9c26c5a..9176e56 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -2967,6 +2967,73 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu) goto out_srcu; } +#ifdef CONFIG_KVM_XICS +/* + * Allocate a per-core structure for managing state about which cores are + * running in the host versus the guest and for exchanging data between + * real mode KVM and CPU running in the host. + * This is only done for the first VM. + * The allocated structure stays even if all VMs have stopped. + * It is only freed when the kvm-hv module is unloaded. + * It's OK for this routine to fail, we just don't support host + * core operations like redirecting H_IPI wakeups. + */ +void kvmppc_alloc_host_rm_ops(void) +{ + struct kvmppc_host_rm_ops *ops; + unsigned long l_ops; + int cpu, core; + int size; + + /* Not the first time here ? */ + if (kvmppc_host_rm_ops_hv != NULL) + return; + + ops = kzalloc(sizeof(struct kvmppc_host_rm_ops), GFP_KERNEL); + if (!ops) + return; + + size = cpu_nr_cores() * sizeof(struct kvmppc_host_rm_core); + ops-
[PATCH 4/6] KVM: PPC: Book3S HV: Host side kick VCPU when poked by real-mode KVM
This patch adds the support for the kick VCPU operation for kvmppc_host_rm_ops. The kvmppc_xics_ipi_action() function provides the function to be invoked for a host side operation when poked by the real mode KVM. This is initiated by KVM by sending an IPI to any free host core. KVM real mode must set the rm_action to XICS_RM_KICK_VCPU and rm_data to point to the VCPU to be woken up before sending the IPI. Note that we have allocated one kvmppc_host_rm_core structure per core. The above values need to be set in the structure corresponding to the core to which the IPI will be sent. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/kvm_ppc.h | 1 + arch/powerpc/kvm/book3s_hv.c | 2 ++ arch/powerpc/kvm/book3s_hv_rm_xics.c | 36 3 files changed, 39 insertions(+) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 47cd441..1b93519 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -447,6 +447,7 @@ extern u64 kvmppc_xics_get_icp(struct kvm_vcpu *vcpu); extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu, u32 cpu); +extern void kvmppc_xics_ipi_action(void); #else static inline void kvmppc_alloc_host_rm_ops(void) {}; static inline void kvmppc_free_host_rm_ops(void) {}; diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 390af5b..80b1eb3 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -3085,6 +3085,8 @@ void kvmppc_alloc_host_rm_ops(void) ops->rm_core[core].rm_state.in_host = 1; } + ops->vcpu_kick = kvmppc_fast_vcpu_kick_hv; + /* * Make the contents of the kvmppc_host_rm_ops structure visible * to other CPUs before we assign it to the global variable. diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c b/arch/powerpc/kvm/book3s_hv_rm_xics.c index 24f5807..43ffbfe 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_xics.c +++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include "book3s_xics.h" @@ -623,3 +624,38 @@ int kvmppc_rm_h_eoi(struct kvm_vcpu *vcpu, unsigned long xirr) bail: return check_too_hard(xics, icp); } + +/* --- Non-real mode XICS-related built-in routines --- */ + +/** + * Host Operations poked by RM KVM + */ +static void rm_host_ipi_action(int action, void *data) +{ + switch (action) { + case XICS_RM_KICK_VCPU: + kvmppc_host_rm_ops_hv->vcpu_kick(data); + break; + default: + WARN(1, "Unexpected rm_action=%d data=%p\n", action, data); + break; + } + +} + +void kvmppc_xics_ipi_action(void) +{ + int core; + unsigned int cpu = smp_processor_id(); + struct kvmppc_host_rm_core *rm_corep; + + core = cpu >> threads_shift; + rm_corep = &kvmppc_host_rm_ops_hv->rm_core[core]; + + if (rm_corep->rm_data) { + rm_host_ipi_action(rm_corep->rm_state.rm_action, + rm_corep->rm_data); + rm_corep->rm_data = NULL; + rm_corep->rm_state.rm_action = 0; + } +} -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6] KVM: PPC: Book3S HV: Manage core host state
Update the core host state in kvmppc_host_rm_ops whenever the primary thread of the core enters the guest or returns back. Signed-off-by: Suresh Warrier --- arch/powerpc/kvm/book3s_hv.c | 44 1 file changed, 44 insertions(+) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 9176e56..7a62aaa 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -2261,6 +2261,46 @@ static void post_guest_process(struct kvmppc_vcore *vc, bool is_master) } /* + * Clear core from the list of active host cores as we are about to + * enter the guest. Only do this if it is the primary thread of the + * core (not if a subcore) that is entering the guest. + */ +static inline void kvmppc_clear_host_core(int cpu) +{ + int core; + + if (!kvmppc_host_rm_ops_hv || cpu_thread_in_core(cpu)) + return; + /* +* Memory barrier can be omitted here as we will do a smp_wmb() +* later in kvmppc_start_thread and we need ensure that state is +* visible to other CPUs only after we enter guest. +*/ + core = cpu >> threads_shift; + kvmppc_host_rm_ops_hv->rm_core[core].rm_state.in_host = 0; +} + +/* + * Advertise this core as an active host core since we exited the guest + * Only need to do this if it is the primary thread of the core that is + * exiting. + */ +static inline void kvmppc_set_host_core(int cpu) +{ + int core; + + if (!kvmppc_host_rm_ops_hv || cpu_thread_in_core(cpu)) + return; + + /* +* Memory barrier can be omitted here because we do a spin_unlock +* immediately after this which provides the memory barrier. +*/ + core = cpu >> threads_shift; + kvmppc_host_rm_ops_hv->rm_core[core].rm_state.in_host = 1; +} + +/* * Run a set of guest threads on a physical core. * Called with vc->lock held. */ @@ -2372,6 +2412,8 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc) } } + kvmppc_clear_host_core(pcpu); + /* Start all the threads */ active = 0; for (sub = 0; sub < core_info.n_subcores; ++sub) { @@ -2468,6 +2510,8 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc) kvmppc_ipi_thread(pcpu + i); } + kvmppc_set_host_core(pcpu); + spin_unlock(&vc->lock); /* make sure updates to secondary vcpu structs are visible now */ -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/6] KVM: PPC: Book3S HV: Add tunable to control H_IPI redirection
Redirecting the wakeup of a VCPU from the H_IPI hypercall to a core running in the host is usually a good idea, most workloads seemed to benefit. However, in one heavily interrupt-driven SMT1 workload, some regression was observed. This patch adds a kvm_hv module parameter called h_ipi_redirect to control this feature. The default value for this tunable is 1 - that is enable the feature. Signed-off-by: Suresh Warrier --- arch/powerpc/include/asm/kvm_ppc.h | 1 + arch/powerpc/kvm/book3s_hv.c | 11 +++ arch/powerpc/kvm/book3s_hv_rm_xics.c | 5 - 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 1b93519..29d1442 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -448,6 +448,7 @@ extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu, u32 cpu); extern void kvmppc_xics_ipi_action(void); +extern int h_ipi_redirect; #else static inline void kvmppc_alloc_host_rm_ops(void) {}; static inline void kvmppc_free_host_rm_ops(void) {}; diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 80b1eb3..20b2598 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -81,6 +81,17 @@ static int target_smt_mode; module_param(target_smt_mode, int, S_IRUGO | S_IWUSR); MODULE_PARM_DESC(target_smt_mode, "Target threads per core (0 = max)"); +#ifdef CONFIG_KVM_XICS +static struct kernel_param_ops module_param_ops = { + .set = param_set_int, + .get = param_get_int, +}; + +module_param_cb(h_ipi_redirect, &module_param_ops, &h_ipi_redirect, + S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(h_ipi_redirect, "Redirect H_IPI wakeup to a free host core"); +#endif + static void kvmppc_end_cede(struct kvm_vcpu *vcpu); static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c b/arch/powerpc/kvm/book3s_hv_rm_xics.c index b8d6403..b162f41 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_xics.c +++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c @@ -24,6 +24,9 @@ #define DEBUG_PASSUP +int h_ipi_redirect = 1; +EXPORT_SYMBOL(h_ipi_redirect); + static void icp_rm_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp *icp, u32 new_irq); @@ -134,7 +137,7 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu, cpu = vcpu->arch.thread_cpu; if (cpu < 0 || cpu >= nr_cpu_ids) { hcore = -1; - if (kvmppc_host_rm_ops_hv) + if (kvmppc_host_rm_ops_hv && h_ipi_redirect) hcore = find_available_hostcore(XICS_RM_KICK_VCPU); if (hcore != -1) { hcpu = hcore << threads_shift; -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/6] KVM: PPC: Book3S HV: Optimize wakeup VCPU from H_IPI
When the VCPU target of an H_IPI hypercall is not running in the guest, we need to do a kick VCPU (wake the VCPU thread) to make it runnable. The real-mode version of the H_IPI hypercall cannot do this because it involves waking a sleeping thread. Thus the hcall returns H_TOO_HARD which forces a switch back to host so that the H_IPI call can be completed in virtual mode. This has been found to cause a slowdown for many workloads like YCSB MongoDB, small message networking, etc. This patch set optimizes the wakeup of the target VCPU by posting a free core already running in the host to do the wakeup, thus avoiding the switch to host and back. It requires maintaining a bitmask of all the available cores in the system to indicate if they are in the host or running in some guest. It also requires the H_IPI hypercall to search for a free host core and send it a new IPI message PPC_MSG_RM_HOST_ACTION after stashing away some parameters like the pointer to VCPU for the IPI handler. Locks are avoided by using atomic operations to save core state, to find and reserve a core in the host, etc. Note that it is possible for a guest to be destroyed and its VCPUs freed before the IPI handler gets to run. This case is handled by ensuring that any pending PPC_MSG_RM_HOST_ACTION IPIs are completed before proceeding with freeing the VCPUs. A tunable h_ipi_redirect is also included in the patch set to disable the feature. This patch set depends upon patches to powerpc to increase the number of supported IPI messages to 8 and which also defines the PPC_MSG_RM_HOST_ACTION message. Suresh Warrier (6): KVM: PPC: Book3S HV: Host-side RM data structures KVM: PPC: Book3S HV: Manage core host state KVM: PPC: Book3S HV: kvmppc_host_rm_ops - handle offlining CPUs KVM: PPC: Book3S HV: Host side kick VCPU when poked by real-mode KVM KVM: PPC: Book3S HV: Send IPI to host core to wake VCPU KVM: PPC: Book3S HV: Add tunable to control H_IPI redirection arch/powerpc/include/asm/kvm_ppc.h | 33 +++ arch/powerpc/kernel/smp.c| 11 +++ arch/powerpc/kvm/book3s_hv.c | 166 +++ arch/powerpc/kvm/book3s_hv_builtin.c | 3 + arch/powerpc/kvm/book3s_hv_rm_xics.c | 120 - arch/powerpc/kvm/powerpc.c | 10 +++ 6 files changed, 340 insertions(+), 3 deletions(-) -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html