Re: [RFC PATCH v0 5/5] pseries: Asynchronous page fault support
On Fri, Aug 13, 2021 at 02:06:40PM +1000, Nicholas Piggin wrote: > Excerpts from Bharata B Rao's message of August 5, 2021 5:24 pm: > > Add asynchronous page fault support for pseries guests. > > > > 1. Setup the guest to handle async-pf > >- Issue H_REG_SNS hcall to register the SNS region. > >- Setup the subvention interrupt irq. > >- Enable async-pf by updating the byte_b9 of VPA for each > > CPU. > > 2. Check if the page fault is an expropriation notification > >(SRR1_PROGTRAP set in SRR1) and if so put the task on > >wait queue based on the expropriation correlation number > >read from the VPA. > > 3. Handle subvention interrupt to wake any waiting tasks. > >The wait and wakeup mechanism from x86 async-pf implementation > >is being reused here. > > I don't know too much about the background of this. > > How much benefit does this give? What situations? I haven't yet gotten into measuring the benefit of this. Once the patches are bit more stable than what they are currently, we need to measure and evaluate the benefits. > Does PowerVM implement it? I suppose so, need to check though. > Do other architectures KVM have something similar? Yes, x86 and s390 KVM have had this feature for a while now and generic KVM interfaces exist to support it. > > The SRR1 setting for the DSI is in PAPR? In that case it should be okay, > it might be good to add a small comment in exceptions-64s.S. Yes, SRR1 setting is part of PAPR. > > [...] > > > @@ -395,6 +395,11 @@ static int ___do_page_fault(struct pt_regs *regs, > > unsigned long address, > > vm_fault_t fault, major = 0; > > bool kprobe_fault = kprobe_page_fault(regs, 11); > > > > +#ifdef CONFIG_PPC_PSERIES > > + if (handle_async_page_fault(regs, address)) > > + return 0; > > +#endif > > + > > if (unlikely(debugger_fault_handler(regs) || kprobe_fault)) > > return 0; > > [...] > > > +int handle_async_page_fault(struct pt_regs *regs, unsigned long addr) > > +{ > > + struct async_pf_sleep_node n; > > + DECLARE_SWAITQUEUE(wait); > > + unsigned long exp_corr_nr; > > + > > + /* Is this Expropriation notification? */ > > + if (!(mfspr(SPRN_SRR1) & SRR1_PROGTRAP)) > > + return 0; > > Yep this should be an inline that is guarded by a static key, and then > probably have an inline check for SRR1_PROGTRAP. You shouldn't need to > mfspr here, but just use regs->msr. Right. > > > + > > + if (unlikely(!user_mode(regs))) > > + panic("Host injected async PF in kernel mode\n"); > > Hmm. Is there anything in the PAPR interface that specifies that the > OS can only deal with problem state access faults here? Or is that > inherent in the expropriation feature? Didn't see anything specific to that effect in PAPR. However since this puts the faulting guest process to sleep until the page becomes ready in the host, I have limited it to guest user space faults. Regards, Bharata.
Re: [RFC PATCH v0 5/5] pseries: Asynchronous page fault support
Excerpts from Bharata B Rao's message of August 5, 2021 5:24 pm: > Add asynchronous page fault support for pseries guests. > > 1. Setup the guest to handle async-pf >- Issue H_REG_SNS hcall to register the SNS region. >- Setup the subvention interrupt irq. >- Enable async-pf by updating the byte_b9 of VPA for each > CPU. > 2. Check if the page fault is an expropriation notification >(SRR1_PROGTRAP set in SRR1) and if so put the task on >wait queue based on the expropriation correlation number >read from the VPA. > 3. Handle subvention interrupt to wake any waiting tasks. >The wait and wakeup mechanism from x86 async-pf implementation >is being reused here. I don't know too much about the background of this. How much benefit does this give? What situations? Does PowerVM implement it? Do other architectures KVM have something similar? The SRR1 setting for the DSI is in PAPR? In that case it should be okay, it might be good to add a small comment in exceptions-64s.S. [...] > @@ -395,6 +395,11 @@ static int ___do_page_fault(struct pt_regs *regs, > unsigned long address, > vm_fault_t fault, major = 0; > bool kprobe_fault = kprobe_page_fault(regs, 11); > > +#ifdef CONFIG_PPC_PSERIES > + if (handle_async_page_fault(regs, address)) > + return 0; > +#endif > + > if (unlikely(debugger_fault_handler(regs) || kprobe_fault)) > return 0; [...] > +int handle_async_page_fault(struct pt_regs *regs, unsigned long addr) > +{ > + struct async_pf_sleep_node n; > + DECLARE_SWAITQUEUE(wait); > + unsigned long exp_corr_nr; > + > + /* Is this Expropriation notification? */ > + if (!(mfspr(SPRN_SRR1) & SRR1_PROGTRAP)) > + return 0; Yep this should be an inline that is guarded by a static key, and then probably have an inline check for SRR1_PROGTRAP. You shouldn't need to mfspr here, but just use regs->msr. > + > + if (unlikely(!user_mode(regs))) > + panic("Host injected async PF in kernel mode\n"); Hmm. Is there anything in the PAPR interface that specifies that the OS can only deal with problem state access faults here? Or is that inherent in the expropriation feature? Thanks, Nick
[RFC PATCH v0 5/5] pseries: Asynchronous page fault support
Add asynchronous page fault support for pseries guests. 1. Setup the guest to handle async-pf - Issue H_REG_SNS hcall to register the SNS region. - Setup the subvention interrupt irq. - Enable async-pf by updating the byte_b9 of VPA for each CPU. 2. Check if the page fault is an expropriation notification (SRR1_PROGTRAP set in SRR1) and if so put the task on wait queue based on the expropriation correlation number read from the VPA. 3. Handle subvention interrupt to wake any waiting tasks. The wait and wakeup mechanism from x86 async-pf implementation is being reused here. TODO: - Check how to keep this feature together with other CMO features. - The async-pf check in the page fault handler path is limited to guest with an #ifdef. This isn't sufficient and hence needs to be replaced by an appropriate check. Signed-off-by: Bharata B Rao --- arch/powerpc/include/asm/async-pf.h | 12 ++ arch/powerpc/mm/fault.c | 7 +- arch/powerpc/platforms/pseries/Makefile | 2 +- arch/powerpc/platforms/pseries/async-pf.c | 219 ++ 4 files changed, 238 insertions(+), 2 deletions(-) create mode 100644 arch/powerpc/include/asm/async-pf.h create mode 100644 arch/powerpc/platforms/pseries/async-pf.c diff --git a/arch/powerpc/include/asm/async-pf.h b/arch/powerpc/include/asm/async-pf.h new file mode 100644 index ..95d6c3da9f50 --- /dev/null +++ b/arch/powerpc/include/asm/async-pf.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Async page fault support via PAPR Expropriation/Subvention Notification + * option(ESN) + * + * Copyright 2020 Bharata B Rao, IBM Corp. + */ + +#ifndef _ASM_POWERPC_ASYNC_PF_H +int handle_async_page_fault(struct pt_regs *regs, unsigned long addr); +#define _ASM_POWERPC_ASYNC_PF_H +#endif diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index a8d0ce85d39a..bbdc61605885 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -44,7 +44,7 @@ #include #include #include - +#include /* * do_page_fault error handling helpers @@ -395,6 +395,11 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address, vm_fault_t fault, major = 0; bool kprobe_fault = kprobe_page_fault(regs, 11); +#ifdef CONFIG_PPC_PSERIES + if (handle_async_page_fault(regs, address)) + return 0; +#endif + if (unlikely(debugger_fault_handler(regs) || kprobe_fault)) return 0; diff --git a/arch/powerpc/platforms/pseries/Makefile b/arch/powerpc/platforms/pseries/Makefile index 4cda0ef87be0..e0ada605ef20 100644 --- a/arch/powerpc/platforms/pseries/Makefile +++ b/arch/powerpc/platforms/pseries/Makefile @@ -6,7 +6,7 @@ obj-y := lpar.o hvCall.o nvram.o reconfig.o \ of_helpers.o \ setup.o iommu.o event_sources.o ras.o \ firmware.o power.o dlpar.o mobility.o rng.o \ - pci.o pci_dlpar.o eeh_pseries.o msi.o + pci.o pci_dlpar.o eeh_pseries.o msi.o async-pf.o obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_SCANLOG) += scanlog.o obj-$(CONFIG_KEXEC_CORE) += kexec.o diff --git a/arch/powerpc/platforms/pseries/async-pf.c b/arch/powerpc/platforms/pseries/async-pf.c new file mode 100644 index ..c2f3bbc0d674 --- /dev/null +++ b/arch/powerpc/platforms/pseries/async-pf.c @@ -0,0 +1,219 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Async page fault support via PAPR Expropriation/Subvention Notification + * option(ESN) + * + * Copyright 2020 Bharata B Rao, IBM Corp. + */ + +#include +#include +#include +#include +#include +#include + +static char sns_buffer[PAGE_SIZE] __aligned(4096); +static uint16_t *esn_q = (uint16_t *)sns_buffer + 1; +static unsigned long next_eq_entry, nr_eq_entries; + +#define ASYNC_PF_SLEEP_HASHBITS 8 +#define ASYNC_PF_SLEEP_HASHSIZE (1token == token) + return n; + } + + return NULL; +} +static int async_pf_queue_task(u64 token, struct async_pf_sleep_node *n) +{ + u64 key = hash_64(token, ASYNC_PF_SLEEP_HASHBITS); + struct async_pf_sleep_head *b = &async_pf_sleepers[key]; + struct async_pf_sleep_node *e; + + raw_spin_lock(&b->lock); + e = _find_apf_task(b, token); + if (e) { + /* dummy entry exist -> wake up was delivered ahead of PF */ + hlist_del(&e->link); + raw_spin_unlock(&b->lock); + kfree(e); + return false; + } + + n->token = token; + n->cpu = smp_processor_id(); + init_swait_queue_head(&n->wq); + hlist_add_head(&n->link, &b->list); + raw_spin_unlock(&b->lock); + return true; +} + +/* + * Handle Expropriation no