Re: [Xen-devel] [PATCH for-4.12] x86: Improve the efficiency of domain_relinquish_resources()
Hi Andrew, On 2/21/19 12:22 PM, Andrew Cooper wrote: pci_release_devices() takes the global PCI lock. Once pci_release_devices() has completed, it will be called redundantly each time paging_teardown() and vcpu_destroy_pagetables() continue. This is liable to be millions of times for a reasonably sized guest, and is a serialising bottleneck now that domain_kill() can be run concurrently on different domains. Instead of propagating the opencoding of the relinquish state machine, take the opportunity to clean it up. Leave a proper set of comments explaining that domain_relinquish_resources() implements a co-routine. Introduce a documented PROGRESS() macro to avoid latent bugs such as the RELMEM_xen case, and make the new PROG_* states private to domain_relinquish_resources(). Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Wei Liu CC: Roger Pau Monné CC: Juergen Gross CC: Stefano Stabellini CC: Julien Grall So I know Xen 4.12 isn't going to crash and burn without this change, but I also can't un-see the unnecessary global PCI lock contention. In terms of risk, this is extremely low - this function has complete coverage in testing, and its behaviour isn't changing dramatically. ARM: There are no problems, latent or otherwise, with your version of domain_relinquish_resources(), but I'd recommend the same cleanup in due course. I will add in my todo list of cleanup! Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12] x86: Improve the efficiency of domain_relinquish_resources()
>>> On 21.02.19 at 14:31, wrote: > On Thu, Feb 21, 2019 at 12:22:13PM +, Andrew Cooper wrote: >> pci_release_devices() takes the global PCI lock. Once pci_release_devices() >> has completed, it will be called redundantly each time paging_teardown() and >> vcpu_destroy_pagetables() continue. >> >> This is liable to be millions of times for a reasonably sized guest, and is a >> serialising bottleneck now that domain_kill() can be run concurrently on >> different domains. >> >> Instead of propagating the opencoding of the relinquish state machine, take >> the opportunity to clean it up. >> >> Leave a proper set of comments explaining that domain_relinquish_resources() >> implements a co-routine. Introduce a documented PROGRESS() macro to avoid >> latent bugs such as the RELMEM_xen case, and make the new PROG_* states >> private to domain_relinquish_resources(). >> >> Signed-off-by: Andrew Cooper > > Reviewed-by: Wei Liu Acked-by: Jan Beulich ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12] x86: Improve the efficiency of domain_relinquish_resources()
On Thu, Feb 21, 2019 at 12:22:13PM +, Andrew Cooper wrote: > pci_release_devices() takes the global PCI lock. Once pci_release_devices() > has completed, it will be called redundantly each time paging_teardown() and > vcpu_destroy_pagetables() continue. > > This is liable to be millions of times for a reasonably sized guest, and is a > serialising bottleneck now that domain_kill() can be run concurrently on > different domains. > > Instead of propagating the opencoding of the relinquish state machine, take > the opportunity to clean it up. > > Leave a proper set of comments explaining that domain_relinquish_resources() > implements a co-routine. Introduce a documented PROGRESS() macro to avoid > latent bugs such as the RELMEM_xen case, and make the new PROG_* states > private to domain_relinquish_resources(). > > Signed-off-by: Andrew Cooper Reviewed-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12] x86: Improve the efficiency of domain_relinquish_resources()
On 21/02/2019 13:22, Andrew Cooper wrote: > pci_release_devices() takes the global PCI lock. Once pci_release_devices() > has completed, it will be called redundantly each time paging_teardown() and > vcpu_destroy_pagetables() continue. > > This is liable to be millions of times for a reasonably sized guest, and is a > serialising bottleneck now that domain_kill() can be run concurrently on > different domains. > > Instead of propagating the opencoding of the relinquish state machine, take > the opportunity to clean it up. > > Leave a proper set of comments explaining that domain_relinquish_resources() > implements a co-routine. Introduce a documented PROGRESS() macro to avoid > latent bugs such as the RELMEM_xen case, and make the new PROG_* states > private to domain_relinquish_resources(). > > Signed-off-by: Andrew Cooper Release-acked-by: Juergen Gross Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH for-4.12] x86: Improve the efficiency of domain_relinquish_resources()
pci_release_devices() takes the global PCI lock. Once pci_release_devices() has completed, it will be called redundantly each time paging_teardown() and vcpu_destroy_pagetables() continue. This is liable to be millions of times for a reasonably sized guest, and is a serialising bottleneck now that domain_kill() can be run concurrently on different domains. Instead of propagating the opencoding of the relinquish state machine, take the opportunity to clean it up. Leave a proper set of comments explaining that domain_relinquish_resources() implements a co-routine. Introduce a documented PROGRESS() macro to avoid latent bugs such as the RELMEM_xen case, and make the new PROG_* states private to domain_relinquish_resources(). Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Wei Liu CC: Roger Pau Monné CC: Juergen Gross CC: Stefano Stabellini CC: Julien Grall So I know Xen 4.12 isn't going to crash and burn without this change, but I also can't un-see the unnecessary global PCI lock contention. In terms of risk, this is extremely low - this function has complete coverage in testing, and its behaviour isn't changing dramatically. ARM: There are no problems, latent or otherwise, with your version of domain_relinquish_resources(), but I'd recommend the same cleanup in due course. --- xen/arch/x86/domain.c| 71 +--- xen/include/asm-x86/domain.h | 10 +-- 2 files changed, 48 insertions(+), 33 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 32dc4253..7a29435 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -475,8 +475,6 @@ int arch_domain_create(struct domain *d, int rc; INIT_LIST_HEAD(>arch.pdev_list); - -d->arch.relmem = RELMEM_not_started; INIT_PAGE_LIST_HEAD(>arch.relmem_list); spin_lock_init(>arch.e820_lock); @@ -2020,18 +2018,51 @@ int domain_relinquish_resources(struct domain *d) BUG_ON(!cpumask_empty(d->dirty_cpumask)); -switch ( d->arch.relmem ) +/* + * This hypercall can take minutes of wallclock time to complete. This + * logic implements a co-routine, stashing state in struct domain across + * hypercall continuation boundaries. + */ +switch ( d->arch.rel_priv ) { -case RELMEM_not_started: +/* + * Record the current progress. Subsequent hypercall continuations + * will logically restart work from this point. + * + * PROGRESS() markers must not be in the middle of loops. The loop + * variable isn't preserved across a continuation. + * + * To avoid redundant work, there should be a marker before each + * function which may return -ERESTART. + */ +#define PROGRESS(x) \ +d->arch.rel_priv = PROG_ ## x; /* Fallthrough */ case PROG_ ## x + +enum { +PROG_paging = 1, +PROG_vcpu_pagetables, +PROG_shared, +PROG_xen, +PROG_l4, +PROG_l3, +PROG_l2, +PROG_done, +}; + +case 0: ret = pci_release_devices(d); if ( ret ) return ret; +PROGRESS(paging): + /* Tear down paging-assistance stuff. */ ret = paging_teardown(d); if ( ret ) return ret; +PROGRESS(vcpu_pagetables): + /* Drop the in-use references to page-table bases. */ for_each_vcpu ( d, v ) { @@ -2058,10 +2089,7 @@ int domain_relinquish_resources(struct domain *d) d->arch.auto_unmask = 0; } -d->arch.relmem = RELMEM_shared; -/* fallthrough */ - -case RELMEM_shared: +PROGRESS(shared): if ( is_hvm_domain(d) ) { @@ -2072,45 +2100,40 @@ int domain_relinquish_resources(struct domain *d) return ret; } -d->arch.relmem = RELMEM_xen; - spin_lock(>page_alloc_lock); page_list_splice(>arch.relmem_list, >page_list); INIT_PAGE_LIST_HEAD(>arch.relmem_list); spin_unlock(>page_alloc_lock); -/* Fallthrough. Relinquish every page of memory. */ -case RELMEM_xen: +PROGRESS(xen): + ret = relinquish_memory(d, >xenpage_list, ~0UL); if ( ret ) return ret; -d->arch.relmem = RELMEM_l4; -/* fallthrough */ -case RELMEM_l4: +PROGRESS(l4): + ret = relinquish_memory(d, >page_list, PGT_l4_page_table); if ( ret ) return ret; -d->arch.relmem = RELMEM_l3; -/* fallthrough */ -case RELMEM_l3: +PROGRESS(l3): + ret = relinquish_memory(d, >page_list, PGT_l3_page_table); if ( ret ) return ret; -d->arch.relmem = RELMEM_l2; -/* fallthrough */ -case RELMEM_l2: +PROGRESS(l2): + ret = relinquish_memory(d, >page_list,