Re: [Xen-devel] [PATCH for-4.12] x86: Improve the efficiency of domain_relinquish_resources()

2019-02-27 Thread Julien Grall

Hi Andrew,

On 2/21/19 12:22 PM, Andrew Cooper wrote:

pci_release_devices() takes the global PCI lock.  Once pci_release_devices()
has completed, it will be called redundantly each time paging_teardown() and
vcpu_destroy_pagetables() continue.

This is liable to be millions of times for a reasonably sized guest, and is a
serialising bottleneck now that domain_kill() can be run concurrently on
different domains.

Instead of propagating the opencoding of the relinquish state machine, take
the opportunity to clean it up.

Leave a proper set of comments explaining that domain_relinquish_resources()
implements a co-routine.  Introduce a documented PROGRESS() macro to avoid
latent bugs such as the RELMEM_xen case, and make the new PROG_* states
private to domain_relinquish_resources().

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 
CC: Juergen Gross 
CC: Stefano Stabellini 
CC: Julien Grall 

So I know Xen 4.12 isn't going to crash and burn without this change, but I
also can't un-see the unnecessary global PCI lock contention.  In terms of
risk, this is extremely low - this function has complete coverage in testing,
and its behaviour isn't changing dramatically.

ARM: There are no problems, latent or otherwise, with your version of
domain_relinquish_resources(), but I'd recommend the same cleanup in due
course.


I will add in my todo list of cleanup!

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12] x86: Improve the efficiency of domain_relinquish_resources()

2019-02-22 Thread Jan Beulich
>>> On 21.02.19 at 14:31,  wrote:
> On Thu, Feb 21, 2019 at 12:22:13PM +, Andrew Cooper wrote:
>> pci_release_devices() takes the global PCI lock.  Once pci_release_devices()
>> has completed, it will be called redundantly each time paging_teardown() and
>> vcpu_destroy_pagetables() continue.
>> 
>> This is liable to be millions of times for a reasonably sized guest, and is a
>> serialising bottleneck now that domain_kill() can be run concurrently on
>> different domains.
>> 
>> Instead of propagating the opencoding of the relinquish state machine, take
>> the opportunity to clean it up.
>> 
>> Leave a proper set of comments explaining that domain_relinquish_resources()
>> implements a co-routine.  Introduce a documented PROGRESS() macro to avoid
>> latent bugs such as the RELMEM_xen case, and make the new PROG_* states
>> private to domain_relinquish_resources().
>> 
>> Signed-off-by: Andrew Cooper 
> 
> Reviewed-by: Wei Liu 

Acked-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12] x86: Improve the efficiency of domain_relinquish_resources()

2019-02-21 Thread Wei Liu
On Thu, Feb 21, 2019 at 12:22:13PM +, Andrew Cooper wrote:
> pci_release_devices() takes the global PCI lock.  Once pci_release_devices()
> has completed, it will be called redundantly each time paging_teardown() and
> vcpu_destroy_pagetables() continue.
> 
> This is liable to be millions of times for a reasonably sized guest, and is a
> serialising bottleneck now that domain_kill() can be run concurrently on
> different domains.
> 
> Instead of propagating the opencoding of the relinquish state machine, take
> the opportunity to clean it up.
> 
> Leave a proper set of comments explaining that domain_relinquish_resources()
> implements a co-routine.  Introduce a documented PROGRESS() macro to avoid
> latent bugs such as the RELMEM_xen case, and make the new PROG_* states
> private to domain_relinquish_resources().
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12] x86: Improve the efficiency of domain_relinquish_resources()

2019-02-21 Thread Juergen Gross
On 21/02/2019 13:22, Andrew Cooper wrote:
> pci_release_devices() takes the global PCI lock.  Once pci_release_devices()
> has completed, it will be called redundantly each time paging_teardown() and
> vcpu_destroy_pagetables() continue.
> 
> This is liable to be millions of times for a reasonably sized guest, and is a
> serialising bottleneck now that domain_kill() can be run concurrently on
> different domains.
> 
> Instead of propagating the opencoding of the relinquish state machine, take
> the opportunity to clean it up.
> 
> Leave a proper set of comments explaining that domain_relinquish_resources()
> implements a co-routine.  Introduce a documented PROGRESS() macro to avoid
> latent bugs such as the RELMEM_xen case, and make the new PROG_* states
> private to domain_relinquish_resources().
> 
> Signed-off-by: Andrew Cooper 

Release-acked-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12] x86: Improve the efficiency of domain_relinquish_resources()

2019-02-21 Thread Andrew Cooper
pci_release_devices() takes the global PCI lock.  Once pci_release_devices()
has completed, it will be called redundantly each time paging_teardown() and
vcpu_destroy_pagetables() continue.

This is liable to be millions of times for a reasonably sized guest, and is a
serialising bottleneck now that domain_kill() can be run concurrently on
different domains.

Instead of propagating the opencoding of the relinquish state machine, take
the opportunity to clean it up.

Leave a proper set of comments explaining that domain_relinquish_resources()
implements a co-routine.  Introduce a documented PROGRESS() macro to avoid
latent bugs such as the RELMEM_xen case, and make the new PROG_* states
private to domain_relinquish_resources().

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 
CC: Juergen Gross 
CC: Stefano Stabellini 
CC: Julien Grall 

So I know Xen 4.12 isn't going to crash and burn without this change, but I
also can't un-see the unnecessary global PCI lock contention.  In terms of
risk, this is extremely low - this function has complete coverage in testing,
and its behaviour isn't changing dramatically.

ARM: There are no problems, latent or otherwise, with your version of
domain_relinquish_resources(), but I'd recommend the same cleanup in due
course.
---
 xen/arch/x86/domain.c| 71 +---
 xen/include/asm-x86/domain.h | 10 +--
 2 files changed, 48 insertions(+), 33 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 32dc4253..7a29435 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -475,8 +475,6 @@ int arch_domain_create(struct domain *d,
 int rc;
 
 INIT_LIST_HEAD(>arch.pdev_list);
-
-d->arch.relmem = RELMEM_not_started;
 INIT_PAGE_LIST_HEAD(>arch.relmem_list);
 
 spin_lock_init(>arch.e820_lock);
@@ -2020,18 +2018,51 @@ int domain_relinquish_resources(struct domain *d)
 
 BUG_ON(!cpumask_empty(d->dirty_cpumask));
 
-switch ( d->arch.relmem )
+/*
+ * This hypercall can take minutes of wallclock time to complete.  This
+ * logic implements a co-routine, stashing state in struct domain across
+ * hypercall continuation boundaries.
+ */
+switch ( d->arch.rel_priv )
 {
-case RELMEM_not_started:
+/*
+ * Record the current progress.  Subsequent hypercall continuations
+ * will logically restart work from this point.
+ *
+ * PROGRESS() markers must not be in the middle of loops.  The loop
+ * variable isn't preserved across a continuation.
+ *
+ * To avoid redundant work, there should be a marker before each
+ * function which may return -ERESTART.
+ */
+#define PROGRESS(x) \
+d->arch.rel_priv = PROG_ ## x; /* Fallthrough */ case PROG_ ## x
+
+enum {
+PROG_paging = 1,
+PROG_vcpu_pagetables,
+PROG_shared,
+PROG_xen,
+PROG_l4,
+PROG_l3,
+PROG_l2,
+PROG_done,
+};
+
+case 0:
 ret = pci_release_devices(d);
 if ( ret )
 return ret;
 
+PROGRESS(paging):
+
 /* Tear down paging-assistance stuff. */
 ret = paging_teardown(d);
 if ( ret )
 return ret;
 
+PROGRESS(vcpu_pagetables):
+
 /* Drop the in-use references to page-table bases. */
 for_each_vcpu ( d, v )
 {
@@ -2058,10 +2089,7 @@ int domain_relinquish_resources(struct domain *d)
 d->arch.auto_unmask = 0;
 }
 
-d->arch.relmem = RELMEM_shared;
-/* fallthrough */
-
-case RELMEM_shared:
+PROGRESS(shared):
 
 if ( is_hvm_domain(d) )
 {
@@ -2072,45 +2100,40 @@ int domain_relinquish_resources(struct domain *d)
 return ret;
 }
 
-d->arch.relmem = RELMEM_xen;
-
 spin_lock(>page_alloc_lock);
 page_list_splice(>arch.relmem_list, >page_list);
 INIT_PAGE_LIST_HEAD(>arch.relmem_list);
 spin_unlock(>page_alloc_lock);
 
-/* Fallthrough. Relinquish every page of memory. */
-case RELMEM_xen:
+PROGRESS(xen):
+
 ret = relinquish_memory(d, >xenpage_list, ~0UL);
 if ( ret )
 return ret;
-d->arch.relmem = RELMEM_l4;
-/* fallthrough */
 
-case RELMEM_l4:
+PROGRESS(l4):
+
 ret = relinquish_memory(d, >page_list, PGT_l4_page_table);
 if ( ret )
 return ret;
-d->arch.relmem = RELMEM_l3;
-/* fallthrough */
 
-case RELMEM_l3:
+PROGRESS(l3):
+
 ret = relinquish_memory(d, >page_list, PGT_l3_page_table);
 if ( ret )
 return ret;
-d->arch.relmem = RELMEM_l2;
-/* fallthrough */
 
-case RELMEM_l2:
+PROGRESS(l2):
+
 ret = relinquish_memory(d, >page_list,