Re: [Xen-devel] [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
>>> On 05.07.16 at 16:35,wrote: > On Thu, Jun 23, 2016 at 4:42 PM, Tamas K Lengyel wrote: +if ( !atomic_read(>pause_count) || + !atomic_read(>pause_count) ) +{ +rcu_unlock_domain(cd); +rc = -EINVAL; +goto out; +} >>> >>> I realize that Jan asked for this, but I'm really not sure what good >>> this is supposed to do, since the guest can be un-paused at any point >>> halfway through the transaction. >>> >>> Wouldn't it make more sense to have this function pause and unpause >>> the domains itself? >> >> The domains are paused by the user when this op is called, so this is >> just a sanity check ensuring you are not issuing the op on some other >> domain. So if this function would pause the domain as well all it >> would do is increase the pause count. This is the only intended >> use-case for this function at this time. It would make no sense to try >> to issue this op on domains that are running, as that pretty much >> guarantee that some of their memory has already diverged, and thus >> bulk-sharing their memory would have some unintended side-effects. > > Yes, I understand all that. But what I'm saying (and mostly this is > actually to Jan that I'm saying it) is that this check, as written, > seems pointless to me. We're effectively *not* trusting the toolstack > to pause the domain, but we *are* trust the toolstack not to unpause > the domain before the operation is completed. It seems to me we > should either trust the toolstack to pause the domain and leave it > paused -- letting it suffer the consequences if it fails to do so -- > or we should pause and unpause the domain ourselves. > > Adding an extra pause count is simple and harmless. If we're going to > make an effort to think about buggy toolstacks, we might as well just > make it 100% safe. > >>> To start with, it seems like having a "bulk share" option which could >>> do just a specific range would be useful as well as a "bulk share" >>> which automatically deduped the entire set of memory. >> >> I have no need for such options. > > Yes, but it's not unlikely that somebody else may need them at some > point in the future; and it's only a tiny bit of adjustment to make > them more generally useful than just your current specific use case, > so that we can avoid changing the interface, or adding yet another > hypercall in the future. Well, I don't view it as a guarantee, but simply a sanity check. Iirc I did ask for it simply because similar checks exist elsewhere. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
On Tue, Jul 5, 2016 at 8:35 AM, George Dunlapwrote: > On Thu, Jun 23, 2016 at 4:42 PM, Tamas K Lengyel wrote: +if ( !atomic_read(>pause_count) || + !atomic_read(>pause_count) ) +{ +rcu_unlock_domain(cd); +rc = -EINVAL; +goto out; +} >>> >>> I realize that Jan asked for this, but I'm really not sure what good >>> this is supposed to do, since the guest can be un-paused at any point >>> halfway through the transaction. >>> >>> Wouldn't it make more sense to have this function pause and unpause >>> the domains itself? >> >> The domains are paused by the user when this op is called, so this is >> just a sanity check ensuring you are not issuing the op on some other >> domain. So if this function would pause the domain as well all it >> would do is increase the pause count. This is the only intended >> use-case for this function at this time. It would make no sense to try >> to issue this op on domains that are running, as that pretty much >> guarantee that some of their memory has already diverged, and thus >> bulk-sharing their memory would have some unintended side-effects. > > Yes, I understand all that. But what I'm saying (and mostly this is > actually to Jan that I'm saying it) is that this check, as written, > seems pointless to me. We're effectively *not* trusting the toolstack > to pause the domain, but we *are* trust the toolstack not to unpause > the domain before the operation is completed. It seems to me we > should either trust the toolstack to pause the domain and leave it > paused -- letting it suffer the consequences if it fails to do so -- > or we should pause and unpause the domain ourselves. > > Adding an extra pause count is simple and harmless. If we're going to > make an effort to think about buggy toolstacks, we might as well just > make it 100% safe. Well, there are many many more ways the toolstack could cause issues for this op so I would say let's just trust the toolstack to behave as intended so I'll remove the pause check and point out in a comment that the intended usecase assumes both domains are paused and remain paused for the duration of the op. > >>> To start with, it seems like having a "bulk share" option which could >>> do just a specific range would be useful as well as a "bulk share" >>> which automatically deduped the entire set of memory. >> >> I have no need for such options. > > Yes, but it's not unlikely that somebody else may need them at some > point in the future; and it's only a tiny bit of adjustment to make > them more generally useful than just your current specific use case, > so that we can avoid changing the interface, or adding yet another > hypercall in the future. > >>> Secondly, structuring the information like the other memory operations >>> -- for example, "nr_extents" and "nr_done" -- would make it more >>> consistent, and would allow you to also to avoid overwriting one of >>> the "in" values and having to restore it when you were done. >> >> I don't see any harm in clearing this value to 0 when the op finishes >> so I don't think it would really make much difference if we have >> another field just for scratch-space use. I'm fine with adding that >> but it just seems a bit odd to me. > > Well clearing the value to zero seems a bit odd to me. :-) But more > to the point, it's valuable to have the interface 1) be flexible > enough to bulk-share a range but not the entire address space 2) be > similar enough to existing interfaces that nobody needs to figure out > how it works. I really doubt there will be any users for this feature but it's simple enough of an adjustment that I'm fine with the change. Thanks, Tamas ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
On Thu, Jun 23, 2016 at 4:42 PM, Tamas K Lengyelwrote: >>> +if ( !atomic_read(>pause_count) || >>> + !atomic_read(>pause_count) ) >>> +{ >>> +rcu_unlock_domain(cd); >>> +rc = -EINVAL; >>> +goto out; >>> +} >> >> I realize that Jan asked for this, but I'm really not sure what good >> this is supposed to do, since the guest can be un-paused at any point >> halfway through the transaction. >> >> Wouldn't it make more sense to have this function pause and unpause >> the domains itself? > > The domains are paused by the user when this op is called, so this is > just a sanity check ensuring you are not issuing the op on some other > domain. So if this function would pause the domain as well all it > would do is increase the pause count. This is the only intended > use-case for this function at this time. It would make no sense to try > to issue this op on domains that are running, as that pretty much > guarantee that some of their memory has already diverged, and thus > bulk-sharing their memory would have some unintended side-effects. Yes, I understand all that. But what I'm saying (and mostly this is actually to Jan that I'm saying it) is that this check, as written, seems pointless to me. We're effectively *not* trusting the toolstack to pause the domain, but we *are* trust the toolstack not to unpause the domain before the operation is completed. It seems to me we should either trust the toolstack to pause the domain and leave it paused -- letting it suffer the consequences if it fails to do so -- or we should pause and unpause the domain ourselves. Adding an extra pause count is simple and harmless. If we're going to make an effort to think about buggy toolstacks, we might as well just make it 100% safe. >> To start with, it seems like having a "bulk share" option which could >> do just a specific range would be useful as well as a "bulk share" >> which automatically deduped the entire set of memory. > > I have no need for such options. Yes, but it's not unlikely that somebody else may need them at some point in the future; and it's only a tiny bit of adjustment to make them more generally useful than just your current specific use case, so that we can avoid changing the interface, or adding yet another hypercall in the future. >> Secondly, structuring the information like the other memory operations >> -- for example, "nr_extents" and "nr_done" -- would make it more >> consistent, and would allow you to also to avoid overwriting one of >> the "in" values and having to restore it when you were done. > > I don't see any harm in clearing this value to 0 when the op finishes > so I don't think it would really make much difference if we have > another field just for scratch-space use. I'm fine with adding that > but it just seems a bit odd to me. Well clearing the value to zero seems a bit odd to me. :-) But more to the point, it's valuable to have the interface 1) be flexible enough to bulk-share a range but not the entire address space 2) be similar enough to existing interfaces that nobody needs to figure out how it works. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
On Wed, Jun 22, 2016 at 9:38 AM, George Dunlapwrote: > On Sun, Jun 12, 2016 at 12:24 AM, Tamas K Lengyel wrote: >> Currently mem-sharing can be performed on a page-by-page base from the >> control >> domain. However, when completely deduplicating (cloning) a VM, this requires >> at least 3 hypercalls per page. As the user has to loop through all pages up >> to max_gpfn, this process is very slow and wasteful. >> >> This patch introduces a new mem_sharing memop for bulk deduplication where >> the user doesn't have to separately nominate each page in both the source and >> destination domain, and the looping over all pages happen in the hypervisor. >> This significantly reduces the overhead of completely deduplicating entire >> domains. >> >> Signed-off-by: Tamas K Lengyel >> Acked-by: Wei Liu >> --- >> Ian Jackson >> George Dunlap >> Jan Beulich >> Andrew Cooper > > I'm sorry I'm a bit late to this party -- I'm not sure how I managed > to miss the earlier posts of this. A couple of questions... > > >> @@ -1468,6 +1516,79 @@ int >> mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) >> } >> break; >> >> +case XENMEM_sharing_op_bulk_share: >> +{ >> +unsigned long max_sgfn, max_cgfn; >> +struct domain *cd; >> + >> +rc = -EINVAL; >> +if( mso.u.bulk._pad[0] || mso.u.bulk._pad[1] || >> mso.u.bulk._pad[2] ) >> +goto out; >> + >> +if ( !mem_sharing_enabled(d) ) >> +goto out; >> + >> +rc = rcu_lock_live_remote_domain_by_id(mso.u.bulk.client_domain, >> + ); >> +if ( rc ) >> +goto out; >> + >> +/* >> + * We reuse XENMEM_sharing_op_share XSM check here as this is >> essentially >> + * the same concept repeated over multiple pages. >> + */ >> +rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, >> XENMEM_sharing_op_share); >> +if ( rc ) >> +{ >> +rcu_unlock_domain(cd); >> +goto out; >> +} >> + >> +if ( !mem_sharing_enabled(cd) ) >> +{ >> +rcu_unlock_domain(cd); >> +rc = -EINVAL; >> +goto out; >> +} >> + >> +if ( !atomic_read(>pause_count) || >> + !atomic_read(>pause_count) ) >> +{ >> +rcu_unlock_domain(cd); >> +rc = -EINVAL; >> +goto out; >> +} > > I realize that Jan asked for this, but I'm really not sure what good > this is supposed to do, since the guest can be un-paused at any point > halfway through the transaction. > > Wouldn't it make more sense to have this function pause and unpause > the domains itself? The domains are paused by the user when this op is called, so this is just a sanity check ensuring you are not issuing the op on some other domain. So if this function would pause the domain as well all it would do is increase the pause count. This is the only intended use-case for this function at this time. It would make no sense to try to issue this op on domains that are running, as that pretty much guarantee that some of their memory has already diverged, and thus bulk-sharing their memory would have some unintended side-effects. > >> + >> +max_sgfn = domain_get_maximum_gpfn(d); >> +max_cgfn = domain_get_maximum_gpfn(cd); >> + >> +if ( max_sgfn != max_cgfn || max_sgfn < mso.u.bulk.start ) >> +{ >> +rcu_unlock_domain(cd); >> +rc = -EINVAL; >> +goto out; >> +} >> + >> +rc = bulk_share(d, cd, max_sgfn + 1, ); >> +if ( rc > 0 ) >> +{ >> +if ( __copy_to_guest(arg, , 1) ) >> +rc = -EFAULT; >> +else >> +rc = >> hypercall_create_continuation(__HYPERVISOR_memory_op, >> + "lh", >> XENMEM_sharing_op, >> + arg); >> +} >> +else >> +{ >> +mso.u.bulk.start = 0; >> +mso.u.bulk.shared = atomic_read(>shr_pages); >> +} >> + >> +rcu_unlock_domain(cd); >> +} >> +break; >> + >> case XENMEM_sharing_op_debug_gfn: >> { >> unsigned long gfn = mso.u.debug.u.gfn; >> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h >> index 29ec571..084f06e 100644 >> --- a/xen/include/public/memory.h >> +++ b/xen/include/public/memory.h >> @@ -465,6
Re: [Xen-devel] [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
On Sun, Jun 12, 2016 at 12:24 AM, Tamas K Lengyelwrote: > Currently mem-sharing can be performed on a page-by-page base from the control > domain. However, when completely deduplicating (cloning) a VM, this requires > at least 3 hypercalls per page. As the user has to loop through all pages up > to max_gpfn, this process is very slow and wasteful. > > This patch introduces a new mem_sharing memop for bulk deduplication where > the user doesn't have to separately nominate each page in both the source and > destination domain, and the looping over all pages happen in the hypervisor. > This significantly reduces the overhead of completely deduplicating entire > domains. > > Signed-off-by: Tamas K Lengyel > Acked-by: Wei Liu > --- > Ian Jackson > George Dunlap > Jan Beulich > Andrew Cooper I'm sorry I'm a bit late to this party -- I'm not sure how I managed to miss the earlier posts of this. A couple of questions... > @@ -1468,6 +1516,79 @@ int > mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) > } > break; > > +case XENMEM_sharing_op_bulk_share: > +{ > +unsigned long max_sgfn, max_cgfn; > +struct domain *cd; > + > +rc = -EINVAL; > +if( mso.u.bulk._pad[0] || mso.u.bulk._pad[1] || > mso.u.bulk._pad[2] ) > +goto out; > + > +if ( !mem_sharing_enabled(d) ) > +goto out; > + > +rc = rcu_lock_live_remote_domain_by_id(mso.u.bulk.client_domain, > + ); > +if ( rc ) > +goto out; > + > +/* > + * We reuse XENMEM_sharing_op_share XSM check here as this is > essentially > + * the same concept repeated over multiple pages. > + */ > +rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, > XENMEM_sharing_op_share); > +if ( rc ) > +{ > +rcu_unlock_domain(cd); > +goto out; > +} > + > +if ( !mem_sharing_enabled(cd) ) > +{ > +rcu_unlock_domain(cd); > +rc = -EINVAL; > +goto out; > +} > + > +if ( !atomic_read(>pause_count) || > + !atomic_read(>pause_count) ) > +{ > +rcu_unlock_domain(cd); > +rc = -EINVAL; > +goto out; > +} I realize that Jan asked for this, but I'm really not sure what good this is supposed to do, since the guest can be un-paused at any point halfway through the transaction. Wouldn't it make more sense to have this function pause and unpause the domains itself? > + > +max_sgfn = domain_get_maximum_gpfn(d); > +max_cgfn = domain_get_maximum_gpfn(cd); > + > +if ( max_sgfn != max_cgfn || max_sgfn < mso.u.bulk.start ) > +{ > +rcu_unlock_domain(cd); > +rc = -EINVAL; > +goto out; > +} > + > +rc = bulk_share(d, cd, max_sgfn + 1, ); > +if ( rc > 0 ) > +{ > +if ( __copy_to_guest(arg, , 1) ) > +rc = -EFAULT; > +else > +rc = > hypercall_create_continuation(__HYPERVISOR_memory_op, > + "lh", > XENMEM_sharing_op, > + arg); > +} > +else > +{ > +mso.u.bulk.start = 0; > +mso.u.bulk.shared = atomic_read(>shr_pages); > +} > + > +rcu_unlock_domain(cd); > +} > +break; > + > case XENMEM_sharing_op_debug_gfn: > { > unsigned long gfn = mso.u.debug.u.gfn; > diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h > index 29ec571..084f06e 100644 > --- a/xen/include/public/memory.h > +++ b/xen/include/public/memory.h > @@ -465,6 +465,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t); > #define XENMEM_sharing_op_debug_gref5 > #define XENMEM_sharing_op_add_physmap 6 > #define XENMEM_sharing_op_audit 7 > +#define XENMEM_sharing_op_bulk_share8 > > #define XENMEM_SHARING_OP_S_HANDLE_INVALID (-10) > #define XENMEM_SHARING_OP_C_HANDLE_INVALID (-9) > @@ -500,7 +501,19 @@ struct xen_mem_sharing_op { > uint64_aligned_t client_gfn;/* IN: the client gfn */ > uint64_aligned_t client_handle; /* IN: handle to the client page > */ > domid_t client_domain; /* IN: the client domain id */ > -} share; > +} share; > +struct mem_sharing_op_bulk { /* OP_BULK_SHARE */ > +
Re: [Xen-devel] [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
On Wed, Jun 15, 2016 at 02:14:15AM -0600, Jan Beulich wrote: > >>> On 14.06.16 at 18:33,wrote: > >> +/* Check for continuation if it's not the last iteration. */ > >> +if ( limit > ++bulk->start && hypercall_preempt_check() ) > > > > I surprised the compiler didn't complain to you about lack of parenthesis. > > I'm puzzled - what kind of warning would you expect here? The > compiler - afaik - treats relational operators vs logical AND/OR as > having well known precedence wrt to one another; what iirc it > would warn about is lack of parentheses in an expression using > both of the logical operators, as people frequently don't know > that && has higher precedence than ||. Maybe those are the warnings I had seen in the past. I recall that with a more modern compiler it would give me suggestions to use parenthesis. But I honestly can't recall the details. Either way - the code is fine. > > Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
>>> On 14.06.16 at 18:33,wrote: >> +/* Check for continuation if it's not the last iteration. */ >> +if ( limit > ++bulk->start && hypercall_preempt_check() ) > > I surprised the compiler didn't complain to you about lack of parenthesis. I'm puzzled - what kind of warning would you expect here? The compiler - afaik - treats relational operators vs logical AND/OR as having well known precedence wrt to one another; what iirc it would warn about is lack of parentheses in an expression using both of the logical operators, as people frequently don't know that && has higher precedence than ||. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
On Jun 14, 2016 10:33, "Konrad Rzeszutek Wilk"wrote: > > > diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c > > index a522423..ba06fb0 100644 > > --- a/xen/arch/x86/mm/mem_sharing.c > > +++ b/xen/arch/x86/mm/mem_sharing.c > > @@ -1294,6 +1294,54 @@ int relinquish_shared_pages(struct domain *d) > > return rc; > > } > > > > +static int bulk_share(struct domain *d, struct domain *cd, unsigned long limit, > > + struct mem_sharing_op_bulk *bulk) > > +{ > > +int rc = 0; > > +shr_handle_t sh, ch; > > + > > +while( limit > bulk->start ) > > You are missing a space there. Ack. > > +{ > > +/* > > + * We only break out if we run out of memory as individual pages may > > + * legitimately be unsharable and we just want to skip over those. > > + */ > > +rc = mem_sharing_nominate_page(d, bulk->start, 0, ); > > +if ( rc == -ENOMEM ) > > +break; > > +if ( !rc ) > > +{ > > +rc = mem_sharing_nominate_page(cd, bulk->start, 0, ); > > +if ( rc == -ENOMEM ) > > +break; > > +if ( !rc ) > > +{ > > +/* If we get here this should be guaranteed to succeed. */ > > +rc = mem_sharing_share_pages(d, bulk->start, sh, > > + cd, bulk->start, ch); > > +ASSERT(!rc); > > +} > > +} > > + > > +/* Check for continuation if it's not the last iteration. */ > > +if ( limit > ++bulk->start && hypercall_preempt_check() ) > > I surprised the compiler didn't complain to you about lack of parenthesis. This seems to be standard way to create continuation used in multiple places throughout Xen. I don't personally like it much but I guess it's better to be consistent. > > > +{ > > +rc = 1; > > +break; > > +} > > +} > > + > > +/* > > + * We only propagate -ENOMEM as individual pages may fail with -EINVAL, > > + * and for bulk sharing we only care if -ENOMEM was encountered so we reset > > + * rc here. > > + */ > > +if ( rc < 0 && rc != -ENOMEM ) > > +rc = 0; > > + > > +return rc; > > +} > > + > > int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) > > { > > int rc; > > @@ -1468,6 +1516,79 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) > > } > > break; > > > > +case XENMEM_sharing_op_bulk_share: > > +{ > > +unsigned long max_sgfn, max_cgfn; > > +struct domain *cd; > > + > > +rc = -EINVAL; > > +if( mso.u.bulk._pad[0] || mso.u.bulk._pad[1] || mso.u.bulk._pad[2] ) > > The "if(".. Ack. Thanks, Tamas ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
> diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c > index a522423..ba06fb0 100644 > --- a/xen/arch/x86/mm/mem_sharing.c > +++ b/xen/arch/x86/mm/mem_sharing.c > @@ -1294,6 +1294,54 @@ int relinquish_shared_pages(struct domain *d) > return rc; > } > > +static int bulk_share(struct domain *d, struct domain *cd, unsigned long > limit, > + struct mem_sharing_op_bulk *bulk) > +{ > +int rc = 0; > +shr_handle_t sh, ch; > + > +while( limit > bulk->start ) You are missing a space there. > +{ > +/* > + * We only break out if we run out of memory as individual pages may > + * legitimately be unsharable and we just want to skip over those. > + */ > +rc = mem_sharing_nominate_page(d, bulk->start, 0, ); > +if ( rc == -ENOMEM ) > +break; > +if ( !rc ) > +{ > +rc = mem_sharing_nominate_page(cd, bulk->start, 0, ); > +if ( rc == -ENOMEM ) > +break; > +if ( !rc ) > +{ > +/* If we get here this should be guaranteed to succeed. */ > +rc = mem_sharing_share_pages(d, bulk->start, sh, > + cd, bulk->start, ch); > +ASSERT(!rc); > +} > +} > + > +/* Check for continuation if it's not the last iteration. */ > +if ( limit > ++bulk->start && hypercall_preempt_check() ) I surprised the compiler didn't complain to you about lack of parenthesis. > +{ > +rc = 1; > +break; > +} > +} > + > +/* > + * We only propagate -ENOMEM as individual pages may fail with -EINVAL, > + * and for bulk sharing we only care if -ENOMEM was encountered so we > reset > + * rc here. > + */ > +if ( rc < 0 && rc != -ENOMEM ) > +rc = 0; > + > +return rc; > +} > + > int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) > { > int rc; > @@ -1468,6 +1516,79 @@ int > mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) > } > break; > > +case XENMEM_sharing_op_bulk_share: > +{ > +unsigned long max_sgfn, max_cgfn; > +struct domain *cd; > + > +rc = -EINVAL; > +if( mso.u.bulk._pad[0] || mso.u.bulk._pad[1] || > mso.u.bulk._pad[2] ) The "if(".. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
>>> On 12.06.16 at 01:24,wrote: > @@ -1468,6 +1516,79 @@ int > mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) > } > break; > > +case XENMEM_sharing_op_bulk_share: > +{ > +unsigned long max_sgfn, max_cgfn; > +struct domain *cd; > + > +rc = -EINVAL; > +if( mso.u.bulk._pad[0] || mso.u.bulk._pad[1] || > mso.u.bulk._pad[2] ) With the missing blank added here (doable on commit as well), hypervisor side Reviewed-by: Jan Beulich > +goto out; > + > +if ( !mem_sharing_enabled(d) ) > +goto out; Irrespective of the above I think a more descriptive error than -EINVAL would be quite reasonable to expect for a sharing operation attempted without having enabled sharing. By the name of it (but not by its conventional meaning) -EILSEQ might be a candidate. > +rc = rcu_lock_live_remote_domain_by_id(mso.u.bulk.client_domain, > + ); > +if ( rc ) > +goto out; > + > +/* > + * We reuse XENMEM_sharing_op_share XSM check here as this is > essentially > + * the same concept repeated over multiple pages. > + */ > +rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, > XENMEM_sharing_op_share); > +if ( rc ) > +{ > +rcu_unlock_domain(cd); > +goto out; > +} > + > +if ( !mem_sharing_enabled(cd) ) > +{ > +rcu_unlock_domain(cd); > +rc = -EINVAL; > +goto out; > +} > + > +if ( !atomic_read(>pause_count) || > + !atomic_read(>pause_count) ) > +{ > +rcu_unlock_domain(cd); > +rc = -EINVAL; > +goto out; > +} > + > +max_sgfn = domain_get_maximum_gpfn(d); > +max_cgfn = domain_get_maximum_gpfn(cd); > + > +if ( max_sgfn != max_cgfn || max_sgfn < mso.u.bulk.start ) > +{ > +rcu_unlock_domain(cd); I'd also recommend adding a new label below to avoid these repeated rcu_unlock_domain(cd) invocations. But for both of these - you're the maintainer, so you know best. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
On Sat, Jun 11, 2016 at 5:24 PM, Tamas K Lengyelwrote: > Currently mem-sharing can be performed on a page-by-page base from the control > domain. However, when completely deduplicating (cloning) a VM, this requires > at least 3 hypercalls per page. As the user has to loop through all pages up > to max_gpfn, this process is very slow and wasteful. > > This patch introduces a new mem_sharing memop for bulk deduplication where > the user doesn't have to separately nominate each page in both the source and > destination domain, and the looping over all pages happen in the hypervisor. > This significantly reduces the overhead of completely deduplicating entire > domains. > > Signed-off-by: Tamas K Lengyel > Acked-by: Wei Liu > --- > Ian Jackson > George Dunlap > Jan Beulich > Andrew Cooper Missed Cc: here. > > v4: Add padding to bulk op and enforce it being 0 > Use correct XSM permission check > --- > tools/libxc/include/xenctrl.h | 15 ++ > tools/libxc/xc_memshr.c | 19 +++ > xen/arch/x86/mm/mem_sharing.c | 121 > ++ > xen/include/public/memory.h | 15 +- > 4 files changed, 169 insertions(+), 1 deletion(-) > > diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h > index 6ae1a2b..34dcc72 100644 > --- a/tools/libxc/include/xenctrl.h > +++ b/tools/libxc/include/xenctrl.h > @@ -2327,6 +2327,21 @@ int xc_memshr_add_to_physmap(xc_interface *xch, > domid_t client_domain, > unsigned long client_gfn); > > +/* Allows to deduplicate the entire memory of a client domain in bulk. Using > + * this function is equivalent of calling xc_memshr_nominate_gfn for each gfn > + * in the two domains followed by xc_memshr_share_gfns. If successfull, > + * returns the number of shared pages in 'shared'. Both domains must be > paused. > + * > + * May fail with -EINVAL if the source and client domain have different > + * memory size or if memory sharing is not enabled on either of the domains. > + * May also fail with -ENOMEM if there isn't enough memory available to store > + * the sharing metadata before deduplication can happen. > + */ > +int xc_memshr_bulk_share(xc_interface *xch, > + domid_t source_domain, > + domid_t client_domain, > + uint64_t *shared); > + > /* Debug calls: return the number of pages referencing the shared frame > backing > * the input argument. Should be one or greater. > * > diff --git a/tools/libxc/xc_memshr.c b/tools/libxc/xc_memshr.c > index deb0aa4..71350d2 100644 > --- a/tools/libxc/xc_memshr.c > +++ b/tools/libxc/xc_memshr.c > @@ -181,6 +181,25 @@ int xc_memshr_add_to_physmap(xc_interface *xch, > return xc_memshr_memop(xch, source_domain, ); > } > > +int xc_memshr_bulk_share(xc_interface *xch, > + domid_t source_domain, > + domid_t client_domain, > + uint64_t *shared) > +{ > +int rc; > +xen_mem_sharing_op_t mso; > + > +memset(, 0, sizeof(mso)); > + > +mso.op = XENMEM_sharing_op_bulk_share; > +mso.u.bulk.client_domain = client_domain; > + > +rc = xc_memshr_memop(xch, source_domain, ); > +if ( !rc && shared ) *shared = mso.u.bulk.shared; > + > +return rc; > +} > + > int xc_memshr_domain_resume(xc_interface *xch, > domid_t domid) > { > diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c > index a522423..ba06fb0 100644 > --- a/xen/arch/x86/mm/mem_sharing.c > +++ b/xen/arch/x86/mm/mem_sharing.c > @@ -1294,6 +1294,54 @@ int relinquish_shared_pages(struct domain *d) > return rc; > } > > +static int bulk_share(struct domain *d, struct domain *cd, unsigned long > limit, > + struct mem_sharing_op_bulk *bulk) > +{ > +int rc = 0; > +shr_handle_t sh, ch; > + > +while( limit > bulk->start ) > +{ > +/* > + * We only break out if we run out of memory as individual pages may > + * legitimately be unsharable and we just want to skip over those. > + */ > +rc = mem_sharing_nominate_page(d, bulk->start, 0, ); > +if ( rc == -ENOMEM ) > +break; > +if ( !rc ) > +{ > +rc = mem_sharing_nominate_page(cd, bulk->start, 0, ); > +if ( rc == -ENOMEM ) > +break; > +if ( !rc ) > +{ > +/* If we get here this should be guaranteed to succeed. */ > +rc = mem_sharing_share_pages(d, bulk->start, sh, > + cd, bulk->start, ch); > +ASSERT(!rc); > +} > +} > + > +/* Check for continuation if it's not the last
[Xen-devel] [PATCH v5 1/2] x86/mem-sharing: Bulk mem-sharing entire domains
Currently mem-sharing can be performed on a page-by-page base from the control domain. However, when completely deduplicating (cloning) a VM, this requires at least 3 hypercalls per page. As the user has to loop through all pages up to max_gpfn, this process is very slow and wasteful. This patch introduces a new mem_sharing memop for bulk deduplication where the user doesn't have to separately nominate each page in both the source and destination domain, and the looping over all pages happen in the hypervisor. This significantly reduces the overhead of completely deduplicating entire domains. Signed-off-by: Tamas K LengyelAcked-by: Wei Liu --- Ian Jackson George Dunlap Jan Beulich Andrew Cooper v4: Add padding to bulk op and enforce it being 0 Use correct XSM permission check --- tools/libxc/include/xenctrl.h | 15 ++ tools/libxc/xc_memshr.c | 19 +++ xen/arch/x86/mm/mem_sharing.c | 121 ++ xen/include/public/memory.h | 15 +- 4 files changed, 169 insertions(+), 1 deletion(-) diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index 6ae1a2b..34dcc72 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2327,6 +2327,21 @@ int xc_memshr_add_to_physmap(xc_interface *xch, domid_t client_domain, unsigned long client_gfn); +/* Allows to deduplicate the entire memory of a client domain in bulk. Using + * this function is equivalent of calling xc_memshr_nominate_gfn for each gfn + * in the two domains followed by xc_memshr_share_gfns. If successfull, + * returns the number of shared pages in 'shared'. Both domains must be paused. + * + * May fail with -EINVAL if the source and client domain have different + * memory size or if memory sharing is not enabled on either of the domains. + * May also fail with -ENOMEM if there isn't enough memory available to store + * the sharing metadata before deduplication can happen. + */ +int xc_memshr_bulk_share(xc_interface *xch, + domid_t source_domain, + domid_t client_domain, + uint64_t *shared); + /* Debug calls: return the number of pages referencing the shared frame backing * the input argument. Should be one or greater. * diff --git a/tools/libxc/xc_memshr.c b/tools/libxc/xc_memshr.c index deb0aa4..71350d2 100644 --- a/tools/libxc/xc_memshr.c +++ b/tools/libxc/xc_memshr.c @@ -181,6 +181,25 @@ int xc_memshr_add_to_physmap(xc_interface *xch, return xc_memshr_memop(xch, source_domain, ); } +int xc_memshr_bulk_share(xc_interface *xch, + domid_t source_domain, + domid_t client_domain, + uint64_t *shared) +{ +int rc; +xen_mem_sharing_op_t mso; + +memset(, 0, sizeof(mso)); + +mso.op = XENMEM_sharing_op_bulk_share; +mso.u.bulk.client_domain = client_domain; + +rc = xc_memshr_memop(xch, source_domain, ); +if ( !rc && shared ) *shared = mso.u.bulk.shared; + +return rc; +} + int xc_memshr_domain_resume(xc_interface *xch, domid_t domid) { diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c index a522423..ba06fb0 100644 --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -1294,6 +1294,54 @@ int relinquish_shared_pages(struct domain *d) return rc; } +static int bulk_share(struct domain *d, struct domain *cd, unsigned long limit, + struct mem_sharing_op_bulk *bulk) +{ +int rc = 0; +shr_handle_t sh, ch; + +while( limit > bulk->start ) +{ +/* + * We only break out if we run out of memory as individual pages may + * legitimately be unsharable and we just want to skip over those. + */ +rc = mem_sharing_nominate_page(d, bulk->start, 0, ); +if ( rc == -ENOMEM ) +break; +if ( !rc ) +{ +rc = mem_sharing_nominate_page(cd, bulk->start, 0, ); +if ( rc == -ENOMEM ) +break; +if ( !rc ) +{ +/* If we get here this should be guaranteed to succeed. */ +rc = mem_sharing_share_pages(d, bulk->start, sh, + cd, bulk->start, ch); +ASSERT(!rc); +} +} + +/* Check for continuation if it's not the last iteration. */ +if ( limit > ++bulk->start && hypercall_preempt_check() ) +{ +rc = 1; +break; +} +} + +/* + * We only propagate -ENOMEM as individual pages may fail with -EINVAL, + * and for bulk sharing we only care if -ENOMEM was encountered so we reset + * rc here. +