Re: [Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info

2020-03-12 Thread Jan Beulich
On 11.03.2020 16:28, Paul Durrant wrote:
>> -Original Message-
>> From: Jan Beulich 
>> Sent: 11 March 2020 09:17
>> To: p...@xen.org
>> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' ; 
>> 'Stefano Stabellini'
>> ; 'Julien Grall' ; 'Volodymyr 
>> Babchuk'
>> ; 'Andrew Cooper' ; 
>> 'George Dunlap'
>> ; 'Ian Jackson' ; 
>> 'Konrad Rzeszutek Wilk'
>> ; 'Wei Liu' 
>> Subject: Re: [PATCH v5 6/6] domain: use PGC_extra domheap page for 
>> shared_info
>>
>> On 10.03.2020 18:33, Paul Durrant wrote:
 -Original Message-
 From: Jan Beulich 
 Sent: 09 March 2020 15:56

 On 09.03.2020 11:23, p...@xen.org wrote:
> --- a/xen/common/time.c
> +++ b/xen/common/time.c
> @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d)
>  uint32_t *wc_version;
>  uint64_t sec;
>
> +if ( d != current->domain )
> +{
> +/*
> + * We need to check is_dying here as, if it is set, the
> + * shared_info may have been freed. To do this safely we need
> + * hold the domain lock.
> + */
> +domain_lock(d);
> +if ( d->is_dying )
> +goto unlock;
> +}

 This shouldn't happen very often, but it's pretty heavy a lock.
 It's a fundamental aspect of xenheap pages that their disposal
 can b e delay until almost the last moment of guest cleanup. I
 continue to think it's not really a good ideal to have special
 purpose allocation (and mapping) accompanied by these pages
 getting taken care of by the generic relinquish-resources logic
 here (from a more general pov such is of course often nice to
 have). Instead of freeing these pages there, couldn't they just
 be taken off d->page_list, with the unmapping and freeing left
 as it was?
>>>
>>> I don't think this can be achieved without being able de-assign
>>> pages and I don't really want to have to invent new logic to do
>>> that (basically re-implementing what happens to xenheap pages).
>>
>> Where's the connection to being able to de-assign pages here?
>> There'll be one when the same conversion is to be done for
>> gnttab code, but I don't see it here - the shared info page is
>> never to be de-assigned. As to gnttab code, I think it was
>> noted before that we may be better off not "unpopulating"
>> status pages when switching back from v2 to v1. At which point
>> the de-assignment need would go away there, too.
> 
> Ok, maybe I'm misunderstanding something then. We need to call
> free_domheap_pages() on all pages assigned to a domain so that
> the domain references get dropped. The xenpage ref is dropped
> when d->xenheap_pages == 0. The domheap ref is dropped when
> domain_adjust_tot_pages() returns zero. (This is what I meant
> by de-assigning... but that was probably a poor choice of words).
> So, because domain_adjust_tot_pages() returns d->tot_pages
> (which includes the extra_pages count) it won't fall to zero
> until the last put_page() on any PGC_extra page. So how is it
> possible to free shared_info in domain destroy? We'll never get
> that far, because the domheap ref will never get dropped.

Well, now that these pages sit on a separate list, it would
look even less problematic than before to me to also give them
special treatment here: You wouldn't even have to take them
off the list anymore, but just call domain_adjust_tot_pages()
with -d->extra_pages (and suitably deal with the return value;
perhaps for consistency then followed by also zeroing
d->extra_pages, so overall accounting still looks correct).
Actually taking these pages off the list could (for dumping
purposes) then be done alongside their actual freeing. Such a
transition would apparently imply clearing PGC_extra alongside
the new domain_adjust_tot_pages() call.

I realize though that the end result won't be much different
from the current PGC_xen_heap handling (at least as far as
domain cleanup goes, but after all that's what I'm in fact
trying to convince you of), so the question would be whether
the whole transition then is worth it. Without having seen at
least a sketch of the LU code that is affected by the current
behavior, it remains hard for me to tell what issues might
remain, despite your and David's explanations.

> I
> guess this could be fixed by having domain_adjust_tot_pages()
> return the same values as domain_tot_pages() (i.e.
> tot_pages - extra_pages). Is that what you're suggesting?

That's an option, too, but imo less desirable.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info

2020-03-11 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich 
> Sent: 11 March 2020 09:17
> To: p...@xen.org
> Cc: xen-devel@lists.xenproject.org; 'Paul Durrant' ; 
> 'Stefano Stabellini'
> ; 'Julien Grall' ; 'Volodymyr Babchuk'
> ; 'Andrew Cooper' ; 
> 'George Dunlap'
> ; 'Ian Jackson' ; 
> 'Konrad Rzeszutek Wilk'
> ; 'Wei Liu' 
> Subject: Re: [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
> 
> On 10.03.2020 18:33, Paul Durrant wrote:
> >> -Original Message-
> >> From: Jan Beulich 
> >> Sent: 09 March 2020 15:56
> >>
> >> On 09.03.2020 11:23, p...@xen.org wrote:
> >>> --- a/xen/common/time.c
> >>> +++ b/xen/common/time.c
> >>> @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d)
> >>>  uint32_t *wc_version;
> >>>  uint64_t sec;
> >>>
> >>> +if ( d != current->domain )
> >>> +{
> >>> +/*
> >>> + * We need to check is_dying here as, if it is set, the
> >>> + * shared_info may have been freed. To do this safely we need
> >>> + * hold the domain lock.
> >>> + */
> >>> +domain_lock(d);
> >>> +if ( d->is_dying )
> >>> +goto unlock;
> >>> +}
> >>
> >> This shouldn't happen very often, but it's pretty heavy a lock.
> >> It's a fundamental aspect of xenheap pages that their disposal
> >> can b e delay until almost the last moment of guest cleanup. I
> >> continue to think it's not really a good ideal to have special
> >> purpose allocation (and mapping) accompanied by these pages
> >> getting taken care of by the generic relinquish-resources logic
> >> here (from a more general pov such is of course often nice to
> >> have). Instead of freeing these pages there, couldn't they just
> >> be taken off d->page_list, with the unmapping and freeing left
> >> as it was?
> >
> > I don't think this can be achieved without being able de-assign
> > pages and I don't really want to have to invent new logic to do
> > that (basically re-implementing what happens to xenheap pages).
> 
> Where's the connection to being able to de-assign pages here?
> There'll be one when the same conversion is to be done for
> gnttab code, but I don't see it here - the shared info page is
> never to be de-assigned. As to gnttab code, I think it was
> noted before that we may be better off not "unpopulating"
> status pages when switching back from v2 to v1. At which point
> the de-assignment need would go away there, too.

Ok, maybe I'm misunderstanding something then. We need to call 
free_domheap_pages() on all pages assigned to a domain so that the domain 
references get dropped. The xenpage ref is dropped when d->xenheap_pages == 0. 
The domheap ref is dropped when domain_adjust_tot_pages() returns zero. (This 
is what I meant by de-assigning... but that was probably a poor choice of 
words). So, because domain_adjust_tot_pages() returns d->tot_pages (which 
includes the extra_pages count) it won't fall to zero until the last put_page() 
on any PGC_extra page. So how is it possible to free shared_info in domain 
destroy? We'll never get that far, because the domheap ref will never get 
dropped. I guess this could be fixed by having domain_adjust_tot_pages() return 
the same values as domain_tot_pages() (i.e. tot_pages - extra_pages). Is that 
what you're suggesting?

> 
> > I really don't think it is that bad to deal with shared info
> > and grant table pages as domheap pages. Yes, we have to be
> > careful, but in this case the lock is only taken in a
> > toolstack update of the wallclock and, apart from start of
> > day access, I think this is limited to XEN_DOMCTL_settimeoffset
> > and XEN_DOMCTL_setvcpucontext neither of which I believe is
> > particularly performance-critical.
> 
> It's not, I agree (and hence I had started my previous reply
> also with "This shouldn't happen very often"). How all of this
> is going to look like with the new extra_page_list I'll have
> to see anyway. But for now I remain unconvinced of the want /
> need to de-allocate the shared info page early.
> 

Well hopefully I've explained above why that is currently necessary if it 
becomes a domheap page.

  Paul



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info

2020-03-11 Thread Jan Beulich
On 10.03.2020 18:33, Paul Durrant wrote:
>> -Original Message-
>> From: Jan Beulich 
>> Sent: 09 March 2020 15:56
>>
>> On 09.03.2020 11:23, p...@xen.org wrote:
>>> --- a/xen/common/time.c
>>> +++ b/xen/common/time.c
>>> @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d)
>>>  uint32_t *wc_version;
>>>  uint64_t sec;
>>>
>>> +if ( d != current->domain )
>>> +{
>>> +/*
>>> + * We need to check is_dying here as, if it is set, the
>>> + * shared_info may have been freed. To do this safely we need
>>> + * hold the domain lock.
>>> + */
>>> +domain_lock(d);
>>> +if ( d->is_dying )
>>> +goto unlock;
>>> +}
>>
>> This shouldn't happen very often, but it's pretty heavy a lock.
>> It's a fundamental aspect of xenheap pages that their disposal
>> can b e delay until almost the last moment of guest cleanup. I
>> continue to think it's not really a good ideal to have special
>> purpose allocation (and mapping) accompanied by these pages
>> getting taken care of by the generic relinquish-resources logic
>> here (from a more general pov such is of course often nice to
>> have). Instead of freeing these pages there, couldn't they just
>> be taken off d->page_list, with the unmapping and freeing left
>> as it was?
> 
> I don't think this can be achieved without being able de-assign
> pages and I don't really want to have to invent new logic to do
> that (basically re-implementing what happens to xenheap pages).

Where's the connection to being able to de-assign pages here?
There'll be one when the same conversion is to be done for
gnttab code, but I don't see it here - the shared info page is
never to be de-assigned. As to gnttab code, I think it was
noted before that we may be better off not "unpopulating"
status pages when switching back from v2 to v1. At which point
the de-assignment need would go away there, too.

> I really don't think it is that bad to deal with shared info
> and grant table pages as domheap pages. Yes, we have to be
> careful, but in this case the lock is only taken in a
> toolstack update of the wallclock and, apart from start of
> day access, I think this is limited to XEN_DOMCTL_settimeoffset
> and XEN_DOMCTL_setvcpucontext neither of which I believe is
> particularly performance-critical.

It's not, I agree (and hence I had started my previous reply
also with "This shouldn't happen very often"). How all of this
is going to look like with the new extra_page_list I'll have
to see anyway. But for now I remain unconvinced of the want /
need to de-allocate the shared info page early.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info

2020-03-10 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich 
> Sent: 09 March 2020 15:56
> To: p...@xen.org
> Cc: xen-devel@lists.xenproject.org; Paul Durrant ; 
> Stefano Stabellini
> ; Julien Grall ; Volodymyr Babchuk
> ; Andrew Cooper ; 
> George Dunlap
> ; Ian Jackson ; Konrad 
> Rzeszutek Wilk
> ; Wei Liu 
> Subject: Re: [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info
> 
> On 09.03.2020 11:23, p...@xen.org wrote:
> > From: Paul Durrant 
> >
> > Currently shared_info is a shared xenheap page but shared xenheap pages
> > complicate future plans for live-update of Xen so it is desirable to,
> > where possible, not use them [1]. This patch therefore converts shared_info
> > into a PGC_extra domheap page. This does entail freeing shared_info during
> > domain_relinquish_resources() rather than domain_destroy() so care is
> > needed to avoid de-referencing a NULL shared_info pointer hence some
> > extra checks of 'is_dying' are needed.
> >
> > Also, because shared_info will no longer be a xenheap page this patch adds
> > an extra dump function to make sure the shared_info MFN is included in the
> > output of dump_pageframe_info().
> 
> I've voiced my objection to such a model, and hence it'll take
> another REST maintainer to approve of this despite my arguments
> against it. (And of course, just to re-record this here, the
> APIC access page, converted by ea3daabff5f2, will want to get a
> dumping function added then, too.)
> 
> I wonder whether a domain's "extra" pages couldn't be put in a
> separate, singly-linked list, using the union the next_shadow
> field is in as the linking field. None of the other union
> members look to be applicable to "extra" pages.
> 
> > +void dump_shared_info(struct domain *d)
> > +{
> > +domain_lock(d);
> > +
> > +if ( d->shared_info.virt )
> > +printk("Shared Info: %"PRI_mfn"\n", mfn_x(d->shared_info.mfn));
> 
> count_info and type_info should be logged imo, just like
> dump_pageframe_info() does. On the whole I think the actual
> dumping might better be uniform, and these functions would
> then only exist to "know" which page(s) to dump.
> 

I think the extra page list should cover these issues.

> > --- a/xen/common/time.c
> > +++ b/xen/common/time.c
> > @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d)
> >  uint32_t *wc_version;
> >  uint64_t sec;
> >
> > +if ( d != current->domain )
> > +{
> > +/*
> > + * We need to check is_dying here as, if it is set, the
> > + * shared_info may have been freed. To do this safely we need
> > + * hold the domain lock.
> > + */
> > +domain_lock(d);
> > +if ( d->is_dying )
> > +goto unlock;
> > +}
> 
> This shouldn't happen very often, but it's pretty heavy a lock.
> It's a fundamental aspect of xenheap pages that their disposal
> can b e delay until almost the last moment of guest cleanup. I
> continue to think it's not really a good ideal to have special
> purpose allocation (and mapping) accompanied by these pages
> getting taken care of by the generic relinquish-resources logic
> here (from a more general pov such is of course often nice to
> have). Instead of freeing these pages there, couldn't they just
> be taken off d->page_list, with the unmapping and freeing left
> as it was?

I don't think this can be achieved without being able de-assign pages and I 
don't really want to have to invent new logic to do that (basically 
re-implementing what happens to xenheap pages). I really don't think it is that 
bad to deal with shared info and grant table pages as domheap pages. Yes, we 
have to be careful, but in this case the lock is only taken in a toolstack 
update of the wallclock and, apart from start of day access, I think this is 
limited to XEN_DOMCTL_settimeoffset and XEN_DOMCTL_setvcpucontext neither of 
which I believe is particularly performance-critical.

  Paul

> 
> Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info

2020-03-09 Thread Jan Beulich
On 09.03.2020 11:23, p...@xen.org wrote:
> From: Paul Durrant 
> 
> Currently shared_info is a shared xenheap page but shared xenheap pages
> complicate future plans for live-update of Xen so it is desirable to,
> where possible, not use them [1]. This patch therefore converts shared_info
> into a PGC_extra domheap page. This does entail freeing shared_info during
> domain_relinquish_resources() rather than domain_destroy() so care is
> needed to avoid de-referencing a NULL shared_info pointer hence some
> extra checks of 'is_dying' are needed.
> 
> Also, because shared_info will no longer be a xenheap page this patch adds
> an extra dump function to make sure the shared_info MFN is included in the
> output of dump_pageframe_info().

I've voiced my objection to such a model, and hence it'll take
another REST maintainer to approve of this despite my arguments
against it. (And of course, just to re-record this here, the
APIC access page, converted by ea3daabff5f2, will want to get a
dumping function added then, too.)

I wonder whether a domain's "extra" pages couldn't be put in a
separate, singly-linked list, using the union the next_shadow
field is in as the linking field. None of the other union
members look to be applicable to "extra" pages.

> +void dump_shared_info(struct domain *d)
> +{
> +domain_lock(d);
> +
> +if ( d->shared_info.virt )
> +printk("Shared Info: %"PRI_mfn"\n", mfn_x(d->shared_info.mfn));

count_info and type_info should be logged imo, just like
dump_pageframe_info() does. On the whole I think the actual
dumping might better be uniform, and these functions would
then only exist to "know" which page(s) to dump.

> --- a/xen/common/time.c
> +++ b/xen/common/time.c
> @@ -99,6 +99,18 @@ void update_domain_wallclock_time(struct domain *d)
>  uint32_t *wc_version;
>  uint64_t sec;
>  
> +if ( d != current->domain )
> +{
> +/*
> + * We need to check is_dying here as, if it is set, the
> + * shared_info may have been freed. To do this safely we need
> + * hold the domain lock.
> + */
> +domain_lock(d);
> +if ( d->is_dying )
> +goto unlock;
> +}

This shouldn't happen very often, but it's pretty heavy a lock.
It's a fundamental aspect of xenheap pages that their disposal
can b e delay until almost the last moment of guest cleanup. I
continue to think it's not really a good ideal to have special
purpose allocation (and mapping) accompanied by these pages
getting taken care of by the generic relinquish-resources logic
here (from a more general pov such is of course often nice to
have). Instead of freeing these pages there, couldn't they just
be taken off d->page_list, with the unmapping and freeing left
as it was?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 6/6] domain: use PGC_extra domheap page for shared_info

2020-03-09 Thread paul
From: Paul Durrant 

Currently shared_info is a shared xenheap page but shared xenheap pages
complicate future plans for live-update of Xen so it is desirable to,
where possible, not use them [1]. This patch therefore converts shared_info
into a PGC_extra domheap page. This does entail freeing shared_info during
domain_relinquish_resources() rather than domain_destroy() so care is
needed to avoid de-referencing a NULL shared_info pointer hence some
extra checks of 'is_dying' are needed.

Also, because shared_info will no longer be a xenheap page this patch adds
an extra dump function to make sure the shared_info MFN is included in the
output of dump_pageframe_info().

NOTE: For Arm, the call to free_shared_info() in arch_domain_destroy() is
  left in place since it is idempotent and called in the error path for
  arch_domain_create().

[1] See 
https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg02018.html

Signed-off-by: Paul Durrant 
---
Cc: Stefano Stabellini 
Cc: Julien Grall 
Cc: Volodymyr Babchuk 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Konrad Rzeszutek Wilk 
Cc: Wei Liu 

v5:
 - Incorporate new dump_shared_info() function

v2:
 - Addressed comments from Julien
 - Expanded the commit comment to explain why this patch is wanted
---
 xen/arch/arm/domain.c  |  2 ++
 xen/arch/x86/domain.c  |  5 -
 xen/common/domain.c| 38 ++
 xen/common/event_channel.c |  3 +++
 xen/common/time.c  | 15 +++
 xen/include/xen/domain.h   |  1 +
 6 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 5298d80bd2..741f6dd444 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -1005,6 +1005,8 @@ int domain_relinquish_resources(struct domain *d)
 BUG();
 }
 
+free_shared_info(d);
+
 return 0;
 }
 
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index f6ed25e8ee..60623beba1 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -261,6 +261,8 @@ void dump_pageframe_info(struct domain *d)
page->count_info, page->u.inuse.type_info);
 }
 spin_unlock(>page_alloc_lock);
+
+dump_shared_info(d);
 }
 
 void update_guest_memory_policy(struct vcpu *v,
@@ -693,7 +695,6 @@ void arch_domain_destroy(struct domain *d)
 pv_domain_destroy(d);
 free_perdomain_mappings(d);
 
-free_shared_info(d);
 cleanup_domain_irq_mapping(d);
 
 psr_domain_free(d);
@@ -2249,6 +2250,8 @@ int domain_relinquish_resources(struct domain *d)
 if ( is_hvm_domain(d) )
 hvm_domain_relinquish_resources(d);
 
+free_shared_info(d);
+
 return 0;
 }
 
diff --git a/xen/common/domain.c b/xen/common/domain.c
index ba7a905258..0b1c722708 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1650,24 +1650,54 @@ int continue_hypercall_on_cpu(
 
 int alloc_shared_info(struct domain *d, unsigned int memflags)
 {
-if ( (d->shared_info.virt = alloc_xenheap_pages(0, memflags)) == NULL )
+struct page_info *pg;
+
+pg = alloc_domheap_page(d, MEMF_no_refcount | memflags);
+if ( !pg )
 return -ENOMEM;
 
-d->shared_info.mfn = virt_to_mfn(d->shared_info.virt);
+if ( !get_page_and_type(pg, d, PGT_writable_page) )
+{
+/*
+ * The domain should not be running at this point so there is
+ * no way we should reach this error path.
+ */
+ASSERT_UNREACHABLE();
+return -ENODATA;
+}
+
+d->shared_info.mfn = page_to_mfn(pg);
+d->shared_info.virt = __map_domain_page_global(pg);
 
 clear_page(d->shared_info.virt);
-share_xen_page_with_guest(mfn_to_page(d->shared_info.mfn), d, SHARE_rw);
 
 return 0;
 }
 
 void free_shared_info(struct domain *d)
 {
+struct page_info *pg;
+
 if ( !d->shared_info.virt )
 return;
 
-free_xenheap_page(d->shared_info.virt);
+unmap_domain_page_global(d->shared_info.virt);
 d->shared_info.virt = NULL;
+
+pg = mfn_to_page(d->shared_info.mfn);
+
+put_page_alloc_ref(pg);
+put_page_and_type(pg);
+}
+
+void dump_shared_info(struct domain *d)
+{
+domain_lock(d);
+
+if ( d->shared_info.virt )
+printk("Shared Info: %"PRI_mfn"\n", mfn_x(d->shared_info.mfn));
+
+domain_unlock(d);
 }
 
 /*
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index e86e2bfab0..a17422284d 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -1325,6 +1325,9 @@ void evtchn_destroy(struct domain *d)
 {
 unsigned int i;
 
+/* This must be done before shared_info is freed */
+BUG_ON(!d->shared_info.virt);
+
 /* After this barrier no new event-channel allocations can occur. */
 BUG_ON(!d->is_dying);
 spin_barrier(>event_lock);
diff --git a/xen/common/time.c b/xen/common/time.c
index 58fa9abc40..ada02faf07 100644
--- a/xen/common/time.c
+++ b/xen/common/time.c
@@ -99,6