Re: [Xen-devel] [PATCH 4/5] x86/pv: Drop support for paging out the LDT

2018-01-18 Thread Jan Beulich
>>> On 18.01.18 at 12:00,  wrote:
> On 18/01/18 10:57, Andrew Cooper wrote:
>> On 18/01/18 10:38, George Dunlap wrote:
>>> On Thu, Jan 18, 2018 at 7:59 AM, Jan Beulich  wrote:
>>> On 12.01.18 at 19:37,  wrote:
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1942,11 +1942,8 @@ int domain_relinquish_resources(struct domain *d)
>  {
>  for_each_vcpu ( d, v )
>  {
> -/*
> - * Relinquish GDT mappings. No need for explicit 
> unmapping of
> - * the LDT as it automatically gets squashed with the 
> guest
> - * mappings.
> - */
> +/* Relinquish GDT/LDT mappings. */
> +pv_destroy_ldt(v);
>  pv_destroy_gdt(v);
 The domain is dead at this point, so the order doesn't matter much,
 but strictly speaking you should destroy the GDT before destroying
 the LDT (just like LDT _loads_ always need to come _after_ GDT
 adjustments).
>> By that logic, I've got it the correct way round (which is they way I
>> intended).  Allocation and freeing of the LDT is nested within the scope
>> of the GDT.

Ah, I see how I didn't properly express what I mean. The idea
behind the remark was that the GDT might still have a reference
to the LDT, which would become stale with the LDT gone. The
better thing to compare with would be construction of an LDT
versus insertion of the respective descriptor into the GDT.

 Everything else here looks fine, but the initial comment may need
 further discussion. For example we may want to consider a
 two-stage phasing out of the feature, with a couple of years in
 between: Make the functionality dependent upon a default-off
 command line option for the time being, and issue a bright warning
 when someone actually enables it (telling them to tell us).
>>> One of the problems we have is that people seem to wait for 2-3 years
>>> after a release has been made to start updating to it.  So we'd have
>>> to leave such a warning for probably 5 years minimum.
>> In almost any other case, I'd agree, and would be the first to suggest this.
>>
>> However, This patch is strictly necessary for a more complete XPTI,
>> which is how I stumbled onto the issue in my KAISER series to begin
>> with.  It is the only codepath where we ever poke at a remote critical
>> data structure, which is why
> 
> Sorry - sent too early.
> 
> ... which is why isolating the LDT in a per-cpu doesn't work in
> combination with this algorithm.

Right, in the context of that series I can see it likely becoming
unavoidable to remove this functionality (aiui it could be kept,
but would become more complicated). Not having heard back
on the incompleteness discussion on stage 1 yet, I can't really
conclude at this point whether we will actually _need_ (not
just want) this full series making things per-CPU (taken together
with there so far not being any promise of it to help recover
performance).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/5] x86/pv: Drop support for paging out the LDT

2018-01-18 Thread Andrew Cooper
On 18/01/18 10:57, Andrew Cooper wrote:
> On 18/01/18 10:38, George Dunlap wrote:
>> On Thu, Jan 18, 2018 at 7:59 AM, Jan Beulich  wrote:
>> On 12.01.18 at 19:37,  wrote:
 Windows is the only OS which pages out kernel datastructures, so chances 
 are
 good that this is a vestigial remnant of the PV Windows XP experiment.
>>> This is based on what? How do you know there are no other OSes
>>> doing so, including perhaps ones which none of us has ever heard
>>> of?
> The chances of there being a production OS ported to PV that we've never
> head of 0.
>
>>>  The diffstat of the change here is certainly nice, but as always
>>> with something being removed from the ABI I'm rather hesitant.
>>>
 --- a/xen/arch/x86/domain.c
 +++ b/xen/arch/x86/domain.c
 @@ -1942,11 +1942,8 @@ int domain_relinquish_resources(struct domain *d)
  {
  for_each_vcpu ( d, v )
  {
 -/*
 - * Relinquish GDT mappings. No need for explicit 
 unmapping of
 - * the LDT as it automatically gets squashed with the 
 guest
 - * mappings.
 - */
 +/* Relinquish GDT/LDT mappings. */
 +pv_destroy_ldt(v);
  pv_destroy_gdt(v);
>>> The domain is dead at this point, so the order doesn't matter much,
>>> but strictly speaking you should destroy the GDT before destroying
>>> the LDT (just like LDT _loads_ always need to come _after_ GDT
>>> adjustments).
> By that logic, I've got it the correct way round (which is they way I
> intended).  Allocation and freeing of the LDT is nested within the scope
> of the GDT.
>
>>> Everything else here looks fine, but the initial comment may need
>>> further discussion. For example we may want to consider a
>>> two-stage phasing out of the feature, with a couple of years in
>>> between: Make the functionality dependent upon a default-off
>>> command line option for the time being, and issue a bright warning
>>> when someone actually enables it (telling them to tell us).
>> One of the problems we have is that people seem to wait for 2-3 years
>> after a release has been made to start updating to it.  So we'd have
>> to leave such a warning for probably 5 years minimum.
> In almost any other case, I'd agree, and would be the first to suggest this.
>
> However, This patch is strictly necessary for a more complete XPTI,
> which is how I stumbled onto the issue in my KAISER series to begin
> with.  It is the only codepath where we ever poke at a remote critical
> data structure, which is why

Sorry - sent too early.

... which is why isolating the LDT in a per-cpu doesn't work in
combination with this algorithm.

~Andrew

>
> Also as noted in the commit message, it is broken even in the case you
> wanted to sensibly page the LDT, which further backs up the exception
> that it was only for Windows XP, and has never ever encountered a
> production system.
>
> ~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/5] x86/pv: Drop support for paging out the LDT

2018-01-18 Thread Andrew Cooper
On 18/01/18 10:38, George Dunlap wrote:
> On Thu, Jan 18, 2018 at 7:59 AM, Jan Beulich  wrote:
> On 12.01.18 at 19:37,  wrote:
>>> Windows is the only OS which pages out kernel datastructures, so chances are
>>> good that this is a vestigial remnant of the PV Windows XP experiment.
>> This is based on what? How do you know there are no other OSes
>> doing so, including perhaps ones which none of us has ever heard
>> of?

The chances of there being a production OS ported to PV that we've never
head of 0.

>>  The diffstat of the change here is certainly nice, but as always
>> with something being removed from the ABI I'm rather hesitant.
>>
>>> --- a/xen/arch/x86/domain.c
>>> +++ b/xen/arch/x86/domain.c
>>> @@ -1942,11 +1942,8 @@ int domain_relinquish_resources(struct domain *d)
>>>  {
>>>  for_each_vcpu ( d, v )
>>>  {
>>> -/*
>>> - * Relinquish GDT mappings. No need for explicit unmapping 
>>> of
>>> - * the LDT as it automatically gets squashed with the guest
>>> - * mappings.
>>> - */
>>> +/* Relinquish GDT/LDT mappings. */
>>> +pv_destroy_ldt(v);
>>>  pv_destroy_gdt(v);
>> The domain is dead at this point, so the order doesn't matter much,
>> but strictly speaking you should destroy the GDT before destroying
>> the LDT (just like LDT _loads_ always need to come _after_ GDT
>> adjustments).

By that logic, I've got it the correct way round (which is they way I
intended).  Allocation and freeing of the LDT is nested within the scope
of the GDT.

>>
>> Everything else here looks fine, but the initial comment may need
>> further discussion. For example we may want to consider a
>> two-stage phasing out of the feature, with a couple of years in
>> between: Make the functionality dependent upon a default-off
>> command line option for the time being, and issue a bright warning
>> when someone actually enables it (telling them to tell us).
> One of the problems we have is that people seem to wait for 2-3 years
> after a release has been made to start updating to it.  So we'd have
> to leave such a warning for probably 5 years minimum.

In almost any other case, I'd agree, and would be the first to suggest this.

However, This patch is strictly necessary for a more complete XPTI,
which is how I stumbled onto the issue in my KAISER series to begin
with.  It is the only codepath where we ever poke at a remote critical
data structure, which is why

Also as noted in the commit message, it is broken even in the case you
wanted to sensibly page the LDT, which further backs up the exception
that it was only for Windows XP, and has never ever encountered a
production system.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/5] x86/pv: Drop support for paging out the LDT

2018-01-18 Thread Jan Beulich
>>> On 18.01.18 at 11:38,  wrote:
> On Thu, Jan 18, 2018 at 7:59 AM, Jan Beulich  wrote:
> On 12.01.18 at 19:37,  wrote:
>>> --- a/xen/arch/x86/domain.c
>>> +++ b/xen/arch/x86/domain.c
>>> @@ -1942,11 +1942,8 @@ int domain_relinquish_resources(struct domain *d)
>>>  {
>>>  for_each_vcpu ( d, v )
>>>  {
>>> -/*
>>> - * Relinquish GDT mappings. No need for explicit unmapping 
>>> of
>>> - * the LDT as it automatically gets squashed with the guest
>>> - * mappings.
>>> - */
>>> +/* Relinquish GDT/LDT mappings. */
>>> +pv_destroy_ldt(v);
>>>  pv_destroy_gdt(v);
>>
>> The domain is dead at this point, so the order doesn't matter much,
>> but strictly speaking you should destroy the GDT before destroying
>> the LDT (just like LDT _loads_ always need to come _after_ GDT
>> adjustments).
>>
>> Everything else here looks fine, but the initial comment may need
>> further discussion. For example we may want to consider a
>> two-stage phasing out of the feature, with a couple of years in
>> between: Make the functionality dependent upon a default-off
>> command line option for the time being, and issue a bright warning
>> when someone actually enables it (telling them to tell us).
> 
> One of the problems we have is that people seem to wait for 2-3 years
> after a release has been made to start updating to it.  So we'd have
> to leave such a warning for probably 5 years minimum.

That's a reasonable time frame imo for phasing out something that's
a de-facto part of an ABI.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/5] x86/pv: Drop support for paging out the LDT

2018-01-18 Thread George Dunlap
On Thu, Jan 18, 2018 at 7:59 AM, Jan Beulich  wrote:
 On 12.01.18 at 19:37,  wrote:
>> Windows is the only OS which pages out kernel datastructures, so chances are
>> good that this is a vestigial remnant of the PV Windows XP experiment.
>
> This is based on what? How do you know there are no other OSes
> doing so, including perhaps ones which none of us has ever heard
> of? The diffstat of the change here is certainly nice, but as always
> with something being removed from the ABI I'm rather hesitant.
>
>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -1942,11 +1942,8 @@ int domain_relinquish_resources(struct domain *d)
>>  {
>>  for_each_vcpu ( d, v )
>>  {
>> -/*
>> - * Relinquish GDT mappings. No need for explicit unmapping 
>> of
>> - * the LDT as it automatically gets squashed with the guest
>> - * mappings.
>> - */
>> +/* Relinquish GDT/LDT mappings. */
>> +pv_destroy_ldt(v);
>>  pv_destroy_gdt(v);
>
> The domain is dead at this point, so the order doesn't matter much,
> but strictly speaking you should destroy the GDT before destroying
> the LDT (just like LDT _loads_ always need to come _after_ GDT
> adjustments).
>
> Everything else here looks fine, but the initial comment may need
> further discussion. For example we may want to consider a
> two-stage phasing out of the feature, with a couple of years in
> between: Make the functionality dependent upon a default-off
> command line option for the time being, and issue a bright warning
> when someone actually enables it (telling them to tell us).

One of the problems we have is that people seem to wait for 2-3 years
after a release has been made to start updating to it.  So we'd have
to leave such a warning for probably 5 years minimum.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/5] x86/pv: Drop support for paging out the LDT

2018-01-17 Thread Jan Beulich
>>> On 12.01.18 at 19:37,  wrote:
> Windows is the only OS which pages out kernel datastructures, so chances are
> good that this is a vestigial remnant of the PV Windows XP experiment.

This is based on what? How do you know there are no other OSes
doing so, including perhaps ones which none of us has ever heard
of? The diffstat of the change here is certainly nice, but as always
with something being removed from the ABI I'm rather hesitant.

> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1942,11 +1942,8 @@ int domain_relinquish_resources(struct domain *d)
>  {
>  for_each_vcpu ( d, v )
>  {
> -/*
> - * Relinquish GDT mappings. No need for explicit unmapping of
> - * the LDT as it automatically gets squashed with the guest
> - * mappings.
> - */
> +/* Relinquish GDT/LDT mappings. */
> +pv_destroy_ldt(v);
>  pv_destroy_gdt(v);

The domain is dead at this point, so the order doesn't matter much,
but strictly speaking you should destroy the GDT before destroying
the LDT (just like LDT _loads_ always need to come _after_ GDT
adjustments).

Everything else here looks fine, but the initial comment may need
further discussion. For example we may want to consider a
two-stage phasing out of the feature, with a couple of years in
between: Make the functionality dependent upon a default-off
command line option for the time being, and issue a bright warning
when someone actually enables it (telling them to tell us).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/5] x86/pv: Drop support for paging out the LDT

2018-01-12 Thread Doug Goldstein
On 1/12/18 12:37 PM, Andrew Cooper wrote:
> Windows is the only OS which pages out kernel datastructures, so chances are
> good that this is a vestigial remnant of the PV Windows XP experiment.
> Furthermore the implementation is incomplete; it only functions for a present
> => not-present transition, rather than a present => read/write transition.
> 
> The for_each_vcpu() is one scalability limitation for PV guests, which can't
> reasonably be altered to be continuable.
> 
> One side effects of dropping paging out support is that now, the LDT (like the
> GDT) is only ever modified in current context, allowing us to drop
> shadow_ldt_mapcnt and shadow_ldt_lock from struct vcpu.
> 
> Another side effect is that the LDT no longer automatically cleans itself up
> on domain destruction.  Cover this by explicitly releasing the LDT frames at
> the same time as the GDT frames.
> 
> Finally, leave some asserts around to confirm the expected behaviour of all
> the functions playing with PGT_seg_desc_page references.
> 
> Signed-off-by: Andrew Cooper 

Makes sense to me and the code looks good.

Reviewed-by: Doug Goldstein 

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 4/5] x86/pv: Drop support for paging out the LDT

2018-01-12 Thread Andrew Cooper
Windows is the only OS which pages out kernel datastructures, so chances are
good that this is a vestigial remnant of the PV Windows XP experiment.
Furthermore the implementation is incomplete; it only functions for a present
=> not-present transition, rather than a present => read/write transition.

The for_each_vcpu() is one scalability limitation for PV guests, which can't
reasonably be altered to be continuable.

One side effects of dropping paging out support is that now, the LDT (like the
GDT) is only ever modified in current context, allowing us to drop
shadow_ldt_mapcnt and shadow_ldt_lock from struct vcpu.

Another side effect is that the LDT no longer automatically cleans itself up
on domain destruction.  Cover this by explicitly releasing the LDT frames at
the same time as the GDT frames.

Finally, leave some asserts around to confirm the expected behaviour of all
the functions playing with PGT_seg_desc_page references.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
---
 xen/arch/x86/domain.c   |  7 ++-
 xen/arch/x86/mm.c   | 17 -
 xen/arch/x86/pv/descriptor-tables.c | 20 ++--
 xen/arch/x86/pv/domain.c|  2 --
 xen/arch/x86/pv/mm.c|  3 ---
 xen/include/asm-x86/domain.h|  4 
 6 files changed, 8 insertions(+), 45 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index da1bf1a..2b7bc5b 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1942,11 +1942,8 @@ int domain_relinquish_resources(struct domain *d)
 {
 for_each_vcpu ( d, v )
 {
-/*
- * Relinquish GDT mappings. No need for explicit unmapping of
- * the LDT as it automatically gets squashed with the guest
- * mappings.
- */
+/* Relinquish GDT/LDT mappings. */
+pv_destroy_ldt(v);
 pv_destroy_gdt(v);
 }
 }
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 14cfa93..15a9334 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1152,7 +1152,6 @@ void put_page_from_l1e(l1_pgentry_t l1e, struct domain 
*l1e_owner)
 unsigned long pfn = l1e_get_pfn(l1e);
 struct page_info *page;
 struct domain*pg_owner;
-struct vcpu  *v;
 
 if ( !(l1e_get_flags(l1e) & _PAGE_PRESENT) || is_iomem_page(_mfn(pfn)) )
 return;
@@ -1188,25 +1187,9 @@ void put_page_from_l1e(l1_pgentry_t l1e, struct domain 
*l1e_owner)
  */
 if ( (l1e_get_flags(l1e) & _PAGE_RW) &&
  ((l1e_owner == pg_owner) || !paging_mode_external(pg_owner)) )
-{
 put_page_and_type(page);
-}
 else
-{
-/* We expect this is rare so we blow the entire shadow LDT. */
-if ( unlikely(((page->u.inuse.type_info & PGT_type_mask) ==
-   PGT_seg_desc_page)) &&
- unlikely(((page->u.inuse.type_info & PGT_count_mask) != 0)) &&
- (l1e_owner == pg_owner) )
-{
-for_each_vcpu ( pg_owner, v )
-{
-if ( pv_destroy_ldt(v) )
-flush_tlb_mask(v->vcpu_dirty_cpumask);
-}
-}
 put_page(page);
-}
 }
 
 
diff --git a/xen/arch/x86/pv/descriptor-tables.c 
b/xen/arch/x86/pv/descriptor-tables.c
index b418bbb..77f9851 100644
--- a/xen/arch/x86/pv/descriptor-tables.c
+++ b/xen/arch/x86/pv/descriptor-tables.c
@@ -37,18 +37,12 @@
  */
 bool pv_destroy_ldt(struct vcpu *v)
 {
-l1_pgentry_t *pl1e;
+l1_pgentry_t *pl1e = pv_ldt_ptes(v);
 unsigned int i, mappings_dropped = 0;
 struct page_info *page;
 
 ASSERT(!in_irq());
-
-spin_lock(>arch.pv_vcpu.shadow_ldt_lock);
-
-if ( v->arch.pv_vcpu.shadow_ldt_mapcnt == 0 )
-goto out;
-
-pl1e = pv_ldt_ptes(v);
+ASSERT(v == current || cpumask_empty(v->vcpu_dirty_cpumask));
 
 for ( i = 0; i < 16; i++ )
 {
@@ -64,12 +58,6 @@ bool pv_destroy_ldt(struct vcpu *v)
 put_page_and_type(page);
 }
 
-ASSERT(v->arch.pv_vcpu.shadow_ldt_mapcnt == mappings_dropped);
-v->arch.pv_vcpu.shadow_ldt_mapcnt = 0;
-
- out:
-spin_unlock(>arch.pv_vcpu.shadow_ldt_lock);
-
 return mappings_dropped;
 }
 
@@ -80,6 +68,8 @@ void pv_destroy_gdt(struct vcpu *v)
 l1_pgentry_t zero_l1e = l1e_from_mfn(zero_mfn, __PAGE_HYPERVISOR_RO);
 unsigned int i;
 
+ASSERT(v == current || cpumask_empty(v->vcpu_dirty_cpumask));
+
 v->arch.pv_vcpu.gdt_ents = 0;
 for ( i = 0; i < FIRST_RESERVED_GDT_PAGE; i++ )
 {
@@ -100,6 +90,8 @@ long pv_set_gdt(struct vcpu *v, unsigned long *frames, 
unsigned int entries)
 l1_pgentry_t *pl1e;
 unsigned int i, nr_frames = DIV_ROUND_UP(entries, 512);
 
+ASSERT(v == current || cpumask_empty(v->vcpu_dirty_cpumask));
+
 if ( entries > FIRST_RESERVED_GDT_ENTRY )
 return -EINVAL;
 
diff --git