Re: [Xen-devel] [for-4.8][PATCH v2 17/23] xen/arm: p2m: Introduce p2m_set_entry and __p2m_set_entry

2016-09-21 Thread Stefano Stabellini
On Thu, 15 Sep 2016, Julien Grall wrote:
> The ARM architecture mandates to use of a break-before-make sequence
> when changing translation entries if the page table is shared between
> multiple CPUs whenever a valid entry is replaced by another valid entry
> (see D4.7.1 in ARM DDI 0487A.j for more details).
> 
> The break-before-make sequence can be divided in the following steps:
> 1) Invalidate the old entry in the page table
> 2) Issue a TLB invalidation instruction for the address associated
> to this entry
> 3) Write the new entry
> 
> The current P2M code implemented in apply_one_level does not respect
> this sequence and may result to break coherency on some processors.
> 
> Adapting the current implementation to use the break-before-make
> sequence would imply some code duplication and more TLBs invalidation
> than necessary. For instance, if we are replacing a 4KB page and the
> current mapping in the P2M is using a 1GB superpage, the following steps
> will happen:
> 1) Shatter the 1GB superpage into a series of 2MB superpages
> 2) Shatter the 2MB superpage into a series of 4KB pages
> 3) Replace the 4KB page
> 
> As the current implementation is shattering while descending and install
> the mapping, Xen would need to issue 3 TLB invalidation instructions
> which is clearly inefficient.
> 
> Furthermore, all the operations which modify the page table are using
> the same skeleton. It is more complicated to maintain different code paths
> than having a generic function that set an entry and take care of the
> break-before-make sequence.
> 
> The new implementation is based on the x86 EPT one which, I think,
> fits quite well for the break-before-make sequence whilst keeping
> the code simple.
> 
> The main function of the new implementation is __p2m_set_entry. It will
> only work on mapping that are aligned to a block entry in the page table
> (i.e 1GB, 2MB, 4KB when using a 4KB granularity).
> 
> Another function, p2m_set_entry, is provided to break down is region
> into mapping that is aligned to a block entry or 4KB when memaccess is
> enabled.
> 
> Signed-off-by: Julien Grall 
> 
> ---
> Changes in v2:
> - fix typoes in the commit message
> - don't use the default access when shattering the superpage
> - remove duplicated comment
> - export p2m_set_entry
> - s/todo/nr/
> - iommu flush
> - update the stats when removing/adding a mapping
> - update doc
> - fix p2m_split_superpage
> - don't try to allocate intermediate page table when removing an
> entry
> - the TLB flush is not necessary when only permission are
> changed (e.g memaccess).
> ---
>  xen/arch/arm/p2m.c | 374 
> +
>  xen/include/asm-arm/p2m.h  |  11 ++
>  xen/include/asm-arm/page.h |   4 +
>  3 files changed, 389 insertions(+)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 3ca2e90..5f7aef0 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -783,6 +783,380 @@ static void p2m_put_l3_page(const lpae_t pte)
>  }
>  }
>  
> +/* Free lpae sub-tree behind an entry */
> +static void p2m_free_entry(struct p2m_domain *p2m,
> +   lpae_t entry, unsigned int level)
> +{
> +unsigned int i;
> +lpae_t *table;
> +mfn_t mfn;
> +
> +/* Nothing to do if the entry is invalid. */
> +if ( !p2m_valid(entry) )
> +return;
> +
> +/* Nothing to do but updating the states if the entry is a super-page. */
   ^ stats

Aside from this small typo, everything else is good:

Reviewed-by: Stefano Stabellini 


> +if ( p2m_is_superpage(entry, level) )
> +{
> +p2m->stats.mappings[level]--;
> +return;
> +}
> +
> +if ( level == 3 )
> +{
> +p2m->stats.mappings[level]--;
> +p2m_put_l3_page(entry);
> +return;
> +}
> +
> +table = map_domain_page(_mfn(entry.p2m.base));
> +for ( i = 0; i < LPAE_ENTRIES; i++ )
> +p2m_free_entry(p2m, *(table + i), level + 1);
> +
> +unmap_domain_page(table);
> +
> +/*
> + * Make sure all the references in the TLB have been removed before
> + * freing the intermediate page table.
> + * XXX: Should we defer the free of the page table to avoid the
> + * flush?
> + */
> +if ( p2m->need_flush )
> +p2m_flush_tlb_sync(p2m);
> +
> +mfn = _mfn(entry.p2m.base);
> +ASSERT(mfn_valid(mfn_x(mfn)));
> +
> +free_domheap_page(mfn_to_page(mfn_x(mfn)));
> +}
> +
> +static bool p2m_split_superpage(struct p2m_domain *p2m, lpae_t *entry,
> +unsigned int level, unsigned int target,
> +const unsigned int *offsets)
> +{
> +struct page_info *page;
> +unsigned int i;
> +lpae_t pte, *table;
> +bool rv = 

[Xen-devel] [for-4.8][PATCH v2 17/23] xen/arm: p2m: Introduce p2m_set_entry and __p2m_set_entry

2016-09-15 Thread Julien Grall
The ARM architecture mandates to use of a break-before-make sequence
when changing translation entries if the page table is shared between
multiple CPUs whenever a valid entry is replaced by another valid entry
(see D4.7.1 in ARM DDI 0487A.j for more details).

The break-before-make sequence can be divided in the following steps:
1) Invalidate the old entry in the page table
2) Issue a TLB invalidation instruction for the address associated
to this entry
3) Write the new entry

The current P2M code implemented in apply_one_level does not respect
this sequence and may result to break coherency on some processors.

Adapting the current implementation to use the break-before-make
sequence would imply some code duplication and more TLBs invalidation
than necessary. For instance, if we are replacing a 4KB page and the
current mapping in the P2M is using a 1GB superpage, the following steps
will happen:
1) Shatter the 1GB superpage into a series of 2MB superpages
2) Shatter the 2MB superpage into a series of 4KB pages
3) Replace the 4KB page

As the current implementation is shattering while descending and install
the mapping, Xen would need to issue 3 TLB invalidation instructions
which is clearly inefficient.

Furthermore, all the operations which modify the page table are using
the same skeleton. It is more complicated to maintain different code paths
than having a generic function that set an entry and take care of the
break-before-make sequence.

The new implementation is based on the x86 EPT one which, I think,
fits quite well for the break-before-make sequence whilst keeping
the code simple.

The main function of the new implementation is __p2m_set_entry. It will
only work on mapping that are aligned to a block entry in the page table
(i.e 1GB, 2MB, 4KB when using a 4KB granularity).

Another function, p2m_set_entry, is provided to break down is region
into mapping that is aligned to a block entry or 4KB when memaccess is
enabled.

Signed-off-by: Julien Grall 

---
Changes in v2:
- fix typoes in the commit message
- don't use the default access when shattering the superpage
- remove duplicated comment
- export p2m_set_entry
- s/todo/nr/
- iommu flush
- update the stats when removing/adding a mapping
- update doc
- fix p2m_split_superpage
- don't try to allocate intermediate page table when removing an
entry
- the TLB flush is not necessary when only permission are
changed (e.g memaccess).
---
 xen/arch/arm/p2m.c | 374 +
 xen/include/asm-arm/p2m.h  |  11 ++
 xen/include/asm-arm/page.h |   4 +
 3 files changed, 389 insertions(+)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 3ca2e90..5f7aef0 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -783,6 +783,380 @@ static void p2m_put_l3_page(const lpae_t pte)
 }
 }
 
+/* Free lpae sub-tree behind an entry */
+static void p2m_free_entry(struct p2m_domain *p2m,
+   lpae_t entry, unsigned int level)
+{
+unsigned int i;
+lpae_t *table;
+mfn_t mfn;
+
+/* Nothing to do if the entry is invalid. */
+if ( !p2m_valid(entry) )
+return;
+
+/* Nothing to do but updating the states if the entry is a super-page. */
+if ( p2m_is_superpage(entry, level) )
+{
+p2m->stats.mappings[level]--;
+return;
+}
+
+if ( level == 3 )
+{
+p2m->stats.mappings[level]--;
+p2m_put_l3_page(entry);
+return;
+}
+
+table = map_domain_page(_mfn(entry.p2m.base));
+for ( i = 0; i < LPAE_ENTRIES; i++ )
+p2m_free_entry(p2m, *(table + i), level + 1);
+
+unmap_domain_page(table);
+
+/*
+ * Make sure all the references in the TLB have been removed before
+ * freing the intermediate page table.
+ * XXX: Should we defer the free of the page table to avoid the
+ * flush?
+ */
+if ( p2m->need_flush )
+p2m_flush_tlb_sync(p2m);
+
+mfn = _mfn(entry.p2m.base);
+ASSERT(mfn_valid(mfn_x(mfn)));
+
+free_domheap_page(mfn_to_page(mfn_x(mfn)));
+}
+
+static bool p2m_split_superpage(struct p2m_domain *p2m, lpae_t *entry,
+unsigned int level, unsigned int target,
+const unsigned int *offsets)
+{
+struct page_info *page;
+unsigned int i;
+lpae_t pte, *table;
+bool rv = true;
+
+/* Convenience aliases */
+mfn_t mfn = _mfn(entry->p2m.base);
+unsigned int next_level = level + 1;
+unsigned int level_order = level_orders[next_level];
+
+/*
+ * This should only be called with target != level and the entry is
+ * a superpage.
+ */
+ASSERT(level < target);
+ASSERT(p2m_is_superpage(*entry, level));
+
+page = alloc_domheap_page(NULL, 0);
+if ( !page )
+return false;
+
+