Re: [Xen-devel] Writable page tables questions

2015-01-08 Thread Tim Deegan
At 09:55 + on 06 Jan (1420534536), Ian Campbell wrote:
 The tlb flushes involved in the above are reasonably expensive, IIRC Xen
 flip flopped a bit (years ago now) on whether it is worthwhile doing
 this or not, which is why I'm not sure if it still does or not.

The current writable pagetables code for PV guests emulates the
write and validates the resulting PTE.  If it passes validation, it
updates it, without ever making the page actually writable to the
guest itself.

The code is in xen/arch/x86/mm.c, as ptwr_*

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Writable page tables questions

2015-01-08 Thread Ian Campbell
On Thu, 2015-01-08 at 12:19 +0100, Tim Deegan wrote:
 At 09:55 + on 06 Jan (1420534536), Ian Campbell wrote:
  The tlb flushes involved in the above are reasonably expensive, IIRC Xen
  flip flopped a bit (years ago now) on whether it is worthwhile doing
  this or not, which is why I'm not sure if it still does or not.
 
 The current writable pagetables code for PV guests emulates the
 write and validates the resulting PTE.  If it passes validation, it
 updates it, without ever making the page actually writable to the
 guest itself.

Indeed, it seems like the mode I was on about was removed 9 years ago:

commit 228f081e08474febb96ee694f6d1b3d6d7465052
Author: kfraser@localhost.localdomain kfraser@localhost.localdomain
Date:   Fri Aug 11 16:07:22 2006 +0100

[XEN] Remove batched writable pagetable logic.

Benchmarks show it provides little or no benefit (except
on synthetic benchmarks). Also it is complicated and
likely to hinder efforts to reduce lockign granularity.

Signed-off-by: Keir Fraser k...@xensource.com

$ git describe --contains 228f081e08474febb96ee694f6d1b3d6d7465052
3.0.3-branched~459

So in 3.0.3 apparently.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Writable page tables questions

2015-01-06 Thread Ian Campbell
On Mon, 2015-01-05 at 17:28 +, Andrew Cooper wrote:
 On 04/01/2015 17:17, Junji Zhi wrote:
  Hi,
 
  I'm Junji, a newbie in Xen and hoping I can contribute to the
  community one day. I have a few questions regarding the writable page
  tables, while reading The Definitive Guide to the Xen Hypervisor by
  David Chisnall:
 
  1. Writable page tables is one Xen memory assist technique, applied to
  paravirtualized guests ONLY. HVM does not apply. Correct?
 
  2. According to the book, when a guest wants to modify its page table,
  it triggers a trap into the hypervisor and it does a few steps:
 
  (1) it invalidates a PTE that points to the page containing the page
  table. Is my understanding correct?
 
  Q: What does invalidate really mean here? Does it mean simply
  flipping a bit in the PTE of the page table, or removing the PTE
  completely?

At least clearing the present bit, what happens to the other bits in the
PTE is up to the implementation I think.

  Does it also need to invalidate the TLB entry?

Yes, I think so, else the CPU might subsequently use a stale mapping.

  (2) then the control goes back to the guest and it can write/read the
  page table now.
 
  (3) The book's words pasted: When an address referenced by the newly
  invalidated page directory entry is referenced (read or write), a page
  fault occurs. 
 
  Q: The description of step (3) is confusing. What does it mean by an
  address referenced by the newly invalidated page directory entry is
  referenced? Does it mean the case when the guest code is accessing an
  virtual address that needs to search the invalidated page table for
  translation?

Yes, it means when something tries to access memory which would have
been mapped by the PT page which was removed in (1).

 I do not have the Chisnall book to hand at the moment, so cannot comment
 as to the exact text in it.
 
 However, looking at the code as it exists today,
 XENFEAT_writable_page_tables (there is a typo in the ABI) is strictly
 only offered to HVM guests, and not to PV guests.

XENFEAT_writable_page_tables is different from out of sync PT updates,
which is what Junji (and the book) seems to be referring to.

I don't know if modern Xen still does this for PV (I think it still does
for shadow mode HVM under at least some circumstances) but at at one
point in time (presumably when the book was written) it used to be that
Xen would handle an emulated write to a r/o page table page by:
  * unhooking it from the higher level PTs which referenced it,
flushing TLBs
  * map the PT page itself r/w (contrary to the usual invariant that
it be mapped r/o, which is Xen's usual invariant)

At which point any subsequent writes to the now out-of-sync PT page can
just happen without trapping. This is safe because after the unhook the
PT is not part of any cr3 and the invariant is not violated (the guest
doesn't really know this is happening, for all it knows all writes are
still being emulated).

At some point something would try and access the memory which would be
mapped by the out of sync PT page and Xen will, in the page fault
handler:
  * make all the mappings r/o again (+ tlb flush)
  * validate all the entries in the page
  * rehook it into the higher level PTs which should reference it

At which point the mappings are available again and Xen's invariants are
preserved.

The tlb flushes involved in the above are reasonably expensive, IIRC Xen
flip flopped a bit (years ago now) on whether it is worthwhile doing
this or not, which is why I'm not sure if it still does or not.

This is all different from XENFEAT_writable_page_tables that you talk
about which is where the guest is informed that it is not obliged to
make the regular mappings r/o in the first place, i.e. to ignore Xen's
invariant completely.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Writable page tables questions

2015-01-05 Thread Andrew Cooper
On 04/01/2015 17:17, Junji Zhi wrote:
 Hi,

 I'm Junji, a newbie in Xen and hoping I can contribute to the
 community one day. I have a few questions regarding the writable page
 tables, while reading The Definitive Guide to the Xen Hypervisor by
 David Chisnall:

 1. Writable page tables is one Xen memory assist technique, applied to
 paravirtualized guests ONLY. HVM does not apply. Correct?

 2. According to the book, when a guest wants to modify its page table,
 it triggers a trap into the hypervisor and it does a few steps:

 (1) it invalidates a PTE that points to the page containing the page
 table. Is my understanding correct?

 Q: What does invalidate really mean here? Does it mean simply
 flipping a bit in the PTE of the page table, or removing the PTE
 completely? Does it also need to invalidate the TLB entry?

 (2) then the control goes back to the guest and it can write/read the
 page table now.

 (3) The book's words pasted: When an address referenced by the newly
 invalidated page directory entry is referenced (read or write), a page
 fault occurs. 

 Q: The description of step (3) is confusing. What does it mean by an
 address referenced by the newly invalidated page directory entry is
 referenced? Does it mean the case when the guest code is accessing an
 virtual address that needs to search the invalidated page table for
 translation?

I do not have the Chisnall book to hand at the moment, so cannot comment
as to the exact text in it.

However, looking at the code as it exists today,
XENFEAT_writable_page_tables (there is a typo in the ABI) is strictly
only offered to HVM guests, and not to PV guests.

PV guests must, under all circumstances, have their pagetables reachable
from any cr3 read-only.  Any ability to write to an active pagetable
without an audit from Xen would be a security issue, as a guest could
give itself access to frames which belonged to Xen or other guests.

Updating an individual PTE can be done by either writing directly to it,
in which case Xen will trap, emulate and audit the attempt, or use an
appropriate hypercall, which will be more efficient as no emulation is
required.  A PV guest is required to perform its own TLB management when
necessary (again, hypercall or trap and emulate).

Updating pagetables in general can either be done by updating each PTE
individually, or by constructing a new pagetable from scratch, pinning
it (via hypercall), which performs all the auditing at once, then
introducing it into the active set of pagetables.

An example might be:
1) Write all 512 entries into a regular page
2) Unmap the page (taking its refcount to 0, to permit a typechange)
3) Pinning the page as a specific type of pagetable (each level of
pagetables have a different type, for refcounting purposes)
4) PTE write or hypercall to introduce this new pagetable into the
active set.

The important points are that nothing can ever be changed in the active
set of pagetables without an audit by Xen, but the cost of the audit can
be amortised by constructing pagetables separately in a regular page first.

I hope this helps to clarify the situation.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Writable page tables questions

2015-01-04 Thread Junji Zhi

Hi,

I'm Junji, a newbie in Xen and hoping I can contribute to the community 
one day. I have a few questions regarding the writable page tables, 
while reading The Definitive Guide to the Xen Hypervisor by David Chisnall:


1. Writable page tables is one Xen memory assist technique, applied to 
paravirtualized guests ONLY. HVM does not apply. Correct?


2. According to the book, when a guest wants to modify its page table, 
it triggers a trap into the hypervisor and it does a few steps:


(1) it invalidates a PTE that points to the page containing the page 
table. Is my understanding correct?


Q: What does invalidate really mean here? Does it mean simply flipping 
a bit in the PTE of the page table, or removing the PTE completely? Does 
it also need to invalidate the TLB entry?


(2) then the control goes back to the guest and it can write/read the 
page table now.


(3) The book's words pasted: When an address referenced by the newly 
invalidated page directory entry is referenced (read or write), a page 
fault occurs. 


Q: The description of step (3) is confusing. What does it mean by an 
address referenced by the newly invalidated page directory entry is 
referenced? Does it mean the case when the guest code is accessing an 
virtual address that needs to search the invalidated page table for 
translation?



Thanks and I really appreciate any comment or responses.
Junji

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel