At the moment, exec ignores high bits in each address,
for efficiency.
This is incorrect: devices can do full 64 bit DMA, it's
only the CPU that is limited by target address space.
Resolving such addresses can actually corrupt the pagetables,
so using full 64 bit addresses is called for.

However, using full 64 bit addresses was clocked at 12% performance
hit on a microbenchmark.
To solve, teach pagetables to skip bits at any level
and not just the lowest level.

This solves the performance problem (only one line of code changed on the data
path).  In fact we even gain a bit of speed:

Before:
portio-no-eventfd:pci-io 3225
After:
portio-no-eventfd:pci-io 3123

Changes from v2:
    lots of bugfixes if you read v1 you'll have to re-read,
    although the basic algorithm is still the same
    minor tweaks suggested by Eric Blake

Michael S. Tsirkin (5):
  exec: replace leaf with skip
  exec: extend skip field to 6 bit, page entry to 32 bit
  exec: pass hw address to phys_page_find
  exec: memory radix tree page level compression
  exec: reduce L2_PAGE_SIZE

Paolo Bonzini (2):
  split definitions for exec.c and translate-all.c radix trees
  exec: make address spaces 64-bit wide

 translate-all.h |   7 ---
 exec.c          | 135 +++++++++++++++++++++++++++++++++++++++++++++-----------
 translate-all.c |  32 ++++++++------
 3 files changed, 127 insertions(+), 47 deletions(-)

-- 
MST


Reply via email to