At the moment, exec ignores high bits in each address, for efficiency. This is incorrect: devices can do full 64 bit DMA, it's only the CPU that is limited by target address space. Resolving such addresses can actually corrupt the pagetables, so using full 64 bit addresses is called for.
However, using full 64 bit addresses was clocked at 12% performance hit on a microbenchmark. To solve, teach pagetables to skip bits at any level and not just the lowest level. This solves the performance problem (only one line of code changed on the data path). In fact we even gain a bit of speed: Before: portio-no-eventfd:pci-io 3225 After: portio-no-eventfd:pci-io 3123 Changes from v2: lots of bugfixes if you read v1 you'll have to re-read, although the basic algorithm is still the same minor tweaks suggested by Eric Blake Michael S. Tsirkin (5): exec: replace leaf with skip exec: extend skip field to 6 bit, page entry to 32 bit exec: pass hw address to phys_page_find exec: memory radix tree page level compression exec: reduce L2_PAGE_SIZE Paolo Bonzini (2): split definitions for exec.c and translate-all.c radix trees exec: make address spaces 64-bit wide translate-all.h | 7 --- exec.c | 135 +++++++++++++++++++++++++++++++++++++++++++++----------- translate-all.c | 32 ++++++++------ 3 files changed, 127 insertions(+), 47 deletions(-) -- MST