I was looking at a reported bug about semihosting when the MPU
is being used on Arm CPUs:
https://gitlab.com/qemu-project/qemu/-/issues/3292
which turns out to be because in cpu_memory_rw_debug() instead
of passing in the vaddr we're given to get_phys_page_debug() we
first round it down to a target-page, then translate it, then
add the page offset back in. This doesn't work if the target
is dealing with memory protection/validity at a granularity
smaller than a target-page, as the Arm MPU does.

I thought this was a bit odd, because the Arm page-table
walk code will happily translate a non-page-aligned address
for you, so I had a bit of a dig into our implementations and
callers, and we seem to be inconsistent about whether you're
supposed to handle non-page-aligned addresses here. (By "handle"
I mean both "can take a non-page-aligned input" and also "the
resulting physaddr includes the bits for the offset within the
page and has not been rounded down to a page-base address".)

Implementations that don't handle non-aligned addresses:
 * alpha, microblaze, sparc, x86, riscv, s390x
   (but riscv only because it explicitly masks out the low bits
   just before returning the physaddr!)
Implementations that do handle non-aligned addresses:
 * arm, avr, hppa, loongarch, m68k, mips, openrisc, rx, sh4,
   tricore, xtensa
Implementations that were too complex for me to determine the
answer with a quick look:
 * ppc

And at the callsites we have:

Callers that naturally always use page aligned addresses:
 * hw/i386/vapic.c : always uses page-aligned addresses
 * target/sparc/mmu_helper.c: uses in dump_mmu() are always page aligned

Callers that assume non-aligned addresses are handled:
 * hw/xtensa/sim.c,xtfpga.c : we translate ELF program header p_addrs
 * target/xtensa/xtensa-semi.c
   (these two only work because xtensa is in the "handles" this group)

Callers that do some kind of "work around getting the page base",
and mostly-work with both:
 * monitor/hmp-cmds.c: in hmp_gva2gpa we round to a page base then
   add back in the offset within the page
 * plugins/api.c: ORs in the low bits, so can cope with both "get
   exact answer" and "gives page base"
 * target/s390x/helper.c: s390_cpu_get_phys_addr_debug is a wrapper
   implementing "give actual page" in terms of cpu_get_phys_page_debug
   by the usual "mask and put the page offset back in" approach
 * physmem.c: cpu_memory_rw_debug rounds to a page base then adds
   back in the offset within the page afterwards

Looking at this, my feeling is that we should:
 * define that these APIs must handle non-page-aligned addresses
 * fix the six or seven implementations that don't (either "properly" or
   by adding in the "mask to page base then OR back in the page offset"
   code to their get_phys_page_debug functions)
 * clean up the callsites to not work around the lack of handling
   of non-page-aligned addresses
 * add/extend the get_phys_page_debug API so that a caller can get
   the lg_page_size
 * fix the cpu_memory_rw_debug vs Arm MPU bug by having it use that
   lg_page_size to determine how much memory to access at once rather
   than assuming it can read to the end of the page

Opinions ?

thanks
-- PMM

Reply via email to