Re: [Qemu-devel] Atomic Instructions - comments please

Peter Maydell Mon, 15 Dec 2014 06:11:27 -0800

On 15 December 2014 at 13:16, Paolo Bonzini <pbonz...@redhat.com> wrote:
> If not, it should not need any change to the memory API; you can do it
> entirely within cputlb.c, roughly the same as the handling of
> TLB_NOTDIRTY.  It also marks pages as I/O, but only internally within TCG.


Speaking of TLB_NOTDIRTY, I just wrote up a summary of how that
works for a private email, so I figured I might as well send it
here too so it's in the qemu-devel mail archives; it's probably
not new information to anybody involved in this immediate
conversation.

How we arrange to throw away cached translations when the guest
writes to that part of memory:

 * we have two data structures effectively tracking dirty status:
   (1) there are a set of bitmaps which track different kinds of
   dirtiness (the DIRTY_MEMORY_*); the functions for manipulating
   these are mostly in ram_addr.h. One of the bitmaps is for
   DIRTY_MEMORY_CODE.
   (2) where we have an entry in the QEMU TLB for a page which
   is backed by host RAM, we may set the TLB_NOTDIRTY bit in
   the addr_write TLB entry field (TLB_NOTDIRTY is one of
   several low order bits that can be set in what is otherwise
   a page-aligned virtual address in the TLB structure. TLB_MMIO
   is another, indicating that the entry is not RAM at all.)
   TLB entries come and go, but the bitmaps cover all of physical
   RAM. When a TLB entry is present then the NOTDIRTY flag should
   be just a cache for "at least one of the dirty bitmaps says
   this page is not dirty".
 * when we generate code we call tlb_protect_code() (from
   tb_alloc_page()): this calls cpu_physical_memory_reset_dirty(),
   which both updates the dirty bitmap data structure (marking
   the region as clean in the DIRTY_MEMORY_CODE bitmap) and also
   calls cpu_tlb_reset_dirty_all() to OR in the TLB_NOTDIRTY
   flag for any present TLB entries in the range
 * when we add an entry to the TLB, tlb_set_page() will OR in
   the TLB_NOTDIRTY bit if the bitmap says this is clean memory,
   so the two structures stay in sync
 * tlb_set_page() also calls memory_region_section_get_iotlb()
   to get an iotlb entry for this RAM, which is what will be
   used on the slow path. For RAM this will be io_mem_notdirty.
 * if the guest attempts a read, we don't do anything special
   because this uses addr_read, not addr_write
 * for a guest write, the generated code will look at addr_write;
   it takes the fast path if the low order bits are clear
   (indicating dirty host RAM). Otherwise we take the slow
   path (clean RAM, MMIO, nothing present, etc etc).
 * we then follow the slow path without special casing RAM,
   which means we'll use the iotlb entry set up when the TLB
   entry was populated, which is io_mem_notdirty.
 * notdirty_mem_write() will invalidate the cached TBs if
   the DIRTY_MEMORY_CODE bitmap says this memory is clean,
   and do the access the slow way. We then mark the TLB entry
   as dirty by calling tlb_set_dirty, so next time we'll take
   the fast path. (There's an optimisation wrinkle here:
   tb_invalidate_phys_page_fast() is complicated because it
   tries to avoid simply nuking every TB in the page. So it
   might need to keep accesses on the slow path. It only calls
   tlb_unprotect_code_phys() to update the DIRTY_MEMORY_CODE
   bitmap if every TB on the page has been invalidated. This
   is why notdirty_mem_write()'s call to tlb_set_dirty() is
   conditional.)
 * writes to already-dirty memory can take the fast path,
   which just writes to the host RAM without calling out
   or checking any dirty bits.

Note that for linux-user mode the mechanism is totally
different, because we don't have a softmmu TLB data structure;
instead we use mprotect to write-protect the page, and then
in the SIGSEGV handler we may throw away cached TBs before
un-write-protecting it.

-- PMM

Re: [Qemu-devel] Atomic Instructions - comments please

Reply via email to