[Qemu-devel] [RFC v2 0/7] Slow-path for atomic instruction translation

Alvise Rigo Mon, 15 Jun 2015 04:52:17 -0700

This is the second iteration of the patch series. The relevant changes from
the v1 are at the bottom of this cover letter.


This patch series provides an infrastructure for atomic
instruction implementation in QEMU, paving the way for TCG multi-threading.
The adopted design does not rely on host atomic
instructions and is intended to propose a 'legacy' solution for
translating guest atomic instructions.

The underlying idea is to provide new TCG instructions that guarantee
atomicity to some memory accesses or in general a way to define memory
transactions. More specifically, a new pair of TCG instructions are
implemented, qemu_ldlink_i32 and qemu_stcond_i32, that behave as
LoadLink and StoreConditional primitives (only 32 bit variant
implemented).  In order to achieve this, a new bitmap is added to the
ram_list structure (always unique) which flags all memory pages that
could not be accessed directly through the fast-path, due to previous
exclusive operations. This new bitmap is coupled with a new TLB flag
which forces the slow-path exectuion. All stores which are performed
between an LL/SC operation by other vCPUs to the same (protected) address
will fail the subsequent StoreConditional.

In theory, the provided implementation of TCG LoadLink/StoreConditional
can be used to properly handle atomic instructions on any architecture.

The new slow-path is implemented such that:
- the LoadLink behaves as a normal load slow-path, except for cleaning
  the dirty flag in the bitmap. The TLB entries created from now on will
  force the slow-path. To ensure it, we flush the TLB cache for the
  other vCPUs
- the StoreConditional behaves as a normal store slow-path, except for
  checking the state of the dirty bitmap and returning 0 or 1 whether or
  not the StoreConditional succeeded (0 when no vCPU has touched the
  same memory in the mean time).

All those write accesses that are forced to follow the 'legacy'
slow-path will set the accessed memory page to dirty.

In this series only the ARM ldrex/strex instructions are implemented
for ARM and i386 hosts.
The code was tested with bare-metal test cases and by booting Linux,
using upstream QEMU.

Change from v1:
- The ram bitmap is not reversed anymore, 1 = dirty, 0 = exclusive
- The way how the offset to access the bitmap is calculated has
  been improved and fixed
- A page to be set as dirty requires a vCPU to target the protected address
  and not just an address in the page
- Addressed comments from Richard Henderson to improve the logic in
  softmmu_template.h and to simplify the methods generation through
  softmmu_llsc_template.h
- Added initial implementation of qemu_{ldlink,stcond}_i32 for tcg/i386

This work has been sponsored by Huawei Technologies Duesseldorf GmbH.

Alvise Rigo (7):
  bitmap: Add bitmap_one_extend operation
  exec: Add new exclusive bitmap to ram_list
  Add new TLB_EXCL flag
  softmmu: Add helpers for a new slow-path
  tcg-op: create new TCG qemu_ldlink and qemu_stcond instructions
  target-arm: translate: implement qemu_ldlink and qemu_stcond ops
  target-i386: translate: implement qemu_ldlink and qemu_stcond ops

 cputlb.c                |  21 +++++-
 exec.c                  |   7 +-
 include/exec/cpu-all.h  |   2 +
 include/exec/cpu-defs.h |   4 +
 include/exec/memory.h   |   3 +-
 include/exec/ram_addr.h |  19 +++++
 include/qemu/bitmap.h   |   9 +++
 softmmu_llsc_template.h | 155 +++++++++++++++++++++++++++++++++++++++
 softmmu_template.h      | 191 +++++++++++++++++++++++++++++++-----------------
 target-arm/translate.c  |  87 +++++++++++++++++++++-
 tcg/arm/tcg-target.c    | 121 +++++++++++++++++++++++-------
 tcg/i386/tcg-target.c   | 136 ++++++++++++++++++++++++++++------
 tcg/tcg-be-ldst.h       |   1 +
 tcg/tcg-op.c            |  23 ++++++
 tcg/tcg-op.h            |   3 +
 tcg/tcg-opc.h           |   4 +
 tcg/tcg.c               |   2 +
 tcg/tcg.h               |  20 +++++
 18 files changed, 684 insertions(+), 124 deletions(-)
 create mode 100644 softmmu_llsc_template.h

-- 
2.4.3

[Qemu-devel] [RFC v2 0/7] Slow-path for atomic instruction translation

Reply via email to