On Sat, Apr 25, 2026 at 6:44 AM Pierrick Bouvier <[email protected]> wrote: > > On 4/24/2026 8:15 AM, Jim Shu wrote: > > On Fri, Apr 24, 2026 at 2:25 AM Pierrick Bouvier > > <[email protected]> wrote: > >> > >> On 4/22/2026 11:16 PM, Jim Shu wrote: > >>> On Thu, Apr 23, 2026 at 12:01 AM Pierrick Bouvier > >>> <[email protected]> wrote: > >>> > >>> Hi Pierrick, > >>> > >>> Thanks for discussing the design! > >>> > >>>> > >>>> Hi Jim, > >>>> > >>>> On 4/21/2026 9:29 AM, Jim Shu wrote: > >>>>> Note: v1 title is "accel/tcg: Pass the access_type to IOMMUMemoryRegion" > >>>>> > >>>>> Incoming security protection devices feature more complex > >>>>> IOMMUMemoryRegion > >>>>> implementation in the CPU path than ARM MPC device. For example, > >>>>> RISC-V wgChecker [1] may permit the access with only RO/WO permissions. > >>>>> Consequently, the IOMMUMemoryRegion could return different sections for > >>>>> read & write access. > >>>>> > >>>> > >>>> The consequence is not directly implied for me. > >>>> > >>>> Why having a single translation and page protection handle read/write > >>>> access is not enough here? > >>>> > >>>> What is the (real world) use case to return different physical memory > >>>> locations depending on read/write access for a given virtual address? > >>> > >>> When permission is granted, the wgChecker device returns the original > >>> region of the address, called the downstream region > >>> When permission is denied, the wgChecker device returns a special > >>> MemoryRegion for error handling, called the blocked_io region). > >>> This region may trigger an IRQ and a bus error. Additionally, memory > >>> accesses to this region will result in 'read 0, write ignored' rather > >>> than triggering exceptions from the MMU or MPU. > >>> > >>> Consequently, implementing RWX permission checks on security > >>> protection devices may return different regions (either downstream or > >>> blocked_io) depending on the access type. > >>> This design is similar to the existing ARM MPC device > >>> (hw/misc/tz-mpc.c). The primary difference is that the MPC only checks > >>> the physical address rather than RWX permissions, meaning it returns > >>> either the downstream or blocked_io region for a single address > >>> regardless of the access type. > >>> > >> > >> It is indeed, a big difference. Translation with MPC device still > >> results in a single address space being chosen. > >> > >>> I also considered an alternative design where CPUTLBEntryFull stores > >>> both the successful permission and the blocked_io region of the IOMMU > >>> region. > >>> In this scenario, the slow-path code utilizing CPUTLBEntryFull would > >>> check permissions and return the blocked_io region if access is denied > >>> > >>> However, the address_space_translate_iommu() implementation supports > >>> recursive IOMMU region translation. > >>> if multiple IOMMU regions are encountered during a single address > >>> translation, storing only the first region's permissions is > >>> insufficient. > >>> Ultimately, we still face a situation where RWX permissions might > >>> return different regions separately (e.g., downstream, > >>> blocked_io_iommu1, and blocked_io_iommu2). > >>> > >> > >> The "blocked_io_iommu_X" is a consequence of the current choice of > >> implementation, but I still don't see why it's an absolute necessity. > >> > > > > I am open to removing it if the community agrees. I believe platforms > > with the ARM MPC do not use multiple IOMMU regions when translating a > > single address. > > > > I am not expert on the topic, but it seems MPC is mostly used with > microcontrollers with trustzone and is much more limited in scope than > WorldChecker. WorldChecker seems to be a mix of MMU + IOMMU + something > like Arm Granule Protection Table in a single external component. > > I am probably biased by Arm "World" implementation of this (out of MPC), > where CPU and SMMU play this role, instead of an external component. > > Going back to MPC, the translation is still deterministic and not based > on type of access. From what I understand, MPC can statically forbid > accesses to specific blocks, simply based on original address.
ARMv9-A GPC verifies permission within the CPU MMU and IOMMU. It performs checks when the memory transaction is travel from the source (such as a CPU or DMA) toward the system bus. In contrast, RISC-V wgChecker is positioned directly in front of peripheral devices. It performs checks when the transcations travel from the bus toward the target devices. In this regard, the ARMv8-A TZASC [1] and ARMv8-M MPC are architecturally similar to the RISC-V wgChecker. We can monitor all memory transactions in the SoC by implementing security filters at either every transaction source or every transaction device. These 2 designs are both reasonable, representing different approaches to SoC security architecture. ARMv8-A TZASC [1] is also the target-side filter and includes RW permission checks. While the TZASC device is not supported in QEMU now, it could also leverage these generic code changes when implementing it. Target-side filtering with RW permission checks is not a RISC-V only design. [1] https://developer.arm.com/documentation/ddi0504/c/introduction/about-the-tzc-400/tzc-400-example-system > > >> An alternative would be to treat all accesses as MMIO, but I guess your > >> goal here is to optimize the code path where we access RAM? > >> > > > > If the downstream region is a memory region, it will update the TLB > > flags and vaddr in addr_idx for the successful permissions of IOMMU > > region. > > To explain further: In the alternative approach, we would obtain the > > downstream region, IOMMU permissions, and the blocked_io region from > > the IOMMU translation function (which would require an additional > > return value or API to get the blocked_io region). The TLB entry will > > handle the downstream region as a non-IOMMU region but will only > > update the TLB flags for the IOMMU permissions; it will also store the > > permissions and the blocked_io region. The slow-path code checks the > > IOMMU permissions. If permission is denied, it will perform the > > transaction on the blocked_io region instead. > > We can remove the lazy IOMMU translation if using an alternative approach. > > > > What I was suggesting in my question was to force any access to a > MemoryRegion handled by a wgChecker to go through a read or write > callback, and to do the actual check there. > > For instance, MPC redirects blocked calls to: > static const MemoryRegionOps tz_mpc_mem_blocked_ops = { > .read_with_attrs = tz_mpc_mem_blocked_read, > .write_with_attrs = tz_mpc_mem_blocked_write, > > So I was suggesting something like: > static const MemoryRegionOps wgchecker_ops = { > .read_with_attrs = wgchecker_mem_read, > .write_with_attrs = wgchecker_mem_write, > > And redirect all read/write to those callbacks, thus turning the whole > range into an MMIO range. Thus my original question to understand if > your main concern is runtime performance or not. Yes, runtime performance is also critical to us. We have positioned the wgChecker in front of the DRAM [2] to partition the memory for each world. If using the slow path for DRAM access, Linux will be too slow to boot. [2] Add WG support to virt machine: https://patchew.org/QEMU/[email protected]/[email protected]/ > >> Why can't you reuse existing page protection mechanism for this, > >> authorizing read or write? WorldChecker just seems to be an additional > >> check on top of existing translation. The fact it's an "additional" > >> device is a design/implementation details, and could simply be part of > >> page protection mechanism. It might require some additional plumbing > >> when an interrupt is raised though, to redirect this correctly instead > >> of signaling CPU like it would normally do. > >> > >> The fact is does different action based on read/write attribute does not > >> really fit very well with existing implementation, as you noticed. And I > >> wonder if it's worth changing the global design just to optimize this > >> (optional) use case for RISC-V. > >> > > > > In the SoC architecture, the wgCheckers are positioned in front of > > devices (either memory or MMIO). When the CPU or DMA sends > > transactions to the device, the device's wgChecker first performs a > > permission check to determine if the transaction should be forwarded. > > If we move this check to the architecture's tlb_fill() function, how > > would the CPU code determine if transactions were sent to the device > > (or to the upstream region of the wgChecker)? The CPU code would need > > to be aware of the memory tree hierarchy to do that, which I believe > > is more difficult to implement. > > > > The memory hierarchy has to be known anyway by the component dealing > with that. Currently, TLB is responsible for it, iterating through the > different regions. However, it could be wgChecker code that does that, > or tlb_fill() for RISC-V. > > My only concern with existing design is that it pushes all that on > generic TLB code, for a feature that is optional and only used by > RISC-V, but still impacting all architectures. > I don't think current address_space/flatview APIs can determine if a PA will access an IOMMUMemoryRegion and return that specific region. To perform this check in the tlb_fill() or wgChecker device, I would need to add this new API to get the correct wgChecker instance and perform the permission check of it. Thus, this approach still requires modifying the generic code to add the new address_space API. I can try my best to minimize the changes of generic TLB code, but it is impossible to support wgChecker w/o any modification of generic code. Original MPC patchset also adds the CPU-side IOMMU region support to address_space_translate_for_iotlb() [3] and adds the TCGIOMMUNotifier. My v1 patchset has fewer modifications of generic TLB code, but it relies on ping-pong TLB entries for RW permission seperately. It is a little tricky and may not be easy to maintain. I tried to formalize the changes of TLB entry for the IOMMU region with RW permission in v2 patchset via lazy IOMMU translation. [3] https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg00666.html > It's not a no, I don't have any authority on this part of codebase, but > just a question to understand why it can't be solved another way that > would be RISC-V specific. As I mentioned above, ARMv8-A TZASC also has a similar design. > > > In the current QEMU architecture, when a security protection device > > sits in front of a device, placing it within the memory tree seems to > > be the most suitable approach, similar to the ARM MPC device. The > > address_space*() API can check if a transaction will be forwarded to > > the device and perform the permission check via the IOMMU region. Both > > the CPU-side address_space_translate_for_iotlb() and the DMA-side > > address_space_translate_iommu() handle this. The only missing > > component is that the CPU TLB cannot handle RWX permission checks for > > IOMMU regions within the memory tree. > > > > I understand this and what you want. > I'm really open to it if anyone else has feedback on this, and help > decide if it's worth changing generic TLB code for this use case. > > > > > > > > >>>> > >>>> I'm not against the goal of this series, but trying to understand why it > >>>> went in this direction, which complexify this already complex code path. > >>>> > >>>>> To support such IOMMUMemoryRegion behavior in the CPU path, the design > >>>>> of IOMMU translation must be updated: > >>>>> > >>>>> 1. address_space_translate*() must now pass the access_type to > >>>>> IOMMUMemoryRegion. > >>>>> 2. Since IOMMU translation results are too complex to be fully stored > >>>>> in the CPU TLB. we will defer the translation until the actual > >>>>> access > >>>>> occurs. Also, TLB is allowed to store the untranslated IOMMU > >>>>> region. > >>>>> > >>>>> To implement deferred IOMMU translation, this patchset introduces the > >>>>> following changes: > >>>>> > >>>>> 1. tlb_set_page_full() no longer translates the IOMMU region > >>>>> immediately. Instead, it stores the untranslated region directly > >>>>> in > >>>>> the TLB. A new slow-path flag, TLB_IOMMU, is introduced to force > >>>>> access into the slow path when a region has not yet been > >>>>> translated > >>>>> in the TLB entry. > >>>>> > >>>>> 2. When the CPU utilizes a TLB entry in the slow path, it should perform > >>>>> the lazy IOMMU translation of the access_type first. The resulting > >>>>> translated region and access type are stored in CPUTLBEntryFull. > >>>>> Since the slow path always performs lazy translation first, we can > >>>>> switch the CPUTLBEntryFull content to the correct access type > >>>>> before > >>>>> use. > >>>>> > >>>>> 3. To accelerate memory access in the fast path, lazy translation can > >>>>> update the addend of the CPUTLBEntry when translating the region > >>>>> to a > >>>>> host memory region. We restrict the IOMMU region to have a single > >>>>> non-zero 'addend' across all permissions. If a second 'addend' is > >>>>> present for a CPUTLBEntry, QEMU will trigger an assertion. This > >>>>> limitation is sufficient for security devices, as their > >>>>> "secondary" > >>>>> region is typically an IO region used to emulate device error > >>>>> handling when access is rejected. > >>>>> > >>>>> 4. To support non-slow TLB flags, lazy translation can update the TLB > >>>>> flags in the 'addr_idx' of the CPUTLBEntry. Lazy translation only > >>>>> updates the flags for the permissions specified in @prot. This > >>>>> ensures that each access_type of a translated region to maintains > >>>>> independent TLB flags. For example, TLB_DIRTY of memory region > >>>>> will > >>>>> not be "polluted" from other permission that translated to > >>>>> different > >>>>> region. > >>>>> > >>>>> Both RISC-V wgChecker [1] and RISC-V IOPMP [2] devices require this > >>>>> feature. > >>>>> > >>>>> [1] RISC-V WG: > >>>>> https://patchew.org/QEMU/[email protected]/ > >>>>> [2] RISC-V IOPMP: > >>>>> https://patchew.org/QEMU/[email protected]/ > >>>>> > >>>>> Changed since v1: > >>>>> - Remove the ping-pong TLB entry behavior. Instead, defer the IOMMU > >>>>> translation until actual access in the CPU path. Provide a IOMMU > >>>>> lazy translation function with the special handling of 'addend' > >>>>> and 'addr_idx' fields of CPUTLBEntry. > >>>>> - Fix the checkpatch warning. > >>>>> > >>>>> > >>>>> Jim Shu (5): > >>>>> accel/tcg: Pass access_type as an argument of tlb_set_page*() > >>>>> accel/tcg: address_space_translate*() will pass the correct > >>>>> iommu_flags > >>>>> accel/tcg: Provide early AS translate function > >>>>> accel/tcg: Add IOMMU lazy translation function > >>>>> accel/tcg: Support IOMMU lazy translation in CPU TLB > >>>>> > >>>>> accel/tcg/cputlb.c | 247 > >>>>> +++++++++++++++++++++++++-- > >>>>> include/accel/tcg/iommu.h | 17 +- > >>>>> include/exec/cputlb.h | 11 +- > >>>>> include/exec/tlb-flags.h | 4 +- > >>>>> include/hw/core/cpu.h | 15 ++ > >>>>> system/physmem.c | 60 ++++++- > >>>>> target/alpha/helper.c | 2 +- > >>>>> target/avr/helper.c | 3 +- > >>>>> target/hppa/mem_helper.c | 1 - > >>>>> target/i386/tcg/system/excp_helper.c | 3 +- > >>>>> target/loongarch/tcg/tlb_helper.c | 2 +- > >>>>> target/m68k/helper.c | 10 +- > >>>>> target/microblaze/helper.c | 8 +- > >>>>> target/mips/tcg/system/tlb_helper.c | 4 +- > >>>>> target/or1k/mmu.c | 2 +- > >>>>> target/ppc/mmu_helper.c | 2 +- > >>>>> target/riscv/cpu_helper.c | 2 +- > >>>>> target/rx/cpu.c | 3 +- > >>>>> target/s390x/tcg/excp_helper.c | 2 +- > >>>>> target/sh4/helper.c | 3 +- > >>>>> target/sparc/mmu_helper.c | 6 +- > >>>>> target/tricore/helper.c | 2 +- > >>>>> target/xtensa/helper.c | 3 +- > >>>>> 23 files changed, 354 insertions(+), 58 deletions(-) > >>>>> > >>>> > >>> > >>> Regards, > >>> Jim > >> > > Regards, > Pierrick Regards, Jim
