Richard Henderson <richard.hender...@linaro.org> writes:
> On 9/22/22 14:58, Alex Bennée wrote: >> We currently have hacks across the hw/ to reference current_cpu to >> work out what the current accessing CPU is. This breaks in some cases >> including using gdbstub to access HW state. As we have MemTxAttrs to >> describe details about the access lets extend it to mention if this is >> a CPU access and which one it is. >> There are a number of places we need to fix up including: >> CPU helpers directly calling address_space_*() fns >> models in hw/ fishing the data out of current_cpu >> I'll start addressing some of these in following patches. >> Signed-off-by: Alex Bennée <alex.ben...@linaro.org> >> --- >> v2 >> - use separate field cpu_index >> - bool for requester_is_cpu >> --- >> include/exec/memattrs.h | 4 ++++ >> accel/tcg/cputlb.c | 22 ++++++++++++++++------ >> hw/core/cpu-sysemu.c | 17 +++++++++++++---- >> 3 files changed, 33 insertions(+), 10 deletions(-) >> diff --git a/include/exec/memattrs.h b/include/exec/memattrs.h >> index 9fb98bc1ef..e83a993c21 100644 >> --- a/include/exec/memattrs.h >> +++ b/include/exec/memattrs.h >> @@ -43,6 +43,10 @@ typedef struct MemTxAttrs { >> * (see MEMTX_ACCESS_ERROR). >> */ >> unsigned int memory:1; >> + /* Requester is CPU (or as CPU, e.g. debug) */ >> + bool requester_is_cpu:1; >> + /* cpu_index (if requester_is_cpu) */ >> + unsigned int cpu_index:16; >> /* Requester ID (for MSI for example) */ >> unsigned int requester_id:16; > > I'm not keen on adding another field like this. Hmm I thought it was unavoidable from Edgar's comment: "CPU's can also have a Master-IDs (Requester IDs) which are unrelated to the Clusters CPU index. This is used for example in the Xilinx ZynqMP and Xilinx Versal and the XMPU (Memory Protection Units). Anyway, I think this approach is an improvement from the current state but would rather see a new separate field from requester_id. Overloading requester_id will break some of our use-cases (in the Xilinx tree)..." Of course we don't have to care about external use cases but it seemed to indicate we might need both. > I don't think it addresses Peter's point about unique identifiers on > e.g. the MSI bus. But addressing that is surely an problem for any > host/pci bridge that supports PCI. Because we're already talking about > two different busses -- PCI, and the one between the cpu and the > bridge. We can return to overloading requester_id with a enum to indicate the type of bus. > What bounds our max number of cpus at the moment? We use "int" in > struct CPUCore, but that's almost certainly for convenience. > > target/s390x/cpu.h:#define S390_MAX_CPUS 248 > hw/i386/pc_piix.c: m->max_cpus = HVM_MAX_VCPUS; > > hw/i386/pc_q35.c: m->max_cpus = 288; > > hw/arm/virt.c: mc->max_cpus = 512; > > hw/arm/sbsa-ref.c: mc->max_cpus = 512; > > hw/i386/microvm.c: mc->max_cpus = 288; > > hw/ppc/spapr.c: mc->max_cpus = INT32_MAX; > > > Most of these are nicely bounded, but HVM_MAX_VCPUS is a magic number > from Xen, and ppc appears to be prepared for 31 bits worth of cpus. >From 5642e4513e (spapr.c: do not use MachineClass::max_cpus to limit CPUs) I think it is being a little optimistic. Even with the beefiest hosts you start to see diminishing returns by ~12 vCPUs and it won't take long before each extra vCPU just slows you down. > > >> @@ -1340,8 +1340,13 @@ static uint64_t io_readx(CPUArchState *env, >> CPUIOTLBEntry *iotlbentry, >> uint64_t val; >> bool locked = false; >> MemTxResult r; >> + MemTxAttrs attrs = iotlbentry->attrs; >> - section = iotlb_to_section(cpu, iotlbentry->addr, >> iotlbentry->attrs); >> + /* encode the accessing CPU */ >> + attrs.requester_is_cpu = 1; >> + attrs.cpu_index = cpu->cpu_index; > > > As I said before, we cannot set these generically, or > MEMTXATTRS_UNSPECIFIED means nothing. Furthermore, they should be set > at the point we create the tlb entry, not while we're reading it. > Thus this must be done by each target's tlb_fill function. I was confused by the last comment because I forgot the TLBs are not shared between cores. So I can just bang: MemTxAttrs attrs = { .cpu_index = cs->cpu_index }; into arm_cpu_tlb_fill and be done with it? Or only when we know it is an IOTLB being filled? > >> @@ -51,13 +51,22 @@ hwaddr cpu_get_phys_page_attrs_debug(CPUState *cpu, >> vaddr addr, >> MemTxAttrs *attrs) >> { >> CPUClass *cc = CPU_GET_CLASS(cpu); >> + MemTxAttrs local = { }; >> + hwaddr res; >> if (cc->sysemu_ops->get_phys_page_attrs_debug) { >> - return cc->sysemu_ops->get_phys_page_attrs_debug(cpu, addr, attrs); >> + res = cc->sysemu_ops->get_phys_page_attrs_debug(cpu, addr, &local); >> + } else { >> + /* Fallback for CPUs which don't implement the _attrs_ hook */ >> + local = MEMTXATTRS_UNSPECIFIED; >> + res = cc->sysemu_ops->get_phys_page_debug(cpu, addr); >> } >> - /* Fallback for CPUs which don't implement the _attrs_ hook */ >> - *attrs = MEMTXATTRS_UNSPECIFIED; >> - return cc->sysemu_ops->get_phys_page_debug(cpu, addr); >> + >> + /* debug access is treated as though it came from the CPU */ >> + local.requester_is_cpu = 1; >> + local.cpu_index = cpu->cpu_index; >> + *attrs = local; >> + return res; > > Again, I don't think it makes any sense to have .unspecified set, and > anything else non-zero. > > > r~ -- Alex Bennée