[PATCH] target/riscv: Validate the mode in write_vstvec
Base on the riscv-privileged spec, vstvec substitutes for the usual stvec. Therefore, the encoding of the MODE should also be restricted to 0 and 1. Signed-off-by: Jiayi Li --- target/riscv/csr.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/target/riscv/csr.c b/target/riscv/csr.c index 432c59dc66..f9229d92ab 100644 --- a/target/riscv/csr.c +++ b/target/riscv/csr.c @@ -3791,7 +3791,12 @@ static RISCVException read_vstvec(CPURISCVState *env, int csrno, static RISCVException write_vstvec(CPURISCVState *env, int csrno, target_ulong val) { -env->vstvec = val; +/* bits [1:0] encode mode; 0 = direct, 1 = vectored, 2 >= reserved */ +if ((val & 3) < 2) { +env->vstvec = val; +} else { +qemu_log_mask(LOG_UNIMP, "CSR_VSTVEC: reserved mode not supported\n"); +} return RISCV_EXCP_NONE; } -- 2.25.1
Re: [PATCH v2 1/1] memory tier: consolidate the initialization of memory tiers
Hi, Jack, "Ho-Ren (Jack) Chuang" writes: I suggest you to merge the [0/1] with the change log here. [0/1] describes why do we need the patch. The below text describes some details. Just don't use "---" to separate them. We need both parts in the final commit message. > If we simply move the set_node_memory_tier() from memory_tier_init() > to late_initcall(), it will result in HMAT not registering > the mt_adistance_algorithm callback function, because > set_node_memory_tier() is not performed during the memory tiering > initialization phase, leading to a lack of correct default_dram > information. > > Therefore, we introduced a nodemask to pass the information of the > default DRAM nodes. The reason for not choosing to reuse > default_dram_type->nodes is that it is not clean enough. So in the end, > we use a __initdata variable, which is a variable that is released once > initialization is complete, including both CPU and memory nodes for HMAT > to iterate through. > > Besides, since default_dram_type may be checked/used during the > initialization process of HMAT and drivers, it is better to keep the > allocation of default_dram_type in memory_tier_init(). Why do we need it? IIRC, we have deleted its usage in hmat.c. > Signed-off-by: Ho-Ren (Jack) Chuang > Suggested-by: Jonathan Cameron > --- > drivers/acpi/numa/hmat.c | 5 +-- > include/linux/memory-tiers.h | 2 ++ > mm/memory-tiers.c| 59 +++- > 3 files changed, 28 insertions(+), 38 deletions(-) > > diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c > index 2c8ccc91ebe6..a2f9e7a4b479 100644 > --- a/drivers/acpi/numa/hmat.c > +++ b/drivers/acpi/numa/hmat.c > @@ -940,10 +940,7 @@ static int hmat_set_default_dram_perf(void) > struct memory_target *target; > struct access_coordinate *attrs; > > - if (!default_dram_type) > - return -EIO; > - > - for_each_node_mask(nid, default_dram_type->nodes) { > + for_each_node_mask(nid, default_dram_nodes) { > pxm = node_to_pxm(nid); > target = find_mem_target(pxm); > if (!target) > diff --git a/include/linux/memory-tiers.h b/include/linux/memory-tiers.h > index 0d70788558f4..fa61ad9c4d75 100644 > --- a/include/linux/memory-tiers.h > +++ b/include/linux/memory-tiers.h > @@ -38,6 +38,7 @@ struct access_coordinate; > #ifdef CONFIG_NUMA > extern bool numa_demotion_enabled; > extern struct memory_dev_type *default_dram_type; Can we remove the above line? > +extern nodemask_t default_dram_nodes __initdata; We don't need to use __initdata in variable declaration. > struct memory_dev_type *alloc_memory_type(int adistance); > void put_memory_type(struct memory_dev_type *memtype); > void init_node_memory_type(int node, struct memory_dev_type *default_type); > @@ -76,6 +77,7 @@ static inline bool node_is_toptier(int node) > > #define numa_demotion_enabledfalse > #define default_dram_typeNULL > +#define default_dram_nodes NODE_MASK_NONE Should we use after "default_dram_nodes"? > /* > * CONFIG_NUMA implementation returns non NULL error. > */ > diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c > index 6632102bd5c9..a19a90c3ad36 100644 > --- a/mm/memory-tiers.c > +++ b/mm/memory-tiers.c > @@ -43,6 +43,7 @@ static LIST_HEAD(memory_tiers); > static LIST_HEAD(default_memory_types); > static struct node_memory_type_map node_memory_types[MAX_NUMNODES]; > struct memory_dev_type *default_dram_type; > +nodemask_t default_dram_nodes __initdata = NODE_MASK_NONE; > > static const struct bus_type memory_tier_subsys = { > .name = "memory_tiering", > @@ -671,28 +672,38 @@ EXPORT_SYMBOL_GPL(mt_put_memory_types); > > /* > * This is invoked via `late_initcall()` to initialize memory tiers for > - * CPU-less memory nodes after driver initialization, which is > - * expected to provide `adistance` algorithms. > + * memory nodes, both with and without CPUs. After the initialization of > + * firmware and devices, adistance algorithms are expected to be provided. > */ > static int __init memory_tier_late_init(void) > { > int nid; > + struct memory_tier *memtier; > > + get_online_mems(); > guard(mutex)(_tier_lock); > + /* > + * Look at all the existing and uninitialized N_MEMORY nodes and > + * add them to default memory tier or to a tier if we already have > + * memory types assigned. > + */ If the memory type of the node has been assigned, we will skip it in the following code. So, I think that we need to revise the comments. > for_each_node_state(nid, N_MEMORY) { > /* > - * Some device drivers may have initialized memory tiers > - * between `memory_tier_init()` and `memory_tier_late_init()`, > - * potentially bringing online memory nodes and > - * configuring memory tiers. Exclude them here. > + * Some device
Re: [PATCH 2/2] target/i386: drop AMD machine check bits from Intel CPUID
On Fri, Jun 28, 2024 at 03:23:11PM +0200, Paolo Bonzini wrote: > Date: Fri, 28 Jun 2024 15:23:11 +0200 > From: Paolo Bonzini > Subject: Re: [PATCH 2/2] target/i386: drop AMD machine check bits from > Intel CPUID > > Il ven 28 giu 2024, 10:32 Xiaoyao Li ha scritto: > > > On 6/27/2024 10:06 PM, Paolo Bonzini wrote: > > > The recent addition of the SUCCOR bit to kvm_arch_get_supported_cpuid() > > > causes the bit to be visible when "-cpu host" VMs are started on Intel > > > processors. > > > > > > While this should in principle be harmless, it's not tidy and we don't > > > even know for sure that it doesn't cause any guest OS to take unexpected > > > paths. Since x86_cpu_get_supported_feature_word() can return different > > > different values depending on the guest, adjust it to hide the SUCCOR > > > > superfluous different > > > > > bit if the guest has non-AMD vendor. > > > > It seems to adjust it based on vendor in kvm_arch_get_supported_cpuid() > > is better than in x86_cpu_get_supported_feature_word(). Otherwise > > kvm_arch_get_supported_cpuid() still returns "risky" value for Intel VMs. > > > > But the cpuid bit is only invalid for Intel *guest* vendor, not host. It is > not a problem to have it if you run on Intel host but have a guest model > with AMD vendor. > > I will check if there are other callers of kvm_arch_get_supported_cpuid(), > or callers of x86_cpu_get_supported_feature_word() with NULL cpu, that > might care about the difference. Another example is CPUID_EXT3_TOPOEXT, though it's a no_autoenable_flags, it can be set by "-cpu host,+topoext" on Intel platforms. For this case, we have recognized that that the host/max CPU should only contain vender specific features, and I think it would be hard to expand such a rule afterwards, especially since there's other x86 vender like zhaoxin who implement a subset of Intel/AMD: https://lore.kernel.org/qemu-devel/d4c0dae5-b9d5-4deb-b300-78492ab11...@zhaoxin.com/#t What about a new flag "host_bare_metal_check" in FeatureWordInfo? Then if a feature is marked as "host_bare_metal_check", in addition to the current checks in x86_cpu_get_supported_feature_word(), bare-metal CPUID check is also needed (by host_cpuid()) for "host" CPU. -Zhao
[PATCH 6/6] target/riscv: Enable RV32 CPU support in RV64 QEMU
From: TANG Tiancheng Add gdb XML files and adjust CPU initialization to allow running RV32 CPUs in RV64 QEMU. Signed-off-by: TANG Tiancheng Reviewed-by: Liu Zhiwei --- configs/targets/riscv64-softmmu.mak | 2 +- target/riscv/cpu.c | 17 + 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/configs/targets/riscv64-softmmu.mak b/configs/targets/riscv64-softmmu.mak index f688ffa7bc..5c1abb4b51 100644 --- a/configs/targets/riscv64-softmmu.mak +++ b/configs/targets/riscv64-softmmu.mak @@ -1,6 +1,6 @@ TARGET_ARCH=riscv64 TARGET_BASE_ARCH=riscv TARGET_SUPPORTS_MTTCG=y -TARGET_XML_FILES= gdb-xml/riscv-64bit-cpu.xml gdb-xml/riscv-32bit-fpu.xml gdb-xml/riscv-64bit-fpu.xml gdb-xml/riscv-64bit-virtual.xml +TARGET_XML_FILES= gdb-xml/riscv-64bit-cpu.xml gdb-xml/riscv-32bit-fpu.xml gdb-xml/riscv-64bit-fpu.xml gdb-xml/riscv-64bit-virtual.xml gdb-xml/riscv-32bit-cpu.xml gdb-xml/riscv-32bit-virtual.xml # needed by boot.c TARGET_NEED_FDT=y diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 69a08e8c2c..58165901a2 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -630,8 +630,10 @@ static void rv64e_bare_cpu_init(Object *obj) riscv_cpu_set_misa_ext(env, RVE); } -#else /* !TARGET_RISCV64 */ +#endif /* !TARGET_RISCV64 */ +#if defined(TARGET_RISCV32) || \ +(defined(TARGET_RISCV64) && !defined(CONFIG_USER_ONLY)) static void rv32_base_cpu_init(Object *obj) { RISCVCPU *cpu = RISCV_CPU(obj); @@ -2544,6 +2546,13 @@ static const TypeInfo riscv_cpu_type_infos[] = { #if defined(TARGET_RISCV32) DEFINE_DYNAMIC_CPU(TYPE_RISCV_CPU_ANY, MXL_RV32, riscv_any_cpu_init), DEFINE_DYNAMIC_CPU(TYPE_RISCV_CPU_MAX, MXL_RV32, riscv_max_cpu_init), +#elif defined(TARGET_RISCV64) +DEFINE_DYNAMIC_CPU(TYPE_RISCV_CPU_ANY, MXL_RV64, riscv_any_cpu_init), +DEFINE_DYNAMIC_CPU(TYPE_RISCV_CPU_MAX, MXL_RV64, riscv_max_cpu_init), +#endif + +#if defined(TARGET_RISCV32) || \ +(defined(TARGET_RISCV64) && !defined(CONFIG_USER_ONLY)) DEFINE_DYNAMIC_CPU(TYPE_RISCV_CPU_BASE32,MXL_RV32, rv32_base_cpu_init), DEFINE_VENDOR_CPU(TYPE_RISCV_CPU_IBEX, MXL_RV32, rv32_ibex_cpu_init), DEFINE_VENDOR_CPU(TYPE_RISCV_CPU_SIFIVE_E31, MXL_RV32, rv32_sifive_e_cpu_init), @@ -2551,9 +2560,9 @@ static const TypeInfo riscv_cpu_type_infos[] = { DEFINE_VENDOR_CPU(TYPE_RISCV_CPU_SIFIVE_U34, MXL_RV32, rv32_sifive_u_cpu_init), DEFINE_BARE_CPU(TYPE_RISCV_CPU_RV32I,MXL_RV32, rv32i_bare_cpu_init), DEFINE_BARE_CPU(TYPE_RISCV_CPU_RV32E,MXL_RV32, rv32e_bare_cpu_init), -#elif defined(TARGET_RISCV64) -DEFINE_DYNAMIC_CPU(TYPE_RISCV_CPU_ANY, MXL_RV64, riscv_any_cpu_init), -DEFINE_DYNAMIC_CPU(TYPE_RISCV_CPU_MAX, MXL_RV64, riscv_max_cpu_init), +#endif + +#if defined(TARGET_RISCV64) DEFINE_DYNAMIC_CPU(TYPE_RISCV_CPU_BASE64,MXL_RV64, rv64_base_cpu_init), DEFINE_VENDOR_CPU(TYPE_RISCV_CPU_SIFIVE_E51, MXL_RV64, rv64_sifive_e_cpu_init), DEFINE_VENDOR_CPU(TYPE_RISCV_CPU_SIFIVE_U54, MXL_RV64, rv64_sifive_u_cpu_init), -- 2.43.0
[PATCH 5/6] target/riscv: Correct mcause/scause bit width for RV32 in RV64 QEMU
From: TANG Tiancheng Ensure mcause high bit is correctly set by using 32-bit width for RV32 mode and 64-bit width for RV64 mode. Signed-off-by: TANG Tiancheng Reviewed-by: Liu Zhiwei --- target/riscv/cpu_helper.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c index 1af83a0a36..88a187c10a 100644 --- a/target/riscv/cpu_helper.c +++ b/target/riscv/cpu_helper.c @@ -1673,6 +1673,8 @@ void riscv_cpu_do_interrupt(CPUState *cs) target_ulong tinst = 0; target_ulong htval = 0; target_ulong mtval2 = 0; +int sxlen = 0; +int mxlen = 0; if (!async) { /* set tval to badaddr for traps with address information */ @@ -1799,7 +1801,8 @@ void riscv_cpu_do_interrupt(CPUState *cs) s = set_field(s, MSTATUS_SPP, env->priv); s = set_field(s, MSTATUS_SIE, 0); env->mstatus = s; -env->scause = cause | ((target_ulong)async << (TARGET_LONG_BITS - 1)); +sxlen = 16UL << riscv_cpu_sxl(env); +env->scause = cause | ((target_ulong)async << (sxlen - 1)); env->sepc = env->pc; env->stval = tval; env->htval = htval; @@ -1830,7 +1833,8 @@ void riscv_cpu_do_interrupt(CPUState *cs) s = set_field(s, MSTATUS_MPP, env->priv); s = set_field(s, MSTATUS_MIE, 0); env->mstatus = s; -env->mcause = cause | ~(((target_ulong)-1) >> async); +mxlen = 16UL << riscv_cpu_mxl(env); +env->mcause = cause | ((target_ulong)async << (mxlen - 1)); env->mepc = env->pc; env->mtval = tval; env->mtval2 = mtval2; -- 2.43.0
[PATCH 4/6] target/riscv: Detect sxl to set bit width for RV32 in RV64
From: TANG Tiancheng Ensure correct bit width based on sxl when running RV32 on RV64 QEMU. This is required as MMU address translations run in S-mode. Signed-off-by: TANG Tiancheng Reviewed-by: Liu Zhiwei --- target/riscv/cpu_helper.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c index 6709622dd3..1af83a0a36 100644 --- a/target/riscv/cpu_helper.c +++ b/target/riscv/cpu_helper.c @@ -887,12 +887,14 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical, CPUState *cs = env_cpu(env); int va_bits = PGSHIFT + levels * ptidxbits + widened; +int sxlen = 16UL << riscv_cpu_sxl(env); +int sxlen_bytes = sxlen / 8; if (first_stage == true) { target_ulong mask, masked_msbs; -if (TARGET_LONG_BITS > (va_bits - 1)) { -mask = (1L << (TARGET_LONG_BITS - (va_bits - 1))) - 1; +if (sxlen > (va_bits - 1)) { +mask = (1L << (sxlen - (va_bits - 1))) - 1; } else { mask = 0; } @@ -961,7 +963,7 @@ restart: int pmp_prot; int pmp_ret = get_physical_address_pmp(env, _prot, pte_addr, - sizeof(target_ulong), + sxlen_bytes, MMU_DATA_LOAD, PRV_S); if (pmp_ret != TRANSLATE_SUCCESS) { return TRANSLATE_PMP_FAIL; @@ -1113,7 +1115,7 @@ restart: * it is no longer valid and we must re-walk the page table. */ MemoryRegion *mr; -hwaddr l = sizeof(target_ulong), addr1; +hwaddr l = sxlen_bytes, addr1; mr = address_space_translate(cs->as, pte_addr, , , false, MEMTXATTRS_UNSPECIFIED); if (memory_region_is_ram(mr)) { @@ -1126,6 +1128,11 @@ restart: *pte_pa = pte = updated_pte; #else target_ulong old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte); +if (riscv_cpu_sxl(env) == MXL_RV32) { +old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte); +} else { +old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte); +} if (old_pte != pte) { goto restart; } -- 2.43.0
[PATCH 3/6] target/riscv: Correct SXL return value for RV32 in RV64 QEMU
From: TANG Tiancheng Ensure that riscv_cpu_sxl returns MXL_RV32 when runningRV32 in an RV64 QEMU. Signed-off-by: TANG Tiancheng Fixes: 05e6ca5e156 ("target/riscv: Ignore reserved bits in PTE for RV64") Reviewed-by: Liu Zhiwei --- target/riscv/cpu.h | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index 6fe0d712b4..36a712044a 100644 --- a/target/riscv/cpu.h +++ b/target/riscv/cpu.h @@ -668,8 +668,11 @@ static inline RISCVMXL riscv_cpu_sxl(CPURISCVState *env) #ifdef CONFIG_USER_ONLY return env->misa_mxl; #else -return get_field(env->mstatus, MSTATUS64_SXL); +if (env->misa_mxl != MXL_RV32) { +return get_field(env->mstatus, MSTATUS64_SXL); +} #endif +return MXL_RV32; } #endif -- 2.43.0
[PATCH 2/6] target/riscv: Adjust PMP size for no-MMU RV64 QEMU running RV32
From: TANG Tiancheng Ensure pmp_size is correctly determined using mxl for RV32 in RV64 QEMU. Signed-off-by: TANG Tiancheng Reviewed-by: Liu Zhiwei --- target/riscv/pmp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c index 9eea397e72..f65aa3dba7 100644 --- a/target/riscv/pmp.c +++ b/target/riscv/pmp.c @@ -326,7 +326,7 @@ bool pmp_hart_has_privs(CPURISCVState *env, hwaddr addr, */ pmp_size = -(addr | TARGET_PAGE_MASK); } else { -pmp_size = sizeof(target_ulong); +pmp_size = 2UL << riscv_cpu_mxl(env); } } else { pmp_size = size; -- 2.43.0
[PATCH 1/6] target/riscv: Add fw_dynamic_info32 for booting RV32 OpenSBI
From: TANG Tiancheng RV32 OpenSBI need a fw_dynamic_info parameter with 32-bit fields instead of target_ulong. In RV64 QEMU, target_ulong is 64. So it is not right for booting RV32 OpenSBI. We create a fw_dynmaic_info32 struct for this purpose. Signed-off-by: TANG Tiancheng Reviewed-by: Liu Zhiwei --- hw/riscv/boot.c | 35 ++--- hw/riscv/sifive_u.c | 3 ++- include/hw/riscv/boot.h | 4 +++- include/hw/riscv/boot_opensbi.h | 29 +++ 4 files changed, 57 insertions(+), 14 deletions(-) diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c index 47281ca853..1a2c1ff9e0 100644 --- a/hw/riscv/boot.c +++ b/hw/riscv/boot.c @@ -342,27 +342,33 @@ void riscv_load_fdt(hwaddr fdt_addr, void *fdt) rom_ptr_for_as(_space_memory, fdt_addr, fdtsize)); } -void riscv_rom_copy_firmware_info(MachineState *machine, hwaddr rom_base, - hwaddr rom_size, uint32_t reset_vec_size, +void riscv_rom_copy_firmware_info(MachineState *machine, + RISCVHartArrayState *harts, + hwaddr rom_base, hwaddr rom_size, + uint32_t reset_vec_size, uint64_t kernel_entry) { +struct fw_dynamic_info32 dinfo32; struct fw_dynamic_info dinfo; size_t dinfo_len; -if (sizeof(dinfo.magic) == 4) { -dinfo.magic = cpu_to_le32(FW_DYNAMIC_INFO_MAGIC_VALUE); -dinfo.version = cpu_to_le32(FW_DYNAMIC_INFO_VERSION); -dinfo.next_mode = cpu_to_le32(FW_DYNAMIC_INFO_NEXT_MODE_S); -dinfo.next_addr = cpu_to_le32(kernel_entry); +if (riscv_is_32bit(harts)) { +dinfo32.magic = cpu_to_le32(FW_DYNAMIC_INFO_MAGIC_VALUE); +dinfo32.version = cpu_to_le32(FW_DYNAMIC_INFO_VERSION); +dinfo32.next_mode = cpu_to_le32(FW_DYNAMIC_INFO_NEXT_MODE_S); +dinfo32.next_addr = cpu_to_le32(kernel_entry); +dinfo32.options = 0; +dinfo32.boot_hart = 0; +dinfo_len = sizeof(dinfo32); } else { dinfo.magic = cpu_to_le64(FW_DYNAMIC_INFO_MAGIC_VALUE); dinfo.version = cpu_to_le64(FW_DYNAMIC_INFO_VERSION); dinfo.next_mode = cpu_to_le64(FW_DYNAMIC_INFO_NEXT_MODE_S); dinfo.next_addr = cpu_to_le64(kernel_entry); +dinfo.options = 0; +dinfo.boot_hart = 0; +dinfo_len = sizeof(dinfo); } -dinfo.options = 0; -dinfo.boot_hart = 0; -dinfo_len = sizeof(dinfo); /** * copy the dynamic firmware info. This information is specific to @@ -374,7 +380,10 @@ void riscv_rom_copy_firmware_info(MachineState *machine, hwaddr rom_base, exit(1); } -rom_add_blob_fixed_as("mrom.finfo", , dinfo_len, +rom_add_blob_fixed_as("mrom.finfo", + riscv_is_32bit(harts) ? + (void *) : (void *), + dinfo_len, rom_base + reset_vec_size, _space_memory); } @@ -430,7 +439,9 @@ void riscv_setup_rom_reset_vec(MachineState *machine, RISCVHartArrayState *harts } rom_add_blob_fixed_as("mrom.reset", reset_vec, sizeof(reset_vec), rom_base, _space_memory); -riscv_rom_copy_firmware_info(machine, rom_base, rom_size, sizeof(reset_vec), +riscv_rom_copy_firmware_info(machine, harts, + rom_base, rom_size, + sizeof(reset_vec), kernel_entry); } diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c index af5f923f54..5010c3eadb 100644 --- a/hw/riscv/sifive_u.c +++ b/hw/riscv/sifive_u.c @@ -646,7 +646,8 @@ static void sifive_u_machine_init(MachineState *machine) rom_add_blob_fixed_as("mrom.reset", reset_vec, sizeof(reset_vec), memmap[SIFIVE_U_DEV_MROM].base, _space_memory); -riscv_rom_copy_firmware_info(machine, memmap[SIFIVE_U_DEV_MROM].base, +riscv_rom_copy_firmware_info(machine, >soc.u_cpus, + memmap[SIFIVE_U_DEV_MROM].base, memmap[SIFIVE_U_DEV_MROM].size, sizeof(reset_vec), kernel_entry); diff --git a/include/hw/riscv/boot.h b/include/hw/riscv/boot.h index a2e4ae9cb0..806256d23f 100644 --- a/include/hw/riscv/boot.h +++ b/include/hw/riscv/boot.h @@ -56,7 +56,9 @@ void riscv_setup_rom_reset_vec(MachineState *machine, RISCVHartArrayState *harts hwaddr rom_base, hwaddr rom_size, uint64_t kernel_entry, uint64_t fdt_load_addr); -void riscv_rom_copy_firmware_info(MachineState *machine, hwaddr rom_base, +void riscv_rom_copy_firmware_info(MachineState *machine, + RISCVHartArrayState *harts, +
[PATCH 0/6] target/riscv: Expose RV32 cpu to RV64 QEMU
From: TANG Tiancheng This patch set aims to expose 32-bit RISC-V cpu to RV64 QEMU. Thus qemu-system-riscv64 can directly boot a RV32 Linux. This patch set has been tested with 6.9.0 Linux Image. - Run RV64 QEMU with RV32 CPU qemu-system-riscv64 -cpu rv32 -M virt -nographic \ -kernel Image \ -append "root=/dev/vda ro console=ttyS0" \ -drive file=rootfs.ext2,format=raw,id=hd0 \ -device virtio-blk-device,drive=hd0 -netdev user,id=net0 \ -device virtio-net-device,netdev=net0 OpenSBI v1.4 QEMU emulator version 9.0.50 (v9.0.0-1132-g7799dc2e3b) [0.00] Linux version 6.9.0 (developer@11109ca35736) (riscv32-unknown-linux-gnu-gcc (gc891d8dc23e-dirty) 13.2.0, GNU ld (GNU Binutils) 2.42) #3 SMP Fri May 31 08:42:15 UTC 2024 [0.00] random: crng init done [0.00] OF: fdt: Ignoring memory range 0x8000 - 0x8040 [0.00] Machine model: riscv-virtio,qemu [0.00] SBI specification v2.0 detected [0.00] SBI implementation ID=0x1 Version=0x10004 [0.00] SBI TIME extension detected [0.00] SBI IPI extension detected [0.00] SBI RFENCE extension detected [0.00] SBI SRST extension detected [0.00] SBI DBCN extension detected [0.00] efi: UEFI not found. [0.00] OF: reserved mem: 0x8000..0x8003 (256 KiB) nomap non-reusable mmode_resv1@8000 [0.00] OF: reserved mem: 0x8004..0x8004 (64 KiB) nomap non-reusable mmode_resv0@8004 [0.00] Zone ranges: [0.00] Normal [mem 0x8040-0x87ff] [0.00] Movable zone start for each node [0.00] Early memory node ranges [0.00] node 0: [mem 0x8040-0x87ff] [0.00] Initmem setup node 0 [mem 0x8040-0x87ff] [0.00] On node 0, zone Normal: 1024 pages in unavailable ranges [0.00] SBI HSM extension detected [0.00] riscv: base ISA extensions acdfhim [0.00] riscv: ELF capabilities acdfim [0.00] percpu: Embedded 17 pages/cpu s37728 r8192 d23712 u69632 [0.00] Kernel command line: root=/dev/vda ro console=ttyS0 [0.00] Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear) [0.00] Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear) [0.00] Built 1 zonelists, mobility grouping on. Total pages: 31465 [0.00] mem auto-init: stack:all(zero), heap alloc:off, heap free:off [0.00] Virtual kernel memory layout: [0.00] fixmap : 0x9c80 - 0x9d00 (8192 kB) [0.00] pci io : 0x9d00 - 0x9e00 ( 16 MB) [0.00] vmemmap : 0x9e00 - 0xa000 ( 32 MB) [0.00] vmalloc : 0xa000 - 0xc000 ( 512 MB) [0.00] lowmem : 0xc000 - 0xc7c0 ( 124 MB) [0.00] Memory: 95700K/126976K available (9090K kernel code, 8845K rwdata, 4096K rodata, 4231K init, 341K bss, 31276K reserved, 0K cma-reserved) ... Welcome to Buildroot buildroot login: root # cat /proc/cpuinfo processor : 0 hart: 0 isa : rv32imafdch_zicbom_zicboz_zicntr_zicsr_zifencei_zihintntl_zihintpause_zihpm_zfa_zba_zbb_zbc_zbs_sstc mmu : sv32 TANG Tiancheng (6): target/riscv: Add fw_dynamic_info32 for booting RV32 OpenSBI target/riscv: Adjust PMP size for no-MMU RV64 QEMU running RV32 target/riscv: Correct SXL return value for RV32 in RV64 QEMU target/riscv: Detect sxl to set bit width for RV32 in RV64 target/riscv: Correct mcause/scause bit width for RV32 in RV64 QEMU target/riscv: Enable RV32 CPU support in RV64 QEMU configs/targets/riscv64-softmmu.mak | 2 +- hw/riscv/boot.c | 35 +++-- hw/riscv/sifive_u.c | 3 ++- include/hw/riscv/boot.h | 4 +++- include/hw/riscv/boot_opensbi.h | 29 target/riscv/cpu.c | 17 ++ target/riscv/cpu.h | 5 - target/riscv/cpu_helper.c | 23 ++- target/riscv/pmp.c | 2 +- 9 files changed, 93 insertions(+), 27 deletions(-) -- 2.43.0
RE: [PATCH] hw/misc/bcm2835_thermal: Handle invalid address accesses gracefully
Hi, zheyu > -Original Message- > From: qemu-devel-bounces+yaoxt.fnst=fujitsu@nongnu.org > On Behalf Of Zheyu > Ma > Sent: Sunday, June 30, 2024 11:14 PM > To: Peter Maydell ; Philippe Mathieu-Daudé > > Cc: Zheyu Ma ; qemu-...@nongnu.org; > qemu-devel@nongnu.org > Subject: [PATCH] hw/misc/bcm2835_thermal: Handle invalid address accesses > gracefully > > This commit handles invalid address accesses gracefully in both read and write > functions. Instead of asserting and aborting, it logs an error message and > returns > a neutral value for read operations and does nothing for write operations. > > Error log: > ERROR:hw/misc/bcm2835_thermal.c:55:bcm2835_thermal_read: code should not > be reached > Bail out! ERROR:hw/misc/bcm2835_thermal.c:55:bcm2835_thermal_read: code > should not be reached > Aborted > > Reproducer: > cat << EOF | qemu-system-aarch64 -display \ > none -machine accel=qtest, -m 512M -machine raspi3b -m 1G -qtest stdio > readw 0x3f212003 > EOF > > Signed-off-by: Zheyu Ma > --- > hw/misc/bcm2835_thermal.c | 12 > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/hw/misc/bcm2835_thermal.c b/hw/misc/bcm2835_thermal.c > index ee7816b8a5..5c2a429d58 100644 > --- a/hw/misc/bcm2835_thermal.c > +++ b/hw/misc/bcm2835_thermal.c > @@ -51,8 +51,10 @@ static uint64_t bcm2835_thermal_read(void *opaque, > hwaddr addr, unsigned size) > val = FIELD_DP32(bcm2835_thermal_temp2adc(25), STAT, VALID, true); > break; > default: > -/* MemoryRegionOps are aligned, so this can not happen. */ > -g_assert_not_reached(); > +qemu_log_mask(LOG_GUEST_ERROR, > + "bcm2835_thermal_read: invalid address 0x%" > + HWADDR_PRIx "\n", addr); > +val = 0; > } > return val; > } > @@ -72,8 +74,10 @@ static void bcm2835_thermal_write(void *opaque, hwaddr > addr, > __func__, value, addr); > break; > default: > -/* MemoryRegionOps are aligned, so this can not happen. */ > -g_assert_not_reached(); > +qemu_log_mask(LOG_GUEST_ERROR, > + "bcm2835_thermal_write: invalid address 0x%" > + HWADDR_PRIx "\n", addr); > +break; > } > } the default branch will never be reached in normal access, so I think this modification is not needed. > > -- > 2.34.1 >
RE: [PATCH] hw/display/tcx: Fix out-of-bounds access in tcx_blit_writel
Hi, zheyu > -Original Message- > From: qemu-devel-bounces+yaoxt.fnst=fujitsu@nongnu.org > On Behalf Of Zheyu > Ma > Sent: Sunday, June 30, 2024 9:04 PM > To: Mark Cave-Ayland > Cc: Zheyu Ma ; qemu-devel@nongnu.org > Subject: [PATCH] hw/display/tcx: Fix out-of-bounds access in tcx_blit_writel > > This patch addresses a potential out-of-bounds memory access issue in the > tcx_blit_writel function. It adds bounds checking to ensure that memory > accesses do not exceed the allocated VRAM size. If an out-of-bounds access > is detected, an error is logged using qemu_log_mask. > > ASAN log: > ==2960379==ERROR: AddressSanitizer: SEGV on unknown address > 0x7f524752fd01 (pc 0x7f525c2c4881 bp 0x7ffdaf87bfd0 sp 0x7ffdaf87b788 T0) > ==2960379==The signal is caused by a READ memory access. > #0 0x7f525c2c4881 in memcpy > string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:222 > #1 0x55aa782bd5b1 in __asan_memcpy > llvm/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:22:3 > #2 0x55aa7854dedd in tcx_blit_writel hw/display/tcx.c:590:13 > > Reproducer: > cat << EOF | qemu-system-sparc -display none \ > -machine accel=qtest, -m 512M -machine LX -m 256 -qtest stdio > writel 0x562e98c4 0x3d92fd01 > EOF > > Signed-off-by: Zheyu Ma > --- > hw/display/tcx.c | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/hw/display/tcx.c b/hw/display/tcx.c > index 99507e7638..af43bea7f2 100644 > --- a/hw/display/tcx.c > +++ b/hw/display/tcx.c > @@ -33,6 +33,7 @@ > #include "migration/vmstate.h" > #include "qemu/error-report.h" > #include "qemu/module.h" > +#include "qemu/log.h" > #include "qom/object.h" > > #define TCX_ROM_FILE "QEMU,tcx.bin" > @@ -577,6 +578,14 @@ static void tcx_blit_writel(void *opaque, hwaddr addr, > addr = (addr >> 3) & 0xf; > adsr = val & 0xff; > len = ((val >> 24) & 0x1f) + 1; > + > +if (addr + len > s->vram_size || adsr + len > s->vram_size) { if adsr == 0xff, this condition may be always true, thus the branch “if (adsr == 0xff) {” will be never reached. > +qemu_log_mask(LOG_GUEST_ERROR, > + "%s: VRAM access out of bounds. addr: 0x%lx, adsr: > 0x%x, len: %u\n", > + __func__, addr, adsr, len); > +return; > +} > + > if (adsr == 0xff) { > memset(>vram[addr], s->tmpblit, len); > if (s->depth == 24) { > -- > 2.34.1 >
[PATCH 2/5] target/i386: Convert cc_op_live to a function
Assert that op is known. Signed-off-by: Richard Henderson --- target/i386/tcg/translate.c | 56 +++- target/i386/tcg/decode-new.c.inc | 2 +- target/i386/tcg/emit.c.inc | 6 ++-- 3 files changed, 39 insertions(+), 25 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 95bad55bf4..e5a8aaf793 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -290,26 +290,38 @@ enum { }; /* Bit set if the global variable is live after setting CC_OP to X. */ -static const uint8_t cc_op_live[CC_OP_NB] = { -[CC_OP_DYNAMIC] = USES_CC_DST | USES_CC_SRC | USES_CC_SRC2, -[CC_OP_EFLAGS] = USES_CC_SRC, -[CC_OP_MULB ... CC_OP_MULQ] = USES_CC_DST | USES_CC_SRC, -[CC_OP_ADDB ... CC_OP_ADDQ] = USES_CC_DST | USES_CC_SRC, -[CC_OP_ADCB ... CC_OP_ADCQ] = USES_CC_DST | USES_CC_SRC | USES_CC_SRC2, -[CC_OP_SUBB ... CC_OP_SUBQ] = USES_CC_DST | USES_CC_SRC | USES_CC_SRCT, -[CC_OP_SBBB ... CC_OP_SBBQ] = USES_CC_DST | USES_CC_SRC | USES_CC_SRC2, -[CC_OP_LOGICB ... CC_OP_LOGICQ] = USES_CC_DST, -[CC_OP_INCB ... CC_OP_INCQ] = USES_CC_DST | USES_CC_SRC, -[CC_OP_DECB ... CC_OP_DECQ] = USES_CC_DST | USES_CC_SRC, -[CC_OP_SHLB ... CC_OP_SHLQ] = USES_CC_DST | USES_CC_SRC, -[CC_OP_SARB ... CC_OP_SARQ] = USES_CC_DST | USES_CC_SRC, -[CC_OP_BMILGB ... CC_OP_BMILGQ] = USES_CC_DST | USES_CC_SRC, -[CC_OP_ADCX] = USES_CC_DST | USES_CC_SRC, -[CC_OP_ADOX] = USES_CC_SRC | USES_CC_SRC2, -[CC_OP_ADCOX] = USES_CC_DST | USES_CC_SRC | USES_CC_SRC2, -[CC_OP_CLR] = 0, -[CC_OP_POPCNT] = USES_CC_DST, -}; +static uint8_t cc_op_live(CCOp op) +{ +switch (op) { +case CC_OP_CLR: +return 0; +case CC_OP_EFLAGS: +return USES_CC_SRC; +case CC_OP_POPCNT: +case CC_OP_LOGICB ... CC_OP_LOGICQ: +return USES_CC_DST; +case CC_OP_MULB ... CC_OP_MULQ: +case CC_OP_ADDB ... CC_OP_ADDQ: +case CC_OP_INCB ... CC_OP_INCQ: +case CC_OP_DECB ... CC_OP_DECQ: +case CC_OP_SHLB ... CC_OP_SHLQ: +case CC_OP_SARB ... CC_OP_SARQ: +case CC_OP_BMILGB ... CC_OP_BMILGQ: +case CC_OP_ADCX: +return USES_CC_DST | USES_CC_SRC; +case CC_OP_ADOX: +return USES_CC_SRC | USES_CC_SRC2; +case CC_OP_SUBB ... CC_OP_SUBQ: +return USES_CC_DST | USES_CC_SRC | USES_CC_SRCT; +case CC_OP_DYNAMIC: +case CC_OP_ADCB ... CC_OP_ADCQ: +case CC_OP_SBBB ... CC_OP_SBBQ: +case CC_OP_ADCOX: +return USES_CC_DST | USES_CC_SRC | USES_CC_SRC2; +default: +g_assert_not_reached(); +} +} static void set_cc_op_1(DisasContext *s, CCOp op, bool dirty) { @@ -320,7 +332,7 @@ static void set_cc_op_1(DisasContext *s, CCOp op, bool dirty) } /* Discard CC computation that will no longer be used. */ -dead = cc_op_live[s->cc_op] & ~cc_op_live[op]; +dead = cc_op_live(s->cc_op) & ~cc_op_live(op); if (dead & USES_CC_DST) { tcg_gen_discard_tl(cpu_cc_dst); } @@ -816,7 +828,7 @@ static void gen_mov_eflags(DisasContext *s, TCGv reg) src2 = cpu_cc_src2; /* Take care to not read values that are not live. */ -live = cc_op_live[s->cc_op] & ~USES_CC_SRCT; +live = cc_op_live(s->cc_op) & ~USES_CC_SRCT; dead = live ^ (USES_CC_DST | USES_CC_SRC | USES_CC_SRC2); if (dead) { TCGv zero = tcg_constant_tl(0); diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc index 0d846c32c2..ba4b8d16f9 100644 --- a/target/i386/tcg/decode-new.c.inc +++ b/target/i386/tcg/decode-new.c.inc @@ -2792,7 +2792,7 @@ static void disas_insn(DisasContext *s, CPUState *cpu) tcg_gen_mov_i32(cpu_cc_op, decode.cc_op_dynamic); } set_cc_op(s, decode.cc_op); -cc_live = cc_op_live[decode.cc_op]; +cc_live = cc_op_live(decode.cc_op); } else { cc_live = 0; } diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc index fc7477833b..38b399783e 100644 --- a/target/i386/tcg/emit.c.inc +++ b/target/i386/tcg/emit.c.inc @@ -3531,18 +3531,20 @@ static void gen_shift_dynamic_flags(DisasContext *s, X86DecodedInsn *decode, TCG { TCGv_i32 count32 = tcg_temp_new_i32(); TCGv_i32 old_cc_op; +uint8_t live; decode->cc_op = CC_OP_DYNAMIC; decode->cc_op_dynamic = tcg_temp_new_i32(); assert(decode->cc_dst == s->T0); -if (cc_op_live[s->cc_op] & USES_CC_DST) { +live = cc_op_live(s->cc_op); +if (live & USES_CC_DST) { decode->cc_dst = tcg_temp_new(); tcg_gen_movcond_tl(TCG_COND_EQ, decode->cc_dst, count, tcg_constant_tl(0), cpu_cc_dst, s->T0); } -if (cc_op_live[s->cc_op] & USES_CC_SRC) { +if (live & USES_CC_SRC) { tcg_gen_movcond_tl(TCG_COND_EQ, decode->cc_src, count, tcg_constant_tl(0), cpu_cc_src, decode->cc_src); } -- 2.34.1
[PATCH 3/5] target/i386: Rearrange CCOp
Define CC_OP_{FIRST,LAST}_BWLQ. Remove CC_OP_NB. Give the first few enumerators explicit integer constants. Move CC_OP_POPCNT up in the enumeration; remove unused CC_OP_POPCNT*__ placeholders. Align the BWLQ enumerators. This will be used to simplify ((op - CC_OP_*B) & 3). Signed-off-by: Richard Henderson --- target/i386/cpu.h | 44 1 file changed, 24 insertions(+), 20 deletions(-) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 29daf37048..df4272fdae 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1270,14 +1270,27 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w, * are only needed for conditional branches. */ typedef enum { -CC_OP_DYNAMIC, /* must use dynamic code to get cc_op */ -CC_OP_EFLAGS, /* all cc are explicitly computed, CC_SRC = flags */ -CC_OP_ADCX, /* CC_DST = C, CC_SRC = rest. */ -CC_OP_ADOX, /* CC_SRC2 = O, CC_SRC = rest. */ -CC_OP_ADCOX, /* CC_DST = C, CC_SRC2 = O, CC_SRC = rest. */ -CC_OP_CLR, /* Z and P set, all other flags clear. */ +CC_OP_DYNAMIC = 0, /* must use dynamic code to get cc_op */ +CC_OP_EFLAGS = 1, /* all cc are explicitly computed, CC_SRC = flags */ +CC_OP_ADCX = 2,/* CC_DST = C, CC_SRC = rest. */ +CC_OP_ADOX = 3,/* CC_SRC2 = O, CC_SRC = rest. */ +CC_OP_ADCOX = 4, /* CC_DST = C, CC_SRC2 = O, CC_SRC = rest. */ +CC_OP_CLR = 5, /* Z and P set, all other flags clear. */ -CC_OP_MULB, /* modify all flags, C, O = (CC_SRC != 0) */ +/* + * Z via CC_DST, all other flags clear. + * Treat CC_OP_POPCNT like the other BWLQ ops in making the low bits + * equal MO_TL; this gives a value of either 6 or 7. + */ +#ifdef TARGET_X86_64 +CC_OP_POPCNT = 7, +#else +CC_OP_POPCNT = 6, +#endif + +#define CC_OP_FIRST_BWLQ CC_OP_POPCNT + +CC_OP_MULB = 8, /* modify all flags, C, O = (CC_SRC != 0) */ CC_OP_MULW, CC_OP_MULL, CC_OP_MULQ, @@ -1332,20 +1345,11 @@ typedef enum { CC_OP_BMILGL, CC_OP_BMILGQ, -/* - * Note that only CC_OP_POPCNT (i.e. the one with MO_TL size) - * is used or implemented, because the translation needs - * to zero-extend CC_DST anyway. - */ -CC_OP_POPCNTB__, /* Z via CC_DST, all other flags clear. */ -CC_OP_POPCNTW__, -CC_OP_POPCNTL__, -CC_OP_POPCNTQ__, -CC_OP_POPCNT = sizeof(target_ulong) == 8 ? CC_OP_POPCNTQ__ : CC_OP_POPCNTL__, - -CC_OP_NB, +#define CC_OP_LAST_BWLQ CC_OP_BMILGQ } CCOp; -QEMU_BUILD_BUG_ON(CC_OP_NB >= 128); + +/* See X86DecodedInsn.cc_op, using int8_t. */ +QEMU_BUILD_BUG_ON(CC_OP_LAST_BWLQ > INT8_MAX); typedef struct SegmentCache { uint32_t selector; -- 2.34.1
[PATCH 0/5] target/i386: CCOp cleanups
While debugging #2413, I spent quite a bit of time trying to work out if the CCOp value was incorrect. I think the following is a worthwhile cleanup, isolating potential problems to asserts. r~ Richard Henderson (5): target/i386: Tidy cc_op_str usage target/i386: Convert cc_op_live to a function target/i386: Rearrange CCOp target/i386: Remove default in cc_op_live target/i386: Introduce cc_op_size target/i386/cpu.h| 44 + target/i386/cpu-dump.c | 17 --- target/i386/tcg/translate.c | 84 +++- target/i386/tcg/decode-new.c.inc | 2 +- target/i386/tcg/emit.c.inc | 9 ++-- 5 files changed, 93 insertions(+), 63 deletions(-) -- 2.34.1
[PATCH 4/5] target/i386: Remove default in cc_op_live
Now that CC_OP_NB is gone, push the assert after the switch. This will allow -Wswitch to diagnose missing entries. Signed-off-by: Richard Henderson --- target/i386/tcg/translate.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index e5a8aaf793..e675afca47 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -318,9 +318,8 @@ static uint8_t cc_op_live(CCOp op) case CC_OP_SBBB ... CC_OP_SBBQ: case CC_OP_ADCOX: return USES_CC_DST | USES_CC_SRC | USES_CC_SRC2; -default: -g_assert_not_reached(); } +g_assert_not_reached(); } static void set_cc_op_1(DisasContext *s, CCOp op, bool dirty) -- 2.34.1
[PATCH 5/5] target/i386: Introduce cc_op_size
Replace arithmetic on cc_op with a helper function. Assert that the op has a size and that it is valid for the configuration. Signed-off-by: Richard Henderson --- target/i386/tcg/translate.c | 29 ++--- target/i386/tcg/emit.c.inc | 3 ++- 2 files changed, 20 insertions(+), 12 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index e675afca47..e98bed0805 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -322,6 +322,16 @@ static uint8_t cc_op_live(CCOp op) g_assert_not_reached(); } +static MemOp cc_op_size(CCOp op) +{ +MemOp size = op & 3; + +assert(op >= CC_OP_FIRST_BWLQ && op <= CC_OP_LAST_BWLQ); +assert(size <= MO_TL); + +return size; +} + static void set_cc_op_1(DisasContext *s, CCOp op, bool dirty) { int dead; @@ -884,7 +894,7 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, TCGv reg) switch (s->cc_op) { case CC_OP_SUBB ... CC_OP_SUBQ: /* (DATA_TYPE)CC_SRCT < (DATA_TYPE)CC_SRC */ -size = s->cc_op - CC_OP_SUBB; +size = cc_op_size(s->cc_op); gen_ext_tl(s->cc_srcT, s->cc_srcT, size, false); gen_ext_tl(cpu_cc_src, cpu_cc_src, size, false); return (CCPrepare) { .cond = TCG_COND_LTU, .reg = s->cc_srcT, @@ -892,7 +902,7 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, TCGv reg) case CC_OP_ADDB ... CC_OP_ADDQ: /* (DATA_TYPE)CC_DST < (DATA_TYPE)CC_SRC */ -size = s->cc_op - CC_OP_ADDB; +size = cc_op_size(s->cc_op); gen_ext_tl(cpu_cc_dst, cpu_cc_dst, size, false); gen_ext_tl(cpu_cc_src, cpu_cc_src, size, false); return (CCPrepare) { .cond = TCG_COND_LTU, .reg = cpu_cc_dst, @@ -910,7 +920,7 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, TCGv reg) case CC_OP_SHLB ... CC_OP_SHLQ: /* (CC_SRC >> (DATA_BITS - 1)) & 1 */ -size = s->cc_op - CC_OP_SHLB; +size = cc_op_size(s->cc_op); return gen_prepare_sign_nz(cpu_cc_src, size); case CC_OP_MULB ... CC_OP_MULQ: @@ -918,7 +928,7 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, TCGv reg) .reg = cpu_cc_src }; case CC_OP_BMILGB ... CC_OP_BMILGQ: -size = s->cc_op - CC_OP_BMILGB; +size = cc_op_size(s->cc_op); gen_ext_tl(cpu_cc_src, cpu_cc_src, size, false); return (CCPrepare) { .cond = TCG_COND_EQ, .reg = cpu_cc_src }; @@ -972,10 +982,7 @@ static CCPrepare gen_prepare_eflags_s(DisasContext *s, TCGv reg) case CC_OP_POPCNT: return (CCPrepare) { .cond = TCG_COND_NEVER }; default: -{ -MemOp size = (s->cc_op - CC_OP_ADDB) & 3; -return gen_prepare_sign_nz(cpu_cc_dst, size); -} +return gen_prepare_sign_nz(cpu_cc_dst, cc_op_size(s->cc_op)); } } @@ -1016,7 +1023,7 @@ static CCPrepare gen_prepare_eflags_z(DisasContext *s, TCGv reg) return (CCPrepare) { .cond = TCG_COND_ALWAYS }; default: { -MemOp size = (s->cc_op - CC_OP_ADDB) & 3; +MemOp size = cc_op_size(s->cc_op); if (size == MO_TL) { return (CCPrepare) { .cond = TCG_COND_EQ, .reg = cpu_cc_dst }; } else { @@ -1042,7 +1049,7 @@ static CCPrepare gen_prepare_cc(DisasContext *s, int b, TCGv reg) switch (s->cc_op) { case CC_OP_SUBB ... CC_OP_SUBQ: /* We optimize relational operators for the cmp/jcc case. */ -size = s->cc_op - CC_OP_SUBB; +size = cc_op_size(s->cc_op); switch (jcc_op) { case JCC_BE: gen_ext_tl(s->cc_srcT, s->cc_srcT, size, false); @@ -3176,7 +3183,7 @@ static void disas_insn_old(DisasContext *s, CPUState *cpu, int b) CC_DST alone, setting CC_SRC, and using a CC_OP_SAR of the same width. */ tcg_gen_mov_tl(cpu_cc_src, s->tmp4); -set_cc_op(s, ((s->cc_op - CC_OP_MULB) & 3) + CC_OP_SARB); +set_cc_op(s, CC_OP_SARB + cc_op_size(s->cc_op)); break; default: /* Otherwise, generate EFLAGS and replace the C bit. */ diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc index 38b399783e..e9d5d196ce 100644 --- a/target/i386/tcg/emit.c.inc +++ b/target/i386/tcg/emit.c.inc @@ -3106,7 +3106,8 @@ static bool gen_eflags_adcox(DisasContext *s, X86DecodedInsn *decode, bool want_ * bit, we might as well fish CF out of EFLAGS and save a shift. */ if (want_carry && (!need_flags || s->cc_op == CC_OP_SHLB + MO_TL)) { -tcg_gen_shri_tl(decode->cc_dst, cpu_cc_src, (8 << (s->cc_op - CC_OP_SHLB)) - 1); +MemOp size = cc_op_size(s->cc_op); +tcg_gen_shri_tl(decode->cc_dst, cpu_cc_src, (8 << size) - 1); got_cf = true; } gen_mov_eflags(s, decode->cc_src); -- 2.34.1
[PATCH 1/5] target/i386: Tidy cc_op_str usage
Make const. Use the read-only strings directly; do not copy them into an on-stack buffer with snprintf. Allow for holes in the cc_op_str array, now present with CC_OP_POPCNT. Fixes: 460231ad369 ("target/i386: give CC_OP_POPCNT low bits corresponding to MO_TL") Signed-off-by: Richard Henderson --- target/i386/cpu-dump.c | 17 +++-- 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/target/i386/cpu-dump.c b/target/i386/cpu-dump.c index 3bb8e44091..dc6723aede 100644 --- a/target/i386/cpu-dump.c +++ b/target/i386/cpu-dump.c @@ -27,7 +27,7 @@ /***/ /* x86 debug */ -static const char *cc_op_str[CC_OP_NB] = { +static const char * const cc_op_str[] = { [CC_OP_DYNAMIC] = "DYNAMIC", [CC_OP_EFLAGS] = "EFLAGS", @@ -347,7 +347,6 @@ void x86_cpu_dump_state(CPUState *cs, FILE *f, int flags) X86CPU *cpu = X86_CPU(cs); CPUX86State *env = >env; int eflags, i, nb; -char cc_op_name[32]; static const char *seg_name[6] = { "ES", "CS", "SS", "DS", "FS", "GS" }; eflags = cpu_compute_eflags(env); @@ -456,10 +455,16 @@ void x86_cpu_dump_state(CPUState *cs, FILE *f, int flags) env->dr[6], env->dr[7]); } if (flags & CPU_DUMP_CCOP) { -if ((unsigned)env->cc_op < CC_OP_NB) -snprintf(cc_op_name, sizeof(cc_op_name), "%s", cc_op_str[env->cc_op]); -else -snprintf(cc_op_name, sizeof(cc_op_name), "[%d]", env->cc_op); +const char *cc_op_name = NULL; +char cc_op_buf[32]; + +if ((unsigned)env->cc_op < ARRAY_SIZE(cc_op_str)) { +cc_op_name = cc_op_str[env->cc_op]; +} +if (cc_op_name == NULL) { +snprintf(cc_op_buf, sizeof(cc_op_buf), "[%d]", env->cc_op); +cc_op_name = cc_op_buf; +} #ifdef TARGET_X86_64 if (env->hflags & HF_CS64_MASK) { qemu_fprintf(f, "CCS=%016" PRIx64 " CCD=%016" PRIx64 " CCO=%s\n", -- 2.34.1
[PATCH] tcg/optimize: Fix TCG_COND_TST* simplification of setcond2
Fix a typo in the argument movement. Cc: qemu-sta...@nongnu.org Fixes: ceb9ee06b71 ("tcg/optimize: Handle TCG_COND_TST{EQ,NE}") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2413 Signed-off-by: Richard Henderson --- tcg/optimize.c | 2 +- tests/tcg/x86_64/test-2413.c | 30 ++ 2 files changed, 31 insertions(+), 1 deletion(-) create mode 100644 tests/tcg/x86_64/test-2413.c diff --git a/tcg/optimize.c b/tcg/optimize.c index 8886f7037a..ba16ec27e2 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -2384,7 +2384,7 @@ static bool fold_setcond2(OptContext *ctx, TCGOp *op) case TCG_COND_TSTEQ: case TCG_COND_TSTNE: -if (arg_is_const_val(op->args[2], 0)) { +if (arg_is_const_val(op->args[3], 0)) { goto do_setcond_high; } if (arg_is_const_val(op->args[4], 0)) { diff --git a/tests/tcg/x86_64/test-2413.c b/tests/tcg/x86_64/test-2413.c new file mode 100644 index 00..a0e4d25093 --- /dev/null +++ b/tests/tcg/x86_64/test-2413.c @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* Copyright 2024 Linaro, Ltd. */ +/* See https://gitlab.com/qemu-project/qemu/-/issues/2413 */ + +#include + +void test(unsigned long *a, unsigned long *d, unsigned long c) +{ +asm("xorl %%eax, %%eax\n\t" +"xorl %%edx, %%edx\n\t" +"testb $0x20, %%cl\n\t" +"sete %%al\n\t" +"setne %%dl\n\t" +"shll %%cl, %%eax\n\t" +"shll %%cl, %%edx\n\t" +: "=a"(*a), "=d"(*d) +: "c"(c)); +} + +int main(void) +{ +long a, c, d; + +for (c = 0; c < 64; c++) { +test(, , c); +assert(a == (c & 0x20 ? 0 : 1u << (c & 0x1f))); +assert(d == (c & 0x20 ? 1u << (c & 0x1f) : 0)); +} +return 0; +} -- 2.34.1
[PATCH v2 1/2] vfio/display: Fix potential memleak of edid info
EDID related device region info is leaked in vfio_display_edid_init() error path and VFIODisplay destroying path. Fixes: 08479114b0de ("vfio/display: add edid support.") Signed-off-by: Zhenzhong Duan --- hw/vfio/display.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/hw/vfio/display.c b/hw/vfio/display.c index 661e921616..9c57fd3888 100644 --- a/hw/vfio/display.c +++ b/hw/vfio/display.c @@ -171,7 +171,9 @@ static void vfio_display_edid_init(VFIOPCIDevice *vdev) err: trace_vfio_display_edid_write_error(); +g_free(dpy->edid_info); g_free(dpy->edid_regs); +dpy->edid_info = NULL; dpy->edid_regs = NULL; return; } @@ -182,6 +184,7 @@ static void vfio_display_edid_exit(VFIODisplay *dpy) return; } +g_free(dpy->edid_info); g_free(dpy->edid_regs); g_free(dpy->edid_blob); timer_free(dpy->edid_link_timer); -- 2.34.1
[PATCH v2 0/2] Misc fixes on vfio display
Hi, This is trying to address an issue Cédric found. See https://www.mail-archive.com/qemu-devel@nongnu.org/msg1043142.html While looking into it, also found a potential memory leak. I'm sorry that I didn't find how to test this fix, because it looks a GFX card is needed. Any idea on how to test or help test are quite appreciated. Thanks Zhenzhong v2: - set dpy->edid_info to NULL in vfio_display_edid_init() err path (Marc-André) - remove a wrongly added g_free(*info) in vfio_get_dev_region_info() (Marc-André) - add R-B on patch2 Zhenzhong Duan (2): vfio/display: Fix potential memleak of edid info vfio/display: Fix vfio_display_edid_init() error path hw/vfio/display.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) -- 2.34.1
[PATCH v2 2/2] vfio/display: Fix vfio_display_edid_init() error path
vfio_display_edid_init() can fail for many reasons and return silently. It would be good to report the error. Old mdev driver may not support vfio edid region and we allow to go through in this case. vfio_display_edid_update() isn't changed because it can be called at runtime when UI changes (i.e. window resize). Fixes: 08479114b0de ("vfio/display: add edid support.") Suggested-by: Cédric Le Goater Signed-off-by: Zhenzhong Duan Reviewed-by: Marc-André Lureau --- hw/vfio/display.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/hw/vfio/display.c b/hw/vfio/display.c index 9c57fd3888..ea87830fe0 100644 --- a/hw/vfio/display.c +++ b/hw/vfio/display.c @@ -124,7 +124,7 @@ static void vfio_display_edid_ui_info(void *opaque, uint32_t idx, } } -static void vfio_display_edid_init(VFIOPCIDevice *vdev) +static bool vfio_display_edid_init(VFIOPCIDevice *vdev, Error **errp) { VFIODisplay *dpy = vdev->dpy; int fd = vdev->vbasedev.fd; @@ -135,7 +135,8 @@ static void vfio_display_edid_init(VFIOPCIDevice *vdev) VFIO_REGION_SUBTYPE_GFX_EDID, >edid_info); if (ret) { -return; +/* Failed to get GFX edid info, allow to go through without edid. */ +return true; } trace_vfio_display_edid_available(); @@ -167,15 +168,16 @@ static void vfio_display_edid_init(VFIOPCIDevice *vdev) vfio_display_edid_link_up, vdev); vfio_display_edid_update(vdev, true, 0, 0); -return; +return true; err: +error_setg(errp, "vfio: failed to read GFX edid field"); trace_vfio_display_edid_write_error(); g_free(dpy->edid_info); g_free(dpy->edid_regs); dpy->edid_info = NULL; dpy->edid_regs = NULL; -return; +return false; } static void vfio_display_edid_exit(VFIODisplay *dpy) @@ -368,8 +370,7 @@ static bool vfio_display_dmabuf_init(VFIOPCIDevice *vdev, Error **errp) return false; } } -vfio_display_edid_init(vdev); -return true; +return vfio_display_edid_init(vdev, errp); } static void vfio_display_dmabuf_exit(VFIODisplay *dpy) -- 2.34.1
Re: [PULL 00/16] Trivial patches for 2024-06-30
On 6/30/24 09:53, Michael Tokarev wrote: The following changes since commit 3665dd6bb9043bef181c91e2dce9e1efff47ed51: Merge tag 'for-upstream' ofhttps://gitlab.com/bonzini/qemu into staging (2024-06-28 16:09:38 -0700) are available in the Git repository at: https://gitlab.com/mjt0k/qemu.git tags/pull-trivial-patches for you to fetch changes up to f22855dffdbc2906f744b5bcfea869cbb66b8fb2: hw/core/loader: gunzip(): fix memory leak on error path (2024-06-30 19:51:44 +0300) trivial patches for 2024-06-30 Applied, thanks. Please update https://wiki.qemu.org/ChangeLog/9.1 as appropriate. r~
Re: [PATCH v3 1/4] hw/intc: Remove loongarch_ipi.c
Hi Philippe, On 2024/6/27 下午9:02, Philippe Mathieu-Daudé wrote: On 27/6/24 04:44, gaosong wrote: 在 2024/6/26 下午8:10, Philippe Mathieu-Daudé 写道: Hi Bibo, On 26/6/24 06:11, maobibo wrote: On 2024/6/5 上午10:15, Jiaxun Yang wrote: It was missed out in previous commit. Fixes: b4a12dfc2132 ("hw/intc/loongarch_ipi: Rename as loongson_ipi") Signed-off-by: Jiaxun Yang --- hw/intc/loongarch_ipi.c | 347 1 file changed, 347 deletions(-) -static void loongarch_ipi_realize(DeviceState *dev, Error **errp) -{ - LoongArchIPI *s = LOONGARCH_IPI(dev); - SysBusDevice *sbd = SYS_BUS_DEVICE(dev); - int i; - - if (s->num_cpu == 0) { - error_setg(errp, "num-cpu must be at least 1"); - return; - } - - memory_region_init_io(>ipi_iocsr_mem, OBJECT(dev), _ipi_ops, - s, "loongarch_ipi_iocsr", 0x48); - - /* loongarch_ipi_iocsr performs re-entrant IO through ipi_send */ - s->ipi_iocsr_mem.disable_reentrancy_guard = true; - - sysbus_init_mmio(sbd, >ipi_iocsr_mem); - - memory_region_init_io(>ipi64_iocsr_mem, OBJECT(dev), - _ipi64_ops, - s, "loongarch_ipi64_iocsr", 0x118); - sysbus_init_mmio(sbd, >ipi64_iocsr_mem); It is different with existing implementation. With hw/intc/loongson_ipi.c, every vcpu has one ipi_mmio_mem, however on loongarch ipi machine, there is no ipi_mmio_mem memory region. So if machine has 256 vcpus, there will be 256 ipi_mmio_mem memory regions. In function sysbus_init_mmio(), memory region can not exceed QDEV_MAX_MMIO (32). With so many memory regions, it slows down memory region search speed also. void sysbus_init_mmio(SysBusDevice *dev, MemoryRegion *memory) { int n; assert(dev->num_mmio < QDEV_MAX_MMIO); n = dev->num_mmio++; dev->mmio[n].addr = -1; dev->mmio[n].memory = memory; } Can we revert this patch? We want to do production usable by real users rather than show pure technology. Since commit b4a12dfc2132 this file is not built/tested anymore: -specific_ss.add(when: 'CONFIG_LOONGARCH_IPI', if_true: files('loongarch_ipi.c')) +specific_ss.add(when: 'CONFIG_LOONGSON_IPI', if_true: files('loongson_ipi.c')) We don't want to maintain dead code. Hi, Philippe It is commmit 49eba52a5 that causes Loongarch to fail to start. What bibao means is that LoongArch and mips do not share "lloongson_ipi.c". This avoids mutual influence. My understanding of the next sentence is as follows. Nowadays, most of the open source operating systems in China use the latest QEMU. e.g. OpenEuler/OpenAnolis/OpenCloudOS, etc. These operating systems have a large number of real users. so we need to maintain the stability of the LoongArch architecture of the QEMU community as much as possible. This will reduce maintenance costs. I'm glad there is a such large number of users :) Therefore, we would like to restore the 'loongarch_ipi.c' file. what do you think? My preference on "reducing maintenance cost" is code reuse instead of duplication. Before reverting, lets try to fix the issue. I suggested a v2: https://lore.kernel.org/qemu-devel/20240627125819.62779-2-phi...@linaro.org Sorry for late reply. How about split loongson_ipi.c into loongson_ipi_base.c/loongson_ipi_loongson.c/loongson_ipi_loongarch.c, File loongson_ipi_base.c contains the common code, loongson_ipi_xxx.c contains arch specific. Soon we will submit irqchip in kernel function, it will be much different for architectures, since ioctl command is different for two architectures to save/restore ipi registers. Regards Bibo Mao That said, both current patch and the suggested fix pass our Avocado CI test suite (running tests/avocado/machine_loongarch.py). Is your use case not covered? Could you expand the CI tests so we don't hit this problem again? (Also we could reproduce and fix more easily). Thanks, Phil.
Re: [PATCH 1/2] vfio/display: Fix potential memleak of edid info
Hi, On 6/29/2024 8:15 PM, Marc-André Lureau wrote: Hi On Fri, Jun 28, 2024 at 1:32 PM Zhenzhong Duan wrote: EDID related device region info is leaked in three paths: 1. In vfio_get_dev_region_info(), when edid info isn't find, the last device region info is leaked. 2. In vfio_display_edid_init() error path, edid info is leaked. 3. In VFIODisplay destroying path, edid info is leaked. Fixes: 08479114b0de ("vfio/display: add edid support.") Signed-off-by: Zhenzhong Duan --- hw/vfio/display.c | 2 ++ hw/vfio/helpers.c | 1 + 2 files changed, 3 insertions(+) diff --git a/hw/vfio/display.c b/hw/vfio/display.c index 661e921616..5926bd6628 100644 --- a/hw/vfio/display.c +++ b/hw/vfio/display.c @@ -171,6 +171,7 @@ static void vfio_display_edid_init(VFIOPCIDevice *vdev) err: trace_vfio_display_edid_write_error(); + g_free(dpy->edid_info); It would be better to set it to NULL. Will do. g_free(dpy->edid_regs); dpy->edid_regs = NULL; return; @@ -182,6 +183,7 @@ static void vfio_display_edid_exit(VFIODisplay *dpy) return; } + g_free(dpy->edid_info); g_free(dpy->edid_regs); g_free(dpy->edid_blob); timer_free(dpy->edid_link_timer); diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c index b14edd46ed..3dd32b26a4 100644 --- a/hw/vfio/helpers.c +++ b/hw/vfio/helpers.c @@ -586,6 +586,7 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type, g_free(*info); } + g_free(*info); This seems incorrect, it is freed at the end of the loop above if it didn't retun. Good catch! Will remove it. Thanks Zhenzhong *info = NULL; return -ENODEV; } -- 2.34.1 -- Marc-André Lureau
[RFC PATCH] target/ppc: Inline most of dcbz helper
This is an RFC patch, not finished, just to show the idea and test this approach. I'm not sure it's correct but I'm sure it can be improved so comments are requested. The test case I've used came out of a discussion about very slow access to VRAM of a graphics card passed through with vfio the reason for which is still not clear but it was already known that dcbz is often used by MacOS and AmigaOS for clearing memory and to avoid reading values about to be overwritten which is faster on real CPU but was found to be slower on QEMU. The optimised copy routines were posted here: https://www.amigans.net/modules/newbb/viewtopic.php?post_id=149123#forumpost149123 and the rest of it I've written to make it a test case is here: http://zero.eik.bme.hu/~balaton/qemu/vramcopy.tar.xz Replace the body of has_altivec() with just "return false". Sorry for only giving pieces but the code posted above has a copyright that does not allow me to include it in the test. This is not measuring VRAM access now just memory copy but shows the effect of dcbz. I've got these results with this patch: Linux user master: Linux user patch: byte loop: 2.2 sec byte loop: 2.2 sec memcpy: 2.19 secmemcpy: 2.19 sec copyToVRAMNoAltivec: 1.7 seccopyToVRAMNoAltivec: 1.71 sec copyToVRAMAltivec: 2.13 sec copyToVRAMAltivec: 2.12 sec copyFromVRAMNoAltivec: 5.11 sec copyFromVRAMNoAltivec: 2.79 sec copyFromVRAMAltivec: 5.87 sec copyFromVRAMAltivec: 3.26 sec Linux system master:Linux system patch: byte loop: 5.86 sec byte loop: 5.9 sec memcpy: 5.45 secmemcpy: 5.47 sec copyToVRAMNoAltivec: 2.51 sec copyToVRAMNoAltivec: 2.53 sec copyToVRAMAltivec: 3.84 sec copyToVRAMAltivec: 3.85 sec copyFromVRAMNoAltivec: 6.11 sec copyFromVRAMNoAltivec: 3.92 sec copyFromVRAMAltivec: 7.22 sec copyFromVRAMAltivec: 5.51 sec It could probably be further optimised with using vector instuctions (dcbz_size is between 32 and 128) or by eliminating the check left in the helper for 970 but I don't know how to do those. (Also the series that convert AltiVec to use 128 bit access may help but I haven't tested that, only trying to optimise dcbz here,) Signed-off-by: BALATON Zoltan --- target/ppc/helper.h | 1 + target/ppc/mem_helper.c | 14 ++ target/ppc/translate.c | 34 -- 3 files changed, 43 insertions(+), 6 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 76b8f25c77..e49681c25b 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -46,6 +46,7 @@ DEF_HELPER_FLAGS_3(stmw, TCG_CALL_NO_WG, void, env, tl, i32) DEF_HELPER_4(lsw, void, env, tl, i32, i32) DEF_HELPER_5(lswx, void, env, tl, i32, i32, i32) DEF_HELPER_FLAGS_4(stsw, TCG_CALL_NO_WG, void, env, tl, i32, i32) +DEF_HELPER_FLAGS_2(dcbz_size, TCG_CALL_NO_WG_SE, tl, env, i32) DEF_HELPER_FLAGS_3(dcbz, TCG_CALL_NO_WG, void, env, tl, i32) DEF_HELPER_FLAGS_3(dcbzep, TCG_CALL_NO_WG, void, env, tl, i32) DEF_HELPER_FLAGS_2(icbi, TCG_CALL_NO_WG, void, env, tl) diff --git a/target/ppc/mem_helper.c b/target/ppc/mem_helper.c index f88155ad45..b06cb2d00e 100644 --- a/target/ppc/mem_helper.c +++ b/target/ppc/mem_helper.c @@ -270,6 +270,20 @@ void helper_stsw(CPUPPCState *env, target_ulong addr, uint32_t nb, } } +target_ulong helper_dcbz_size(CPUPPCState *env, uint32_t opcode) +{ +target_ulong dcbz_size = env->dcache_line_size; + +#if defined(TARGET_PPC64) +/* Check for dcbz vs dcbzl on 970 */ +if (env->excp_model == POWERPC_EXCP_970 && +!(opcode & 0x0020) && ((env->spr[SPR_970_HID5] >> 7) & 0x3) == 1) { +dcbz_size = 32; +} +#endif +return dcbz_size; +} + static void dcbz_common(CPUPPCState *env, target_ulong addr, uint32_t opcode, bool epid, uintptr_t retaddr) { diff --git a/target/ppc/translate.c b/target/ppc/translate.c index 0bc16d7251..49221b8303 100644 --- a/target/ppc/translate.c +++ b/target/ppc/translate.c @@ -4445,14 +4445,36 @@ static void gen_dcblc(DisasContext *ctx) /* dcbz */ static void gen_dcbz(DisasContext *ctx) { -TCGv tcgv_addr; -TCGv_i32 tcgv_op; +TCGv addr, mask, dcbz_size, t0; +TCGv_i32 op = tcg_constant_i32(ctx->opcode & 0x03FF000); +TCGv_i64 z64 = tcg_constant_i64(0); +TCGv_i128 z128 = tcg_temp_new_i128(); +TCGLabel *l; + +addr = tcg_temp_new(); +mask = tcg_temp_new(); +dcbz_size = tcg_temp_new(); +t0 = tcg_temp_new(); +l = gen_new_label(); gen_set_access_type(ctx, ACCESS_CACHE); -tcgv_addr = tcg_temp_new(); -tcgv_op = tcg_constant_i32(ctx->opcode & 0x03FF000); -gen_addr_reg_index(ctx, tcgv_addr); -gen_helper_dcbz(tcg_env, tcgv_addr, tcgv_op); +gen_helper_dcbz_size(dcbz_size, tcg_env, op); +tcg_gen_mov_tl(mask, dcbz_size); +tcg_gen_subi_tl(mask, mask, 1); +tcg_gen_not_tl(mask, mask); +
RE: [PATCH] hw/usb: Fix memory leak in musb_reset()
> -Original Message- > From: qemu-devel-bounces+yaoxt.fnst=fujitsu@nongnu.org > On Behalf Of Zheyu > Ma > Sent: Monday, July 1, 2024 12:32 AM > Cc: Zheyu Ma ; qemu-devel@nongnu.org > Subject: [PATCH] hw/usb: Fix memory leak in musb_reset() > > The musb_reset function was causing a memory leak by not properly freeing > the memory associated with USBPacket instances before reinitializing them. > This commit addresses the memory leak by adding calls to usb_packet_cleanup > for each USBPacket instance before reinitializing them with usb_packet_init. > > Asan log: > > =2970623==ERROR: LeakSanitizer: detected memory leaks > Direct leak of 256 byte(s) in 16 object(s) allocated from: > #0 0x561e20629c3d in malloc > llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 > #1 0x7fee91885738 in g_malloc > (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) > #2 0x561e21b4d0e1 in usb_packet_init hw/usb/core.c:531:5 > #3 0x561e21c5016b in musb_reset hw/usb/hcd-musb.c:372:9 > #4 0x561e21c502a9 in musb_init hw/usb/hcd-musb.c:385:5 > #5 0x561e21c893ef in tusb6010_realize hw/usb/tusb6010.c:827:15 > #6 0x561e23443355 in device_set_realized hw/core/qdev.c:510:13 > #7 0x561e2346ac1b in property_set_bool qom/object.c:2354:5 > #8 0x561e23463895 in object_property_set qom/object.c:1463:5 > #9 0x561e23477909 in object_property_set_qobject qom/qom-qobject.c:28:10 > #10 0x561e234645ed in object_property_set_bool qom/object.c:1533:15 > #11 0x561e2343c830 in qdev_realize hw/core/qdev.c:291:12 > #12 0x561e2343c874 in qdev_realize_and_unref hw/core/qdev.c:298:11 > #13 0x561e20ad5091 in sysbus_realize_and_unref hw/core/sysbus.c:261:12 > #14 0x561e22553283 in n8x0_usb_setup hw/arm/nseries.c:800:5 > #15 0x561e2254e99b in n8x0_init hw/arm/nseries.c:1356:5 > #16 0x561e22561170 in n810_init hw/arm/nseries.c:1418:5 > > Signed-off-by: Zheyu Ma > --- > hw/usb/hcd-musb.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/hw/usb/hcd-musb.c b/hw/usb/hcd-musb.c > index 6dca373cb1..0300aeaec6 100644 > --- a/hw/usb/hcd-musb.c > +++ b/hw/usb/hcd-musb.c > @@ -368,6 +368,8 @@ void musb_reset(MUSBState *s) > s->ep[i].maxp[1] = 0x40; > s->ep[i].musb = s; > s->ep[i].epnum = i; > +usb_packet_cleanup(>ep[i].packey[0].p); > +usb_packet_cleanup(>ep[i].packey[1].p); > usb_packet_init(>ep[i].packey[0].p); > usb_packet_init(>ep[i].packey[1].p); > } > -- > 2.34.1 > Reviewed-by: Xingtao Yao
Re: [PULL 0/1] ufs queue
On 6/29/24 20:52, Jeuk Kim wrote: From: Jeuk Kim The following changes since commit 3665dd6bb9043bef181c91e2dce9e1efff47ed51: Merge tag 'for-upstream' ofhttps://gitlab.com/bonzini/qemu into staging (2024-06-28 16:09:38 -0700) are available in the Git repository at: https://gitlab.com/jeuk20.kim/qemu.git tags/pull-ufs-20240630 for you to fetch changes up to e12b11f6f29272ee31ccde6b0db1a10139e87083: hw/ufs: Fix potential bugs in MMIO read|write (2024-06-30 12:44:32 +0900) hw/ufs: fix coverity issue Applied, thanks. Please update https://wiki.qemu.org/ChangeLog/9.1 as appropriate. r~
[RFC V1 2/6] migration: VMSTATE_FD
Define VMSTATE_FD for declaring a file descriptor field in a VMStateDescription. Signed-off-by: Steve Sistare --- include/migration/vmstate.h | 9 + migration/vmstate-types.c | 32 2 files changed, 41 insertions(+) diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h index f313f2f..a1dfab4 100644 --- a/include/migration/vmstate.h +++ b/include/migration/vmstate.h @@ -230,6 +230,7 @@ extern const VMStateInfo vmstate_info_uint8; extern const VMStateInfo vmstate_info_uint16; extern const VMStateInfo vmstate_info_uint32; extern const VMStateInfo vmstate_info_uint64; +extern const VMStateInfo vmstate_info_fd; /** Put this in the stream when migrating a null pointer.*/ #define VMS_NULLPTR_MARKER (0x30U) /* '0' */ @@ -902,6 +903,9 @@ extern const VMStateInfo vmstate_info_qlist; #define VMSTATE_UINT64_V(_f, _s, _v) \ VMSTATE_SINGLE(_f, _s, _v, vmstate_info_uint64, uint64_t) +#define VMSTATE_FD_V(_f, _s, _v) \ +VMSTATE_SINGLE(_f, _s, _v, vmstate_info_fd, int32_t) + #ifdef CONFIG_LINUX #define VMSTATE_U8_V(_f, _s, _v) \ @@ -936,6 +940,9 @@ extern const VMStateInfo vmstate_info_qlist; #define VMSTATE_UINT64(_f, _s)\ VMSTATE_UINT64_V(_f, _s, 0) +#define VMSTATE_FD(_f, _s)\ +VMSTATE_FD_V(_f, _s, 0) + #ifdef CONFIG_LINUX #define VMSTATE_U8(_f, _s) \ @@ -1009,6 +1016,8 @@ extern const VMStateInfo vmstate_info_qlist; #define VMSTATE_UINT64_TEST(_f, _s, _t) \ VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint64, uint64_t) +#define VMSTATE_FD_TEST(_f, _s, _t) \ +VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_fd, int32_t) #define VMSTATE_TIMER_PTR_TEST(_f, _s, _test) \ VMSTATE_POINTER_TEST(_f, _s, _test, vmstate_info_timer, QEMUTimer *) diff --git a/migration/vmstate-types.c b/migration/vmstate-types.c index e83bfcc..6e45a4a 100644 --- a/migration/vmstate-types.c +++ b/migration/vmstate-types.c @@ -314,6 +314,38 @@ const VMStateInfo vmstate_info_uint64 = { .put = put_uint64, }; +/* File descriptor communicated via SCM_RIGHTS */ + +static int get_fd(QEMUFile *f, void *pv, size_t size, + const VMStateField *field) +{ +int32_t *v = pv; +qemu_get_sbe32s(f, v); +if (*v < 0) { +return 0; +} +*v = qemu_file_get_fd(f); +return 0; +} + +static int put_fd(QEMUFile *f, void *pv, size_t size, + const VMStateField *field, JSONWriter *vmdesc) +{ +int32_t *v = pv; + +qemu_put_sbe32s(f, v); +if (*v < 0) { +return 0; +} +return qemu_file_put_fd(f, *v); +} + +const VMStateInfo vmstate_info_fd = { +.name = "fd", +.get = get_fd, +.put = put_fd, +}; + static int get_nullptr(QEMUFile *f, void *pv, size_t size, const VMStateField *field) -- 1.8.3.1
[RFC V1 6/6] migration: cpr-transfer mode
Add the cpr-transfer migration mode. Usage: qemu-system-$arch -machine anon-alloc=memfd ... start new QEMU with "-incoming -cpr-uri " Issue commands to old QEMU: migrate_set_parameter mode cpr-transfer migrate_set_parameter cpr-uri migrate -d The migrate command stops the VM, saves CPR state to uri-2, saves normal migration state to uri-1, and old QEMU enters the postmigrate state. The user starts new QEMU on the same host as old QEMU, with the same arguments as old QEMU, plus the -incoming option. Guest RAM is preserved in place, albeit with new virtual addresses in new QEMU. This mode requires a second migration channel, specified by the cpr-uri migration property on the outgoing side, and by the cpr-uri QEMU command-line option on the incoming side. The channel must be a type, such as unix socket, that supports SCM_RIGHTS. Memory-backend objects must have the share=on attribute, but memory-backend-epc is not supported. The VM must be started with the '-machine anon-alloc=memfd' option, which allows anonymous memory to be transferred in place to the new process. The memfds are kept open by sending the descriptors to new QEMU via the cpr-uri, which must support SCM_RIGHTS, and they are mmap'd in new QEMU. Signed-off-by: Steve Sistare --- migration/cpr.c | 9 - migration/migration.c | 37 + migration/ram.c | 1 + migration/vmstate-types.c | 5 +++-- qapi/migration.json | 26 +- stubs/vmstate.c | 7 +++ 6 files changed, 81 insertions(+), 4 deletions(-) diff --git a/migration/cpr.c b/migration/cpr.c index 50c130c..7ac01a9 100644 --- a/migration/cpr.c +++ b/migration/cpr.c @@ -58,7 +58,7 @@ static const VMStateDescription vmstate_cpr_fd = { VMSTATE_UINT32(namelen, CprFd), VMSTATE_VBUFFER_ALLOC_UINT32(name, CprFd, 0, NULL, namelen), VMSTATE_INT32(id, CprFd), -VMSTATE_INT32(fd, CprFd), +VMSTATE_FD(fd, CprFd), VMSTATE_END_OF_LIST() } }; @@ -172,6 +172,8 @@ int cpr_state_save(Error **errp) if (mode == MIG_MODE_CPR_EXEC) { f = cpr_exec_output(errp); +} else if (mode == MIG_MODE_CPR_TRANSFER) { +f = cpr_transfer_output(migrate_cpr_uri(), errp); } else { return 0; } @@ -209,6 +211,11 @@ int cpr_state_load(Error **errp) */ if (cpr_exec_has_state()) { f = cpr_exec_input(errp); +if (cpr_uri) { +warn_report("ignoring cpr-uri option for migration mode cpr-exec"); +} +} else if (cpr_uri) { +f = cpr_transfer_input(cpr_uri, errp); } else { return 0; } diff --git a/migration/migration.c b/migration/migration.c index a4a020e..65a36a6 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -77,6 +77,7 @@ static NotifierWithReturnList migration_state_notifiers[MIG_MODE__MAX] = { NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_NORMAL), NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_CPR_REBOOT), NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_CPR_EXEC), +NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_CPR_TRANSFER), }; /* Messages sent on the return path from destination to source */ @@ -205,6 +206,12 @@ migration_channels_and_transport_compatible(MigrationAddress *addr, return false; } +if (migrate_mode() == MIG_MODE_CPR_TRANSFER && +addr->transport == MIGRATION_ADDRESS_TYPE_FILE) { +error_setg(errp, "Migration requires streamable transport (eg unix)"); +return false; +} + return true; } @@ -1697,6 +1704,7 @@ bool migrate_mode_is_cpr(MigrationState *s) { MigMode mode = s->parameters.mode; return mode == MIG_MODE_CPR_REBOOT || + mode == MIG_MODE_CPR_TRANSFER || mode == MIG_MODE_CPR_EXEC; } @@ -2038,6 +2046,12 @@ static bool migrate_prepare(MigrationState *s, bool resume, Error **errp) return false; } +if (migrate_mode() == MIG_MODE_CPR_TRANSFER && +!s->parameters.cpr_uri) { +error_setg(errp, "cpr-transfer mode requires setting cpr-uri"); +return false; +} + if (migration_is_blocked(errp)) { return false; } @@ -2144,6 +2158,29 @@ void qmp_migrate(const char *uri, bool has_channels, goto out; } +/* + * For cpr-transfer mode, the target first reads CPR state, which cannot + * complete until cpr_state_save above finishes, then the target creates + * the migration channel and listens. We must wait for the channel to + * be created before connecting to it. + * + * This implementation of waiting is a hack. It restricts the channel + * type, and will loop forever if the target dies. It should be defined + * as a main-loop event that calls connect on the back end. + */ +if (s->parameters.mode == MIG_MODE_CPR_TRANSFER) { +
[RFC V1 3/6] migration: cpr-transfer save and load
Add functions to create a QEMUFile based on a unix URI, for saving or loading, for use by cpr-transfer mode to preserve CPR state. Signed-off-by: Steve Sistare --- include/migration/cpr.h | 3 ++ migration/cpr-transfer.c | 81 migration/meson.build| 1 + 3 files changed, 85 insertions(+) create mode 100644 migration/cpr-transfer.c diff --git a/include/migration/cpr.h b/include/migration/cpr.h index c6c60f8..9aae94c 100644 --- a/include/migration/cpr.h +++ b/include/migration/cpr.h @@ -32,4 +32,7 @@ bool cpr_exec_has_state(void); void cpr_exec_unpersist_state(void); void cpr_mig_init(void); void cpr_unpreserve_fds(void); + +QEMUFile *cpr_transfer_output(const char *uri, Error **errp); +QEMUFile *cpr_transfer_input(const char *uri, Error **errp); #endif diff --git a/migration/cpr-transfer.c b/migration/cpr-transfer.c new file mode 100644 index 000..fb9ecd8 --- /dev/null +++ b/migration/cpr-transfer.c @@ -0,0 +1,81 @@ +/* + * Copyright (c) 2022, 2024 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "io/channel-file.h" +#include "io/channel-socket.h" +#include "io/net-listener.h" +#include "migration/cpr.h" +#include "migration/migration.h" +#include "migration/savevm.h" +#include "migration/qemu-file.h" +#include "migration/vmstate.h" + +QEMUFile *cpr_transfer_output(const char *uri, Error **errp) +{ +g_autoptr(MigrationChannel) channel = NULL; +QIOChannel *ioc; + +if (!migrate_uri_parse(uri, , errp)) { +return NULL; +} + +if (channel->addr->transport == MIGRATION_ADDRESS_TYPE_SOCKET && +channel->addr->u.socket.type == SOCKET_ADDRESS_TYPE_UNIX) { + +QIOChannelSocket *sioc = qio_channel_socket_new(); +SocketAddress *saddr = >addr->u.socket; + +if (qio_channel_socket_connect_sync(sioc, saddr, errp)) { +object_unref(OBJECT(sioc)); +return NULL; +} +ioc = QIO_CHANNEL(sioc); + +} else { +error_setg(errp, "bad cpr-uri %s; must be unix:", uri); +return NULL; +} + +qio_channel_set_name(ioc, "cpr-out"); +return qemu_file_new_output(ioc); +} + +QEMUFile *cpr_transfer_input(const char *uri, Error **errp) +{ +g_autoptr(MigrationChannel) channel = NULL; +QIOChannel *ioc; + +if (!migrate_uri_parse(uri, , errp)) { +return NULL; +} + +if (channel->addr->transport == MIGRATION_ADDRESS_TYPE_SOCKET && +channel->addr->u.socket.type == SOCKET_ADDRESS_TYPE_UNIX) { + +QIOChannelSocket *sioc; +SocketAddress *saddr = >addr->u.socket; +QIONetListener *listener = qio_net_listener_new(); + +qio_net_listener_set_name(listener, "cpr-socket-listener"); +if (qio_net_listener_open_sync(listener, saddr, 1, errp) < 0) { +object_unref(OBJECT(listener)); +return NULL; +} + +sioc = qio_net_listener_wait_client(listener); +ioc = QIO_CHANNEL(sioc); + +} else { +error_setg(errp, "bad cpr-uri %s; must be unix:", uri); +return NULL; +} + +qio_channel_set_name(ioc, "cpr-in"); +return qemu_file_new_input(ioc); +} diff --git a/migration/meson.build b/migration/meson.build index dd1d315..f722980 100644 --- a/migration/meson.build +++ b/migration/meson.build @@ -15,6 +15,7 @@ system_ss.add(files( 'channel-block.c', 'cpr.c', 'cpr-exec.c', + 'cpr-transfer.c', 'dirtyrate.c', 'exec.c', 'fd.c', -- 1.8.3.1
[RFC V1 1/6] migration: SCM_RIGHTS for QEMUFile
Define functions to put/get file descriptors to/from a QEMUFile, for qio channels that support SCM_RIGHTS. Maintain ordering such that put(A), put(fd), put(B) followed by get(A), get(fd), get(B) always succeeds. Other get orderings may succeed but are not guaranteed. Signed-off-by: Steve Sistare --- migration/qemu-file.c | 83 +++--- migration/qemu-file.h | 2 ++ migration/trace-events | 2 ++ 3 files changed, 83 insertions(+), 4 deletions(-) diff --git a/migration/qemu-file.c b/migration/qemu-file.c index b6d2f58..424c27d 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -37,6 +37,11 @@ #define IO_BUF_SIZE 32768 #define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 64) +typedef struct FdEntry { +QTAILQ_ENTRY(FdEntry) entry; +int fd; +} FdEntry; + struct QEMUFile { QIOChannel *ioc; bool is_writable; @@ -51,6 +56,9 @@ struct QEMUFile { int last_error; Error *last_error_obj; + +bool fd_pass; +QTAILQ_HEAD(, FdEntry) fds; }; /* @@ -109,6 +117,8 @@ static QEMUFile *qemu_file_new_impl(QIOChannel *ioc, bool is_writable) object_ref(ioc); f->ioc = ioc; f->is_writable = is_writable; +f->fd_pass = qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_FD_PASS); +QTAILQ_INIT(>fds); return f; } @@ -310,6 +320,10 @@ static ssize_t coroutine_mixed_fn qemu_fill_buffer(QEMUFile *f) int len; int pending; Error *local_error = NULL; +g_autofree int *fds = NULL; +size_t nfd = 0; +int **pfds = f->fd_pass ? : NULL; +size_t *pnfd = f->fd_pass ? : NULL; assert(!qemu_file_is_writable(f)); @@ -325,10 +339,9 @@ static ssize_t coroutine_mixed_fn qemu_fill_buffer(QEMUFile *f) } do { -len = qio_channel_read(f->ioc, - (char *)f->buf + pending, - IO_BUF_SIZE - pending, - _error); +struct iovec iov = { f->buf + pending, IO_BUF_SIZE - pending }; +len = qio_channel_readv_full(f->ioc, , 1, pfds, pnfd, 0, + _error); if (len == QIO_CHANNEL_ERR_BLOCK) { if (qemu_in_coroutine()) { qio_channel_yield(f->ioc, G_IO_IN); @@ -348,9 +361,65 @@ static ssize_t coroutine_mixed_fn qemu_fill_buffer(QEMUFile *f) qemu_file_set_error_obj(f, len, local_error); } +for (int i = 0; i < nfd; i++) { +FdEntry *fde = g_new0(FdEntry, 1); +fde->fd = fds[i]; +QTAILQ_INSERT_TAIL(>fds, fde, entry); +} + return len; } +int qemu_file_put_fd(QEMUFile *f, int fd) +{ +int ret = 0; +QIOChannel *ioc = qemu_file_get_ioc(f); +Error *err = NULL; +struct iovec iov = { (void *)" ", 1 }; + +/* + * Send a dummy byte so qemu_fill_buffer on the receiving side does not + * fail with a len=0 error. Flush first to maintain ordering wrt other + * data. + */ + +qemu_fflush(f); +if (qio_channel_writev_full(ioc, , 1, , 1, 0, ) < 1) { +error_report_err(error_copy(err)); +qemu_file_set_error_obj(f, -EIO, err); +ret = -1; +} +trace_qemu_file_put_fd(f->ioc->name, fd, ret); +return 0; +} + +int qemu_file_get_fd(QEMUFile *f) +{ +int fd = -1; +FdEntry *fde; + +if (!f->fd_pass) { +Error *err = NULL; +error_setg(, "%s does not support fd passing", f->ioc->name); +error_report_err(error_copy(err)); +qemu_file_set_error_obj(f, -EIO, err); +goto out; +} + +/* Force the dummy byte and its fd passenger to appear. */ +qemu_peek_byte(f, 0); + +fde = QTAILQ_FIRST(>fds); +if (fde) { +qemu_get_byte(f); /* Drop the dummy byte */ +fd = fde->fd; +QTAILQ_REMOVE(>fds, fde, entry); +} +out: +trace_qemu_file_get_fd(f->ioc->name, fd); +return fd; +} + /** Closes the file * * Returns negative error value if any error happened on previous operations or @@ -361,11 +430,17 @@ static ssize_t coroutine_mixed_fn qemu_fill_buffer(QEMUFile *f) */ int qemu_fclose(QEMUFile *f) { +FdEntry *fde, *next; int ret = qemu_fflush(f); int ret2 = qio_channel_close(f->ioc, NULL); if (ret >= 0) { ret = ret2; } +QTAILQ_FOREACH_SAFE(fde, >fds, entry, next) { +warn_report("qemu_fclose: received fd %d was never claimed", fde->fd); +close(fde->fd); +g_free(fde); +} g_clear_pointer(>ioc, object_unref); error_free(f->last_error_obj); g_free(f); diff --git a/migration/qemu-file.h b/migration/qemu-file.h index 11c2120..3e47a20 100644 --- a/migration/qemu-file.h +++ b/migration/qemu-file.h @@ -79,5 +79,7 @@ size_t qemu_get_buffer_at(QEMUFile *f, const uint8_t *buf, size_t buflen, off_t pos); QIOChannel *qemu_file_get_ioc(QEMUFile *file); +int qemu_file_put_fd(QEMUFile *f, int fd); +int qemu_file_get_fd(QEMUFile *f);
[RFC V1 5/6] migration: cpr-uri option
Define the cpr-uri QEMU command-line option to specify the URI from which CPR vmstate is loaded for cpr-transfer mode. Signed-off-by: Steve Sistare --- include/migration/cpr.h | 1 + migration/cpr.c | 7 +++ qemu-options.hx | 8 system/vl.c | 3 +++ 4 files changed, 19 insertions(+) diff --git a/include/migration/cpr.h b/include/migration/cpr.h index 9aae94c..bc62493 100644 --- a/include/migration/cpr.h +++ b/include/migration/cpr.h @@ -22,6 +22,7 @@ int cpr_find_fd(const char *name, int id); int cpr_walk_fd(cpr_walk_fd_cb cb); void cpr_resave_fd(const char *name, int id, int fd); +void cpr_set_cpr_uri(const char *uri); int cpr_state_save(Error **errp); int cpr_state_load(Error **errp); diff --git a/migration/cpr.c b/migration/cpr.c index f756c15..50c130c 100644 --- a/migration/cpr.c +++ b/migration/cpr.c @@ -157,6 +157,13 @@ static const VMStateDescription vmstate_cpr_state = { }; /*/ +static char *cpr_uri; + +void cpr_set_cpr_uri(const char *uri) +{ +cpr_uri = g_strdup(uri); +} + int cpr_state_save(Error **errp) { int ret; diff --git a/qemu-options.hx b/qemu-options.hx index 595b693..a6b5253 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -4776,6 +4776,14 @@ SRST ERST +DEF("cpr-uri", HAS_ARG, QEMU_OPTION_cpr_uri, \ +"-cpr-uri unix:socketpath\n", +QEMU_ARCH_ALL) +SRST +``-cpr-uri unix:socketpath`` +URI for incoming CPR state, for the cpr-transfer migration mode. +ERST + DEF("incoming", HAS_ARG, QEMU_OPTION_incoming, \ "-incoming tcp:[host]:port[,to=maxport][,ipv4=on|off][,ipv6=on|off]\n" \ "-incoming rdma:host:port[,ipv4=on|off][,ipv6=on|off]\n" \ diff --git a/system/vl.c b/system/vl.c index 6521ee3..32015ac 100644 --- a/system/vl.c +++ b/system/vl.c @@ -3483,6 +3483,9 @@ void qemu_init(int argc, char **argv) exit(1); } break; +case QEMU_OPTION_cpr_uri: +cpr_set_cpr_uri(optarg); +break; case QEMU_OPTION_incoming: if (!incoming) { runstate_set(RUN_STATE_INMIGRATE); -- 1.8.3.1
[RFC V1 0/6] Live update: cpr-transfer
What? This patch series adds the live migration cpr-transfer mode, which allows the user to transfer a guest to a new QEMU instance on the same host. It is identical to cpr-exec in most respects, except as described below. The new user-visible interfaces are: * cpr-transfer (MigMode migration parameter) * cpr-uri (migration parameter) * cpr-uri (command-line argument) In this mode, the user starts new QEMU on the same host as old QEMU, with the same arguments as old QEMU, plus the -incoming and the -cpr-uri options. The user issues the migrate command to old QEMU, which stops the VM, saves state to the migration channels, and enters the postmigrate state. Execution resumes in new QEMU. This mode requires a second migration channel, specified by the cpr-uri migration property on the outgoing side, and by the cpr-uri QEMU command-line option on the incoming side. The channel must be a type, such as unix socket, that supports SCM_RIGHTS. This series depends on the series "Live update: cpr-exec mode". Why? cpr-transfer offers the same benefits as cpr-exec mode, but with a model for launching new QEMU that may be more natural for some management packages. How? The file descriptors are kept open by sending them to new QEMU via the cpr-uri, which must support SCM_RIGHTS. Example: In this example, we simply restart the same version of QEMU, but in a real scenario one would use a new QEMU binary path in terminal 2. Terminal 1: start old QEMU # qemu-kvm -monitor stdio -object memory-backend-file,id=ram0,size=4G,mem-path=/dev/shm/ram0,share=on -m 4G -machine anon-alloc=memfd ... Terminal 2: start new QEMU # qemu-kvm ... -incoming unix:vm.sock -cpr-uri unix:cpr.sock Terminal 1: QEMU 9.1.50 monitor - type 'help' for more information (qemu) info status VM status: running (qemu) migrate_set_parameter mode cpr-transfer (qemu) migrate_set_parameter cpr-uri unix:cpr.sock (qemu) migrate -d unix:vm.sock (qemu) info status VM status: paused (postmigrate) Terminal 2: QEMU 9.1.50 monitor - type 'help' for more information (qemu) info status VM status: running Steve Sistare (6): migration: SCM_RIGHTS for QEMUFile migration: VMSTATE_FD migration: cpr-transfer save and load migration: cpr-uri parameter migration: cpr-uri option migration: cpr-transfer mode include/migration/cpr.h| 4 ++ include/migration/vmstate.h| 9 + migration/cpr-transfer.c | 81 + migration/cpr.c| 16 +++- migration/meson.build | 1 + migration/migration-hmp-cmds.c | 10 + migration/migration.c | 37 +++ migration/options.c| 29 +++ migration/options.h| 1 + migration/qemu-file.c | 83 -- migration/qemu-file.h | 2 + migration/ram.c| 1 + migration/trace-events | 2 + migration/vmstate-types.c | 33 + qapi/migration.json| 38 ++- qemu-options.hx| 8 stubs/vmstate.c| 7 system/vl.c| 3 ++ 18 files changed, 359 insertions(+), 6 deletions(-) create mode 100644 migration/cpr-transfer.c -- 1.8.3.1
[RFC V1 4/6] migration: cpr-uri parameter
Define the cpr-uri migration parameter to specify the URI to which CPR vmstate is saved for cpr-transfer mode. Signed-off-by: Steve Sistare --- migration/migration-hmp-cmds.c | 10 ++ migration/options.c| 29 + migration/options.h| 1 + qapi/migration.json| 12 4 files changed, 52 insertions(+) diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 16a4b00..4ede831 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -371,6 +371,11 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) params->direct_io ? "on" : "off"); } +assert(params->cpr_uri); +monitor_printf(mon, "%s: '%s'\n", +MigrationParameter_str(MIGRATION_PARAMETER_CPR_URI), +params->cpr_uri); + assert(params->has_cpr_exec_command); monitor_print_cpr_exec_command(mon, params->cpr_exec_command); } @@ -650,6 +655,11 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) p->has_direct_io = true; visit_type_bool(v, param, >direct_io, ); break; +case MIGRATION_PARAMETER_CPR_URI: +p->cpr_uri = g_new0(StrOrNull, 1); +p->cpr_uri->type = QTYPE_QSTRING; +visit_type_str(v, param, >cpr_uri->u.s, ); +break; case MIGRATION_PARAMETER_CPR_EXEC_COMMAND: { g_autofree char **strv = g_strsplit(valuestr ?: "", " ", -1); strList **tail = >cpr_exec_command; diff --git a/migration/options.c b/migration/options.c index b8d5f72..7526f9f 100644 --- a/migration/options.c +++ b/migration/options.c @@ -163,6 +163,8 @@ Property migration_properties[] = { DEFINE_PROP_ZERO_PAGE_DETECTION("zero-page-detection", MigrationState, parameters.zero_page_detection, ZERO_PAGE_DETECTION_MULTIFD), +DEFINE_PROP_STRING("cpr-uri", MigrationState, + parameters.cpr_uri), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -848,6 +850,13 @@ ZeroPageDetection migrate_zero_page_detection(void) return s->parameters.zero_page_detection; } +const char *migrate_cpr_uri(void) +{ +MigrationState *s = migrate_get_current(); + +return s->parameters.cpr_uri; +} + /* parameters helpers */ AnnounceParameters *migrate_announce_params(void) @@ -931,6 +940,7 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->zero_page_detection = s->parameters.zero_page_detection; params->has_direct_io = true; params->direct_io = s->parameters.direct_io; +params->cpr_uri = g_strdup(s->parameters.cpr_uri); params->has_cpr_exec_command = true; params->cpr_exec_command = QAPI_CLONE(strList, s->parameters.cpr_exec_command); @@ -967,6 +977,7 @@ void migrate_params_init(MigrationParameters *params) params->has_mode = true; params->has_zero_page_detection = true; params->has_direct_io = true; +params->cpr_uri = g_strdup(""); params->has_cpr_exec_command = true; } @@ -1257,9 +1268,15 @@ static void migrate_params_test_apply(MigrateSetParameters *params, dest->direct_io = params->direct_io; } +if (params->cpr_uri) { +assert(params->cpr_uri->type == QTYPE_QSTRING); +dest->cpr_uri = params->cpr_uri->u.s; +} + if (params->has_cpr_exec_command) { dest->cpr_exec_command = params->cpr_exec_command; } + } static void migrate_params_apply(MigrateSetParameters *params, Error **errp) @@ -1390,6 +1407,12 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) s->parameters.direct_io = params->direct_io; } +if (params->cpr_uri) { +g_free(s->parameters.cpr_uri); +assert(params->cpr_uri->type == QTYPE_QSTRING); +s->parameters.cpr_uri = g_strdup(params->cpr_uri->u.s); +} + if (params->has_cpr_exec_command) { qapi_free_strList(s->parameters.cpr_exec_command); s->parameters.cpr_exec_command = @@ -1421,6 +1444,12 @@ void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp) params->tls_authz->u.s = strdup(""); } +if (params->cpr_uri && params->cpr_uri->type == QTYPE_QNULL) { +qobject_unref(params->cpr_uri->u.n); +params->cpr_uri->type = QTYPE_QSTRING; +params->cpr_uri->u.s = strdup(""); +} + migrate_params_test_apply(params, ); if (!migrate_params_check(, errp)) { diff --git a/migration/options.h b/migration/options.h index a239702..7492fcf 100644 --- a/migration/options.h +++ b/migration/options.h @@ -85,6 +85,7 @@ const char *migrate_tls_creds(void); const char *migrate_tls_hostname(void); uint64_t migrate_xbzrle_cache_size(void); ZeroPageDetection migrate_zero_page_detection(void);
[PATCH V2 05/11] physmem: preserve ram blocks for cpr
Save the memfd for anonymous ramblocks in CPR state, along with a name that uniquely identifies it. The block's idstr is not yet set, so it cannot be used for this purpose. Find the saved memfd in new QEMU when creating a block. QEMU hard-codes the length of some internally-created blocks, so to guard against that length changing, use lseek to get the actual length of an incoming memfd. Signed-off-by: Steve Sistare --- system/physmem.c | 25 - 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/system/physmem.c b/system/physmem.c index efe95ff..e37352e 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -73,6 +73,7 @@ #include "qapi/qapi-types-migration.h" #include "migration/options.h" +#include "migration/cpr.h" #include "migration/vmstate.h" #include "qemu/range.h" @@ -1641,6 +1642,19 @@ void qemu_ram_unset_idstr(RAMBlock *block) } } +static char *cpr_name(RAMBlock *block) +{ +MemoryRegion *mr = block->mr; +const char *mr_name = memory_region_name(mr); +g_autofree char *id = mr->dev ? qdev_get_dev_path(mr->dev) : NULL; + +if (id) { +return g_strdup_printf("%s/%s", id, mr_name); +} else { +return g_strdup(mr_name); +} +} + size_t qemu_ram_pagesize(RAMBlock *rb) { return rb->page_size; @@ -1836,13 +1850,17 @@ static void ram_block_add(RAMBlock *new_block, Error **errp) } else if (new_block->flags & RAM_SHARED) { size_t max_length = new_block->max_length; MemoryRegion *mr = new_block->mr; -const char *name = memory_region_name(mr); +g_autofree char *name = cpr_name(new_block); new_block->mr->align = QEMU_VMALLOC_ALIGN; +new_block->fd = cpr_find_fd(name, 0); if (new_block->fd == -1) { new_block->fd = qemu_memfd_create(name, max_length + mr->align, 0, 0, 0, errp); +cpr_save_fd(name, 0, new_block->fd); +} else { +new_block->max_length = lseek(new_block->fd, 0, SEEK_END); } if (new_block->fd >= 0) { @@ -1852,6 +1870,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp) false, 0, errp); } if (!new_block->host) { +cpr_delete_fd(name, 0); qemu_mutex_unlock_ramlist(); return; } @@ -2162,6 +2181,8 @@ static void reclaim_ramblock(RAMBlock *block) void qemu_ram_free(RAMBlock *block) { +g_autofree char *name = NULL; + if (!block) { return; } @@ -2172,6 +2193,8 @@ void qemu_ram_free(RAMBlock *block) } qemu_mutex_lock_ramlist(); +name = cpr_name(block); +cpr_delete_fd(name, 0); QLIST_REMOVE_RCU(block, next); ram_list.mru_block = NULL; /* Write list before version */ -- 1.8.3.1
[PATCH V2 10/11] migration: cpr-exec save and load
To preserve CPR state across exec, create a QEMUFile based on a memfd, and keep the memfd open across exec. Save the value of the memfd in an environment variable so post-exec QEMU can find it. These new functions are called in a subsequent patch. Signed-off-by: Steve Sistare --- include/migration/cpr.h | 5 +++ migration/cpr-exec.c| 95 + migration/meson.build | 1 + 3 files changed, 101 insertions(+) create mode 100644 migration/cpr-exec.c diff --git a/include/migration/cpr.h b/include/migration/cpr.h index 42b4019..76d6ccb 100644 --- a/include/migration/cpr.h +++ b/include/migration/cpr.h @@ -25,4 +25,9 @@ void cpr_resave_fd(const char *name, int id, int fd); int cpr_state_save(Error **errp); int cpr_state_load(Error **errp); +QEMUFile *cpr_exec_output(Error **errp); +QEMUFile *cpr_exec_input(Error **errp); +void cpr_exec_persist_state(QEMUFile *f); +bool cpr_exec_has_state(void); +void cpr_exec_unpersist_state(void); #endif diff --git a/migration/cpr-exec.c b/migration/cpr-exec.c new file mode 100644 index 000..5c40457 --- /dev/null +++ b/migration/cpr-exec.c @@ -0,0 +1,95 @@ +/* + * Copyright (c) 2021-2024 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu/cutils.h" +#include "qemu/memfd.h" +#include "qapi/error.h" +#include "io/channel-file.h" +#include "io/channel-socket.h" +#include "migration/cpr.h" +#include "migration/qemu-file.h" +#include "migration/misc.h" +#include "migration/vmstate.h" +#include "sysemu/runstate.h" + +#define CPR_EXEC_STATE_NAME "QEMU_CPR_EXEC_STATE" + +static QEMUFile *qemu_file_new_fd_input(int fd, const char *name) +{ +g_autoptr(QIOChannelFile) fioc = qio_channel_file_new_fd(fd); +QIOChannel *ioc = QIO_CHANNEL(fioc); +qio_channel_set_name(ioc, name); +return qemu_file_new_input(ioc); +} + +static QEMUFile *qemu_file_new_fd_output(int fd, const char *name) +{ +g_autoptr(QIOChannelFile) fioc = qio_channel_file_new_fd(fd); +QIOChannel *ioc = QIO_CHANNEL(fioc); +qio_channel_set_name(ioc, name); +return qemu_file_new_output(ioc); +} + +void cpr_exec_persist_state(QEMUFile *f) +{ +QIOChannelFile *fioc = QIO_CHANNEL_FILE(qemu_file_get_ioc(f)); +int mfd = dup(fioc->fd); +char val[16]; + +/* Remember mfd in environment for post-exec load */ +qemu_clear_cloexec(mfd); +snprintf(val, sizeof(val), "%d", mfd); +g_setenv(CPR_EXEC_STATE_NAME, val, 1); +} + +static int cpr_exec_find_state(void) +{ +const char *val = g_getenv(CPR_EXEC_STATE_NAME); +int mfd; + +assert(val); +g_unsetenv(CPR_EXEC_STATE_NAME); +assert(!qemu_strtoi(val, NULL, 10, )); +return mfd; +} + +bool cpr_exec_has_state(void) +{ +return g_getenv(CPR_EXEC_STATE_NAME) != NULL; +} + +void cpr_exec_unpersist_state(void) +{ +int mfd; +const char *val = g_getenv(CPR_EXEC_STATE_NAME); + +g_unsetenv(CPR_EXEC_STATE_NAME); +assert(val); +assert(!qemu_strtoi(val, NULL, 10, )); +close(mfd); +} + +QEMUFile *cpr_exec_output(Error **errp) +{ +int mfd = memfd_create(CPR_EXEC_STATE_NAME, 0); + +if (mfd < 0) { +error_setg_errno(errp, errno, "memfd_create failed"); +return NULL; +} + +return qemu_file_new_fd_output(mfd, CPR_EXEC_STATE_NAME); +} + +QEMUFile *cpr_exec_input(Error **errp) +{ +int mfd = cpr_exec_find_state(); + +lseek(mfd, 0, SEEK_SET); +return qemu_file_new_fd_input(mfd, CPR_EXEC_STATE_NAME); +} diff --git a/migration/meson.build b/migration/meson.build index 87feb4c..dd1d315 100644 --- a/migration/meson.build +++ b/migration/meson.build @@ -14,6 +14,7 @@ system_ss.add(files( 'channel.c', 'channel-block.c', 'cpr.c', + 'cpr-exec.c', 'dirtyrate.c', 'exec.c', 'fd.c', -- 1.8.3.1
[PATCH V2 11/11] migration: cpr-exec mode
Add the cpr-exec migration mode. Usage: qemu-system-$arch -machine anon-alloc=memfd ... migrate_set_parameter mode cpr-exec migrate_set_parameter cpr-exec-command \ ... -incoming \ migrate -d The migrate command stops the VM, saves state to uri-1, directly exec's a new version of QEMU on the same host, replacing the original process while retaining its PID, and loads state from uri-1. Guest RAM is preserved in place, albeit with new virtual addresses. The new QEMU process is started by exec'ing the command specified by the @cpr-exec-command parameter. The first word of the command is the binary, and the remaining words are its arguments. The command may be a direct invocation of new QEMU, or may be a non-QEMU command that exec's the new QEMU binary. This mode creates a second migration channel that is not visible to the user. At the start of migration, old QEMU saves CPR state to the second channel, and at the end of migration, it tells the main loop to call cpr_exec. New QEMU loads CPR state early, before objects are created. Because old QEMU terminates when new QEMU starts, one cannot stream data between the two, so uri-1 must be a type, such as a file, that accepts all data before old QEMU exits. Otherwise, old QEMU may quietly block writing to the channel. Memory-backend objects must have the share=on attribute, but memory-backend-epc is not supported. The VM must be started with the '-machine anon-alloc=memfd' option, which allows anonymous memory to be transferred in place to the new process. The memfds are kept open across exec by clearing the close-on-exec flag, their values are saved in CPR state, and they are mmap'd in new qemu. Note that the anon-alloc option is not related to memory-backend-memfd. Later patches add support for memory-backend-memfd. Signed-off-by: Steve Sistare --- include/migration/cpr.h | 2 ++ migration/cpr-exec.c| 85 + migration/cpr.c | 37 ++--- migration/migration.c | 14 ++-- migration/ram.c | 2 ++ qapi/migration.json | 24 +- 6 files changed, 157 insertions(+), 7 deletions(-) diff --git a/include/migration/cpr.h b/include/migration/cpr.h index 76d6ccb..c6c60f8 100644 --- a/include/migration/cpr.h +++ b/include/migration/cpr.h @@ -30,4 +30,6 @@ QEMUFile *cpr_exec_input(Error **errp); void cpr_exec_persist_state(QEMUFile *f); bool cpr_exec_has_state(void); void cpr_exec_unpersist_state(void); +void cpr_mig_init(void); +void cpr_unpreserve_fds(void); #endif diff --git a/migration/cpr-exec.c b/migration/cpr-exec.c index 5c40457..fd65435 100644 --- a/migration/cpr-exec.c +++ b/migration/cpr-exec.c @@ -11,8 +11,11 @@ #include "qapi/error.h" #include "io/channel-file.h" #include "io/channel-socket.h" +#include "block/block-global-state.h" +#include "qemu/main-loop.h" #include "migration/cpr.h" #include "migration/qemu-file.h" +#include "migration/migration.h" #include "migration/misc.h" #include "migration/vmstate.h" #include "sysemu/runstate.h" @@ -93,3 +96,85 @@ QEMUFile *cpr_exec_input(Error **errp) lseek(mfd, 0, SEEK_SET); return qemu_file_new_fd_input(mfd, CPR_EXEC_STATE_NAME); } + +static int preserve_fd(int fd) +{ +qemu_clear_cloexec(fd); +return 0; +} + +static int unpreserve_fd(int fd) +{ +qemu_set_cloexec(fd); +return 0; +} + +static void cpr_preserve_fds(void) +{ +cpr_walk_fd(preserve_fd); +} + +void cpr_unpreserve_fds(void) +{ +cpr_walk_fd(unpreserve_fd); +} + +static void cpr_exec(char **argv) +{ +MigrationState *s = migrate_get_current(); +Error *err = NULL; + +/* + * Clear the close-on-exec flag for all preserved fd's. We cannot do so + * earlier because they should not persist across miscellaneous fork and + * exec calls that are performed during normal operation. + */ +cpr_preserve_fds(); + +execvp(argv[0], argv); + +cpr_unpreserve_fds(); + +error_setg_errno(, errno, "execvp %s failed", argv[0]); +error_report_err(error_copy(err)); +migrate_set_state(>state, s->state, MIGRATION_STATUS_FAILED); +migrate_set_error(s, err); + +migration_call_notifiers(s, MIG_EVENT_PRECOPY_FAILED, NULL); + +if (s->block_inactive) { +Error *local_err = NULL; +bdrv_activate_all(_err); +if (local_err) { +error_report_err(local_err); +return; +} else { +s->block_inactive = false; +} +} + +if (runstate_is_live(s->vm_old_state)) { +vm_start(); +} +} + +static int cpr_exec_notifier(NotifierWithReturn *notifier, MigrationEvent *e, + Error **errp) +{ +MigrationState *s = migrate_get_current(); + +if (e->type == MIG_EVENT_PRECOPY_DONE) { +assert(s->state == MIGRATION_STATUS_COMPLETED); +qemu_system_exec_request(cpr_exec, s->parameters.cpr_exec_command); +} else if (e->type ==
[PATCH V2 03/11] migration: save cpr mode
Save the mode in CPR state, so the user does not need to explicitly specify it for the target. Modify migrate_mode() so it returns the incoming mode on the target. Signed-off-by: Steve Sistare --- include/migration/cpr.h | 7 +++ migration/cpr.c | 23 ++- migration/migration.c | 1 + migration/options.c | 9 +++-- 4 files changed, 37 insertions(+), 3 deletions(-) diff --git a/include/migration/cpr.h b/include/migration/cpr.h index 8e7e705..42b4019 100644 --- a/include/migration/cpr.h +++ b/include/migration/cpr.h @@ -8,6 +8,13 @@ #ifndef MIGRATION_CPR_H #define MIGRATION_CPR_H +#include "qapi/qapi-types-migration.h" + +#define MIG_MODE_NONE MIG_MODE__MAX + +MigMode cpr_get_incoming_mode(void); +void cpr_set_incoming_mode(MigMode mode); + typedef int (*cpr_walk_fd_cb)(int fd); void cpr_save_fd(const char *name, int id, int fd); void cpr_delete_fd(const char *name, int id); diff --git a/migration/cpr.c b/migration/cpr.c index 313e74e..1c296c6 100644 --- a/migration/cpr.c +++ b/migration/cpr.c @@ -21,10 +21,23 @@ typedef QLIST_HEAD(CprFdList, CprFd) CprFdList; typedef struct CprState { +MigMode mode; CprFdList fds; } CprState; -static CprState cpr_state; +static CprState cpr_state = { +.mode = MIG_MODE_NONE, +}; + +MigMode cpr_get_incoming_mode(void) +{ +return cpr_state.mode; +} + +void cpr_set_incoming_mode(MigMode mode) +{ +cpr_state.mode = mode; +} // @@ -124,11 +137,19 @@ void cpr_resave_fd(const char *name, int id, int fd) /*/ #define CPR_STATE "CprState" +static int cpr_state_presave(void *opaque) +{ +cpr_state.mode = migrate_mode(); +return 0; +} + static const VMStateDescription vmstate_cpr_state = { .name = CPR_STATE, .version_id = 1, .minimum_version_id = 1, +.pre_save = cpr_state_presave, .fields = (VMStateField[]) { +VMSTATE_UINT32(mode, CprState), VMSTATE_QLIST_V(fds, CprState, 1, vmstate_cpr_fd, CprFd, next), VMSTATE_END_OF_LIST() } diff --git a/migration/migration.c b/migration/migration.c index e394ad7..0f47765 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -411,6 +411,7 @@ void migration_incoming_state_destroy(void) mis->postcopy_qemufile_dst = NULL; } +cpr_set_incoming_mode(MIG_MODE_NONE); yank_unregister_instance(MIGRATION_YANK_INSTANCE); } diff --git a/migration/options.c b/migration/options.c index 645f550..305397a 100644 --- a/migration/options.c +++ b/migration/options.c @@ -22,6 +22,7 @@ #include "qapi/qmp/qnull.h" #include "sysemu/runstate.h" #include "migration/colo.h" +#include "migration/cpr.h" #include "migration/misc.h" #include "migration.h" #include "migration-stats.h" @@ -758,8 +759,12 @@ uint64_t migrate_max_postcopy_bandwidth(void) MigMode migrate_mode(void) { -MigrationState *s = migrate_get_current(); -MigMode mode = s->parameters.mode; +MigMode mode = cpr_get_incoming_mode(); + +if (mode == MIG_MODE_NONE) { +MigrationState *s = migrate_get_current(); +mode = s->parameters.mode; +} assert(mode >= 0 && mode < MIG_MODE__MAX); return mode; -- 1.8.3.1
[PATCH V2 04/11] migration: stop vm earlier for cpr
Stop the vm earlier for cpr, to guarantee consistent device state when CPR state is saved. Signed-off-by: Steve Sistare --- migration/migration.c | 22 +- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 0f47765..8a8e927 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2077,6 +2077,7 @@ void qmp_migrate(const char *uri, bool has_channels, MigrationState *s = migrate_get_current(); g_autoptr(MigrationChannel) channel = NULL; MigrationAddress *addr = NULL; +bool stopped = false; /* * Having preliminary checks for uri and channel @@ -2120,6 +2121,15 @@ void qmp_migrate(const char *uri, bool has_channels, } } +if (migrate_mode_is_cpr(s)) { +int ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE); +if (ret < 0) { +error_setg(_err, "migration_stop_vm failed, error %d", -ret); +goto out; +} +stopped = true; +} + if (cpr_state_save(_err)) { goto out; } @@ -2155,6 +2165,9 @@ out: } migrate_fd_error(s, local_err); error_propagate(errp, local_err); +if (stopped && runstate_is_live(s->vm_old_state)) { +vm_start(); +} return; } } @@ -3738,7 +3751,6 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) Error *local_err = NULL; uint64_t rate_limit; bool resume = (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP); -int ret; /* * If there's a previous error, free it and prepare for another one. @@ -3810,14 +3822,6 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) return; } -if (migrate_mode_is_cpr(s)) { -ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE); -if (ret < 0) { -error_setg(_err, "migration_stop_vm failed, error %d", -ret); -goto fail; -} -} - if (migrate_background_snapshot()) { qemu_thread_create(>thread, "mig/snapshot", bg_migration_thread, s, QEMU_THREAD_JOINABLE); -- 1.8.3.1
[PATCH V2 08/11] vl: helper to request exec
Add a qemu_system_exec_request() hook that causes the main loop to exit and exec a command using the specified arguments. This will be used during CPR to exec a new version of QEMU. Signed-off-by: Steve Sistare --- include/sysemu/runstate.h | 3 +++ system/runstate.c | 29 + 2 files changed, 32 insertions(+) diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h index 0117d24..cb669cf 100644 --- a/include/sysemu/runstate.h +++ b/include/sysemu/runstate.h @@ -80,6 +80,8 @@ typedef enum WakeupReason { QEMU_WAKEUP_REASON_OTHER, } WakeupReason; +typedef void (*qemu_exec_func)(char **exec_argv); + void qemu_system_reset_request(ShutdownCause reason); void qemu_system_suspend_request(void); void qemu_register_suspend_notifier(Notifier *notifier); @@ -91,6 +93,7 @@ void qemu_register_wakeup_support(void); void qemu_system_shutdown_request_with_code(ShutdownCause reason, int exit_code); void qemu_system_shutdown_request(ShutdownCause reason); +void qemu_system_exec_request(qemu_exec_func func, const strList *args); void qemu_system_powerdown_request(void); void qemu_register_powerdown_notifier(Notifier *notifier); void qemu_register_shutdown_notifier(Notifier *notifier); diff --git a/system/runstate.c b/system/runstate.c index ec32e27..afc56e4 100644 --- a/system/runstate.c +++ b/system/runstate.c @@ -40,6 +40,7 @@ #include "qapi/error.h" #include "qapi/qapi-commands-run-state.h" #include "qapi/qapi-events-run-state.h" +#include "qapi/type-helpers.h" #include "qemu/accel.h" #include "qemu/error-report.h" #include "qemu/job.h" @@ -400,6 +401,8 @@ static NotifierList wakeup_notifiers = static NotifierList shutdown_notifiers = NOTIFIER_LIST_INITIALIZER(shutdown_notifiers); static uint32_t wakeup_reason_mask = ~(1 << QEMU_WAKEUP_REASON_NONE); +qemu_exec_func exec_func; +static char **exec_argv; ShutdownCause qemu_shutdown_requested_get(void) { @@ -416,6 +419,11 @@ static int qemu_shutdown_requested(void) return qatomic_xchg(_requested, SHUTDOWN_CAUSE_NONE); } +static int qemu_exec_requested(void) +{ +return exec_argv != NULL; +} + static void qemu_kill_report(void) { if (!qtest_driver() && shutdown_signal) { @@ -693,6 +701,23 @@ void qemu_system_shutdown_request(ShutdownCause reason) qemu_notify_event(); } +static void qemu_system_exec(void) +{ +exec_func(exec_argv); + +/* exec failed */ +g_strfreev(exec_argv); +exec_argv = NULL; +exec_func = NULL; +} + +void qemu_system_exec_request(qemu_exec_func func, const strList *args) +{ +exec_func = func; +exec_argv = strv_from_str_list(args); +qemu_notify_event(); +} + static void qemu_system_powerdown(void) { qapi_event_send_powerdown(); @@ -739,6 +764,10 @@ static bool main_loop_should_exit(int *status) if (qemu_suspend_requested()) { qemu_system_suspend(); } +if (qemu_exec_requested()) { +qemu_system_exec(); +return false; +} request = qemu_shutdown_requested(); if (request) { qemu_kill_report(); -- 1.8.3.1
[PATCH V2 01/11] machine: alloc-anon option
Allocate anonymous memory using mmap MAP_ANON or memfd_create depending on the value of the anon-alloc machine property. This affects memory-backend-ram objects, guest RAM created with the global -m option but without an associated memory-backend object and without the -mem-path option, and various memory regions such as ROMs that are allocated when devices are created. This option does not affect memory-backend-file, memory-backend-memfd, or memory-backend-epc objects. The memfd option is intended to support new migration modes, in which the memory region can be transferred in place to a new QEMU process, by sending the memfd file descriptor to the process. Memory contents are preserved, and if the mode also transfers device descriptors, then pages that are locked in memory for DMA remain locked. This behavior is a pre-requisite for supporting vfio, vdpa, and iommufd devices with the new modes. To access the same memory in the old and new QEMU processes, the memory must be mapped shared. Therefore, the implementation always sets RAM_SHARED if alloc-anon=memfd, except for memory-backend-ram, where the user must explicitly specify the share option. In lieu of defining a new RAM flag, at the lowest level the implementation uses RAM_SHARED with fd=-1 as the condition for calling memfd_create. Signed-off-by: Steve Sistare --- hw/core/machine.c | 24 include/hw/boards.h | 1 + qapi/machine.json | 14 ++ qemu-options.hx | 13 + system/memory.c | 12 +--- system/physmem.c| 38 +- system/trace-events | 3 +++ 7 files changed, 101 insertions(+), 4 deletions(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index 655d75c..7ca2ad0 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -454,6 +454,20 @@ static void machine_set_mem_merge(Object *obj, bool value, Error **errp) ms->mem_merge = value; } +static int machine_get_anon_alloc(Object *obj, Error **errp) +{ +MachineState *ms = MACHINE(obj); + +return ms->anon_alloc; +} + +static void machine_set_anon_alloc(Object *obj, int value, Error **errp) +{ +MachineState *ms = MACHINE(obj); + +ms->anon_alloc = value; +} + static bool machine_get_usb(Object *obj, Error **errp) { MachineState *ms = MACHINE(obj); @@ -1066,6 +1080,11 @@ static void machine_class_init(ObjectClass *oc, void *data) object_class_property_set_description(oc, "mem-merge", "Enable/disable memory merge support"); +object_class_property_add_enum(oc, "anon-alloc", "AnonAllocOption", + _lookup, + machine_get_anon_alloc, + machine_set_anon_alloc); + object_class_property_add_bool(oc, "usb", machine_get_usb, machine_set_usb); object_class_property_set_description(oc, "usb", @@ -1416,6 +1435,11 @@ static bool create_default_memdev(MachineState *ms, const char *path, Error **er if (!object_property_set_int(obj, "size", ms->ram_size, errp)) { goto out; } +if (!object_property_set_bool(obj, "share", + ms->anon_alloc == ANON_ALLOC_OPTION_MEMFD, + errp)) { +goto out; +} object_property_add_child(object_get_objects_root(), mc->default_ram_id, obj); /* Ensure backend's memory region name is equal to mc->default_ram_id */ diff --git a/include/hw/boards.h b/include/hw/boards.h index 73ad319..77f16ad 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -383,6 +383,7 @@ struct MachineState { bool enable_graphics; ConfidentialGuestSupport *cgs; HostMemoryBackend *memdev; +AnonAllocOption anon_alloc; /* * convenience alias to ram_memdev_id backend memory region * or to numa container memory region diff --git a/qapi/machine.json b/qapi/machine.json index 2fd3e9c..9173953 100644 --- a/qapi/machine.json +++ b/qapi/machine.json @@ -1881,3 +1881,17 @@ { 'command': 'x-query-interrupt-controllers', 'returns': 'HumanReadableText', 'features': [ 'unstable' ]} + +## +# @AnonAllocOption: +# +# An enumeration of the options for allocating anonymous guest memory. +# +# @mmap: allocate using mmap MAP_ANON +# +# @memfd: allocate using memfd_create +# +# Since: 9.1 +## +{ 'enum': 'AnonAllocOption', + 'data': [ 'mmap', 'memfd' ] } diff --git a/qemu-options.hx b/qemu-options.hx index 8ca7f34..595b693 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -38,6 +38,7 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \ "nvdimm=on|off controls NVDIMM support (default=off)\n" "memory-encryption=@var{} memory encryption object to use (default=none)\n" "hmat=on|off controls ACPI HMAT support (default=off)\n" +"anon-alloc=mmap|memfd allocate anonymous guest RAM using mmap
[PATCH V2 07/11] oslib: qemu_clear_cloexec
Define qemu_clear_cloexec, analogous to qemu_set_cloexec. Signed-off-by: Steve Sistare Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Marc-André Lureau Reviewed-by: Fabiano Rosas --- include/qemu/osdep.h | 9 + util/oslib-posix.c | 9 + util/oslib-win32.c | 4 3 files changed, 22 insertions(+) diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h index 191916f..6ebf192 100644 --- a/include/qemu/osdep.h +++ b/include/qemu/osdep.h @@ -662,6 +662,15 @@ ssize_t qemu_write_full(int fd, const void *buf, size_t count) void qemu_set_cloexec(int fd); +/* + * Clear FD_CLOEXEC for a descriptor. + * + * The caller must guarantee that no other fork+exec's occur before the + * exec that is intended to inherit this descriptor, eg by suspending CPUs + * and blocking monitor commands. + */ +void qemu_clear_cloexec(int fd); + /* Return a dynamically allocated directory path that is appropriate for storing * local state. * diff --git a/util/oslib-posix.c b/util/oslib-posix.c index e764416..614c3e5 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -272,6 +272,15 @@ int qemu_socketpair(int domain, int type, int protocol, int sv[2]) return ret; } +void qemu_clear_cloexec(int fd) +{ +int f; +f = fcntl(fd, F_GETFD); +assert(f != -1); +f = fcntl(fd, F_SETFD, f & ~FD_CLOEXEC); +assert(f != -1); +} + char * qemu_get_local_state_dir(void) { diff --git a/util/oslib-win32.c b/util/oslib-win32.c index b623830..c3e969a 100644 --- a/util/oslib-win32.c +++ b/util/oslib-win32.c @@ -222,6 +222,10 @@ void qemu_set_cloexec(int fd) { } +void qemu_clear_cloexec(int fd) +{ +} + int qemu_get_thread_id(void) { return GetCurrentThreadId(); -- 1.8.3.1
[PATCH V2 09/11] migration: cpr-exec-command parameter
Create the cpr-exec-command migration parameter, defined as a list of strings. It will be used for cpr-exec migration mode in a subsequent patch, and contains forward references to cpr-exec mode in the qapi doc. No functional change, except that cpr-exec-command is shown by the 'info migrate' command. Signed-off-by: Steve Sistare --- hmp-commands.hx| 2 +- migration/migration-hmp-cmds.c | 25 + migration/options.c| 14 ++ qapi/migration.json| 21 ++--- 4 files changed, 58 insertions(+), 4 deletions(-) diff --git a/hmp-commands.hx b/hmp-commands.hx index 06746f0..0e04eac 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1009,7 +1009,7 @@ ERST { .name = "migrate_set_parameter", -.args_type = "parameter:s,value:s", +.args_type = "parameter:s,value:S", .params = "parameter value", .help = "Set the parameter for migration", .cmd= hmp_migrate_set_parameter, diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 7d608d2..16a4b00 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -229,6 +229,18 @@ void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict) qapi_free_MigrationCapabilityStatusList(caps); } +static void monitor_print_cpr_exec_command(Monitor *mon, strList *args) +{ +monitor_printf(mon, "%s:", +MigrationParameter_str(MIGRATION_PARAMETER_CPR_EXEC_COMMAND)); + +while (args) { +monitor_printf(mon, " %s", args->value); +args = args->next; +} +monitor_printf(mon, "\n"); +} + void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) { MigrationParameters *params; @@ -358,6 +370,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) MIGRATION_PARAMETER_DIRECT_IO), params->direct_io ? "on" : "off"); } + +assert(params->has_cpr_exec_command); +monitor_print_cpr_exec_command(mon, params->cpr_exec_command); } qapi_free_MigrationParameters(params); @@ -635,6 +650,16 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) p->has_direct_io = true; visit_type_bool(v, param, >direct_io, ); break; +case MIGRATION_PARAMETER_CPR_EXEC_COMMAND: { +g_autofree char **strv = g_strsplit(valuestr ?: "", " ", -1); +strList **tail = >cpr_exec_command; + +for (int i = 0; strv[i]; i++) { +QAPI_LIST_APPEND(tail, strv[i]); +} +p->has_cpr_exec_command = true; +break; +} default: assert(0); } diff --git a/migration/options.c b/migration/options.c index 305397a..b8d5f72 100644 --- a/migration/options.c +++ b/migration/options.c @@ -931,6 +931,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->zero_page_detection = s->parameters.zero_page_detection; params->has_direct_io = true; params->direct_io = s->parameters.direct_io; +params->has_cpr_exec_command = true; +params->cpr_exec_command = QAPI_CLONE(strList, + s->parameters.cpr_exec_command); return params; } @@ -964,6 +967,7 @@ void migrate_params_init(MigrationParameters *params) params->has_mode = true; params->has_zero_page_detection = true; params->has_direct_io = true; +params->has_cpr_exec_command = true; } /* @@ -1252,6 +1256,10 @@ static void migrate_params_test_apply(MigrateSetParameters *params, if (params->has_direct_io) { dest->direct_io = params->direct_io; } + +if (params->has_cpr_exec_command) { +dest->cpr_exec_command = params->cpr_exec_command; +} } static void migrate_params_apply(MigrateSetParameters *params, Error **errp) @@ -1381,6 +1389,12 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) if (params->has_direct_io) { s->parameters.direct_io = params->direct_io; } + +if (params->has_cpr_exec_command) { +qapi_free_strList(s->parameters.cpr_exec_command); +s->parameters.cpr_exec_command = +QAPI_CLONE(strList, params->cpr_exec_command); +} } void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp) diff --git a/qapi/migration.json b/qapi/migration.json index 0f24206..20092d2 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -829,6 +829,10 @@ # only has effect if the @mapped-ram capability is enabled. # (Since 9.1) # +# @cpr-exec-command: Command to start the new QEMU process when @mode +# is @cpr-exec. The first list element is the program's filename, +# the remainder its arguments. (Since 9.1) +# # Features: # # @unstable: Members @x-checkpoint-delay and @@ -854,7 +858,8 @@ 'vcpu-dirty-limit',
[PATCH V2 00/11] Live update: cpr-exec
What? This patch series adds the live migration cpr-exec mode, which allows the user to update QEMU with minimal guest pause time, by preserving guest RAM in place, albeit with new virtual addresses in new QEMU, and by preserving device file descriptors. The new user-visible interfaces are: * cpr-exec (MigMode migration parameter) * cpr-exec-command (migration parameter) * anon-alloc (command-line option for -machine) The user sets the mode parameter before invoking the migrate command. In this mode, the user issues the migrate command to old QEMU, which stops the VM and saves state to the migration channels. Old QEMU then exec's new QEMU, replacing the original process while retaining its PID. The user specifies the command to exec new QEMU in the migration parameter cpr-exec-command. The command must pass all old QEMU arguments to new QEMU, plus the -incoming option. Execution resumes in new QEMU. Memory-backend objects must have the share=on attribute, but memory-backend-epc is not supported. The VM must be started with the '-machine anon-alloc=memfd' option, which allows anonymous memory to be transferred in place to the new process. Why? This mode has less impact on the guest than any other method of updating in place. The pause time is much lower, because devices need not be torn down and recreated, DMA does not need to be drained and quiesced, and minimal state is copied to new QEMU. Further, there are no constraints on the guest. By contrast, cpr-reboot mode requires the guest to support S3 suspend-to-ram, and suspending plus resuming vfio devices adds multiple seconds to the guest pause time. Lastly, there is no loss of connectivity to the guest, because chardev descriptors remain open and connected. These benefits all derive from the core design principle of this mode, which is preserving open descriptors. This approach is very general and can be used to support a wide variety of devices that do not have hardware support for live migration, including but not limited to: vfio, chardev, vhost, vdpa, and iommufd. Some devices need new kernel software interfaces to allow a descriptor to be used in a process that did not originally open it. In a containerized QEMU environment, cpr-exec reuses an existing QEMU container and its assigned resources. By contrast, consider a design in which a new container is created on the same host as the target of the CPR operation. Resources must be reserved for the new container, while the old container still reserves resources until the operation completes. Avoiding over commitment requires extra work in the management layer. This is one reason why a cloud provider may prefer cpr-exec. A second reason is that the container may include agents with their own connections to the outside world, and such connections remain intact if the container is reused. How? All memory that is mapped by the guest is preserved in place. Indeed, it must be, because it may be the target of DMA requests, which are not quiesced during cpr-exec. All such memory must be mmap'able in new QEMU. This is easy for named memory-backend objects, as long as they are mapped shared, because they are visible in the file system in both old and new QEMU. Anonymous memory must be allocated using memfd_create rather than MAP_ANON, so the memfd's can be sent to new QEMU. Pages that were locked in memory for DMA in old QEMU remain locked in new QEMU, because the descriptor of the device that locked them remains open. cpr-exec preserves descriptors across exec by clearing the CLOEXEC flag, and by sending the unique name and value of each descriptor to new QEMU via CPR state. For device descriptors, new QEMU reuses the descriptor when creating the device, rather than opening it again. The same holds for chardevs. For memfd descriptors, new QEMU mmap's the preserved memfd when a ramblock is created. CPR state cannot be sent over the normal migration channel, because devices and backends are created prior to reading the channel, so this mode sends CPR state over a second migration channel that is not visible to the user. New QEMU reads the second channel prior to creating devices or backends. The exec itself is trivial. After writing to the migration channels, the migration code calls a new main-loop hook to perform the exec. Example: In this example, we simply restart the same version of QEMU, but in a real scenario one would use a new QEMU binary path in cpr-exec-command. # qemu-kvm -monitor stdio -object memory-backend-file,id=ram0,size=4G,mem-path=/dev/shm/ram0,share=on -m 4G -machine anon-alloc=memfd ... QEMU 9.1.50 monitor - type 'help' for more information (qemu) info status VM status: running (qemu) migrate_set_parameter mode cpr-exec (qemu) migrate_set_parameter cpr-exec-command qemu-kvm ... -incoming file:vm.state (qemu) migrate -d file:vm.state (qemu) QEMU 9.1.50 monitor - type 'help' for more information (qemu) info status VM status:
[PATCH V2 06/11] migration: fix mismatched GPAs during cpr
For new cpr modes, ramblock_is_ignored will always be true, because the memory is preserved in place rather than copied. However, for an ignored block, parse_ramblock currently requires that the received address of the block must match the address of the statically initialized region on the target. This fails for a PCI rom block, because the memory region address is set when the guest writes to a BAR on the source, which does not occur on the target, causing a "Mismatched GPAs" error during cpr migration. To fix, unconditionally set the target's address to the source's address if the target region does not have an address yet. Signed-off-by: Steve Sistare Reviewed-by: Fabiano Rosas --- include/exec/memory.h | 12 migration/ram.c | 15 +-- system/memory.c | 10 -- 3 files changed, 29 insertions(+), 8 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index c26ede3..227169e 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -811,6 +811,7 @@ struct MemoryRegion { bool ram_device; bool enabled; bool warning_printed; /* For reservations */ +bool has_addr; uint8_t vga_logging_count; MemoryRegion *alias; hwaddr alias_offset; @@ -2408,6 +2409,17 @@ void memory_region_set_enabled(MemoryRegion *mr, bool enabled); void memory_region_set_address(MemoryRegion *mr, hwaddr addr); /* + * memory_region_set_address_only: set the address of a region. + * + * Same as memory_region_set_address, but without causing transaction side + * effects. + * + * @mr: the region to be updated + * @addr: new address, relative to container region + */ +void memory_region_set_address_only(MemoryRegion *mr, hwaddr addr); + +/* * memory_region_set_size: dynamically update the size of a region. * * Dynamically updates the size of a region. diff --git a/migration/ram.c b/migration/ram.c index edec1a2..eaf3151 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -4059,12 +4059,15 @@ static int parse_ramblock(QEMUFile *f, RAMBlock *block, ram_addr_t length) } if (migrate_ignore_shared()) { hwaddr addr = qemu_get_be64(f); -if (migrate_ram_is_ignored(block) && -block->mr->addr != addr) { -error_report("Mismatched GPAs for block %s " - "%" PRId64 "!= %" PRId64, block->idstr, - (uint64_t)addr, (uint64_t)block->mr->addr); -return -EINVAL; +if (migrate_ram_is_ignored(block)) { +if (!block->mr->has_addr) { +memory_region_set_address_only(block->mr, addr); +} else if (block->mr->addr != addr) { +error_report("Mismatched GPAs for block %s " + "%" PRId64 "!= %" PRId64, block->idstr, + (uint64_t)addr, (uint64_t)block->mr->addr); +return -EINVAL; +} } } ret = rdma_block_notification_handle(f, block->idstr); diff --git a/system/memory.c b/system/memory.c index 28a837d..b7548bf 100644 --- a/system/memory.c +++ b/system/memory.c @@ -2655,7 +2655,7 @@ static void memory_region_add_subregion_common(MemoryRegion *mr, for (alias = subregion->alias; alias; alias = alias->alias) { alias->mapped_via_alias++; } -subregion->addr = offset; +memory_region_set_address_only(subregion, offset); memory_region_update_container_subregions(subregion); } @@ -2735,10 +2735,16 @@ static void memory_region_readd_subregion(MemoryRegion *mr) } } +void memory_region_set_address_only(MemoryRegion *mr, hwaddr addr) +{ +mr->addr = addr; +mr->has_addr = true; +} + void memory_region_set_address(MemoryRegion *mr, hwaddr addr) { if (addr != mr->addr) { -mr->addr = addr; +memory_region_set_address_only(mr, addr); memory_region_readd_subregion(mr); } } -- 1.8.3.1
[PATCH V2 02/11] migration: cpr-state
CPR must save state that is needed after QEMU is restarted, when devices are realized. Thus the extra state cannot be saved in the migration stream, as objects must already exist before that stream can be loaded. Instead, define auxilliary state structures and vmstate descriptions, not associated with any registered object, and serialize the aux state to a cpr-specific stream in cpr_state_save. Deserialize in cpr_state_load after QEMU restarts, before devices are realized. Provide accessors for clients to register file descriptors for saving. The mechanism for passing the fd's to the new process will be specific to each migration mode, and added in subsequent patches. Signed-off-by: Steve Sistare --- include/migration/cpr.h | 21 ++ migration/cpr.c | 188 migration/meson.build | 1 + migration/migration.c | 6 ++ migration/trace-events | 5 ++ system/vl.c | 3 + 6 files changed, 224 insertions(+) create mode 100644 include/migration/cpr.h create mode 100644 migration/cpr.c diff --git a/include/migration/cpr.h b/include/migration/cpr.h new file mode 100644 index 000..8e7e705 --- /dev/null +++ b/include/migration/cpr.h @@ -0,0 +1,21 @@ +/* + * Copyright (c) 2021, 2024 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifndef MIGRATION_CPR_H +#define MIGRATION_CPR_H + +typedef int (*cpr_walk_fd_cb)(int fd); +void cpr_save_fd(const char *name, int id, int fd); +void cpr_delete_fd(const char *name, int id); +int cpr_find_fd(const char *name, int id); +int cpr_walk_fd(cpr_walk_fd_cb cb); +void cpr_resave_fd(const char *name, int id, int fd); + +int cpr_state_save(Error **errp); +int cpr_state_load(Error **errp); + +#endif diff --git a/migration/cpr.c b/migration/cpr.c new file mode 100644 index 000..313e74e --- /dev/null +++ b/migration/cpr.c @@ -0,0 +1,188 @@ +/* + * Copyright (c) 2021-2024 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "migration/cpr.h" +#include "migration/misc.h" +#include "migration/qemu-file.h" +#include "migration/savevm.h" +#include "migration/vmstate.h" +#include "sysemu/runstate.h" +#include "trace.h" + +/*/ +/* cpr state container for all information to be saved. */ + +typedef QLIST_HEAD(CprFdList, CprFd) CprFdList; + +typedef struct CprState { +CprFdList fds; +} CprState; + +static CprState cpr_state; + +// + +typedef struct CprFd { +char *name; +unsigned int namelen; +int id; +int fd; +QLIST_ENTRY(CprFd) next; +} CprFd; + +static const VMStateDescription vmstate_cpr_fd = { +.name = "cpr fd", +.version_id = 1, +.minimum_version_id = 1, +.fields = (VMStateField[]) { +VMSTATE_UINT32(namelen, CprFd), +VMSTATE_VBUFFER_ALLOC_UINT32(name, CprFd, 0, NULL, namelen), +VMSTATE_INT32(id, CprFd), +VMSTATE_INT32(fd, CprFd), +VMSTATE_END_OF_LIST() +} +}; + +void cpr_save_fd(const char *name, int id, int fd) +{ +CprFd *elem = g_new0(CprFd, 1); + +trace_cpr_save_fd(name, id, fd); +elem->name = g_strdup(name); +elem->namelen = strlen(name) + 1; +elem->id = id; +elem->fd = fd; +QLIST_INSERT_HEAD(_state.fds, elem, next); +} + +static CprFd *find_fd(CprFdList *head, const char *name, int id) +{ +CprFd *elem; + +QLIST_FOREACH(elem, head, next) { +if (!strcmp(elem->name, name) && elem->id == id) { +return elem; +} +} +return NULL; +} + +void cpr_delete_fd(const char *name, int id) +{ +CprFd *elem = find_fd(_state.fds, name, id); + +if (elem) { +QLIST_REMOVE(elem, next); +g_free(elem->name); +g_free(elem); +} + +trace_cpr_delete_fd(name, id); +} + +int cpr_find_fd(const char *name, int id) +{ +CprFd *elem = find_fd(_state.fds, name, id); +int fd = elem ? elem->fd : -1; + +trace_cpr_find_fd(name, id, fd); +return fd; +} + +int cpr_walk_fd(cpr_walk_fd_cb cb) +{ +CprFd *elem; + +QLIST_FOREACH(elem, _state.fds, next) { +if (elem->fd >= 0 && cb(elem->fd)) { +return 1; +} +} +return 0; +} + +void cpr_resave_fd(const char *name, int id, int fd) +{ +CprFd *elem = find_fd(_state.fds, name, id); +int old_fd = elem ? elem->fd : -1; + +if (old_fd < 0) { +cpr_save_fd(name, id, fd); +} else if (old_fd != fd) { +error_setg(_fatal, + "internal error: cpr fd '%s' id %d value %d " + "already saved with a different value %d", + name, id,
[PATCH v4 03/14] tests/tcg/aarch64: Drop -fno-tree-loop-distribute-patterns
This option is not supported by clang, and is not required in order to get sve code generation with gcc 12. Signed-off-by: Richard Henderson --- tests/tcg/aarch64/Makefile.softmmu-target | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/tcg/aarch64/Makefile.softmmu-target b/tests/tcg/aarch64/Makefile.softmmu-target index 39d3f961c5..dd6d595830 100644 --- a/tests/tcg/aarch64/Makefile.softmmu-target +++ b/tests/tcg/aarch64/Makefile.softmmu-target @@ -39,7 +39,7 @@ memory: CFLAGS+=-DCHECK_UNALIGNED=1 memory-sve: memory.c $(LINK_SCRIPT) $(CRT_OBJS) $(MINILIB_OBJS) $(CC) $(CFLAGS) $(EXTRA_CFLAGS) $< -o $@ $(LDFLAGS) -memory-sve: CFLAGS+=-DCHECK_UNALIGNED=1 -march=armv8.1-a+sve -O3 -fno-tree-loop-distribute-patterns +memory-sve: CFLAGS+=-DCHECK_UNALIGNED=1 -march=armv8.1-a+sve -O3 TESTS+=memory-sve -- 2.34.1
[PATCH v4 13/14] tests/tcg/arm: Use vmrs/vmsr instead of mcr/mrc
Clang 14 generates /home/rth/qemu/src/tests/tcg/arm/fcvt.c:431:9: error: invalid operand for instruction asm("mrc p10, 7, r1, cr1, cr0, 0\n\t" ^ :1:6: note: instantiated into assembly here mrc p10, 7, r1, cr1, cr0, 0 ^ /home/rth/qemu/src/tests/tcg/arm/fcvt.c:432:32: error: invalid operand for instruction "orr r1, r1, %[flags]\n\t" ^ :3:6: note: instantiated into assembly here mcr p10, 7, r1, cr1, cr0, 0 ^ This is perhaps a clang bug, but using the neon mnemonic is clearer. Signed-off-by: Richard Henderson --- tests/tcg/arm/fcvt.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/tests/tcg/arm/fcvt.c b/tests/tcg/arm/fcvt.c index d8c61cd29f..ecebbb0247 100644 --- a/tests/tcg/arm/fcvt.c +++ b/tests/tcg/arm/fcvt.c @@ -427,10 +427,9 @@ int main(int argc, char *argv[argc]) /* And now with ARM alternative FP16 */ #if defined(__arm__) -/* See glibc sysdeps/arm/fpu_control.h */ -asm("mrc p10, 7, r1, cr1, cr0, 0\n\t" +asm("vmrs r1, fpscr\n\t" "orr r1, r1, %[flags]\n\t" -"mcr p10, 7, r1, cr1, cr0, 0\n\t" +"vmsr fpscr, r1" : /* no output */ : [flags] "n" (1 << 26) : "r1" ); #else asm("mrs x1, fpcr\n\t" -- 2.34.1
[PATCH v4 04/14] tests/tcg/aarch64: Explicitly specify register width
From: Akihiko Odaki clang version 18.1.6 assumes a register is 64-bit by default and complains if a 32-bit value is given. Explicitly specify register width when passing a 32-bit value. Signed-off-by: Akihiko Odaki Reviewed-by: Philippe Mathieu-Daudé Message-Id: <20240627-tcg-v2-3-1690a8133...@daynix.com> --- tests/tcg/aarch64/bti-1.c | 6 +++--- tests/tcg/aarch64/bti-3.c | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/tests/tcg/aarch64/bti-1.c b/tests/tcg/aarch64/bti-1.c index 99a879af23..1fada8108d 100644 --- a/tests/tcg/aarch64/bti-1.c +++ b/tests/tcg/aarch64/bti-1.c @@ -17,15 +17,15 @@ static void skip2_sigill(int sig, siginfo_t *info, ucontext_t *uc) #define BTI_JC"hint #38" #define BTYPE_1(DEST) \ -asm("mov %0,#1; adr x16, 1f; br x16; 1: " DEST "; mov %0,#0" \ +asm("mov %w0,#1; adr x16, 1f; br x16; 1: " DEST "; mov %w0,#0" \ : "=r"(skipped) : : "x16") #define BTYPE_2(DEST) \ -asm("mov %0,#1; adr x16, 1f; blr x16; 1: " DEST "; mov %0,#0" \ +asm("mov %w0,#1; adr x16, 1f; blr x16; 1: " DEST "; mov %w0,#0" \ : "=r"(skipped) : : "x16", "x30") #define BTYPE_3(DEST) \ -asm("mov %0,#1; adr x15, 1f; br x15; 1: " DEST "; mov %0,#0" \ +asm("mov %w0,#1; adr x15, 1f; br x15; 1: " DEST "; mov %w0,#0" \ : "=r"(skipped) : : "x15") #define TEST(WHICH, DEST, EXPECT) \ diff --git a/tests/tcg/aarch64/bti-3.c b/tests/tcg/aarch64/bti-3.c index 8c534c09d7..6a3bd037bc 100644 --- a/tests/tcg/aarch64/bti-3.c +++ b/tests/tcg/aarch64/bti-3.c @@ -11,15 +11,15 @@ static void skip2_sigill(int sig, siginfo_t *info, ucontext_t *uc) } #define BTYPE_1() \ -asm("mov %0,#1; adr x16, 1f; br x16; 1: hint #25; mov %0,#0" \ +asm("mov %w0,#1; adr x16, 1f; br x16; 1: hint #25; mov %w0,#0" \ : "=r"(skipped) : : "x16", "x30") #define BTYPE_2() \ -asm("mov %0,#1; adr x16, 1f; blr x16; 1: hint #25; mov %0,#0" \ +asm("mov %w0,#1; adr x16, 1f; blr x16; 1: hint #25; mov %w0,#0" \ : "=r"(skipped) : : "x16", "x30") #define BTYPE_3() \ -asm("mov %0,#1; adr x15, 1f; br x15; 1: hint #25; mov %0,#0" \ +asm("mov %w0,#1; adr x15, 1f; br x15; 1: hint #25; mov %w0,#0" \ : "=r"(skipped) : : "x15", "x30") #define TEST(WHICH, EXPECT) \ -- 2.34.1
[PATCH v4 10/14] tests/tcg/arm: Use -fno-integrated-as for test-arm-iwmmxt
Clang does not support IWMXT instructions. Fall back to the external assembler. Signed-off-by: Richard Henderson --- tests/tcg/arm/Makefile.target | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/tests/tcg/arm/Makefile.target b/tests/tcg/arm/Makefile.target index 0a1965fce7..95f891bf8c 100644 --- a/tests/tcg/arm/Makefile.target +++ b/tests/tcg/arm/Makefile.target @@ -8,6 +8,11 @@ ARM_SRC=$(SRC_PATH)/tests/tcg/arm # Set search path for all sources VPATH += $(ARM_SRC) +config-cc.mak: Makefile + $(quiet-@)( \ + $(call cc-option,-fno-integrated-as, CROSS_CC_HAS_FNIA)) 3> config-cc.mak +-include config-cc.mak + float_madds: CFLAGS+=-mfpu=neon-vfpv4 # Basic Hello World @@ -17,7 +22,8 @@ hello-arm: LDFLAGS+=-nostdlib # IWMXT floating point extensions ARM_TESTS += test-arm-iwmmxt -test-arm-iwmmxt: CFLAGS+=-marm -march=iwmmxt -mabi=aapcs -mfpu=fpv4-sp-d16 +# Clang assembler does not support IWMXT, so use the external assembler. +test-arm-iwmmxt: CFLAGS += -marm -march=iwmmxt -mabi=aapcs -mfpu=fpv4-sp-d16 $(CROSS_CC_HAS_FNIA) test-arm-iwmmxt: test-arm-iwmmxt.S $(CC) $(CFLAGS) $< -o $@ $(LDFLAGS) -- 2.34.1
[PATCH v4 12/14] tests/tcg/arm: Use -march and -mfpu for fcvt
Clang requires the architecture to be set properly in order to assemble the half-precision instructions. Signed-off-by: Richard Henderson --- tests/tcg/arm/Makefile.target | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/tcg/arm/Makefile.target b/tests/tcg/arm/Makefile.target index 95f891bf8c..8e287191af 100644 --- a/tests/tcg/arm/Makefile.target +++ b/tests/tcg/arm/Makefile.target @@ -29,8 +29,8 @@ test-arm-iwmmxt: test-arm-iwmmxt.S # Float-convert Tests ARM_TESTS += fcvt -fcvt: LDFLAGS+=-lm -# fcvt: CFLAGS+=-march=armv8.2-a+fp16 -mfpu=neon-fp-armv8 +fcvt: LDFLAGS += -lm +fcvt: CFLAGS += -march=armv8.2-a+fp16 -mfpu=neon-fp-armv8 run-fcvt: fcvt $(call run-test,fcvt,$(QEMU) $<) $(call diff-out,fcvt,$(ARM_SRC)/fcvt.ref) -- 2.34.1
[PATCH v4 05/14] tests/tcg/aarch64: Fix irg operand type
From: Akihiko Odaki irg expects 64-bit integers. Passing a 32-bit integer results in compilation failure with clang version 18.1.6. Signed-off-by: Akihiko Odaki Message-Id: <20240627-tcg-v2-4-1690a8133...@daynix.com> --- tests/tcg/aarch64/mte-1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/tcg/aarch64/mte-1.c b/tests/tcg/aarch64/mte-1.c index 88dcd617ad..146cad4a04 100644 --- a/tests/tcg/aarch64/mte-1.c +++ b/tests/tcg/aarch64/mte-1.c @@ -15,7 +15,7 @@ int main(int ac, char **av) enable_mte(PR_MTE_TCF_NONE); p0 = alloc_mte_mem(sizeof(*p0)); -asm("irg %0,%1,%2" : "=r"(p1) : "r"(p0), "r"(1)); +asm("irg %0,%1,%2" : "=r"(p1) : "r"(p0), "r"(1l)); assert(p1 != p0); asm("subp %0,%1,%2" : "=r"(c) : "r"(p0), "r"(p1)); assert(c == 0); -- 2.34.1
[PATCH v4 01/14] tests/tcg/minilib: Constify digits in print_num
This avoids a memcpy to the stack when compiled with clang. Since we don't enable optimization, nor provide memcpy, this results in an undefined symbol error at link time. Signed-off-by: Richard Henderson --- tests/tcg/minilib/printf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/tcg/minilib/printf.c b/tests/tcg/minilib/printf.c index 10472b4f58..fb0189c2bb 100644 --- a/tests/tcg/minilib/printf.c +++ b/tests/tcg/minilib/printf.c @@ -27,7 +27,7 @@ static void print_str(char *s) static void print_num(unsigned long long value, int base) { -char digits[] = "0123456789abcdef"; +static const char digits[] = "0123456789abcdef"; char buf[32]; int i = sizeof(buf) - 2, j; -- 2.34.1
[PATCH v4 06/14] tests/tcg/aarch64: Do not use x constraint
From: Akihiko Odaki clang version 18.1.6 does not support x constraint for AArch64. Use w instead. Signed-off-by: Akihiko Odaki Message-Id: <20240627-tcg-v2-5-1690a8133...@daynix.com> --- tests/tcg/arm/fcvt.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/tests/tcg/arm/fcvt.c b/tests/tcg/arm/fcvt.c index 7ac47b564e..f631197287 100644 --- a/tests/tcg/arm/fcvt.c +++ b/tests/tcg/arm/fcvt.c @@ -126,7 +126,7 @@ static void convert_single_to_half(void) asm("vcvtb.f16.f32 %0, %1" : "=t" (output) : "x" (input)); #else uint16_t output; -asm("fcvt %h0, %s1" : "=w" (output) : "x" (input)); +asm("fcvt %h0, %s1" : "=w" (output) : "w" (input)); #endif print_half_number(i, output); } @@ -149,7 +149,7 @@ static void convert_single_to_double(void) #if defined(__arm__) asm("vcvt.f64.f32 %P0, %1" : "=w" (output) : "t" (input)); #else -asm("fcvt %d0, %s1" : "=w" (output) : "x" (input)); +asm("fcvt %d0, %s1" : "=w" (output) : "w" (input)); #endif print_double_number(i, output); } @@ -244,7 +244,7 @@ static void convert_double_to_half(void) /* asm("vcvtb.f16.f64 %0, %P1" : "=t" (output) : "x" (input)); */ output = input; #else -asm("fcvt %h0, %d1" : "=w" (output) : "x" (input)); +asm("fcvt %h0, %d1" : "=w" (output) : "w" (input)); #endif print_half_number(i, output); } @@ -267,7 +267,7 @@ static void convert_double_to_single(void) #if defined(__arm__) asm("vcvt.f32.f64 %0, %P1" : "=w" (output) : "x" (input)); #else -asm("fcvt %s0, %d1" : "=w" (output) : "x" (input)); +asm("fcvt %s0, %d1" : "=w" (output) : "w" (input)); #endif print_single_number(i, output); @@ -335,7 +335,7 @@ static void convert_half_to_double(void) /* asm("vcvtb.f64.f16 %P0, %1" : "=w" (output) : "t" (input)); */ output = input; #else -asm("fcvt %d0, %h1" : "=w" (output) : "x" (input)); +asm("fcvt %d0, %h1" : "=w" (output) : "w" (input)); #endif print_double_number(i, output); } @@ -357,7 +357,7 @@ static void convert_half_to_single(void) #if defined(__arm__) asm("vcvtb.f32.f16 %0, %1" : "=w" (output) : "x" ((uint32_t)input)); #else -asm("fcvt %s0, %h1" : "=w" (output) : "x" (input)); +asm("fcvt %s0, %h1" : "=w" (output) : "w" (input)); #endif print_single_number(i, output); } @@ -380,7 +380,7 @@ static void convert_half_to_integer(void) /* asm("vcvt.s32.f16 %0, %1" : "=t" (output) : "t" (input)); v8.2*/ output = input; #else -asm("fcvt %s0, %h1" : "=w" (output) : "x" (input)); +asm("fcvt %s0, %h1" : "=w" (output) : "w" (input)); #endif print_int64(i, output); } -- 2.34.1
[PATCH v4 00/14] test/tcg: Clang build fixes for arm/aarch64
Supercedes: 20240629-tcg-v3-0-fa57918bd...@daynix.com ("[PATCH v3 0/7] tests/tcg/aarch64: Fix inline assemblies for clang") On top of Akihiko's patches for aarch64, additional changes are required for arm, both as a host and as a guest. r~ Akihiko Odaki (5): tests/tcg/aarch64: Explicitly specify register width tests/tcg/aarch64: Fix irg operand type tests/tcg/aarch64: Do not use x constraint tests/tcg/arm: Fix fcvt result messages tests/tcg/arm: Manually register allocate half-precision numbers Richard Henderson (9): tests/tcg/minilib: Constify digits in print_num tests/tcg: Adjust variable defintion from cc-option tests/tcg/aarch64: Drop -fno-tree-loop-distribute-patterns tests/tcg/aarch64: Add -fno-integrated-as for sme tests/tcg/arm: Drop -N from LDFLAGS tests/tcg/arm: Use -fno-integrated-as for test-arm-iwmmxt tests/tcg/arm: Use -march and -mfpu for fcvt tests/tcg/arm: Use vmrs/vmsr instead of mcr/mrc linux-user/main: Suppress out-of-range comparison warning for clang linux-user/main.c | 1 + tests/tcg/aarch64/bti-1.c | 6 +- tests/tcg/aarch64/bti-3.c | 6 +- tests/tcg/aarch64/mte-1.c | 2 +- tests/tcg/arm/fcvt.c | 28 +- tests/tcg/minilib/printf.c| 2 +- tests/tcg/Makefile.target | 2 +- tests/tcg/aarch64/Makefile.softmmu-target | 4 +- tests/tcg/aarch64/Makefile.target | 18 +- tests/tcg/aarch64/fcvt.ref| 604 +++--- tests/tcg/arm/Makefile.softmmu-target | 4 +- tests/tcg/arm/Makefile.target | 12 +- tests/tcg/arm/fcvt.ref| 604 +++--- 13 files changed, 653 insertions(+), 640 deletions(-) -- 2.34.1
[PATCH v4 02/14] tests/tcg: Adjust variable defintion from cc-option
Define the variable to the compiler flag used, not "y". This avoids replication of the compiler flag itself. Signed-off-by: Richard Henderson --- tests/tcg/Makefile.target | 2 +- tests/tcg/aarch64/Makefile.softmmu-target | 2 +- tests/tcg/aarch64/Makefile.target | 15 --- 3 files changed, 10 insertions(+), 9 deletions(-) diff --git a/tests/tcg/Makefile.target b/tests/tcg/Makefile.target index f21be50d3b..cb8cfeb6da 100644 --- a/tests/tcg/Makefile.target +++ b/tests/tcg/Makefile.target @@ -49,7 +49,7 @@ quiet-command = $(call quiet-@,$2,$3)$1 cc-test = $(CC) -Werror $1 -c -o /dev/null -xc /dev/null >/dev/null 2>&1 cc-option = if $(call cc-test, $1); then \ -echo "$(TARGET_PREFIX)$1 detected" && echo "$(strip $2)=y" >&3; else \ +echo "$(TARGET_PREFIX)$1 detected" && echo "$(strip $2)=$(strip $1)" >&3; else \ echo "$(TARGET_PREFIX)$1 not detected"; fi # $1 = test name, $2 = cmd, $3 = desc diff --git a/tests/tcg/aarch64/Makefile.softmmu-target b/tests/tcg/aarch64/Makefile.softmmu-target index 4b03ef602e..39d3f961c5 100644 --- a/tests/tcg/aarch64/Makefile.softmmu-target +++ b/tests/tcg/aarch64/Makefile.softmmu-target @@ -81,7 +81,7 @@ run-memory-replay: memory-replay run-memory-record EXTRA_RUNS+=run-memory-replay ifneq ($(CROSS_CC_HAS_ARMV8_3),) -pauth-3: CFLAGS += -march=armv8.3-a +pauth-3: CFLAGS += $(CROSS_CC_HAS_ARMV8_3) else pauth-3: $(call skip-test, "BUILD of $@", "missing compiler support") diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target index 70d728ae9a..e6d5e008a8 100644 --- a/tests/tcg/aarch64/Makefile.target +++ b/tests/tcg/aarch64/Makefile.target @@ -32,17 +32,17 @@ config-cc.mak: Makefile ifneq ($(CROSS_CC_HAS_ARMV8_2),) AARCH64_TESTS += dcpop -dcpop: CFLAGS += -march=armv8.2-a +dcpop: CFLAGS += $(CROSS_CC_HAS_ARMV8_2) endif ifneq ($(CROSS_CC_HAS_ARMV8_5),) AARCH64_TESTS += dcpodp -dcpodp: CFLAGS += -march=armv8.5-a +dcpodp: CFLAGS += $(CROSS_CC_HAS_ARMV8_5) endif # Pauth Tests ifneq ($(CROSS_CC_HAS_ARMV8_3),) AARCH64_TESTS += pauth-1 pauth-2 pauth-4 pauth-5 -pauth-%: CFLAGS += -march=armv8.3-a +pauth-%: CFLAGS += $(CROSS_CC_HAS_ARMV8_3) run-pauth-1: QEMU_OPTS += -cpu max run-pauth-2: QEMU_OPTS += -cpu max # Choose a cpu with FEAT_Pauth but without FEAT_FPAC for pauth-[45]. @@ -54,7 +54,7 @@ endif # bti-1 tests the elf notes, so we require special compiler support. ifneq ($(CROSS_CC_HAS_ARMV8_BTI),) AARCH64_TESTS += bti-1 bti-3 -bti-1 bti-3: CFLAGS += -fno-stack-protector -mbranch-protection=standard +bti-1 bti-3: CFLAGS += -fno-stack-protector $(CROSS_CC_HAS_ARMV8_BTI) bti-1 bti-3: LDFLAGS += -nostdlib endif # bti-2 tests PROT_BTI, so no special compiler support required. @@ -63,12 +63,13 @@ AARCH64_TESTS += bti-2 # MTE Tests ifneq ($(CROSS_CC_HAS_ARMV8_MTE),) AARCH64_TESTS += mte-1 mte-2 mte-3 mte-4 mte-5 mte-6 mte-7 -mte-%: CFLAGS += -march=armv8.5-a+memtag +mte-%: CFLAGS += $(CROSS_CC_HAS_ARMV8_MTE) endif # SME Tests ifneq ($(CROSS_AS_HAS_ARMV9_SME),) AARCH64_TESTS += sme-outprod1 sme-smopa-1 sme-smopa-2 +sme-outprod1 sme-smopa-1 sme-smopa-2: CFLAGS += $(CROSS_AS_HAS_ARMV9_SME) endif # System Registers Tests @@ -98,7 +99,7 @@ TESTS += sha512-vector ifneq ($(CROSS_CC_HAS_SVE),) # SVE ioctl test AARCH64_TESTS += sve-ioctls -sve-ioctls: CFLAGS+=-march=armv8.1-a+sve +sve-ioctls: CFLAGS += $(CROSS_CC_HAS_SVE) sha512-sve: CFLAGS=-O3 -march=armv8.1-a+sve sha512-sve: sha512.c @@ -133,7 +134,7 @@ endif ifneq ($(CROSS_CC_HAS_SVE2),) AARCH64_TESTS += test-826 -test-826: CFLAGS+=-march=armv8.1-a+sve2 +test-826: CFLAGS += $(CROSS_CC_HAS_SVE2) endif TESTS += $(AARCH64_TESTS) -- 2.34.1
[PATCH v4 14/14] linux-user/main: Suppress out-of-range comparison warning for clang
For arm32 host and arm64 guest we get .../main.c:851:32: error: result of comparison of constant 70368744177664 with expression of type 'unsigned long' is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (TASK_UNMAPPED_BASE < reserved_va) { ~~ ^ ~~~ We already disable -Wtype-limits here, for this exact comparison, but that is not enough for clang. Disable -Wtautological-compare as well, which is a superset. GCC ignores the unknown warning flag. Signed-off-by: Richard Henderson --- linux-user/main.c | 1 + 1 file changed, 1 insertion(+) diff --git a/linux-user/main.c b/linux-user/main.c index 94c99a1366..7d3cf45fa9 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -843,6 +843,7 @@ int main(int argc, char **argv, char **envp) */ #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wtype-limits" +#pragma GCC diagnostic ignored "-Wtautological-compare" /* * Select an initial value for task_unmapped_base that is in range. -- 2.34.1
[PATCH v4 11/14] tests/tcg/arm: Manually register allocate half-precision numbers
From: Akihiko Odaki Clang does not allow specifying an integer as the value of a single precision register. Explicitly move value from a general register. Signed-off-by: Akihiko Odaki [rth: Use one single inline asm block.] Signed-off-by: Richard Henderson --- tests/tcg/arm/fcvt.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/tests/tcg/arm/fcvt.c b/tests/tcg/arm/fcvt.c index 157790e679..d8c61cd29f 100644 --- a/tests/tcg/arm/fcvt.c +++ b/tests/tcg/arm/fcvt.c @@ -355,7 +355,12 @@ static void convert_half_to_single(void) print_half_number(i, input); #if defined(__arm__) -asm("vcvtb.f32.f16 %0, %1" : "=w" (output) : "x" ((uint32_t)input)); +/* + * Clang refuses to allocate an integer to a fp register. + * Perform the move from a general register by hand. + */ +asm("vmov %0, %1\n\t" +"vcvtb.f32.f16 %0, %0" : "=w" (output) : "r" (input)); #else asm("fcvt %s0, %h1" : "=w" (output) : "w" (input)); #endif -- 2.34.1
[PATCH v4 09/14] tests/tcg/arm: Drop -N from LDFLAGS
This is redudant with a linker script, and is not supported by clang. Signed-off-by: Richard Henderson --- tests/tcg/arm/Makefile.softmmu-target | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/tcg/arm/Makefile.softmmu-target b/tests/tcg/arm/Makefile.softmmu-target index 39e01ce49d..547063c08c 100644 --- a/tests/tcg/arm/Makefile.softmmu-target +++ b/tests/tcg/arm/Makefile.softmmu-target @@ -13,7 +13,7 @@ VPATH += $(ARM_SRC) test-armv6m-undef: test-armv6m-undef.S $(CC) -mcpu=cortex-m0 -mfloat-abi=soft \ -Wl,--build-id=none -x assembler-with-cpp \ - $< -o $@ -nostdlib -N -static \ + $< -o $@ -nostdlib -static \ -T $(ARM_SRC)/$@.ld run-test-armv6m-undef: QEMU_OPTS=-semihosting-config enable=on,target=native,chardev=output -M microbit -kernel @@ -30,7 +30,7 @@ CRT_PATH=$(ARM_SRC) LINK_SCRIPT=$(ARM_SRC)/kernel.ld LDFLAGS=-Wl,-T$(LINK_SCRIPT) CFLAGS+=-nostdlib -ggdb -O0 $(MINILIB_INC) -LDFLAGS+=-static -nostdlib -N $(CRT_OBJS) $(MINILIB_OBJS) -lgcc +LDFLAGS+=-static -nostdlib $(CRT_OBJS) $(MINILIB_OBJS) -lgcc # building head blobs .PRECIOUS: $(CRT_OBJS) -- 2.34.1
[PATCH v4 08/14] tests/tcg/arm: Fix fcvt result messages
From: Akihiko Odaki The test cases for "converting double-precision to single-precision" emits float but the result variable was typed as uint32_t and corrupted the printed values. Propertly type it as float. Signed-off-by: Akihiko Odaki Fixes: 8ec8a55e3fc9 ("tests/tcg/arm: add fcvt test cases for AArch32/64") Message-Id: <20240627-tcg-v2-1-1690a8133...@daynix.com> [rth: Update arm ref file as well] Signed-off-by: Richard Henderson --- tests/tcg/arm/fcvt.c | 2 +- tests/tcg/aarch64/fcvt.ref | 604 ++--- tests/tcg/arm/fcvt.ref | 604 ++--- 3 files changed, 605 insertions(+), 605 deletions(-) diff --git a/tests/tcg/arm/fcvt.c b/tests/tcg/arm/fcvt.c index f631197287..157790e679 100644 --- a/tests/tcg/arm/fcvt.c +++ b/tests/tcg/arm/fcvt.c @@ -258,7 +258,7 @@ static void convert_double_to_single(void) for (i = 0; i < ARRAY_SIZE(double_numbers); ++i) { double input = double_numbers[i].d; -uint32_t output; +float output; feclearexcept(FE_ALL_EXCEPT); diff --git a/tests/tcg/aarch64/fcvt.ref b/tests/tcg/aarch64/fcvt.ref index e7af24dc58..2726b41063 100644 --- a/tests/tcg/aarch64/fcvt.ref +++ b/tests/tcg/aarch64/fcvt.ref @@ -211,45 +211,45 @@ Converting double-precision to half-precision 40 HALF: 0x7f00 (0x1 => INVALID) Converting double-precision to single-precision 00 DOUBLE: nan / 0x007ff4 (0 => OK) -00 SINGLE: 2.145386496000e+09 / 0x4effc000 (0x1 => INVALID) +00 SINGLE: nan / 0x7fe0 (0x1 => INVALID) 01 DOUBLE: -nan / 0x00fff8 (0 => OK) -01 SINGLE: 4.290772992000e+09 / 0x4f7fc000 (0 => OK) +01 SINGLE: -nan / 0xffc0 (0 => OK) 02 DOUBLE: -inf / 0x00fff0 (0 => OK) -02 SINGLE: 4.286578688000e+09 / 0x4f7f8000 (0 => OK) +02 SINGLE: -inf / 0xff80 (0 => OK) 03 DOUBLE: -1.79769313486231570815e+308 / 0x00ffef (0 => OK) -03 SINGLE: 4.286578688000e+09 / 0x4f7f8000 (0x14 => OVERFLOW INEXACT ) +03 SINGLE: -inf / 0xff80 (0x14 => OVERFLOW INEXACT ) 04 DOUBLE: -3.40282346638528859812e+38 / 0x00c7efe000 (0 => OK) -04 SINGLE: 4.286578688000e+09 / 0x4f7f8000 (0x10 =>INEXACT ) +04 SINGLE: -3.40282346638528859812e+38 / 0xff7f (0 => OK) 05 DOUBLE: -3.40282346638528859812e+38 / 0x00c7efe000 (0 => OK) -05 SINGLE: 4.286578688000e+09 / 0x4f7f8000 (0x10 =>INEXACT ) +05 SINGLE: -3.40282346638528859812e+38 / 0xff7f (0 => OK) 06 DOUBLE: -1.11107529e+31 / 0x00c661874b135ff654 (0 => OK) -06 SINGLE: 4.077664768000e+09 / 0x4f730c3a (0x10 =>INEXACT ) +06 SINGLE: -1.1114769645909791e+31 / 0xf30c3a59 (0x10 =>INEXACT ) 07 DOUBLE: -1.11099085e+30 / 0x00c62c0bab523323b9 (0 => OK) -07 SINGLE: 4.04962432e+09 / 0x4f71605d (0x10 =>INEXACT ) +07 SINGLE: -1.1113258488635273e+30 / 0xf1605d5b (0x10 =>INEXACT ) 08 DOUBLE: -2.e+00 / 0x00c000 (0 => OK) -08 SINGLE: 3.221225472000e+09 / 0x4f40 (0 => OK) +08 SINGLE: -2.e+00 / 0xc000 (0 => OK) 09 DOUBLE: -1.e+00 / 0x00bff0 (0 => OK) -09 SINGLE: 3.212836864000e+09 / 0x4f3f8000 (0 => OK) +09 SINGLE: -1.e+00 / 0xbf80 (0 => OK) 10 DOUBLE: -2.22507385850720138309e-308 / 0x008010 (0 => OK) -10 SINGLE: 2.147483648000e+09 / 0x4f00 (0x18 => UNDERFLOW INEXACT ) +10 SINGLE: -0.e+00 / 0x8000 (0x18 => UNDERFLOW INEXACT ) 11 DOUBLE: -1.17549435082228750797e-38 / 0x00b810 (0 => OK) -11 SINGLE: 2.155872256000e+09 / 0x4f008000 (0 => OK) +11 SINGLE: -1.17549435082228750797e-38 / 0x8080 (0 => OK) 12 DOUBLE: 0.e+00 / (0 => OK) 12 SINGLE: 0.e+00 / 00 (0 => OK) 13 DOUBLE: 1.17549435082228750797e-38 / 0x003810 (0 => OK) -13 SINGLE: 8.38860800e+06 / 0x4b00 (0 => OK) +13 SINGLE: 1.17549435082228750797e-38 / 0x0080 (0 => OK) 14 DOUBLE: 2.9802322400013061e-08 / 0x003e60001c5f68 (0 => OK) -14 SINGLE: 8.55638016e+08 / 0x4e4c (0x10 =>INEXACT ) +14 SINGLE: 2.98023223876953125000e-08 / 0x3300 (0x10 =>INEXACT ) 15 DOUBLE: 5.960460015661e-08 / 0x003e6e6cb2fa82 (0 => OK) -15 SINGLE: 8.64026624e+08 / 0x4e4e (0x10 =>INEXACT ) +15 SINGLE: 5.96045985901128005935e-08 / 0x3373 (0x10 =>INEXACT ) 16 DOUBLE: 6.097559994299e-05 / 0x003f0ff801a9af58a1 (0 => OK) -16 SINGLE: 9.47896320e+08 / 0x4e61ff00 (0x10 =>INEXACT ) +16 SINGLE: 6.09755988989491015673e-05 / 0x387fc00d (0x10 =>INEXACT ) 17 DOUBLE: 6.103520013665e-05 / 0x003f10c06a1ef5 (0 => OK) -17 SINGLE: 9.47912704e+08 / 0x4e62 (0x10 =>INEXACT ) +17 SINGLE:
[PATCH v4 07/14] tests/tcg/aarch64: Add -fno-integrated-as for sme
The only use of SME is inline assembly. Both gcc and clang only support SME with very recent releases; by deferring detection to the assembler we get better test coverage. Signed-off-by: Richard Henderson --- tests/tcg/aarch64/Makefile.target | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target index e6d5e008a8..8817ac262f 100644 --- a/tests/tcg/aarch64/Makefile.target +++ b/tests/tcg/aarch64/Makefile.target @@ -20,6 +20,7 @@ run-fcvt: fcvt config-cc.mak: Makefile $(quiet-@)( \ + fnia=`$(call cc-test,-fno-integrated-as) && echo -fno-integrated-as`; \ $(call cc-option,-march=armv8.1-a+sve, CROSS_CC_HAS_SVE); \ $(call cc-option,-march=armv8.1-a+sve2, CROSS_CC_HAS_SVE2); \ $(call cc-option,-march=armv8.2-a, CROSS_CC_HAS_ARMV8_2); \ @@ -27,7 +28,7 @@ config-cc.mak: Makefile $(call cc-option,-march=armv8.5-a, CROSS_CC_HAS_ARMV8_5); \ $(call cc-option,-mbranch-protection=standard, CROSS_CC_HAS_ARMV8_BTI); \ $(call cc-option,-march=armv8.5-a+memtag, CROSS_CC_HAS_ARMV8_MTE); \ - $(call cc-option,-Wa$(COMMA)-march=armv9-a+sme, CROSS_AS_HAS_ARMV9_SME)) 3> config-cc.mak + $(call cc-option,-Wa$(COMMA)-march=armv9-a+sme $$fnia, CROSS_AS_HAS_ARMV9_SME)) 3> config-cc.mak -include config-cc.mak ifneq ($(CROSS_CC_HAS_ARMV8_2),) -- 2.34.1
Re: [PATCH v2 3/3] target/ppc : Update VSX storage access insns to use tcg_gen_qemu _ld/st_i128.
On 6/30/24 05:01, Chinmay Rath wrote: @@ -2175,13 +2179,13 @@ static bool do_lstxv(DisasContext *ctx, int ra, TCGv displ, int rt, bool store, bool paired) { TCGv ea; -TCGv_i64 xt; +TCGv_i128 data; MemOp mop; int rt1, rt2; -xt = tcg_temp_new_i64(); +data = tcg_temp_new_i128(); -mop = DEF_MEMOP(MO_UQ); +mop = DEF_MEMOP(MO_128 | MO_ATOM_IFALIGN_PAIR); gen_set_access_type(ctx, ACCESS_INT); ea = do_ea_calc(ctx, ra, displ); @@ -2195,32 +2199,20 @@ static bool do_lstxv(DisasContext *ctx, int ra, TCGv displ, } if (store) { -get_cpu_vsr(xt, rt1, !ctx->le_mode); -tcg_gen_qemu_st_i64(xt, ea, ctx->mem_idx, mop); -gen_addr_add(ctx, ea, ea, 8); -get_cpu_vsr(xt, rt1, ctx->le_mode); -tcg_gen_qemu_st_i64(xt, ea, ctx->mem_idx, mop); +get_vsr_full(data, rt1); +tcg_gen_qemu_st_i128(data, ea, ctx->mem_idx, mop); if (paired) { gen_addr_add(ctx, ea, ea, 8); The increment needs updating to 16. -get_cpu_vsr(xt, rt2, !ctx->le_mode); -tcg_gen_qemu_st_i64(xt, ea, ctx->mem_idx, mop); -gen_addr_add(ctx, ea, ea, 8); -get_cpu_vsr(xt, rt2, ctx->le_mode); -tcg_gen_qemu_st_i64(xt, ea, ctx->mem_idx, mop); +get_vsr_full(data, rt2); +tcg_gen_qemu_st_i128(data, ea, ctx->mem_idx, mop); } } else { -tcg_gen_qemu_ld_i64(xt, ea, ctx->mem_idx, mop); -set_cpu_vsr(rt1, xt, !ctx->le_mode); -gen_addr_add(ctx, ea, ea, 8); -tcg_gen_qemu_ld_i64(xt, ea, ctx->mem_idx, mop); -set_cpu_vsr(rt1, xt, ctx->le_mode); +tcg_gen_qemu_ld_i128(data, ea, ctx->mem_idx, mop); +set_vsr_full(rt1, data); if (paired) { gen_addr_add(ctx, ea, ea, 8); Likewise. With those fixed, Reviewed-by: Richard Henderson r~
[PATCH] hw/char/pl011: ensure UARTIBRD register is 16-bit
The PL011 TRM says that "The 16-bit integer is written to the Integer Baud Rate Register, UARTIBRD". Updated the handling of the UARTIBRD register to ensure only 16-bit values are written to it. ASAN log: ==2973125==ERROR: AddressSanitizer: FPE on unknown address 0x55f72629b348 (pc 0x55f72629b348 bp 0x7fffa24d0e00 sp 0x7fffa24d0d60 T0) #0 0x55f72629b348 in pl011_get_baudrate hw/char/pl011.c:255:17 #1 0x55f726298d94 in pl011_trace_baudrate_change hw/char/pl011.c:260:33 #2 0x55f726296fc8 in pl011_write hw/char/pl011.c:378:9 Reproducer: cat << EOF | qemu-system-aarch64 -display \ none -machine accel=qtest, -m 512M -machine realview-pb-a8 -qtest stdio writeq 0x1000b024 0xf800 EOF Signed-off-by: Zheyu Ma --- hw/char/pl011.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/char/pl011.c b/hw/char/pl011.c index 8753b84a84..f962786e2a 100644 --- a/hw/char/pl011.c +++ b/hw/char/pl011.c @@ -374,7 +374,7 @@ static void pl011_write(void *opaque, hwaddr offset, s->ilpr = value; break; case 9: /* UARTIBRD */ -s->ibrd = value; +s->ibrd = value & 0x; pl011_trace_baudrate_change(s); break; case 10: /* UARTFBRD */ -- 2.34.1
Re: [PATCH v2 2/3] target/ppc: Update VMX storage access insns to use tcg_gen_qemu_ld/st_i128.
On 6/30/24 05:01, Chinmay Rath wrote: Updated instructions {l, st}vx to use tcg_gen_qemu_ld/st_i128, instead of using 64 bits loads/stores in succession. Introduced functions {get, set}_avr_full in vmx-impl.c.inc to facilitate the above, and potential future usage. Suggested-by: Richard Henderson Signed-off-by: Chinmay Rath --- target/ppc/translate/vmx-impl.c.inc | 42 ++--- 1 file changed, 20 insertions(+), 22 deletions(-) Reviewed-by: Richard Henderson r~
Re: [PATCH v2 1/3] target/ppc: Move get/set_avr64 functions to vmx-impl.c.inc.
On 6/30/24 05:01, Chinmay Rath wrote: Those functions are used to ld/st data to and from Altivec registers, in 64 bits chunks, and are only used in vmx-impl.c.inc file, hence the clean-up movement. Signed-off-by: Chinmay Rath --- target/ppc/translate.c | 10 -- target/ppc/translate/vmx-impl.c.inc | 10 ++ 2 files changed, 10 insertions(+), 10 deletions(-) Reviewed-by: Richard Henderson r~
[PULL 15/16] vl.c: select_machine(): add selected machine type to error message
From: Vladimir Sementsov-Ogievskiy Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- system/vl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/system/vl.c b/system/vl.c index 92fc29c193..bdd2f6ecf6 100644 --- a/system/vl.c +++ b/system/vl.c @@ -1674,7 +1674,7 @@ static MachineClass *select_machine(QDict *qdict, Error **errp) machine_class = find_machine(machine_type, machines); qdict_del(qdict, "type"); if (!machine_class) { -error_setg(errp, "unsupported machine type"); +error_setg(errp, "unsupported machine type: \"%s\"", optarg); } } else { machine_class = find_default_machine(machines); -- 2.39.2
[PULL 05/16] monitor: Remove obsolete stubs
From: Philippe Mathieu-Daudé hmp_info_roms() was removed in commit dd98234c05 ("qapi: introduce x-query-roms QMP command"), hmp_info_numa() in commit 1b8ae799d8 ("qapi: introduce x-query-numa QMP command"), hmp_info_ramblock() in commit ca411b7c8a ("qapi: introduce x-query-ramblock QMP command") and hmp_info_irq() in commit 91f2fa7045 ("qapi: introduce x-query-irq QMP command"). Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Daniel P. Berrangé Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- include/hw/loader.h | 1 - include/monitor/hmp.h | 3 --- 2 files changed, 4 deletions(-) diff --git a/include/hw/loader.h b/include/hw/loader.h index 8685e27334..9844c5e3cf 100644 --- a/include/hw/loader.h +++ b/include/hw/loader.h @@ -338,7 +338,6 @@ void *rom_ptr(hwaddr addr, size_t size); * rom_ptr(). */ void *rom_ptr_for_as(AddressSpace *as, hwaddr addr, size_t size); -void hmp_info_roms(Monitor *mon, const QDict *qdict); #define rom_add_file_fixed(_f, _a, _i) \ rom_add_file(_f, NULL, _a, _i, false, NULL, NULL) diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h index 954f3c83ad..ae116d9804 100644 --- a/include/monitor/hmp.h +++ b/include/monitor/hmp.h @@ -35,7 +35,6 @@ void hmp_info_cpus(Monitor *mon, const QDict *qdict); void hmp_info_vnc(Monitor *mon, const QDict *qdict); void hmp_info_spice(Monitor *mon, const QDict *qdict); void hmp_info_balloon(Monitor *mon, const QDict *qdict); -void hmp_info_irq(Monitor *mon, const QDict *qdict); void hmp_info_pic(Monitor *mon, const QDict *qdict); void hmp_info_pci(Monitor *mon, const QDict *qdict); void hmp_info_tpm(Monitor *mon, const QDict *qdict); @@ -102,7 +101,6 @@ void hmp_chardev_send_break(Monitor *mon, const QDict *qdict); void hmp_object_add(Monitor *mon, const QDict *qdict); void hmp_object_del(Monitor *mon, const QDict *qdict); void hmp_info_memdev(Monitor *mon, const QDict *qdict); -void hmp_info_numa(Monitor *mon, const QDict *qdict); void hmp_info_memory_devices(Monitor *mon, const QDict *qdict); void hmp_qom_list(Monitor *mon, const QDict *qdict); void hmp_qom_get(Monitor *mon, const QDict *qdict); @@ -141,7 +139,6 @@ void hmp_rocker_ports(Monitor *mon, const QDict *qdict); void hmp_rocker_of_dpa_flows(Monitor *mon, const QDict *qdict); void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict *qdict); void hmp_info_dump(Monitor *mon, const QDict *qdict); -void hmp_info_ramblock(Monitor *mon, const QDict *qdict); void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict); void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict); void hmp_info_memory_size_summary(Monitor *mon, const QDict *qdict); -- 2.39.2
[PULL 00/16] Trivial patches for 2024-06-30
The following changes since commit 3665dd6bb9043bef181c91e2dce9e1efff47ed51: Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2024-06-28 16:09:38 -0700) are available in the Git repository at: https://gitlab.com/mjt0k/qemu.git tags/pull-trivial-patches for you to fetch changes up to f22855dffdbc2906f744b5bcfea869cbb66b8fb2: hw/core/loader: gunzip(): fix memory leak on error path (2024-06-30 19:51:44 +0300) trivial patches for 2024-06-30 It's been some time since the previous trivial-patches pullreq, and the queue grew a bit Dr. David Alan Gilbert (4): linux-user: cris: Remove unused struct 'rt_signal_frame' linux-user: sparc: Remove unused struct 'target_mc_fq' hw/arm/bcm2836: Remove unusued struct 'BCM283XClass' net/can: Remove unused struct 'CanBusState' Hyeongtak Ji (1): docs/cxl: fix some typos Martin Joerg (1): hmp-commands-info.hx: Add missing info command for stats subcommand Matheus Tavares Bernardino (1): cpu: fix memleak of 'halt_cond' and 'thread' Philippe Mathieu-Daudé (1): monitor: Remove obsolete stubs Thomas Huth (1): docs/system/devices/usb: Replace the non-existing "qemu" binary Trent Huber (1): os-posix: Expand setrlimit() syscall compatibility Vladimir Sementsov-Ogievskiy (4): vl.c: select_machine(): use ERRP_GUARD instead of error propagation vl.c: select_machine(): use g_autoptr vl.c: select_machine(): add selected machine type to error message hw/core/loader: gunzip(): fix memory leak on error path Zide Chen (2): vl: Allow multiple -overcommit commands target/i386: Advertise MWAIT iff host supports accel/tcg/tcg-accel-ops-rr.c | 1 + docs/system/devices/cxl.rst | 6 +++--- docs/system/devices/usb.rst | 2 +- hmp-commands-info.hx | 2 +- hw/arm/bcm2836.c | 12 hw/core/cpu-common.c | 3 +++ hw/core/loader.c | 1 + include/hw/loader.h | 1 - include/monitor/hmp.h| 3 --- linux-user/cris/signal.c | 8 linux-user/sparc/signal.c| 5 - net/can/can_host.c | 6 -- os-posix.c | 4 system/vl.c | 21 ++--- target/i386/host-cpu.c | 12 target/i386/kvm/kvm-cpu.c| 11 +-- 16 files changed, 33 insertions(+), 65 deletions(-)
[PULL 13/16] vl.c: select_machine(): use ERRP_GUARD instead of error propagation
From: Vladimir Sementsov-Ogievskiy Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- system/vl.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/system/vl.c b/system/vl.c index 4dc862652f..fda93d150c 100644 --- a/system/vl.c +++ b/system/vl.c @@ -1665,28 +1665,28 @@ static const QEMUOption *lookup_opt(int argc, char **argv, static MachineClass *select_machine(QDict *qdict, Error **errp) { +ERRP_GUARD(); const char *machine_type = qdict_get_try_str(qdict, "type"); GSList *machines = object_class_get_list(TYPE_MACHINE, false); -MachineClass *machine_class; -Error *local_err = NULL; +MachineClass *machine_class = NULL; if (machine_type) { machine_class = find_machine(machine_type, machines); qdict_del(qdict, "type"); if (!machine_class) { -error_setg(_err, "unsupported machine type"); +error_setg(errp, "unsupported machine type"); } } else { machine_class = find_default_machine(machines); if (!machine_class) { -error_setg(_err, "No machine specified, and there is no default"); +error_setg(errp, "No machine specified, and there is no default"); } } g_slist_free(machines); -if (local_err) { -error_append_hint(_err, "Use -machine help to list supported machines\n"); -error_propagate(errp, local_err); +if (!machine_class) { +error_append_hint(errp, + "Use -machine help to list supported machines\n"); } return machine_class; } -- 2.39.2
[PULL 03/16] vl: Allow multiple -overcommit commands
From: Zide Chen Both cpu-pm and mem-lock are related to system resource overcommit, but they are separate from each other, in terms of how they are realized, and of course, they are applied to different system resources. It's tempting to use separate command lines to specify their behavior. e.g., in the following example, the cpu-pm command is quietly overwritten, and it's not easy to notice it without careful inspection. --overcommit mem-lock=on --overcommit cpu-pm=on Fixes: c8c9dc42b7ca ("Remove the deprecated -realtime option") Suggested-by: Thomas Huth Signed-off-by: Zide Chen Reviewed-by: Thomas Huth Reviewed-by: Zhao Liu Reviewed-by: Igor Mammedov Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- system/vl.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/system/vl.c b/system/vl.c index cfcb674425..4dc862652f 100644 --- a/system/vl.c +++ b/system/vl.c @@ -3546,8 +3546,8 @@ void qemu_init(int argc, char **argv) if (!opts) { exit(1); } -enable_mlock = qemu_opt_get_bool(opts, "mem-lock", false); -enable_cpu_pm = qemu_opt_get_bool(opts, "cpu-pm", false); +enable_mlock = qemu_opt_get_bool(opts, "mem-lock", enable_mlock); +enable_cpu_pm = qemu_opt_get_bool(opts, "cpu-pm", enable_cpu_pm); break; case QEMU_OPTION_compat: { -- 2.39.2
[PULL 10/16] os-posix: Expand setrlimit() syscall compatibility
From: Trent Huber Darwin uses a subtly different version of the setrlimit() syscall as described in the COMPATIBILITY section of the macOS man page. The value of the rlim_cur member has been adjusted accordingly for Darwin-based systems. Signed-off-by: Trent Huber Tested-by: Philippe Mathieu-Daudé Reviewed-by: Daniel P. Berrangé Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- os-posix.c | 4 1 file changed, 4 insertions(+) diff --git a/os-posix.c b/os-posix.c index a4284e2c07..43f9a43f3f 100644 --- a/os-posix.c +++ b/os-posix.c @@ -270,7 +270,11 @@ void os_setup_limits(void) return; } +#ifdef CONFIG_DARWIN +nofile.rlim_cur = OPEN_MAX < nofile.rlim_max ? OPEN_MAX : nofile.rlim_max; +#else nofile.rlim_cur = nofile.rlim_max; +#endif if (setrlimit(RLIMIT_NOFILE, ) < 0) { warn_report("unable to set NOFILE limit: %s", strerror(errno)); -- 2.39.2
[PULL 08/16] hw/arm/bcm2836: Remove unusued struct 'BCM283XClass'
From: "Dr. David Alan Gilbert" This struct has been unused since Commit f932093ae165 ("hw/arm/bcm2836: Split out common part of BCM283X classes") Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- hw/arm/bcm2836.c | 12 1 file changed, 12 deletions(-) diff --git a/hw/arm/bcm2836.c b/hw/arm/bcm2836.c index db191661f2..40a379bc36 100644 --- a/hw/arm/bcm2836.c +++ b/hw/arm/bcm2836.c @@ -18,18 +18,6 @@ #include "target/arm/cpu-qom.h" #include "target/arm/gtimer.h" -struct BCM283XClass { -/*< private >*/ -DeviceClass parent_class; -/*< public >*/ -const char *name; -const char *cpu_type; -unsigned core_count; -hwaddr peri_base; /* Peripheral base address seen by the CPU */ -hwaddr ctrl_base; /* Interrupt controller and mailboxes etc. */ -int clusterid; -}; - static Property bcm2836_enabled_cores_property = DEFINE_PROP_UINT32("enabled-cpus", BCM283XBaseState, enabled_cpus, 0); -- 2.39.2
[PULL 04/16] target/i386: Advertise MWAIT iff host supports
From: Zide Chen host_cpu_realizefn() sets CPUID_EXT_MONITOR without consulting host/KVM capabilities. This may cause problems: - If MWAIT/MONITOR is not available on the host, advertising this feature to the guest and executing MWAIT/MONITOR from the guest triggers #UD and the guest doesn't boot. This is because typically #UD takes priority over VM-Exit interception checks and KVM doesn't emulate MONITOR/MWAIT on #UD. - If KVM doesn't support KVM_X86_DISABLE_EXITS_MWAIT, MWAIT/MONITOR from the guest are intercepted by KVM, which is not what cpu-pm=on intends to do. In these cases, MWAIT/MONITOR should not be exposed to the guest. The logic in kvm_arch_get_supported_cpuid() to handle CPUID_EXT_MONITOR is correct and sufficient, and we can't set CPUID_EXT_MONITOR after x86_cpu_filter_features(). This was not an issue before commit 662175b91ff ("i386: reorder call to cpu_exec_realizefn") because the feature added in the accel-specific realizefn could be checked against host availability and filtered out. Additionally, it seems not a good idea to handle guest CPUID leaves in host_cpu_realizefn(), and this patch merges host_cpu_enable_cpu_pm() into kvm_cpu_realizefn(). Fixes: f5cc5a5c1686 ("i386: split cpu accelerators from cpu.c, using AccelCPUClass") Fixes: 662175b91ff2 ("i386: reorder call to cpu_exec_realizefn") Signed-off-by: Zide Chen Reviewed-by: Zhao Liu Reviewed-by: Xiaoyao Li Reviewed-by: Igor Mammedov Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- target/i386/host-cpu.c| 12 target/i386/kvm/kvm-cpu.c | 11 +-- 2 files changed, 9 insertions(+), 14 deletions(-) diff --git a/target/i386/host-cpu.c b/target/i386/host-cpu.c index 280e427c01..8b8bf5afec 100644 --- a/target/i386/host-cpu.c +++ b/target/i386/host-cpu.c @@ -42,15 +42,6 @@ static uint32_t host_cpu_phys_bits(void) return host_phys_bits; } -static void host_cpu_enable_cpu_pm(X86CPU *cpu) -{ -CPUX86State *env = >env; - -host_cpuid(5, 0, >mwait.eax, >mwait.ebx, - >mwait.ecx, >mwait.edx); -env->features[FEAT_1_ECX] |= CPUID_EXT_MONITOR; -} - static uint32_t host_cpu_adjust_phys_bits(X86CPU *cpu) { uint32_t host_phys_bits = host_cpu_phys_bits(); @@ -83,9 +74,6 @@ bool host_cpu_realizefn(CPUState *cs, Error **errp) X86CPU *cpu = X86_CPU(cs); CPUX86State *env = >env; -if (cpu->max_features && enable_cpu_pm) { -host_cpu_enable_cpu_pm(cpu); -} if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) { uint32_t phys_bits = host_cpu_adjust_phys_bits(cpu); diff --git a/target/i386/kvm/kvm-cpu.c b/target/i386/kvm/kvm-cpu.c index f9b99b5f50..d57a68a301 100644 --- a/target/i386/kvm/kvm-cpu.c +++ b/target/i386/kvm/kvm-cpu.c @@ -64,8 +64,15 @@ static bool kvm_cpu_realizefn(CPUState *cs, Error **errp) * cpu_common_realizefn() (via xcc->parent_realize) */ if (cpu->max_features) { -if (enable_cpu_pm && kvm_has_waitpkg()) { -env->features[FEAT_7_0_ECX] |= CPUID_7_0_ECX_WAITPKG; +if (enable_cpu_pm) { +if (kvm_has_waitpkg()) { +env->features[FEAT_7_0_ECX] |= CPUID_7_0_ECX_WAITPKG; +} + +if (env->features[FEAT_1_ECX] & CPUID_EXT_MONITOR) { +host_cpuid(5, 0, >mwait.eax, >mwait.ebx, + >mwait.ecx, >mwait.edx); + } } if (cpu->ucode_rev == 0) { cpu->ucode_rev = -- 2.39.2
[PULL 11/16] docs/cxl: fix some typos
From: Hyeongtak Ji This patch corrects minor typographical errors to ensure the ASCII art aligns with the explanations provided. Specifically, it fixes an incorrect root port reference and removes redundant words. Signed-off-by: Hyeongtak Ji Signed-off-by: Michael Tokarev --- docs/system/devices/cxl.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/system/devices/cxl.rst b/docs/system/devices/cxl.rst index 10a0e9bc9f..882b036f5e 100644 --- a/docs/system/devices/cxl.rst +++ b/docs/system/devices/cxl.rst @@ -218,17 +218,17 @@ Notes: A complex configuration here, might be to use the following HDM decoders in HB0. HDM0 routes CFMW0 requests to RP0 and hence part of CXL Type3 0. HDM1 routes CFMW0 requests from a -different region of the CFMW0 PA range to RP2 and hence part +different region of the CFMW0 PA range to RP1 and hence part of CXL Type 3 1. HDM2 routes yet another PA range from within CFMW0 to be interleaved across RP0 and RP1, providing 2 way interleave of part of the memory provided by CXL Type3 0 and CXL Type 3 1. HDM3 routes those interleaved accesses from CFMW1 that target HB0 to RP 0 and another part of the memory of CXL Type 3 0 (as part of a 2 way interleave at the system level -across for example CXL Type3 0 and CXL Type3 2. +across for example CXL Type3 0 and CXL Type3 2). HDM4 is used to enable system wide 4 way interleave across all the present CXL type3 devices, by interleaving those (interleaved) -requests that HB0 receives from from CFMW1 across RP 0 and +requests that HB0 receives from CFMW1 across RP 0 and RP 1 and hence to yet more regions of the memory of the attached Type3 devices. Note this is a representative subset of the full range of possible HDM decoder configurations in this -- 2.39.2
[PULL 16/16] hw/core/loader: gunzip(): fix memory leak on error path
From: Vladimir Sementsov-Ogievskiy We should call inflateEnd() like on success path to cleanup state in s variable. Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- hw/core/loader.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/core/loader.c b/hw/core/loader.c index 2f8105d7de..a3bea1e718 100644 --- a/hw/core/loader.c +++ b/hw/core/loader.c @@ -610,6 +610,7 @@ ssize_t gunzip(void *dst, size_t dstlen, uint8_t *src, size_t srclen) r = inflate(, Z_FINISH); if (r != Z_OK && r != Z_STREAM_END) { printf ("Error: inflate() returned %d\n", r); +inflateEnd(); return -1; } dstbytes = s.next_out - (unsigned char *) dst; -- 2.39.2
[PULL 12/16] docs/system/devices/usb: Replace the non-existing "qemu" binary
From: Thomas Huth We don't ship a binary that is simply called "qemu", so we should avoid this in the documentation. Use the configurable binary name via "|qemu_system|" instead. Signed-off-by: Thomas Huth Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- docs/system/devices/usb.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/system/devices/usb.rst b/docs/system/devices/usb.rst index a6ca7b0c37..dc694d23c2 100644 --- a/docs/system/devices/usb.rst +++ b/docs/system/devices/usb.rst @@ -18,7 +18,7 @@ emulation uses less resources (especially CPU). So if your guest supports XHCI (which should be the case for any operating system released around 2010 or later) we recommend using it: -qemu -device qemu-xhci +|qemu_system| -device qemu-xhci XHCI supports USB 1.1, USB 2.0 and USB 3.0 devices, so this is the only controller you need. With only a single USB controller (and -- 2.39.2
[PULL 07/16] linux-user: sparc: Remove unused struct 'target_mc_fq'
From: "Dr. David Alan Gilbert" This struct is unused since Peter's Commit b8ae597f0e6d ("linux-user/sparc: Fix errors in target_ucontext structures") However, hmm, I'm a bit confused since that commit modifies the structure and then removes it, was that intentional? Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- linux-user/sparc/signal.c | 5 - 1 file changed, 5 deletions(-) diff --git a/linux-user/sparc/signal.c b/linux-user/sparc/signal.c index f164b74032..8181b8b92c 100644 --- a/linux-user/sparc/signal.c +++ b/linux-user/sparc/signal.c @@ -546,11 +546,6 @@ void setup_sigtramp(abi_ulong sigtramp_page) typedef abi_ulong target_mc_greg_t; typedef target_mc_greg_t target_mc_gregset_t[SPARC_MC_NGREG]; -struct target_mc_fq { -abi_ulong mcfq_addr; -uint32_t mcfq_insn; -}; - /* * Note the manual 16-alignment; the kernel gets this because it * includes a "long double qregs[16]" in the mcpu_fregs union, -- 2.39.2
[PULL 14/16] vl.c: select_machine(): use g_autoptr
From: Vladimir Sementsov-Ogievskiy Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- system/vl.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/system/vl.c b/system/vl.c index fda93d150c..92fc29c193 100644 --- a/system/vl.c +++ b/system/vl.c @@ -1667,7 +1667,7 @@ static MachineClass *select_machine(QDict *qdict, Error **errp) { ERRP_GUARD(); const char *machine_type = qdict_get_try_str(qdict, "type"); -GSList *machines = object_class_get_list(TYPE_MACHINE, false); +g_autoptr(GSList) machines = object_class_get_list(TYPE_MACHINE, false); MachineClass *machine_class = NULL; if (machine_type) { @@ -1683,7 +1683,6 @@ static MachineClass *select_machine(QDict *qdict, Error **errp) } } -g_slist_free(machines); if (!machine_class) { error_append_hint(errp, "Use -machine help to list supported machines\n"); -- 2.39.2
[PULL 01/16] hmp-commands-info.hx: Add missing info command for stats subcommand
From: Martin Joerg Signed-off-by: Martin Joerg Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- hmp-commands-info.hx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx index cfd4ad5651..c59cd6637b 100644 --- a/hmp-commands-info.hx +++ b/hmp-commands-info.hx @@ -892,7 +892,7 @@ ERST }, SRST - ``stats`` + ``info stats`` Show runtime-collected statistics ERST -- 2.39.2
[PULL 06/16] linux-user: cris: Remove unused struct 'rt_signal_frame'
From: "Dr. David Alan Gilbert" Since 'setup_rt_frame' has never been implemented, this struct is unused. Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Richard Henderson Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- linux-user/cris/signal.c | 8 1 file changed, 8 deletions(-) diff --git a/linux-user/cris/signal.c b/linux-user/cris/signal.c index 4f532b2903..10948bcf30 100644 --- a/linux-user/cris/signal.c +++ b/linux-user/cris/signal.c @@ -35,14 +35,6 @@ struct target_signal_frame { uint16_t retcode[4]; /* Trampoline code. */ }; -struct rt_signal_frame { -siginfo_t *pinfo; -void *puc; -siginfo_t info; -ucontext_t uc; -uint16_t retcode[4]; /* Trampoline code. */ -}; - static void setup_sigcontext(struct target_sigcontext *sc, CPUCRISState *env) { __put_user(env->regs[0], >regs.r0); -- 2.39.2
[PULL 02/16] cpu: fix memleak of 'halt_cond' and 'thread'
From: Matheus Tavares Bernardino Since a4c2735f35 (cpu: move Qemu[Thread|Cond] setup into common code, 2024-05-30) these fields are now allocated at cpu_common_initfn(). So let's make sure we also free them at cpu_common_finalize(). Furthermore, the code also frees these on round robin, but we missed 'halt_cond'. Signed-off-by: Matheus Tavares Bernardino Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Pierrick Bouvier Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- accel/tcg/tcg-accel-ops-rr.c | 1 + hw/core/cpu-common.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c index 84c36c1450..48c38714bd 100644 --- a/accel/tcg/tcg-accel-ops-rr.c +++ b/accel/tcg/tcg-accel-ops-rr.c @@ -329,6 +329,7 @@ void rr_start_vcpu_thread(CPUState *cpu) /* we share the thread, dump spare data */ g_free(cpu->thread); qemu_cond_destroy(cpu->halt_cond); +g_free(cpu->halt_cond); cpu->thread = single_tcg_cpu_thread; cpu->halt_cond = single_tcg_halt_cond; diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c index bf1a7b8892..f131cde2c0 100644 --- a/hw/core/cpu-common.c +++ b/hw/core/cpu-common.c @@ -286,6 +286,9 @@ static void cpu_common_finalize(Object *obj) g_array_free(cpu->gdb_regs, TRUE); qemu_lockcnt_destroy(>in_ioctl_lock); qemu_mutex_destroy(>work_mutex); +qemu_cond_destroy(cpu->halt_cond); +g_free(cpu->halt_cond); +g_free(cpu->thread); } static int64_t cpu_common_get_arch_id(CPUState *cpu) -- 2.39.2
[PULL 09/16] net/can: Remove unused struct 'CanBusState'
From: "Dr. David Alan Gilbert" As far as I can tell this struct has never been used in this file (it is used in can_core.c). Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Michael Tokarev Signed-off-by: Michael Tokarev --- net/can/can_host.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/net/can/can_host.c b/net/can/can_host.c index a3c84028c6..b2fe553f91 100644 --- a/net/can/can_host.c +++ b/net/can/can_host.c @@ -34,12 +34,6 @@ #include "net/can_emu.h" #include "net/can_host.h" -struct CanBusState { -Object object; - -QTAILQ_HEAD(, CanBusClientState) clients; -}; - static void can_host_disconnect(CanHostState *ch) { CanHostClass *chc = CAN_HOST_GET_CLASS(ch); -- 2.39.2
Re: [PATCH v6] virtio-net: Fix network stall at the host side waiting for kick
Thanks for the patch! Yes something to improve: On Sun, Jun 30, 2024 at 02:36:15PM +0800, Wencheng Yang wrote: > From: thomas > > Patch 06b12970174 ("virtio-net: fix network stall under load") > added double-check to test whether the available buffer size > can satisfy the request or not, in case the guest has added > some buffers to the avail ring simultaneously after the first > check. It will be lucky if the available buffer size becomes > okay after the double-check, then the host can send the packet > to the guest. If the buffer size still can't satisfy the request, > even if the guest has added some buffers, viritio-net would > stall at the host side forever. > > The patch checks whether the guest has added some buffers > after last check of avail idx when the available buffers are > not sufficient, if so then recheck the available buffers > in the loop. > > The patch also reverts patch "06b12970174". > > The case below can reproduce the stall. > >Guest 0 > ++ > | iperf | > ---> | server | > Host |++ >++ |... >| iperf | >| client | Guest n >++ |++ > || iperf | > ---> | server | > ++ > > Boot many guests from qemu with virtio network: > qemu ... -netdev tap,id=net_x \ > -device virtio-net-pci-non-transitional,\ > iommu_platform=on,mac=xx:xx:xx:xx:xx:xx,netdev=net_x > > Each guest acts as iperf server with commands below: > iperf3 -s -D -i 10 -p 8001 > iperf3 -s -D -i 10 -p 8002 > > The host as iperf client: > iperf3 -c guest_IP -p 8001 -i 30 -w 256k -P 20 -t 4 > iperf3 -c guest_IP -p 8002 -i 30 -w 256k -P 20 -t 4 > > After some time, the host loses connection to the guest, > the guest can send packet to the host, but can't receive > packet from host. > > It's more likely to happen if SWIOTLB is enabled in the guest, > allocating and freeing bounce buffer takes some CPU ticks, > copying from/to bounce buffer takes more CPU ticks, compared > with that there is no bounce buffer in the guest. > Once the rate of producing packets from the host approximates > the rate of receiveing packets in the guest, the guest would > loop in NAPI. > > receive packets--- >| | >v | >free buf virtnet_poll >| | >v | > add buf to avail ring --- >| >| need kick the host? >| NAPI continues >v > receive packets--- >| | >v | >free buf virtnet_poll >| | >v | > add buf to avail ring --- >| >v > ... ... > > On the other hand, the host fetches free buf from avail > ring, if the buf in the avail ring is not enough, the > host notifies the guest the event by writing the avail > idx read from avail ring to the event idx of used ring, > then the host goes to sleep, waiting for the kick signal > from the guest. > > Once the guest finds the host is waiting for kick singal > (in virtqueue_kick_prepare_split()), it kicks the host. > > The host may stall forever at the sequences below: > > HostGuest > --- > fetch buf, send packet receive packet --- > ... ... | > fetch buf, send packet add buf | > ...add buf virtnet_poll > buf not enough avail idx-> add buf | > read avail idx add buf | > add buf --- > receive packet --- > write event idx ... | > waiting for kick add buf virtnet_poll > ... | > --- > no more packet, exit NAPI > > In the first loop of NAPI above, indicated in the range of > virtnet_poll above, the host is sending packets while the > guest is receiving packets and adding buf. > step 1: The buf is not enough, for example, a big packet > needs 5 buf, but the available buf count is 3. > The host read current avail idx. > step 2: The guest adds some buf, then checks whether the > host is waiting for kick signal, not at this time. > The used ring is not
[PATCH] hw/usb: Fix memory leak in musb_reset()
The musb_reset function was causing a memory leak by not properly freeing the memory associated with USBPacket instances before reinitializing them. This commit addresses the memory leak by adding calls to usb_packet_cleanup for each USBPacket instance before reinitializing them with usb_packet_init. Asan log: =2970623==ERROR: LeakSanitizer: detected memory leaks Direct leak of 256 byte(s) in 16 object(s) allocated from: #0 0x561e20629c3d in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 #1 0x7fee91885738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) #2 0x561e21b4d0e1 in usb_packet_init hw/usb/core.c:531:5 #3 0x561e21c5016b in musb_reset hw/usb/hcd-musb.c:372:9 #4 0x561e21c502a9 in musb_init hw/usb/hcd-musb.c:385:5 #5 0x561e21c893ef in tusb6010_realize hw/usb/tusb6010.c:827:15 #6 0x561e23443355 in device_set_realized hw/core/qdev.c:510:13 #7 0x561e2346ac1b in property_set_bool qom/object.c:2354:5 #8 0x561e23463895 in object_property_set qom/object.c:1463:5 #9 0x561e23477909 in object_property_set_qobject qom/qom-qobject.c:28:10 #10 0x561e234645ed in object_property_set_bool qom/object.c:1533:15 #11 0x561e2343c830 in qdev_realize hw/core/qdev.c:291:12 #12 0x561e2343c874 in qdev_realize_and_unref hw/core/qdev.c:298:11 #13 0x561e20ad5091 in sysbus_realize_and_unref hw/core/sysbus.c:261:12 #14 0x561e22553283 in n8x0_usb_setup hw/arm/nseries.c:800:5 #15 0x561e2254e99b in n8x0_init hw/arm/nseries.c:1356:5 #16 0x561e22561170 in n810_init hw/arm/nseries.c:1418:5 Signed-off-by: Zheyu Ma --- hw/usb/hcd-musb.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/usb/hcd-musb.c b/hw/usb/hcd-musb.c index 6dca373cb1..0300aeaec6 100644 --- a/hw/usb/hcd-musb.c +++ b/hw/usb/hcd-musb.c @@ -368,6 +368,8 @@ void musb_reset(MUSBState *s) s->ep[i].maxp[1] = 0x40; s->ep[i].musb = s; s->ep[i].epnum = i; +usb_packet_cleanup(>ep[i].packey[0].p); +usb_packet_cleanup(>ep[i].packey[1].p); usb_packet_init(>ep[i].packey[0].p); usb_packet_init(>ep[i].packey[1].p); } -- 2.34.1
[PATCH] hw/misc/bcm2835_thermal: Handle invalid address accesses gracefully
This commit handles invalid address accesses gracefully in both read and write functions. Instead of asserting and aborting, it logs an error message and returns a neutral value for read operations and does nothing for write operations. Error log: ERROR:hw/misc/bcm2835_thermal.c:55:bcm2835_thermal_read: code should not be reached Bail out! ERROR:hw/misc/bcm2835_thermal.c:55:bcm2835_thermal_read: code should not be reached Aborted Reproducer: cat << EOF | qemu-system-aarch64 -display \ none -machine accel=qtest, -m 512M -machine raspi3b -m 1G -qtest stdio readw 0x3f212003 EOF Signed-off-by: Zheyu Ma --- hw/misc/bcm2835_thermal.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/hw/misc/bcm2835_thermal.c b/hw/misc/bcm2835_thermal.c index ee7816b8a5..5c2a429d58 100644 --- a/hw/misc/bcm2835_thermal.c +++ b/hw/misc/bcm2835_thermal.c @@ -51,8 +51,10 @@ static uint64_t bcm2835_thermal_read(void *opaque, hwaddr addr, unsigned size) val = FIELD_DP32(bcm2835_thermal_temp2adc(25), STAT, VALID, true); break; default: -/* MemoryRegionOps are aligned, so this can not happen. */ -g_assert_not_reached(); +qemu_log_mask(LOG_GUEST_ERROR, + "bcm2835_thermal_read: invalid address 0x%" + HWADDR_PRIx "\n", addr); +val = 0; } return val; } @@ -72,8 +74,10 @@ static void bcm2835_thermal_write(void *opaque, hwaddr addr, __func__, value, addr); break; default: -/* MemoryRegionOps are aligned, so this can not happen. */ -g_assert_not_reached(); +qemu_log_mask(LOG_GUEST_ERROR, + "bcm2835_thermal_write: invalid address 0x%" + HWADDR_PRIx "\n", addr); +break; } } -- 2.34.1
Re: [PATCH v3 1/2] tests/avocado: update firmware for sbsa-ref
On Thu, 20 Jun 2024 at 12:20, Marcin Juszkiewicz wrote: > > Update firmware to have graphics card memory fix from EDK2 commit > c1d1910be6e04a8b1a73090cf2881fb698947a6e: > > OvmfPkg/QemuVideoDxe: add feature PCD to remap framebuffer W/C > > Some platforms (such as SBSA-QEMU on recent builds of the emulator) only > tolerate misaligned accesses to normal memory, and raise alignment > faults on such accesses to device memory, which is the default for PCIe > MMIO BARs. > > When emulating a PCIe graphics controller, the framebuffer is typically > exposed via a MMIO BAR, while the disposition of the region is closer to > memory (no side effects on reads or writes, except for the changing > picture on the screen; direct random access to any pixel in the image). > > In order to permit the use of such controllers on platforms that only > tolerate these types of accesses for normal memory, it is necessary to > remap the memory. Use the DXE services to set the desired capabilities > and attributes. > > Hide this behavior under a feature PCD so only platforms that really > need it can enable it. (OVMF on x86 has no need for this) > > With this fix enabled we can boot sbsa-ref with more than one cpu core. > This requires an explanation: what does the number of CPU cores have to do with the memory attributes used for the framebuffer? > Signed-off-by: Marcin Juszkiewicz > --- > tests/avocado/machine_aarch64_sbsaref.py | 14 +++--- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/tests/avocado/machine_aarch64_sbsaref.py > b/tests/avocado/machine_aarch64_sbsaref.py > index 6bb82f2a03..e854ec6a1a 100644 > --- a/tests/avocado/machine_aarch64_sbsaref.py > +++ b/tests/avocado/machine_aarch64_sbsaref.py > @@ -37,18 +37,18 @@ def fetch_firmware(self): > > Used components: > > -- Trusted Firmware 2.11.0 > -- Tianocore EDK2 stable202405 > -- Tianocore EDK2-platforms commit 4bbd0ed > +- Trusted Firmware v2.11.0 > +- Tianocore EDK2 4d4f569924 > +- Tianocore EDK2-platforms 3f08401 > > """ > > # Secure BootRom (TF-A code) > fs0_xz_url = ( > > "https://artifacts.codelinaro.org/artifactory/linaro-419-sbsa-ref/; > -"20240528-140808/edk2/SBSA_FLASH0.fd.xz" > +"20240619-148232/edk2/SBSA_FLASH0.fd.xz" > ) > -fs0_xz_hash = > "fa6004900b67172914c908b78557fec4d36a5f784f4c3dd08f49adb75e1892a9" > +fs0_xz_hash = > "0c954842a590988f526984de22e21ae0ab9cb351a0c99a8a58e928f0c7359cf7" > tar_xz_path = self.fetch_asset(fs0_xz_url, asset_hash=fs0_xz_hash, >algorithm='sha256') > archive.extract(tar_xz_path, self.workdir) > @@ -57,9 +57,9 @@ def fetch_firmware(self): > # Non-secure rom (UEFI and EFI variables) > fs1_xz_url = ( > > "https://artifacts.codelinaro.org/artifactory/linaro-419-sbsa-ref/; > -"20240528-140808/edk2/SBSA_FLASH1.fd.xz" > +"20240619-148232/edk2/SBSA_FLASH1.fd.xz" > ) > -fs1_xz_hash = > "5f3747d4000bc416d9641e33ff4ac60c3cc8cb74ca51b6e932e58531c62eb6f7" > +fs1_xz_hash = > "c6ec39374c4d79bb9e9cdeeb6db44732d90bb4a334cec92002b3f4b9cac4b5ee" > tar_xz_path = self.fetch_asset(fs1_xz_url, asset_hash=fs1_xz_hash, >algorithm='sha256') > archive.extract(tar_xz_path, self.workdir) > > -- > 2.45.1 >
[PATCH] virtio: Implement Virtio Backend for SD/MMC in QEMU
Add a Virtio backend for SD/MMC devices. Confirmed interoperability with Linux. Signed-off-by: Mikhail Krasheninnikov CC: Matwey Kornilov CC: qemu-bl...@nongnu.org CC: Michael S. Tsirkin --- hw/virtio/Kconfig | 5 + hw/virtio/meson.build | 2 + hw/virtio/virtio-mmc-pci.c | 85 ++ hw/virtio/virtio-mmc.c | 165 hw/virtio/virtio.c | 3 +- include/hw/virtio/virtio-mmc.h | 20 +++ include/standard-headers/linux/virtio_ids.h | 1 + 7 files changed, 280 insertions(+), 1 deletion(-) create mode 100644 hw/virtio/virtio-mmc-pci.c create mode 100644 hw/virtio/virtio-mmc.c create mode 100644 include/hw/virtio/virtio-mmc.h diff --git a/hw/virtio/Kconfig b/hw/virtio/Kconfig index 92c9cf6c96..e5fa7607c4 100644 --- a/hw/virtio/Kconfig +++ b/hw/virtio/Kconfig @@ -105,3 +105,8 @@ config VHOST_USER_SCMI bool default y depends on VIRTIO && VHOST_USER + +config VIRTIO_MMC +bool +default y +depends on VIRTIO \ No newline at end of file diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build index 47baf00366..1ff095c5bc 100644 --- a/hw/virtio/meson.build +++ b/hw/virtio/meson.build @@ -41,6 +41,7 @@ specific_virtio_ss.add(when: 'CONFIG_VHOST_USER_GPIO', if_true: files('vhost-use specific_virtio_ss.add(when: ['CONFIG_VIRTIO_PCI', 'CONFIG_VHOST_USER_GPIO'], if_true: files('vhost-user-gpio-pci.c')) specific_virtio_ss.add(when: 'CONFIG_VHOST_USER_SCMI', if_true: files('vhost-user-scmi.c')) specific_virtio_ss.add(when: ['CONFIG_VIRTIO_PCI', 'CONFIG_VHOST_USER_SCMI'], if_true: files('vhost-user-scmi-pci.c')) +specific_virtio_ss.add(when: 'CONFIG_VIRTIO_MMC', if_true: files('virtio-mmc.c')) virtio_pci_ss = ss.source_set() virtio_pci_ss.add(when: 'CONFIG_VHOST_VSOCK', if_true: files('vhost-vsock-pci.c')) @@ -68,6 +69,7 @@ virtio_pci_ss.add(when: 'CONFIG_VIRTIO_IOMMU', if_true: files('virtio-iommu-pci. virtio_pci_ss.add(when: 'CONFIG_VIRTIO_MEM', if_true: files('virtio-mem-pci.c')) virtio_pci_ss.add(when: 'CONFIG_VHOST_VDPA_DEV', if_true: files('vdpa-dev-pci.c')) virtio_pci_ss.add(when: 'CONFIG_VIRTIO_MD', if_true: files('virtio-md-pci.c')) +virtio_pci_ss.add(when: 'CONFIG_VIRTIO_MMC', if_true: files('virtio-mmc-pci.c')) specific_virtio_ss.add_all(when: 'CONFIG_VIRTIO_PCI', if_true: virtio_pci_ss) diff --git a/hw/virtio/virtio-mmc-pci.c b/hw/virtio/virtio-mmc-pci.c new file mode 100644 index 00..f0ed17d03b --- /dev/null +++ b/hw/virtio/virtio-mmc-pci.c @@ -0,0 +1,85 @@ +#include "qemu/osdep.h" + +#include "hw/virtio/virtio-pci.h" +#include "hw/virtio/virtio-mmc.h" +#include "hw/qdev-properties-system.h" +#include "qemu/typedefs.h" +#include "qapi/error.h" +#include "sysemu/block-backend-global-state.h" + +typedef struct VirtIOMMCPCI VirtIOMMCPCI; + +/* + * virtio-mmc-pci: This extends VirtioPCIProxy. + */ +#define TYPE_VIRTIO_MMC_PCI "virtio-mmc-pci-base" +DECLARE_INSTANCE_CHECKER(VirtIOMMCPCI, VIRTIO_MMC_PCI, + TYPE_VIRTIO_MMC_PCI) + +struct VirtIOMMCPCI { +VirtIOPCIProxy parent_obj; +VirtIOMMC vdev; +BlockBackend *blk; +}; + +static void virtio_mmc_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp) +{ +VirtIOMMCPCI *vmmc = VIRTIO_MMC_PCI(vpci_dev); +DeviceState *dev = DEVICE(>vdev); + +if (!vmmc->blk) { +error_setg(errp, "Drive property not set"); +return; +} +VirtIOMMC *vmmc_dev = >vdev; +vmmc_dev->blk = vmmc->blk; +blk_detach_dev(vmmc->blk, DEVICE(vpci_dev)); + +qdev_set_parent_bus(dev, BUS(_dev->bus), errp); + +virtio_pci_force_virtio_1(vpci_dev); +object_property_set_bool(OBJECT(dev), "realized", true, errp); +} + +static Property virtio_mmc_properties[] = { +DEFINE_PROP_DRIVE("drive", VirtIOMMCPCI, blk), +DEFINE_PROP_END_OF_LIST(), +}; + +static void virtio_mmc_pci_class_init(ObjectClass *oc, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(oc); +VirtioPCIClass *virtio_pci_class = VIRTIO_PCI_CLASS(oc); +PCIDeviceClass *pci_device_class = PCI_DEVICE_CLASS(oc); + +device_class_set_props(dc, virtio_mmc_properties); +set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); + +virtio_pci_class->realize = virtio_mmc_pci_realize; + +pci_device_class->revision = VIRTIO_PCI_ABI_VERSION; +pci_device_class->class_id = PCI_CLASS_MEMORY_FLASH; +} + +static void virtio_mmc_pci_instance_init(Object *obj) +{ +VirtIOMMCPCI *dev = VIRTIO_MMC_PCI(obj); + +virtio_instance_init_common(obj, >vdev, sizeof(dev->vdev), +TYPE_VIRTIO_MMC); +} + +static const VirtioPCIDeviceTypeInfo virtio_mmc_pci_info = { +.base_name = TYPE_VIRTIO_MMC_PCI, +.generic_name = "virtio-mmc-pci", +.instance_size = sizeof(VirtIOMMCPCI), +.class_init= virtio_mmc_pci_class_init, +.instance_init = virtio_mmc_pci_instance_init, +}; + +static void
[PATCH] hw/display/tcx: Fix out-of-bounds access in tcx_blit_writel
This patch addresses a potential out-of-bounds memory access issue in the tcx_blit_writel function. It adds bounds checking to ensure that memory accesses do not exceed the allocated VRAM size. If an out-of-bounds access is detected, an error is logged using qemu_log_mask. ASAN log: ==2960379==ERROR: AddressSanitizer: SEGV on unknown address 0x7f524752fd01 (pc 0x7f525c2c4881 bp 0x7ffdaf87bfd0 sp 0x7ffdaf87b788 T0) ==2960379==The signal is caused by a READ memory access. #0 0x7f525c2c4881 in memcpy string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:222 #1 0x55aa782bd5b1 in __asan_memcpy llvm/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:22:3 #2 0x55aa7854dedd in tcx_blit_writel hw/display/tcx.c:590:13 Reproducer: cat << EOF | qemu-system-sparc -display none \ -machine accel=qtest, -m 512M -machine LX -m 256 -qtest stdio writel 0x562e98c4 0x3d92fd01 EOF Signed-off-by: Zheyu Ma --- hw/display/tcx.c | 9 + 1 file changed, 9 insertions(+) diff --git a/hw/display/tcx.c b/hw/display/tcx.c index 99507e7638..af43bea7f2 100644 --- a/hw/display/tcx.c +++ b/hw/display/tcx.c @@ -33,6 +33,7 @@ #include "migration/vmstate.h" #include "qemu/error-report.h" #include "qemu/module.h" +#include "qemu/log.h" #include "qom/object.h" #define TCX_ROM_FILE "QEMU,tcx.bin" @@ -577,6 +578,14 @@ static void tcx_blit_writel(void *opaque, hwaddr addr, addr = (addr >> 3) & 0xf; adsr = val & 0xff; len = ((val >> 24) & 0x1f) + 1; + +if (addr + len > s->vram_size || adsr + len > s->vram_size) { +qemu_log_mask(LOG_GUEST_ERROR, + "%s: VRAM access out of bounds. addr: 0x%lx, adsr: 0x%x, len: %u\n", + __func__, addr, adsr, len); +return; +} + if (adsr == 0xff) { memset(>vram[addr], s->tmpblit, len); if (s->depth == 24) { -- 2.34.1
Question: xen + vhost user
Hi All, I am trying to enable vhost user input with xen hypervisor on i.MX95, using qemu vhost-user-input. But meet " Invalid vring_addr message ". My xen domu cfg: '-chardev', 'socket,path=/tmp/input.sock,id=mouse0', '-device', 'vhost-user-input-pci,chardev=mouse0', Anyone knows what missing? Partial error log: Vhost user message Request: VHOST_USER_SET_VRING_ADDR (9) Flags: 0x1 Size:40 vhost_vring_addr: index: 0 flags: 0 desc_user_addr: 0x889b used_user_addr: 0x889b04c0 avail_user_addr: 0x889b0400 log_guest_addr: 0x444714c0 Setting virtq addresses: vring_desc at (nil) vring_used at (nil) vring_avail at (nil) ** (vhost-user-input:1816): CRITICAL **: 07:20:46.077: Invalid vring_addr message Thanks, Peng. The full vhost user debug log: ./vhost-user-input --socket-path=/tmp/input.sock --evdev-path=/d -path=/dev/input/event1 ./vhost-user-input --socket-path=/tmp/input.sock --evdev- Vhost user message Request: VHOST_USER_GET_FEATURES (1) Flags: 0x1 Size:0 Sending back to guest u64: 0x00017500 Vhost user message Request: VHOST_USER_GET_PROTOCOL_FEATURES (15) Flags: 0x1 Size:0 Vhost user message Request: VHOST_USER_SET_PROTOCOL_FEATURES (16) Flags: 0x1 Size:8 u64: 0x8e2b Vhost user message Request: VHOST_USER_GET_QUEUE_NUM (17) Flags: 0x1 Size:0 Vhost user message Request: VHOST_USER_GET_MAX_MEM_SLOTS (36) Flags: 0x1 Size:0 u64: 0x0020 Vhost user message Request: VHOST_USER_SET_BACKEND_REQ_FD (21) Flags: 0x9 Size:0 Fds: 6 Got backend_fd: 6 Vhost user message Request: VHOST_USER_SET_OWNER (3) Flags: 0x1 Size:0 Vhost user message Request: VHOST_USER_GET_FEATURES (1) Flags: 0x1 Size:0 Sending back to guest u64: 0x00017500 Vhost user message Request: VHOST_USER_SET_VRING_CALL (13) Flags: 0x1 Size:8 Fds: 7 u64: 0x Got call_fd: 7 for vq: 0 Vhost user message Request: VHOST_USER_SET_VRING_ERR (14) Flags: 0x1 Size:8 Fds: 8 u64: 0x Vhost user message Request: VHOST_USER_SET_VRING_CALL (13) Flags: 0x1 Size:8 Fds: 9 u64: 0x0001 Got call_fd: 9 for vq: 1 Vhost user message Request: VHOST_USER_SET_VRING_ERR (14) Flags: 0x1 Size:8 Fds: 10 u64: 0x0001 (XEN) d2v0 Unhandled SMC/HVC: 0x8450 (XEN) d2v0 Unhandled SMC/HVC: 0x8600ff01 (XEN) d2v0: vGICD: RAZ on reserved register offset 0x0c (XEN) d2v0: vGICD: unhandled word write 0x00 to ICACTIVER4 (XEN) d2v0: vGICR: SGI: unhandled word write 0x00 to ICACTIVER0 Vhost user message Request: VHOST_USER_SET_CONFIG (25) Flags: 0x9 Size:148 Vhost user message Request: VHOST_USER_SET_CONFIG (25) Flags: 0x9 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags: 0x1 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags: 0x1 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags: 0x1 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags: 0x1 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags: 0x1 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags: 0x1 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags: 0x1 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags: 0x1 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags: 0x1 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags: 0x1 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags: 0x1 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags: 0x1 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags: 0x1 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags: 0x1 Size:148 Vhost user message Request: VHOST_USER_GET_CONFIG (24) Flags:
[PATCH v2 3/3] target/ppc : Update VSX storage access insns to use tcg_gen_qemu _ld/st_i128.
Updated many VSX instructions to use tcg_gen_qemu_ld/st_i128, instead of using tcg_gen_qemu_ld/st_i64 consecutively. Introduced functions {get,set}_vsr_full to facilitate the above & for future use. Suggested-by: Richard Henderson Signed-off-by: Chinmay Rath --- target/ppc/translate/vsx-impl.c.inc | 70 + 1 file changed, 31 insertions(+), 39 deletions(-) diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index 26ebf3fedf..b622831a73 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -10,6 +10,16 @@ static inline void set_cpu_vsr(int n, TCGv_i64 src, bool high) tcg_gen_st_i64(src, tcg_env, vsr64_offset(n, high)); } +static inline void get_vsr_full(TCGv_i128 dst, int reg) +{ +tcg_gen_ld_i128(dst, tcg_env, vsr_full_offset(reg)); +} + +static inline void set_vsr_full(int reg, TCGv_i128 src) +{ +tcg_gen_st_i128(src, tcg_env, vsr_full_offset(reg)); +} + static inline TCGv_ptr gen_vsr_ptr(int reg) { TCGv_ptr r = tcg_temp_new_ptr(); @@ -196,20 +206,17 @@ static bool trans_LXVH8X(DisasContext *ctx, arg_LXVH8X *a) static bool trans_LXVB16X(DisasContext *ctx, arg_LXVB16X *a) { TCGv EA; -TCGv_i64 xth, xtl; +TCGv_i128 data; REQUIRE_VSX(ctx); REQUIRE_INSNS_FLAGS2(ctx, ISA300); -xth = tcg_temp_new_i64(); -xtl = tcg_temp_new_i64(); +data = tcg_temp_new_i128(); gen_set_access_type(ctx, ACCESS_INT); EA = do_ea_calc(ctx, a->ra, cpu_gpr[a->rb]); -tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_BEUQ); -tcg_gen_addi_tl(EA, EA, 8); -tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_BEUQ); -set_cpu_vsr(a->rt, xth, true); -set_cpu_vsr(a->rt, xtl, false); +tcg_gen_qemu_ld_i128(data, EA, ctx->mem_idx, + MO_BE | MO_128 | MO_ATOM_IFALIGN_PAIR); +set_vsr_full(a->rt, data); return true; } @@ -385,20 +392,17 @@ static bool trans_STXVH8X(DisasContext *ctx, arg_STXVH8X *a) static bool trans_STXVB16X(DisasContext *ctx, arg_STXVB16X *a) { TCGv EA; -TCGv_i64 xsh, xsl; +TCGv_i128 data; REQUIRE_VSX(ctx); REQUIRE_INSNS_FLAGS2(ctx, ISA300); -xsh = tcg_temp_new_i64(); -xsl = tcg_temp_new_i64(); -get_cpu_vsr(xsh, a->rt, true); -get_cpu_vsr(xsl, a->rt, false); +data = tcg_temp_new_i128(); gen_set_access_type(ctx, ACCESS_INT); EA = do_ea_calc(ctx, a->ra, cpu_gpr[a->rb]); -tcg_gen_qemu_st_i64(xsh, EA, ctx->mem_idx, MO_BEUQ); -tcg_gen_addi_tl(EA, EA, 8); -tcg_gen_qemu_st_i64(xsl, EA, ctx->mem_idx, MO_BEUQ); +get_vsr_full(data, a->rt); +tcg_gen_qemu_st_i128(data, EA, ctx->mem_idx, + MO_BE | MO_128 | MO_ATOM_IFALIGN_PAIR); return true; } @@ -2175,13 +2179,13 @@ static bool do_lstxv(DisasContext *ctx, int ra, TCGv displ, int rt, bool store, bool paired) { TCGv ea; -TCGv_i64 xt; +TCGv_i128 data; MemOp mop; int rt1, rt2; -xt = tcg_temp_new_i64(); +data = tcg_temp_new_i128(); -mop = DEF_MEMOP(MO_UQ); +mop = DEF_MEMOP(MO_128 | MO_ATOM_IFALIGN_PAIR); gen_set_access_type(ctx, ACCESS_INT); ea = do_ea_calc(ctx, ra, displ); @@ -2195,32 +2199,20 @@ static bool do_lstxv(DisasContext *ctx, int ra, TCGv displ, } if (store) { -get_cpu_vsr(xt, rt1, !ctx->le_mode); -tcg_gen_qemu_st_i64(xt, ea, ctx->mem_idx, mop); -gen_addr_add(ctx, ea, ea, 8); -get_cpu_vsr(xt, rt1, ctx->le_mode); -tcg_gen_qemu_st_i64(xt, ea, ctx->mem_idx, mop); +get_vsr_full(data, rt1); +tcg_gen_qemu_st_i128(data, ea, ctx->mem_idx, mop); if (paired) { gen_addr_add(ctx, ea, ea, 8); -get_cpu_vsr(xt, rt2, !ctx->le_mode); -tcg_gen_qemu_st_i64(xt, ea, ctx->mem_idx, mop); -gen_addr_add(ctx, ea, ea, 8); -get_cpu_vsr(xt, rt2, ctx->le_mode); -tcg_gen_qemu_st_i64(xt, ea, ctx->mem_idx, mop); +get_vsr_full(data, rt2); +tcg_gen_qemu_st_i128(data, ea, ctx->mem_idx, mop); } } else { -tcg_gen_qemu_ld_i64(xt, ea, ctx->mem_idx, mop); -set_cpu_vsr(rt1, xt, !ctx->le_mode); -gen_addr_add(ctx, ea, ea, 8); -tcg_gen_qemu_ld_i64(xt, ea, ctx->mem_idx, mop); -set_cpu_vsr(rt1, xt, ctx->le_mode); +tcg_gen_qemu_ld_i128(data, ea, ctx->mem_idx, mop); +set_vsr_full(rt1, data); if (paired) { gen_addr_add(ctx, ea, ea, 8); -tcg_gen_qemu_ld_i64(xt, ea, ctx->mem_idx, mop); -set_cpu_vsr(rt2, xt, !ctx->le_mode); -gen_addr_add(ctx, ea, ea, 8); -tcg_gen_qemu_ld_i64(xt, ea, ctx->mem_idx, mop); -set_cpu_vsr(rt2, xt, ctx->le_mode); +tcg_gen_qemu_ld_i128(data, ea, ctx->mem_idx, mop); +set_vsr_full(rt2, data); } } return true; -- 2.39.3
[PATCH v2 2/3] target/ppc: Update VMX storage access insns to use tcg_gen_qemu_ld/st_i128.
Updated instructions {l, st}vx to use tcg_gen_qemu_ld/st_i128, instead of using 64 bits loads/stores in succession. Introduced functions {get, set}_avr_full in vmx-impl.c.inc to facilitate the above, and potential future usage. Suggested-by: Richard Henderson Signed-off-by: Chinmay Rath --- target/ppc/translate/vmx-impl.c.inc | 42 ++--- 1 file changed, 20 insertions(+), 22 deletions(-) diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc index a182d2cf81..70d0ad2e71 100644 --- a/target/ppc/translate/vmx-impl.c.inc +++ b/target/ppc/translate/vmx-impl.c.inc @@ -24,25 +24,29 @@ static inline void set_avr64(int regno, TCGv_i64 src, bool high) tcg_gen_st_i64(src, tcg_env, avr64_offset(regno, high)); } +static inline void get_avr_full(TCGv_i128 dst, int regno) +{ +tcg_gen_ld_i128(dst, tcg_env, avr_full_offset(regno)); +} + +static inline void set_avr_full(int regno, TCGv_i128 src) +{ +tcg_gen_st_i128(src, tcg_env, avr_full_offset(regno)); +} + static bool trans_LVX(DisasContext *ctx, arg_X *a) { TCGv EA; -TCGv_i64 avr; +TCGv_i128 avr; REQUIRE_INSNS_FLAGS(ctx, ALTIVEC); REQUIRE_VECTOR(ctx); gen_set_access_type(ctx, ACCESS_INT); -avr = tcg_temp_new_i64(); +avr = tcg_temp_new_i128(); EA = do_ea_calc(ctx, a->ra, cpu_gpr[a->rb]); tcg_gen_andi_tl(EA, EA, ~0xf); -/* - * We only need to swap high and low halves. gen_qemu_ld64_i64 - * does necessary 64-bit byteswap already. - */ -gen_qemu_ld64_i64(ctx, avr, EA); -set_avr64(a->rt, avr, !ctx->le_mode); -tcg_gen_addi_tl(EA, EA, 8); -gen_qemu_ld64_i64(ctx, avr, EA); -set_avr64(a->rt, avr, ctx->le_mode); +tcg_gen_qemu_ld_i128(avr, EA, ctx->mem_idx, + DEF_MEMOP(MO_128 | MO_ATOM_IFALIGN_PAIR)); +set_avr_full(a->rt, avr); return true; } @@ -56,22 +60,16 @@ static bool trans_LVXL(DisasContext *ctx, arg_LVXL *a) static bool trans_STVX(DisasContext *ctx, arg_STVX *a) { TCGv EA; -TCGv_i64 avr; +TCGv_i128 avr; REQUIRE_INSNS_FLAGS(ctx, ALTIVEC); REQUIRE_VECTOR(ctx); gen_set_access_type(ctx, ACCESS_INT); -avr = tcg_temp_new_i64(); +avr = tcg_temp_new_i128(); EA = do_ea_calc(ctx, a->ra, cpu_gpr[a->rb]); tcg_gen_andi_tl(EA, EA, ~0xf); -/* - * We only need to swap high and low halves. gen_qemu_st64_i64 - * does necessary 64-bit byteswap already. - */ -get_avr64(avr, a->rt, !ctx->le_mode); -gen_qemu_st64_i64(ctx, avr, EA); -tcg_gen_addi_tl(EA, EA, 8); -get_avr64(avr, a->rt, ctx->le_mode); -gen_qemu_st64_i64(ctx, avr, EA); +get_avr_full(avr, a->rt); +tcg_gen_qemu_st_i128(avr, EA, ctx->mem_idx, + DEF_MEMOP(MO_128 | MO_ATOM_IFALIGN_PAIR)); return true; } -- 2.39.3
[PATCH v2 0/3] target/ppc: Update vector insns to use 128 bit
Updating a bunch of VMX and VSX storage access instructions to use tcg_gen_qemu_ld/st_i128 instead of using tcg_gen_qemu_ld/st_i64 in succession; as suggested by Richard, in my decodetree patches. Plus some minor clean-ups to facilitate the above in case of VMX insns. Change log: v2 : Applied IFALIGN_PAIR memop changes in patches 2/3 and 3/3, based on review comments by Richard in v1. v1 : https://lore.kernel.org/qemu-devel/20240621114604.868415-1-ra...@linux.ibm.com/ Chinmay Rath (3): target/ppc: Move get/set_avr64 functions to vmx-impl.c.inc. target/ppc: Update VMX storage access insns to use tcg_gen_qemu_ld/st_i128. target/ppc : Update VSX storage access insns to use tcg_gen_qemu _ld/st_i128. target/ppc/translate.c | 10 - target/ppc/translate/vmx-impl.c.inc | 52 - target/ppc/translate/vsx-impl.c.inc | 70 + 3 files changed, 61 insertions(+), 71 deletions(-) -- 2.39.3
[PATCH v2 1/3] target/ppc: Move get/set_avr64 functions to vmx-impl.c.inc.
Those functions are used to ld/st data to and from Altivec registers, in 64 bits chunks, and are only used in vmx-impl.c.inc file, hence the clean-up movement. Signed-off-by: Chinmay Rath --- target/ppc/translate.c | 10 -- target/ppc/translate/vmx-impl.c.inc | 10 ++ 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/target/ppc/translate.c b/target/ppc/translate.c index ad512e1922..f7f2c2db9e 100644 --- a/target/ppc/translate.c +++ b/target/ppc/translate.c @@ -6200,16 +6200,6 @@ static inline void set_fpr(int regno, TCGv_i64 src) tcg_gen_st_i64(tcg_constant_i64(0), tcg_env, vsr64_offset(regno, false)); } -static inline void get_avr64(TCGv_i64 dst, int regno, bool high) -{ -tcg_gen_ld_i64(dst, tcg_env, avr64_offset(regno, high)); -} - -static inline void set_avr64(int regno, TCGv_i64 src, bool high) -{ -tcg_gen_st_i64(src, tcg_env, avr64_offset(regno, high)); -} - /* * Helpers for decodetree used by !function for decoding arguments. */ diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc index 152bcde0e3..a182d2cf81 100644 --- a/target/ppc/translate/vmx-impl.c.inc +++ b/target/ppc/translate/vmx-impl.c.inc @@ -14,6 +14,16 @@ static inline TCGv_ptr gen_avr_ptr(int reg) return r; } +static inline void get_avr64(TCGv_i64 dst, int regno, bool high) +{ +tcg_gen_ld_i64(dst, tcg_env, avr64_offset(regno, high)); +} + +static inline void set_avr64(int regno, TCGv_i64 src, bool high) +{ +tcg_gen_st_i64(src, tcg_env, avr64_offset(regno, high)); +} + static bool trans_LVX(DisasContext *ctx, arg_X *a) { TCGv EA; -- 2.39.3
[PATCH v6] virtio-net: Fix network stall at the host side waiting for kick
From: thomas Patch 06b12970174 ("virtio-net: fix network stall under load") added double-check to test whether the available buffer size can satisfy the request or not, in case the guest has added some buffers to the avail ring simultaneously after the first check. It will be lucky if the available buffer size becomes okay after the double-check, then the host can send the packet to the guest. If the buffer size still can't satisfy the request, even if the guest has added some buffers, viritio-net would stall at the host side forever. The patch checks whether the guest has added some buffers after last check of avail idx when the available buffers are not sufficient, if so then recheck the available buffers in the loop. The patch also reverts patch "06b12970174". The case below can reproduce the stall. Guest 0 ++ | iperf | ---> | server | Host |++ ++ |... | iperf | | client | Guest n ++ |++ || iperf | ---> | server | ++ Boot many guests from qemu with virtio network: qemu ... -netdev tap,id=net_x \ -device virtio-net-pci-non-transitional,\ iommu_platform=on,mac=xx:xx:xx:xx:xx:xx,netdev=net_x Each guest acts as iperf server with commands below: iperf3 -s -D -i 10 -p 8001 iperf3 -s -D -i 10 -p 8002 The host as iperf client: iperf3 -c guest_IP -p 8001 -i 30 -w 256k -P 20 -t 4 iperf3 -c guest_IP -p 8002 -i 30 -w 256k -P 20 -t 4 After some time, the host loses connection to the guest, the guest can send packet to the host, but can't receive packet from host. It's more likely to happen if SWIOTLB is enabled in the guest, allocating and freeing bounce buffer takes some CPU ticks, copying from/to bounce buffer takes more CPU ticks, compared with that there is no bounce buffer in the guest. Once the rate of producing packets from the host approximates the rate of receiveing packets in the guest, the guest would loop in NAPI. receive packets--- | | v | free buf virtnet_poll | | v | add buf to avail ring --- | | need kick the host? | NAPI continues v receive packets--- | | v | free buf virtnet_poll | | v | add buf to avail ring --- | v ... ... On the other hand, the host fetches free buf from avail ring, if the buf in the avail ring is not enough, the host notifies the guest the event by writing the avail idx read from avail ring to the event idx of used ring, then the host goes to sleep, waiting for the kick signal from the guest. Once the guest finds the host is waiting for kick singal (in virtqueue_kick_prepare_split()), it kicks the host. The host may stall forever at the sequences below: HostGuest --- fetch buf, send packet receive packet --- ... ... | fetch buf, send packet add buf | ...add buf virtnet_poll buf not enough avail idx-> add buf | read avail idx add buf | add buf --- receive packet --- write event idx ... | waiting for kick add buf virtnet_poll ... | --- no more packet, exit NAPI In the first loop of NAPI above, indicated in the range of virtnet_poll above, the host is sending packets while the guest is receiving packets and adding buf. step 1: The buf is not enough, for example, a big packet needs 5 buf, but the available buf count is 3. The host read current avail idx. step 2: The guest adds some buf, then checks whether the host is waiting for kick signal, not at this time. The used ring is not empty, the guest continues the second loop of NAPI. step 3: The host writes the avail idx read from avail ring to used ring as event idx via virtio_queue_set_notification(q->rx_vq, 1). step 4: At the end of the second loop of NAPI, recheck whether kick is needed, as the event idx in the used ring