Re: [Qemu-devel] [Qemu-arm] [PATCH 09/15] target/arm: Don't store M profile PRIMASK and FAULTMASK in daif
On Thu, Aug 03, 2017 at 03:05:17PM -0700, Richard Henderson wrote: > On 08/03/2017 08:38 AM, Edgar E. Iglesias wrote: > >> +uint32_t primask; > >> +uint32_t faultmask; > > It seems like these could be booleans? > > I was thinking the same thing until I read the v8m description as a 32-bit > register. This makes qemu match the spec, which has value. > Ah, I didn't check the spec :-)
Re: [Qemu-devel] [Qemu-arm] [PATCH 8/8] target/arm: Implement new do_transaction_failed hook
On Fri, Aug 04, 2017 at 06:20:49PM +0100, Peter Maydell wrote: > Implement the new do_transaction_failed hook for ARM, which should > cause the CPU to take a prefetch abort or data abort. > > Signed-off-by: Peter Maydell Reviewed-by: Edgar E. Iglesias > --- > target/arm/internals.h | 10 ++ > target/arm/cpu.c | 1 + > target/arm/op_helper.c | 43 +++ > 3 files changed, 54 insertions(+) > > diff --git a/target/arm/internals.h b/target/arm/internals.h > index a3adbd8..13bb001 100644 > --- a/target/arm/internals.h > +++ b/target/arm/internals.h > @@ -471,6 +471,16 @@ void arm_cpu_do_unaligned_access(CPUState *cs, vaddr > vaddr, > MMUAccessType access_type, > int mmu_idx, uintptr_t retaddr); > > +/* arm_cpu_do_transaction_failed: handle a memory system error response > + * (eg "no device/memory present at address") by raising an external abort > + * exception > + */ > +void arm_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr, > + vaddr addr, unsigned size, > + MMUAccessType access_type, > + int mmu_idx, MemTxAttrs attrs, > + MemTxResult response, uintptr_t retaddr); > + > /* Call the EL change hook if one has been registered */ > static inline void arm_call_el_change_hook(ARMCPU *cpu) > { > diff --git a/target/arm/cpu.c b/target/arm/cpu.c > index 05c038b..6baede0 100644 > --- a/target/arm/cpu.c > +++ b/target/arm/cpu.c > @@ -1670,6 +1670,7 @@ static void arm_cpu_class_init(ObjectClass *oc, void > *data) > #else > cc->do_interrupt = arm_cpu_do_interrupt; > cc->do_unaligned_access = arm_cpu_do_unaligned_access; > +cc->do_transaction_failed = arm_cpu_do_transaction_failed; > cc->get_phys_page_attrs_debug = arm_cpu_get_phys_page_attrs_debug; > cc->asidx_from_attrs = arm_asidx_from_attrs; > cc->vmsd = &vmstate_arm_cpu; > diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c > index 7eac272..54b6dd8 100644 > --- a/target/arm/op_helper.c > +++ b/target/arm/op_helper.c > @@ -229,6 +229,49 @@ void arm_cpu_do_unaligned_access(CPUState *cs, vaddr > vaddr, > deliver_fault(cpu, vaddr, access_type, fsr, fsc, &fi); > } > > +/* arm_cpu_do_transaction_failed: handle a memory system error response > + * (eg "no device/memory present at address") by raising an external abort > + * exception > + */ > +void arm_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr, > + vaddr addr, unsigned size, > + MMUAccessType access_type, > + int mmu_idx, MemTxAttrs attrs, > + MemTxResult response, uintptr_t retaddr) > +{ > +ARMCPU *cpu = ARM_CPU(cs); > +CPUARMState *env = &cpu->env; > +uint32_t fsr, fsc; > +ARMMMUFaultInfo fi = {}; > +ARMMMUIdx arm_mmu_idx = core_to_arm_mmu_idx(env, mmu_idx); > + > +if (retaddr) { > +/* now we have a real cpu fault */ > +cpu_restore_state(cs, retaddr); > +} > + > +/* The EA bit in syndromes and fault status registers is an > + * IMPDEF classification of external aborts. ARM implementations > + * usually use this to indicate AXI bus Decode error (0) or > + * Slave error (1); in QEMU we follow that. > + */ > +fi.ea = (response != MEMTX_DECODE_ERROR); > + > +/* The fault status register format depends on whether we're using > + * the LPAE long descriptor format, or the short descriptor format. > + */ > +if (arm_s1_regime_using_lpae_format(env, arm_mmu_idx)) { > +/* long descriptor form, STATUS 0b01: synchronous ext abort */ > +fsr = (fi.ea << 12) | (1 << 9) | 0x10; > +} else { > +/* short descriptor form, FSR 0b01000 : synchronous ext abort */ > +fsr = (fi.ea << 12) | 0x8; > +} > +fsc = 0x10; > + > +deliver_fault(cpu, addr, access_type, fsr, fsc, &fi); > +} > + > #endif /* !defined(CONFIG_USER_ONLY) */ > > uint32_t HELPER(add_setq)(CPUARMState *env, uint32_t a, uint32_t b) > -- > 2.7.4 > >
Re: [Qemu-devel] [Qemu-arm] [PATCH 7/8] target/arm: Allow deliver_fault() caller to specify EA bit
On Fri, Aug 04, 2017 at 06:20:48PM +0100, Peter Maydell wrote: > For external aborts, we will want to be able to specify the EA > (external abort type) bit in the syndrome field. Allow callers of > deliver_fault() to do that by adding a field to ARMMMUFaultInfo which > we use when constructing the syndrome values. > > Signed-off-by: Peter Maydell Reviewed-by: Edgar E. Iglesias > --- > target/arm/internals.h | 2 ++ > target/arm/op_helper.c | 10 +- > 2 files changed, 7 insertions(+), 5 deletions(-) > > diff --git a/target/arm/internals.h b/target/arm/internals.h > index 1f6efef..a3adbd8 100644 > --- a/target/arm/internals.h > +++ b/target/arm/internals.h > @@ -448,12 +448,14 @@ void arm_handle_psci_call(ARMCPU *cpu); > * @s2addr: Address that caused a fault at stage 2 > * @stage2: True if we faulted at stage 2 > * @s1ptw: True if we faulted at stage 2 while doing a stage 1 page-table > walk > + * @ea: True if we should set the EA (external abort type) bit in syndrome > */ > typedef struct ARMMMUFaultInfo ARMMMUFaultInfo; > struct ARMMMUFaultInfo { > target_ulong s2addr; > bool stage2; > bool s1ptw; > +bool ea; > }; > > /* Do a page table walk and add page to TLB if possible */ > diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c > index aa52a98..7eac272 100644 > --- a/target/arm/op_helper.c > +++ b/target/arm/op_helper.c > @@ -80,7 +80,7 @@ uint32_t HELPER(neon_tbl)(CPUARMState *env, uint32_t ireg, > uint32_t def, > > static inline uint32_t merge_syn_data_abort(uint32_t template_syn, > unsigned int target_el, > -bool same_el, > +bool same_el, bool ea, > bool s1ptw, bool is_write, > int fsc) > { > @@ -99,7 +99,7 @@ static inline uint32_t merge_syn_data_abort(uint32_t > template_syn, > */ > if (!(template_syn & ARM_EL_ISV) || target_el != 2 || s1ptw) { > syn = syn_data_abort_no_iss(same_el, > -0, 0, s1ptw, is_write, fsc); > +ea, 0, s1ptw, is_write, fsc); > } else { > /* Fields: IL, ISV, SAS, SSE, SRT, SF and AR come from the template > * syndrome created at translation time. > @@ -107,7 +107,7 @@ static inline uint32_t merge_syn_data_abort(uint32_t > template_syn, > */ > syn = syn_data_abort_with_iss(same_el, >0, 0, 0, 0, 0, > - 0, 0, s1ptw, is_write, fsc, > + ea, 0, s1ptw, is_write, fsc, >false); > /* Merge the runtime syndrome with the template syndrome. */ > syn |= template_syn; > @@ -141,11 +141,11 @@ static void deliver_fault(ARMCPU *cpu, vaddr addr, > MMUAccessType access_type, > } > > if (access_type == MMU_INST_FETCH) { > -syn = syn_insn_abort(same_el, 0, fi->s1ptw, fsc); > +syn = syn_insn_abort(same_el, fi->ea, fi->s1ptw, fsc); > exc = EXCP_PREFETCH_ABORT; > } else { > syn = merge_syn_data_abort(env->exception.syndrome, target_el, > - same_el, fi->s1ptw, > + same_el, fi->ea, fi->s1ptw, > access_type == MMU_DATA_STORE, > fsc); > if (access_type == MMU_DATA_STORE > -- > 2.7.4 > >
Re: [Qemu-devel] [Qemu-arm] [PATCH 6/8] target/arm: Factor out fault delivery code
On Fri, Aug 04, 2017 at 06:20:47PM +0100, Peter Maydell wrote: > We currently have some similar code in tlb_fill() and in > arm_cpu_do_unaligned_access() for delivering a data abort or prefetch > abort. We're also going to want to do the same thing to handle > external aborts. Factor out the common code into a new function > deliver_fault(). I found this a bit hard to read but I think it looks OK :-) Acked-by: Edgar E. Iglesias > > Signed-off-by: Peter Maydell > --- > target/arm/op_helper.c | 110 > + > 1 file changed, 57 insertions(+), 53 deletions(-) > > diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c > index 2a85666..aa52a98 100644 > --- a/target/arm/op_helper.c > +++ b/target/arm/op_helper.c > @@ -115,6 +115,51 @@ static inline uint32_t merge_syn_data_abort(uint32_t > template_syn, > return syn; > } > > +static void deliver_fault(ARMCPU *cpu, vaddr addr, MMUAccessType access_type, > + uint32_t fsr, uint32_t fsc, ARMMMUFaultInfo *fi) > +{ > +CPUARMState *env = &cpu->env; > +int target_el; > +bool same_el; > +uint32_t syn, exc; > + > +target_el = exception_target_el(env); > +if (fi->stage2) { > +target_el = 2; > +env->cp15.hpfar_el2 = extract64(fi->s2addr, 12, 47) << 4; > +} > +same_el = (arm_current_el(env) == target_el); > + > +if (fsc == 0x3f) { > +/* Caller doesn't have a long-format fault status code. This > + * should only happen if this fault will never actually be reported > + * to an EL that uses a syndrome register. Check that here. > + * 0x3f is a (currently) reserved FSR code, in case the constructed > + * syndrome does leak into the guest somehow. > + */ > +assert(target_el != 2 && !arm_el_is_aa64(env, target_el)); > +} > + > +if (access_type == MMU_INST_FETCH) { > +syn = syn_insn_abort(same_el, 0, fi->s1ptw, fsc); > +exc = EXCP_PREFETCH_ABORT; > +} else { > +syn = merge_syn_data_abort(env->exception.syndrome, target_el, > + same_el, fi->s1ptw, > + access_type == MMU_DATA_STORE, > + fsc); > +if (access_type == MMU_DATA_STORE > +&& arm_feature(env, ARM_FEATURE_V6)) { > +fsr |= (1 << 11); > +} > +exc = EXCP_DATA_ABORT; > +} > + > +env->exception.vaddress = addr; > +env->exception.fsr = fsr; > +raise_exception(env, exc, syn, target_el); > +} > + > /* try to fill the TLB and return an exception if error. If retaddr is > * NULL, it means that the function was called in C code (i.e. not > * from generated code or from helper.c) > @@ -129,23 +174,13 @@ void tlb_fill(CPUState *cs, target_ulong addr, > MMUAccessType access_type, > ret = arm_tlb_fill(cs, addr, access_type, mmu_idx, &fsr, &fi); > if (unlikely(ret)) { > ARMCPU *cpu = ARM_CPU(cs); > -CPUARMState *env = &cpu->env; > -uint32_t syn, exc, fsc; > -unsigned int target_el; > -bool same_el; > +uint32_t fsc; > > if (retaddr) { > /* now we have a real cpu fault */ > cpu_restore_state(cs, retaddr); > } > > -target_el = exception_target_el(env); > -if (fi.stage2) { > -target_el = 2; > -env->cp15.hpfar_el2 = extract64(fi.s2addr, 12, 47) << 4; > -} > -same_el = arm_current_el(env) == target_el; > - > if (fsr & (1 << 9)) { > /* LPAE format fault status register : bottom 6 bits are > * status code in the same form as needed for syndrome > @@ -153,34 +188,15 @@ void tlb_fill(CPUState *cs, target_ulong addr, > MMUAccessType access_type, > fsc = extract32(fsr, 0, 6); > } else { > /* Short format FSR : this fault will never actually be reported > - * to an EL that uses a syndrome register. Check that here, > - * and use a (currently) reserved FSR code in case the > constructed > - * syndrome does leak into the guest somehow. > + * to an EL that uses a syndrome register. Use a (currently) > + * reserved FSR code in case the constructed syndrome does leak > + * into the guest somehow. deliver_fault will assert that > + * we don't target an EL using the syndrome. > */ > -assert(target_el != 2 && !arm_el_is_aa64(env, target_el)); > fsc = 0x3f; > } > > -/* For insn and data aborts we assume there is no instruction > syndrome > - * information; this is always true for exceptions reported to EL1. > - */ > -if (access_type == MMU_INST_FETCH) { > -syn = syn_insn_abort(same_el, 0, fi.s1ptw, fsc); > -exc = EXCP_PREFETCH_ABORT; > -
Re: [Qemu-devel] [Qemu-arm] [PATCH 5/8] hw/arm: Set ignore_memory_transaction_failures for most ARM boards
On Fri, Aug 04, 2017 at 06:20:46PM +0100, Peter Maydell wrote: > Set the MachineClass flag ignore_memory_transaction_failures > for almost all ARM boards. This means they retain the legacy > behaviour that accesses to unimplemented addresses will RAZ/WI > rather than aborting, when a subsequent commit adds support > for external aborts. > > The exceptions are: > * virt -- we know that guests won't try to prod devices >that we don't describe in the device tree or ACPI tables > * mps2 -- this board was written to use unimplemented-device >for all the ranges with devices we don't yet handle > > New boards should not set the flag, but instead be written > like the mps2. For the Xilinx boards: Reviewed-by: Edgar E. Iglesias > > Signed-off-by: Peter Maydell > --- > hw/arm/aspeed.c | 3 +++ > hw/arm/collie.c | 1 + > hw/arm/cubieboard.c | 1 + > hw/arm/digic_boards.c | 1 + > hw/arm/exynos4_boards.c | 2 ++ > hw/arm/gumstix.c| 2 ++ > hw/arm/highbank.c | 2 ++ > hw/arm/imx25_pdk.c | 1 + > hw/arm/integratorcp.c | 1 + > hw/arm/kzm.c| 1 + > hw/arm/mainstone.c | 1 + > hw/arm/musicpal.c | 1 + > hw/arm/netduino2.c | 1 + > hw/arm/nseries.c| 2 ++ > hw/arm/omap_sx1.c | 2 ++ > hw/arm/palm.c | 1 + > hw/arm/raspi.c | 1 + > hw/arm/realview.c | 4 > hw/arm/sabrelite.c | 1 + > hw/arm/spitz.c | 4 > hw/arm/stellaris.c | 2 ++ > hw/arm/tosa.c | 1 + > hw/arm/versatilepb.c| 2 ++ > hw/arm/vexpress.c | 1 + > hw/arm/xilinx_zynq.c| 1 + > hw/arm/xlnx-ep108.c | 2 ++ > hw/arm/z2.c | 1 + > 27 files changed, 43 insertions(+) > > diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c > index 0c5635f..ab895ad 100644 > --- a/hw/arm/aspeed.c > +++ b/hw/arm/aspeed.c > @@ -270,6 +270,7 @@ static void palmetto_bmc_class_init(ObjectClass *oc, void > *data) > mc->no_floppy = 1; > mc->no_cdrom = 1; > mc->no_parallel = 1; > +mc->ignore_memory_transaction_failures = true; > } > > static const TypeInfo palmetto_bmc_type = { > @@ -302,6 +303,7 @@ static void ast2500_evb_class_init(ObjectClass *oc, void > *data) > mc->no_floppy = 1; > mc->no_cdrom = 1; > mc->no_parallel = 1; > +mc->ignore_memory_transaction_failures = true; > } > > static const TypeInfo ast2500_evb_type = { > @@ -326,6 +328,7 @@ static void romulus_bmc_class_init(ObjectClass *oc, void > *data) > mc->no_floppy = 1; > mc->no_cdrom = 1; > mc->no_parallel = 1; > +mc->ignore_memory_transaction_failures = true; > } > > static const TypeInfo romulus_bmc_type = { > diff --git a/hw/arm/collie.c b/hw/arm/collie.c > index 2e69531..8830192 100644 > --- a/hw/arm/collie.c > +++ b/hw/arm/collie.c > @@ -64,6 +64,7 @@ static void collie_machine_init(MachineClass *mc) > { > mc->desc = "Sharp SL-5500 (Collie) PDA (SA-1110)"; > mc->init = collie_init; > +mc->ignore_memory_transaction_failures = true; > } > > DEFINE_MACHINE("collie", collie_machine_init) > diff --git a/hw/arm/cubieboard.c b/hw/arm/cubieboard.c > index b98e1c4..32f1edd 100644 > --- a/hw/arm/cubieboard.c > +++ b/hw/arm/cubieboard.c > @@ -86,6 +86,7 @@ static void cubieboard_machine_init(MachineClass *mc) > mc->init = cubieboard_init; > mc->block_default_type = IF_IDE; > mc->units_per_default_bus = 1; > +mc->ignore_memory_transaction_failures = true; > } > > DEFINE_MACHINE("cubieboard", cubieboard_machine_init) > diff --git a/hw/arm/digic_boards.c b/hw/arm/digic_boards.c > index 520c8e9..9f11dcd 100644 > --- a/hw/arm/digic_boards.c > +++ b/hw/arm/digic_boards.c > @@ -155,6 +155,7 @@ static void canon_a1100_machine_init(MachineClass *mc) > { > mc->desc = "Canon PowerShot A1100 IS"; > mc->init = &canon_a1100_init; > +mc->ignore_memory_transaction_failures = true; > } > > DEFINE_MACHINE("canon-a1100", canon_a1100_machine_init) > diff --git a/hw/arm/exynos4_boards.c b/hw/arm/exynos4_boards.c > index 7c03ed3..f1441ec 100644 > --- a/hw/arm/exynos4_boards.c > +++ b/hw/arm/exynos4_boards.c > @@ -189,6 +189,7 @@ static void nuri_class_init(ObjectClass *oc, void *data) > mc->desc = "Samsung NURI board (Exynos4210)"; > mc->init = nuri_init; > mc->max_cpus = EXYNOS4210_NCPUS; > +mc->ignore_memory_transaction_failures = true; > } > > static const TypeInfo nuri_type = { > @@ -204,6 +205,7 @@ static void smdkc210_class_init(ObjectClass *oc, void > *data) > mc->desc = "Samsung SMDKC210 board (Exynos4210)"; > mc->init = smdkc210_init; > mc->max_cpus = EXYNOS4210_NCPUS; > +mc->ignore_memory_transaction_failures = true; > } > > static const TypeInfo smdkc210_type = { > diff --git a/hw/arm/gumstix.c b/hw/arm/gumstix.c > index d59d9ba..092ce36 100644 > --- a/hw/arm/gumstix.c > +++ b/hw/arm/gumstix.c > @@ -128,6 +128,7 @@ static void connex_class_init(ObjectClass *oc, void *data) > >
Re: [Qemu-devel] [Qemu-arm] [PATCH 4/8] boards.h: Define new flag ignore_memory_transaction_failures
On Fri, Aug 04, 2017 at 06:20:45PM +0100, Peter Maydell wrote: > Define a new MachineClass field ignore_memory_transaction_failures. > If this is flag is true then the CPU will ignore memory transaction > failures which should cause the CPU to take an exception due to an > access to an unassigned physical address; the transaction will > instead return zero (for a read) or be ignored (for a write). This > should be set only by legacy board models which rely on the old > RAZ/WI behaviour for handling devices that QEMU does not yet model. > New board models should instead use "unimplemented-device" for all > memory ranges where the guest will attempt to probe for a device that > QEMU doesn't implement and a stub device is required. > > We need this for ARM boards, where we're about to implement support for > generating external aborts on memory transaction failures. Too many > of our legacy board models rely on the RAZ/WI behaviour and we > would break currently working guests when their "probe for device" > code provoked an external abort rather than a RAZ. > > Signed-off-by: Peter Maydell Reviewed-by: Edgar E. Iglesias > --- > include/hw/boards.h | 11 +++ > include/qom/cpu.h | 7 ++- > qom/cpu.c | 7 +++ > 3 files changed, 24 insertions(+), 1 deletion(-) > > diff --git a/include/hw/boards.h b/include/hw/boards.h > index 3363dd1..7f044d1 100644 > --- a/include/hw/boards.h > +++ b/include/hw/boards.h > @@ -131,6 +131,16 @@ typedef struct { > *size than the target architecture's minimum. (Attempting to create > *such a CPU will fail.) Note that changing this is a migration > *compatibility break for the machine. > + * @ignore_memory_transaction_failures: > + *If this is flag is true then the CPU will ignore memory transaction > + *failures which should cause the CPU to take an exception due to an > + *access to an unassigned physical address; the transaction will instead > + *return zero (for a read) or be ignored (for a write). This should be > + *set only by legacy board models which rely on the old RAZ/WI behaviour > + *for handling devices that QEMU does not yet model. New board models > + *should instead use "unimplemented-device" for all memory ranges where > + *the guest will attempt to probe for a device that QEMU doesn't > + *implement and a stub device is required. > */ > struct MachineClass { > /*< private >*/ > @@ -171,6 +181,7 @@ struct MachineClass { > bool rom_file_has_mr; > int minimum_page_bits; > bool has_hotpluggable_cpus; > +bool ignore_memory_transaction_failures; > int numa_mem_align_shift; > void (*numa_auto_assign_ram)(MachineClass *mc, NodeInfo *nodes, > int nb_nodes, ram_addr_t size); > diff --git a/include/qom/cpu.h b/include/qom/cpu.h > index fc54d55..8cff86f 100644 > --- a/include/qom/cpu.h > +++ b/include/qom/cpu.h > @@ -311,6 +311,9 @@ struct qemu_work_item; > * @trace_dstate_delayed: Delayed changes to trace_dstate (includes all > changes > *to @trace_dstate). > * @trace_dstate: Dynamic tracing state of events for this vCPU (bitmask). > + * @ignore_memory_transaction_failures: Cached copy of the MachineState > + *flag of the same name: allows the board to suppress calling of the > + *CPU do_transaction_failed hook function. > * > * State of one CPU core or thread. > */ > @@ -397,6 +400,8 @@ struct CPUState { > */ > bool throttle_thread_scheduled; > > +bool ignore_memory_transaction_failures; > + > /* Note that this is accessed at the start of every TB via a negative > offset from AREG0. Leave this field at the end so as to make the > (absolute value) offset as small as possible. This reduces code > @@ -853,7 +858,7 @@ static inline void cpu_transaction_failed(CPUState *cpu, > hwaddr physaddr, > { > CPUClass *cc = CPU_GET_CLASS(cpu); > > -if (cc->do_transaction_failed) { > +if (!cpu->ignore_memory_transaction_failures && > cc->do_transaction_failed) { > cc->do_transaction_failed(cpu, physaddr, addr, size, access_type, >mmu_idx, attrs, response, retaddr); > } > diff --git a/qom/cpu.c b/qom/cpu.c > index 4f38db0..d8dcf64 100644 > --- a/qom/cpu.c > +++ b/qom/cpu.c > @@ -29,6 +29,7 @@ > #include "exec/cpu-common.h" > #include "qemu/error-report.h" > #include "sysemu/sysemu.h" > +#include "hw/boards.h" > #include "hw/qdev-properties.h" > #include "trace-root.h" > > @@ -360,6 +361,12 @@ static void cpu_common_parse_features(const char > *typename, char *features, > static void cpu_common_realizefn(DeviceState *dev, Error **errp) > { > CPUState *cpu = CPU(dev); > +Object *machine = qdev_get_machine(); > +ObjectClass *oc = object_get_class(machine); > +MachineClass *mc = MACHINE_CLASS(oc); > + > +cpu->ignore_memory_transaction_failures = > +
Re: [Qemu-devel] [Qemu-arm] [PATCH 3/8] cputlb: Support generating CPU exceptions on memory transaction failures
On Fri, Aug 04, 2017 at 06:20:44PM +0100, Peter Maydell wrote: > Call the new cpu_transaction_failed() hook at the places where > CPU generated code interacts with the memory system: > io_readx() > io_writex() > get_page_addr_code() > > Any access from C code (eg via cpu_physical_memory_rw(), > address_space_rw(), ld/st_*_phys()) will *not* trigger CPU exceptions > via cpu_transaction_failed(). Handling for transactions failures for > this kind of call should be done by using a function which returns a > MemTxResult and treating the failure case appropriately in the > calling code. > > In an ideal world we would not generate CPU exceptions for > instruction fetch failures in get_page_addr_code() but instead wait > until the code translation process tried a load and it failed; > however that change would require too great a restructuring and > redesign to attempt at this point. You're right but onsidering the lack of models for I caches and prefetching, I don't think that matters much... Reviewed-by: Edgar E. Iglesias > > Signed-off-by: Peter Maydell > --- > softmmu_template.h | 4 ++-- > accel/tcg/cputlb.c | 32 ++-- > 2 files changed, 32 insertions(+), 4 deletions(-) > > diff --git a/softmmu_template.h b/softmmu_template.h > index 4a2b665..d756329 100644 > --- a/softmmu_template.h > +++ b/softmmu_template.h > @@ -101,7 +101,7 @@ static inline DATA_TYPE glue(io_read, > SUFFIX)(CPUArchState *env, >uintptr_t retaddr) > { > CPUIOTLBEntry *iotlbentry = &env->iotlb[mmu_idx][index]; > -return io_readx(env, iotlbentry, addr, retaddr, DATA_SIZE); > +return io_readx(env, iotlbentry, mmu_idx, addr, retaddr, DATA_SIZE); > } > #endif > > @@ -262,7 +262,7 @@ static inline void glue(io_write, SUFFIX)(CPUArchState > *env, >uintptr_t retaddr) > { > CPUIOTLBEntry *iotlbentry = &env->iotlb[mmu_idx][index]; > -return io_writex(env, iotlbentry, val, addr, retaddr, DATA_SIZE); > +return io_writex(env, iotlbentry, mmu_idx, val, addr, retaddr, > DATA_SIZE); > } > > void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val, > diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c > index 85635ae..e72415a 100644 > --- a/accel/tcg/cputlb.c > +++ b/accel/tcg/cputlb.c > @@ -747,6 +747,7 @@ static inline ram_addr_t > qemu_ram_addr_from_host_nofail(void *ptr) > } > > static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry, > + int mmu_idx, > target_ulong addr, uintptr_t retaddr, int size) > { > CPUState *cpu = ENV_GET_CPU(env); > @@ -754,6 +755,7 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry > *iotlbentry, > MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs); > uint64_t val; > bool locked = false; > +MemTxResult r; > > physaddr = (physaddr & TARGET_PAGE_MASK) + addr; > cpu->mem_io_pc = retaddr; > @@ -767,7 +769,12 @@ static uint64_t io_readx(CPUArchState *env, > CPUIOTLBEntry *iotlbentry, > qemu_mutex_lock_iothread(); > locked = true; > } > -memory_region_dispatch_read(mr, physaddr, &val, size, iotlbentry->attrs); > +r = memory_region_dispatch_read(mr, physaddr, > +&val, size, iotlbentry->attrs); > +if (r != MEMTX_OK) { > +cpu_transaction_failed(cpu, physaddr, addr, size, MMU_DATA_LOAD, > + mmu_idx, iotlbentry->attrs, r, retaddr); > +} > if (locked) { > qemu_mutex_unlock_iothread(); > } > @@ -776,6 +783,7 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry > *iotlbentry, > } > > static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry, > + int mmu_idx, >uint64_t val, target_ulong addr, >uintptr_t retaddr, int size) > { > @@ -783,6 +791,7 @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry > *iotlbentry, > hwaddr physaddr = iotlbentry->addr; > MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs); > bool locked = false; > +MemTxResult r; > > physaddr = (physaddr & TARGET_PAGE_MASK) + addr; > if (mr != &io_mem_rom && mr != &io_mem_notdirty && !cpu->can_do_io) { > @@ -795,7 +804,12 @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry > *iotlbentry, > qemu_mutex_lock_iothread(); > locked = true; > } > -memory_region_dispatch_write(mr, physaddr, val, size, iotlbentry->attrs); > +r = memory_region_dispatch_write(mr, physaddr, > + val, size, iotlbentry->attrs); > +if (r != MEMTX_OK) { > +cpu_transaction_failed(cpu, physaddr, addr, size, MMU_DATA_STORE, > + mmu_idx, iotlbentry->attrs, r, retaddr); > +} > if (locked) { >
Re: [Qemu-devel] [Qemu-arm] [PATCH 2/8] cpu: Define new cpu_transaction_failed() hook
On Fri, Aug 04, 2017 at 06:20:43PM +0100, Peter Maydell wrote: > Currently we have a rather half-baked setup for allowing CPUs to > generate exceptions on accesses to invalid memory: the CPU has a > cpu_unassigned_access() hook which the memory system calls in > unassigned_mem_write() and unassigned_mem_read() if the current_cpu > pointer is non-NULL. This was originally designed before we > implemented the MemTxResult type that allows memory operations to > report a success or failure code, which is why the hook is called > right at the bottom of the memory system. The major problem with > this is that it means that the hook can be called even when the > access was not actually done by the CPU: for instance if the CPU > writes to a DMA engine register which causes the DMA engine to begin > a transaction which has been set up by the guest to operate on > invalid memory then this will casue the CPU to take an exception > incorrectly. Another minor problem is that currently if a device > returns a transaction error then this won't turn into a CPU exception > at all. > > The right way to do this is to have allow the CPU to respond > to memory system transaction failures at the point where the > CPU specific code calls into the memory system. > > Define a new QOM CPU method and utility function > cpu_transaction_failed() which is called in these cases. > The functionality here overlaps with the existing > cpu_unassigned_access() because individual target CPUs will > need some work to convert them to the new system. When this > transition is complete we can remove the old cpu_unassigned_access() > code. BTW, a question. I don't know of any from memory but does any arch have the ability to report the payload that failed for stores? I guess it's easy enough to add that if needed though. > > Signed-off-by: Peter Maydell > --- > include/qom/cpu.h | 21 + > 1 file changed, 21 insertions(+) > > diff --git a/include/qom/cpu.h b/include/qom/cpu.h > index 25eefea..fc54d55 100644 > --- a/include/qom/cpu.h > +++ b/include/qom/cpu.h > @@ -85,8 +85,10 @@ struct TranslationBlock; > * @has_work: Callback for checking if there is work to do. > * @do_interrupt: Callback for interrupt handling. > * @do_unassigned_access: Callback for unassigned access handling. > + * (this is deprecated: new targets should use do_transaction_failed instead) > * @do_unaligned_access: Callback for unaligned access handling, if > * the target defines #ALIGNED_ONLY. > + * @do_transaction_failed: Callback for handling failed memory transactions > * @virtio_is_big_endian: Callback to return %true if a CPU which supports > * runtime configurable endianness is currently big-endian. Non-configurable > * CPUs can use the default implementation of this method. This method should > @@ -153,6 +155,10 @@ typedef struct CPUClass { > void (*do_unaligned_access)(CPUState *cpu, vaddr addr, > MMUAccessType access_type, > int mmu_idx, uintptr_t retaddr); > +void (*do_transaction_failed)(CPUState *cpu, hwaddr physaddr, vaddr addr, > + unsigned size, MMUAccessType access_type, > + int mmu_idx, MemTxAttrs attrs, > + MemTxResult response, uintptr_t retaddr); > bool (*virtio_is_big_endian)(CPUState *cpu); > int (*memory_rw_debug)(CPUState *cpu, vaddr addr, > uint8_t *buf, int len, bool is_write); > @@ -837,6 +843,21 @@ static inline void cpu_unaligned_access(CPUState *cpu, > vaddr addr, > > cc->do_unaligned_access(cpu, addr, access_type, mmu_idx, retaddr); > } > + > +static inline void cpu_transaction_failed(CPUState *cpu, hwaddr physaddr, > + vaddr addr, unsigned size, > + MMUAccessType access_type, > + int mmu_idx, MemTxAttrs attrs, > + MemTxResult response, > + uintptr_t retaddr) > +{ > +CPUClass *cc = CPU_GET_CLASS(cpu); > + > +if (cc->do_transaction_failed) { > +cc->do_transaction_failed(cpu, physaddr, addr, size, access_type, > + mmu_idx, attrs, response, retaddr); > +} > +} > #endif > > #endif /* NEED_CPU_H */ > -- > 2.7.4 > >
Re: [Qemu-devel] [Qemu-arm] [PATCH 2/8] cpu: Define new cpu_transaction_failed() hook
On Fri, Aug 04, 2017 at 06:20:43PM +0100, Peter Maydell wrote: > Currently we have a rather half-baked setup for allowing CPUs to > generate exceptions on accesses to invalid memory: the CPU has a > cpu_unassigned_access() hook which the memory system calls in > unassigned_mem_write() and unassigned_mem_read() if the current_cpu > pointer is non-NULL. This was originally designed before we > implemented the MemTxResult type that allows memory operations to > report a success or failure code, which is why the hook is called > right at the bottom of the memory system. The major problem with > this is that it means that the hook can be called even when the > access was not actually done by the CPU: for instance if the CPU > writes to a DMA engine register which causes the DMA engine to begin > a transaction which has been set up by the guest to operate on > invalid memory then this will casue the CPU to take an exception > incorrectly. Another minor problem is that currently if a device > returns a transaction error then this won't turn into a CPU exception > at all. > > The right way to do this is to have allow the CPU to respond > to memory system transaction failures at the point where the > CPU specific code calls into the memory system. > > Define a new QOM CPU method and utility function > cpu_transaction_failed() which is called in these cases. > The functionality here overlaps with the existing > cpu_unassigned_access() because individual target CPUs will > need some work to convert them to the new system. When this > transition is complete we can remove the old cpu_unassigned_access() > code. > > Signed-off-by: Peter Maydell > > --- > include/qom/cpu.h | 21 + > 1 file changed, 21 insertions(+) > > diff --git a/include/qom/cpu.h b/include/qom/cpu.h > index 25eefea..fc54d55 100644 > --- a/include/qom/cpu.h > +++ b/include/qom/cpu.h > @@ -85,8 +85,10 @@ struct TranslationBlock; > * @has_work: Callback for checking if there is work to do. > * @do_interrupt: Callback for interrupt handling. > * @do_unassigned_access: Callback for unassigned access handling. > + * (this is deprecated: new targets should use do_transaction_failed instead) > * @do_unaligned_access: Callback for unaligned access handling, if > * the target defines #ALIGNED_ONLY. > + * @do_transaction_failed: Callback for handling failed memory transactions Looks OK but I wonder if there you might want to clarify that this is a bus/slave failure and not a failure within the CPU (e.g not an MMU fault). Anyway: Reviewed-by: Edgar E. Iglesias > * @virtio_is_big_endian: Callback to return %true if a CPU which supports > * runtime configurable endianness is currently big-endian. Non-configurable > * CPUs can use the default implementation of this method. This method should > @@ -153,6 +155,10 @@ typedef struct CPUClass { > void (*do_unaligned_access)(CPUState *cpu, vaddr addr, > MMUAccessType access_type, > int mmu_idx, uintptr_t retaddr); > +void (*do_transaction_failed)(CPUState *cpu, hwaddr physaddr, vaddr addr, > + unsigned size, MMUAccessType access_type, > + int mmu_idx, MemTxAttrs attrs, > + MemTxResult response, uintptr_t retaddr); > bool (*virtio_is_big_endian)(CPUState *cpu); > int (*memory_rw_debug)(CPUState *cpu, vaddr addr, > uint8_t *buf, int len, bool is_write); > @@ -837,6 +843,21 @@ static inline void cpu_unaligned_access(CPUState *cpu, > vaddr addr, > > cc->do_unaligned_access(cpu, addr, access_type, mmu_idx, retaddr); > } > + > +static inline void cpu_transaction_failed(CPUState *cpu, hwaddr physaddr, > + vaddr addr, unsigned size, > + MMUAccessType access_type, > + int mmu_idx, MemTxAttrs attrs, > + MemTxResult response, > + uintptr_t retaddr) > +{ > +CPUClass *cc = CPU_GET_CLASS(cpu); > + > +if (cc->do_transaction_failed) { > +cc->do_transaction_failed(cpu, physaddr, addr, size, access_type, > + mmu_idx, attrs, response, retaddr); > +} > +} > #endif > > #endif /* NEED_CPU_H */ > -- > 2.7.4 > >
Re: [Qemu-devel] [Qemu-arm] [PATCH 1/8] memory.h: Move MemTxResult type to memattrs.h
On Fri, Aug 04, 2017 at 06:20:42PM +0100, Peter Maydell wrote: > Move the MemTxResult type to memattrs.h. We're going to want to > use it in cpu/qom.h, which doesn't want to include all of > memory.h. In practice MemTxResult and MemTxAttrs are pretty > closely linked since both are used for the new-style > read_with_attrs and write_with_attrs callbacks, so memattrs.h > is a reasonable home for this rather than creating a whole > new header file for it. > > Signed-off-by: Peter Maydell Reviewed-by: Edgar E. Iglesias > --- > include/exec/memattrs.h | 10 ++ > include/exec/memory.h | 10 -- > 2 files changed, 10 insertions(+), 10 deletions(-) > > diff --git a/include/exec/memattrs.h b/include/exec/memattrs.h > index e601061..d4a1642 100644 > --- a/include/exec/memattrs.h > +++ b/include/exec/memattrs.h > @@ -46,4 +46,14 @@ typedef struct MemTxAttrs { > */ > #define MEMTXATTRS_UNSPECIFIED ((MemTxAttrs) { .unspecified = 1 }) > > +/* New-style MMIO accessors can indicate that the transaction failed. > + * A zero (MEMTX_OK) response means success; anything else is a failure > + * of some kind. The memory subsystem will bitwise-OR together results > + * if it is synthesizing an operation from multiple smaller accesses. > + */ > +#define MEMTX_OK 0 > +#define MEMTX_ERROR (1U << 0) /* device returned an error */ > +#define MEMTX_DECODE_ERROR (1U << 1) /* nothing at that address */ > +typedef uint32_t MemTxResult; > + > #endif > diff --git a/include/exec/memory.h b/include/exec/memory.h > index 400dd44..1dcd312 100644 > --- a/include/exec/memory.h > +++ b/include/exec/memory.h > @@ -112,16 +112,6 @@ static inline void iommu_notifier_init(IOMMUNotifier *n, > IOMMUNotify fn, > n->end = end; > } > > -/* New-style MMIO accessors can indicate that the transaction failed. > - * A zero (MEMTX_OK) response means success; anything else is a failure > - * of some kind. The memory subsystem will bitwise-OR together results > - * if it is synthesizing an operation from multiple smaller accesses. > - */ > -#define MEMTX_OK 0 > -#define MEMTX_ERROR (1U << 0) /* device returned an error */ > -#define MEMTX_DECODE_ERROR (1U << 1) /* nothing at that address */ > -typedef uint32_t MemTxResult; > - > /* > * Memory region callbacks > */ > -- > 2.7.4 > >
[Qemu-devel] [PATCH for-2.11 2/5] qmp-shell: Pass split cmdargs to __build_cmd()
This will allow us to implement a method to run a command that is already split in a list. Signed-off-by: Eduardo Habkost --- scripts/qmp/qmp-shell | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/scripts/qmp/qmp-shell b/scripts/qmp/qmp-shell index c276b90..5fe6162 100755 --- a/scripts/qmp/qmp-shell +++ b/scripts/qmp/qmp-shell @@ -212,15 +212,13 @@ class QMPShell(qmp.QEMUMonitorProtocol): raise QMPShellError('Cannot set "%s" multiple times' % key) parent[optpath[-1]] = value -def __build_cmd(self, cmdline): +def __build_cmd(self, cmdargs): """ Build a QMP input object from a user provided command-line in the following format: < command-name > [ arg-name1=arg1 ] ... [ arg-nameN=argN ] """ -cmdargs = cmdline.split() - # Transactional CLI entry/exit: if cmdargs[0] == 'transaction(': self._transmode = True @@ -247,7 +245,7 @@ class QMPShell(qmp.QEMUMonitorProtocol): finalize = True self.__cli_expr(cmdargs[1:], action['data']) self._actions.append(action) -return self.__build_cmd(')') if finalize else None +return self.__build_cmd([')']) if finalize else None # Standard command: parse and return it to be executed. qmpcmd = { 'execute': cmdargs[0], 'arguments': {} } @@ -262,8 +260,9 @@ class QMPShell(qmp.QEMUMonitorProtocol): print str(jsobj) def _execute_cmd(self, cmdline): +cmdargs = cmdline.split() try: -qmpcmd = self.__build_cmd(cmdline) +qmpcmd = self.__build_cmd(cmdargs) except Exception as e: print 'Error while parsing command line: %s' % e print 'command format: ', -- 2.9.4
[Qemu-devel] [PATCH for-2.11 1/5] qmp-shell: Use argparse module
It makes command-line parsing and generation of help text much simpler. Signed-off-by: Eduardo Habkost --- scripts/qmp/qmp-shell | 61 +-- 1 file changed, 20 insertions(+), 41 deletions(-) diff --git a/scripts/qmp/qmp-shell b/scripts/qmp/qmp-shell index 860ffb2..c276b90 100755 --- a/scripts/qmp/qmp-shell +++ b/scripts/qmp/qmp-shell @@ -73,6 +73,7 @@ import sys import os import errno import atexit +import argparse class QMPCompleter(list): def complete(self, text, state): @@ -393,61 +394,39 @@ def die(msg): sys.stderr.write('ERROR: %s\n' % msg) sys.exit(1) -def fail_cmdline(option=None): -if option: -sys.stderr.write('ERROR: bad command-line option \'%s\'\n' % option) -sys.stderr.write('qmp-shell [ -v ] [ -p ] [ -H ] [ -N ] < UNIX socket path> | < TCP address:port >\n') -sys.stderr.write('-v Verbose (echo command sent and received)\n') -sys.stderr.write('-p Pretty-print JSON\n') -sys.stderr.write('-H Use HMP interface\n') -sys.stderr.write('-N Skip negotiate (for qemu-ga)\n') -sys.exit(1) - def main(): -addr = '' -qemu = None -hmp = False -pretty = False -verbose = False -negotiate = True +parser = argparse.ArgumentParser(description='QMP shell utility') +parser.add_argument('-v', action='store_true', dest='verbose', +help='Verbose (echo command sent and received)') +parser.add_argument('-p', action='store_true', dest='pretty', +help='Pretty-print JSON') +parser.add_argument('-H', action='store_true', dest='hmp', +help='Use HMP interface') +parser.add_argument('-N', action='store_false', dest='negotiate', +default=True, help='Skip negotiate (for qemu-ga)') +parser.add_argument('addr', metavar='ADDRESS', +help='QMP socket address (Unix socket path or TCP address:port)') +args = parser.parse_args() try: -for arg in sys.argv[1:]: -if arg == "-H": -if qemu is not None: -fail_cmdline(arg) -hmp = True -elif arg == "-p": -pretty = True -elif arg == "-N": -negotiate = False -elif arg == "-v": -verbose = True -else: -if qemu is not None: -fail_cmdline(arg) -if hmp: -qemu = HMPShell(arg) -else: -qemu = QMPShell(arg, pretty) -addr = arg - -if qemu is None: -fail_cmdline() +if args.hmp: +qemu = HMPShell(args.addr) +else: +qemu = QMPShell(args.addr, args.pretty) except QMPShellBadPort: die('bad port number in command-line') try: -qemu.connect(negotiate) +qemu.connect(args.negotiate) except qmp.QMPConnectError: die('Didn\'t get QMP greeting message') except qmp.QMPCapabilitiesError: die('Could not negotiate capabilities') except qemu.error: -die('Could not connect to %s' % addr) +die('Could not connect to %s' % args.addr) qemu.show_banner() -qemu.set_verbosity(verbose) +qemu.set_verbosity(args.verbose) while qemu.read_exec_command(qemu.get_prompt()): pass qemu.close() -- 2.9.4
[Qemu-devel] [PATCH for-2.11 0/5] qmp-shell non-interactive mode, delete scripts/qmp/qmp
This series adds the ability to run QMP commands non-interactively to qmp-shell, and deletes scripts/qmp/qmp. Eduardo Habkost (5): qmp-shell: Use argparse module qmp-shell: Pass split cmdargs to __build_cmd() qmp-shell: execute_cmdargs() method qmp-shell: Accept QMP command as argument Remove scripts/qmp/qmp scripts/qmp/qmp | 126 -- scripts/qmp/qmp-shell | 90 2 files changed, 40 insertions(+), 176 deletions(-) delete mode 100755 scripts/qmp/qmp -- 2.9.4
[Qemu-devel] [PATCH for-2.11 5/5] Remove scripts/qmp/qmp
The only purpose of scripts/qmp/qmp was the ability to run QMP commands non-interactively. Now it is possible to run qmp-shell non-interactively by providing a QMP command a command-line argument, making scripts/qmp/qmp obsolete. Signed-off-by: Eduardo Habkost --- scripts/qmp/qmp | 126 1 file changed, 126 deletions(-) delete mode 100755 scripts/qmp/qmp diff --git a/scripts/qmp/qmp b/scripts/qmp/qmp deleted file mode 100755 index 514b539..000 --- a/scripts/qmp/qmp +++ /dev/null @@ -1,126 +0,0 @@ -#!/usr/bin/python -# -# QMP command line tool -# -# Copyright IBM, Corp. 2011 -# -# Authors: -# Anthony Liguori -# -# This work is licensed under the terms of the GNU GPLv2 or later. -# See the COPYING file in the top-level directory. - -import sys, os -from qmp import QEMUMonitorProtocol - -def print_response(rsp, prefix=[]): -if type(rsp) == list: -i = 0 -for item in rsp: -if prefix == []: -prefix = ['item'] -print_response(item, prefix[:-1] + ['%s[%d]' % (prefix[-1], i)]) -i += 1 -elif type(rsp) == dict: -for key in rsp.keys(): -print_response(rsp[key], prefix + [key]) -else: -if len(prefix): -print '%s: %s' % ('.'.join(prefix), rsp) -else: -print '%s' % (rsp) - -def main(args): -path = None - -# Use QMP_PATH if it's set -if os.environ.has_key('QMP_PATH'): -path = os.environ['QMP_PATH'] - -while len(args): -arg = args[0] - -if arg.startswith('--'): -arg = arg[2:] -if arg.find('=') == -1: -value = True -else: -arg, value = arg.split('=', 1) - -if arg in ['path']: -if type(value) == str: -path = value -elif arg in ['help']: -os.execlp('man', 'man', 'qmp') -else: -print 'Unknown argument "%s"' % arg - -args = args[1:] -else: -break - -if not path: -print "QMP path isn't set, use --path=qmp-monitor-address or set QMP_PATH" -return 1 - -if len(args): -command, args = args[0], args[1:] -else: -print 'No command found' -print 'Usage: "qmp [--path=qmp-monitor-address] qmp-cmd arguments"' -return 1 - -if command in ['help']: -os.execlp('man', 'man', 'qmp') - -srv = QEMUMonitorProtocol(path) -srv.connect() - -def do_command(srv, cmd, **kwds): -rsp = srv.cmd(cmd, kwds) -if rsp.has_key('error'): -raise Exception(rsp['error']['desc']) -return rsp['return'] - -commands = map(lambda x: x['name'], do_command(srv, 'query-commands')) - -srv.close() - -if command not in commands: -fullcmd = 'qmp-%s' % command -try: -os.environ['QMP_PATH'] = path -os.execvp(fullcmd, [fullcmd] + args) -except OSError as exc: -if exc.errno == 2: -print 'Command "%s" not found.' % (fullcmd) -return 1 -raise -return 0 - -srv = QEMUMonitorProtocol(path) -srv.connect() - -arguments = {} -for arg in args: -if not arg.startswith('--'): -print 'Unknown argument "%s"' % arg -return 1 - -arg = arg[2:] -if arg.find('=') == -1: -value = True -else: -arg, value = arg.split('=', 1) - -if arg in ['help']: -os.execlp('man', 'man', 'qmp-%s' % command) -return 1 - -arguments[arg] = value - -rsp = do_command(srv, command, **arguments) -print_response(rsp) - -if __name__ == '__main__': -sys.exit(main(sys.argv[1:])) -- 2.9.4
[Qemu-devel] [PATCH for-2.11 4/5] qmp-shell: Accept QMP command as argument
This is useful for testing QMP commands in scripts. Example usage, combined with 'jq' for filtering the results: $ ./scripts/qmp/qmp-shell /tmp/qmp qom-list path=/ | jq -r .return[].name machine type chardevs backend $ Signed-off-by: Eduardo Habkost --- scripts/qmp/qmp-shell | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/scripts/qmp/qmp-shell b/scripts/qmp/qmp-shell index 6113aaf..947fc36 100755 --- a/scripts/qmp/qmp-shell +++ b/scripts/qmp/qmp-shell @@ -410,6 +410,10 @@ def main(): default=True, help='Skip negotiate (for qemu-ga)') parser.add_argument('addr', metavar='ADDRESS', help='QMP socket address (Unix socket path or TCP address:port)') +parser.add_argument('cmd', metavar='COMMAND', nargs='?', +help='QMP command to be exeucted') +parser.add_argument('cmdargs', metavar='ARG=VALUE', nargs='*', +help='Argument to QMP command') args = parser.parse_args() try: @@ -429,10 +433,13 @@ def main(): except qemu.error: die('Could not connect to %s' % args.addr) -qemu.show_banner() qemu.set_verbosity(args.verbose) -while qemu.read_exec_command(qemu.get_prompt()): -pass +if args.cmd: +qemu.execute_cmdargs([args.cmd] + args.cmdargs) +else: +qemu.show_banner() +while qemu.read_exec_command(qemu.get_prompt()): +pass qemu.close() if __name__ == '__main__': -- 2.9.4
[Qemu-devel] [PATCH for-2.11 3/5] qmp-shell: execute_cmdargs() method
This will allow us to execute a command that was already split in a list. Signed-off-by: Eduardo Habkost --- scripts/qmp/qmp-shell | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/scripts/qmp/qmp-shell b/scripts/qmp/qmp-shell index 5fe6162..6113aaf 100755 --- a/scripts/qmp/qmp-shell +++ b/scripts/qmp/qmp-shell @@ -260,7 +260,9 @@ class QMPShell(qmp.QEMUMonitorProtocol): print str(jsobj) def _execute_cmd(self, cmdline): -cmdargs = cmdline.split() +return self.execute_cmdargs(cmdline.split()) + +def execute_cmdargs(self, cmdargs): try: qmpcmd = self.__build_cmd(cmdargs) except Exception as e: @@ -386,6 +388,9 @@ class HMPShell(QMPShell): print '%s: %s' % (resp['error']['class'], resp['error']['desc']) return True +def execute_cmdargs(self, cmdargs): +return self._execute_cmd(' '.join(cmdargs)) + def show_banner(self): QMPShell.show_banner(self, msg='Welcome to the HMP shell!') -- 2.9.4
Re: [Qemu-devel] [PATCH v3 2/3] pci: add QEMU-specific PCI capability structure
2017-08-04 23:28 GMT+03:00 Laszlo Ersek : > On 08/04/17 20:59, Alexander Bezzubikov wrote: >> 2017-08-01 20:28 GMT+03:00 Alexander Bezzubikov : >>> 2017-08-01 16:38 GMT+03:00 Marcel Apfelbaum : On 31/07/2017 22:01, Alexander Bezzubikov wrote: > > 2017-07-31 21:57 GMT+03:00 Michael S. Tsirkin : >> >> On Mon, Jul 31, 2017 at 09:54:55PM +0300, Alexander Bezzubikov wrote: >>> >>> 2017-07-31 17:09 GMT+03:00 Marcel Apfelbaum : On 31/07/2017 17:00, Michael S. Tsirkin wrote: > > > On Sat, Jul 29, 2017 at 02:34:31AM +0300, Aleksandr Bezzubikov wrote: >> >> >> On PCI init PCI bridge devices may need some >> extra info about bus number to reserve, IO, memory and >> prefetchable memory limits. QEMU can provide this >> with special vendor-specific PCI capability. >> >> This capability is intended to be used only >> for Red Hat PCI bridges, i.e. QEMU cooperation. >> >> Signed-off-by: Aleksandr Bezzubikov >> --- >>src/fw/dev-pci.h | 62 >> >>1 file changed, 62 insertions(+) >>create mode 100644 src/fw/dev-pci.h >> >> diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h >> new file mode 100644 >> index 000..fbd49ed >> --- /dev/null >> +++ b/src/fw/dev-pci.h >> @@ -0,0 +1,62 @@ >> +#ifndef _PCI_CAP_H >> +#define _PCI_CAP_H >> + >> +#include "types.h" >> + >> +/* >> + >> +QEMU-specific vendor(Red Hat)-specific capability. >> +It's intended to provide some hints for firmware to init PCI >> devices. >> + >> +Its is shown below: >> + >> +Header: >> + >> +u8 id; Standard PCI Capability Header field >> +u8 next; Standard PCI Capability Header field >> +u8 len; Standard PCI Capability Header field >> +u8 type; Red Hat vendor-specific capability type: >> + now only REDHAT_QEMU_CAP 1 exists >> +Data: >> + >> +u16 non_prefetchable_16; non-prefetchable memory limit >> + >> +u8 bus_res; minimum bus number to reserve; >> + this is necessary for PCI Express Root Ports >> + to support PCIE-to-PCI bridge hotplug >> + >> +u8 io_8; IO limit in case of 8-bit limit value >> +u32 io_32; IO limit in case of 16-bit limit value >> + io_8 and io_16 are mutually exclusive, in other words, >> + they can't be non-zero simultaneously >> + >> +u32 prefetchable_32; non-prefetchable memory limit >> + in case of 32-bit limit value >> +u64 prefetchable_64; non-prefetchable memory limit >> + in case of 64-bit limit value >> + prefetachable_32 and prefetchable_64 >> are >> + mutually exclusive, in other words, >> + they can't be non-zero simultaneously >> +If any field in Data section is 0, >> +it means that such kind of reservation >> +is not needed. >>> >>> >>> I really don't like this 'mutually exclusive' fields approach because >>> IMHO it increases confusion level when undertanding this capability >>> structure. >>> But - if we came to consensus on that, then IO fields should be used >>> in the same way, >>> because as I understand, this 'mutual exclusivity' serves to distinguish >>> cases >>> when we employ only *_LIMIT register and both *_LIMIT an UPPER_*_LIMIT >>> registers. >>> And this is how both IO and PREFETCHABLE works, isn't it? >> >> >> I would just use simeple 64 bit registers. PCI spec has an ugly format >> with fields spread all over the place but that is because of >> compatibility concerns. It makes not sense to spend cycles just >> to be similarly messy. > > > Then I suggest to use exactly one field of a maximum possible size > for each reserving object, and get rid of mutually exclusive fields. > Then it can be something like that (order and names can be changed): > u8 bus; > u16 non_pref; > u32 io; > u64 pref; > I think Michael suggested: u64 bus_res; u64 mem_res; u64 io_res; u64 mem_pref_res; OR: u32 bus_res; u32 mem_res; u32 io_res; u64 mem_pref_res; We can use 0XFFF..F as "not-set" value "merging" Gerd's and Michael's requests. >>> >>> Let's dwell on the second option (with -1 as 'ignore' sign), if no
Re: [Qemu-devel] [PATCH v3 2/3] pci: add QEMU-specific PCI capability structure
On 08/04/17 20:59, Alexander Bezzubikov wrote: > 2017-08-01 20:28 GMT+03:00 Alexander Bezzubikov : >> 2017-08-01 16:38 GMT+03:00 Marcel Apfelbaum : >>> On 31/07/2017 22:01, Alexander Bezzubikov wrote: 2017-07-31 21:57 GMT+03:00 Michael S. Tsirkin : > > On Mon, Jul 31, 2017 at 09:54:55PM +0300, Alexander Bezzubikov wrote: >> >> 2017-07-31 17:09 GMT+03:00 Marcel Apfelbaum : >>> >>> On 31/07/2017 17:00, Michael S. Tsirkin wrote: On Sat, Jul 29, 2017 at 02:34:31AM +0300, Aleksandr Bezzubikov wrote: > > > On PCI init PCI bridge devices may need some > extra info about bus number to reserve, IO, memory and > prefetchable memory limits. QEMU can provide this > with special vendor-specific PCI capability. > > This capability is intended to be used only > for Red Hat PCI bridges, i.e. QEMU cooperation. > > Signed-off-by: Aleksandr Bezzubikov > --- >src/fw/dev-pci.h | 62 > >1 file changed, 62 insertions(+) >create mode 100644 src/fw/dev-pci.h > > diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h > new file mode 100644 > index 000..fbd49ed > --- /dev/null > +++ b/src/fw/dev-pci.h > @@ -0,0 +1,62 @@ > +#ifndef _PCI_CAP_H > +#define _PCI_CAP_H > + > +#include "types.h" > + > +/* > + > +QEMU-specific vendor(Red Hat)-specific capability. > +It's intended to provide some hints for firmware to init PCI > devices. > + > +Its is shown below: > + > +Header: > + > +u8 id; Standard PCI Capability Header field > +u8 next; Standard PCI Capability Header field > +u8 len; Standard PCI Capability Header field > +u8 type; Red Hat vendor-specific capability type: > + now only REDHAT_QEMU_CAP 1 exists > +Data: > + > +u16 non_prefetchable_16; non-prefetchable memory limit > + > +u8 bus_res; minimum bus number to reserve; > + this is necessary for PCI Express Root Ports > + to support PCIE-to-PCI bridge hotplug > + > +u8 io_8; IO limit in case of 8-bit limit value > +u32 io_32; IO limit in case of 16-bit limit value > + io_8 and io_16 are mutually exclusive, in other words, > + they can't be non-zero simultaneously > + > +u32 prefetchable_32; non-prefetchable memory limit > + in case of 32-bit limit value > +u64 prefetchable_64; non-prefetchable memory limit > + in case of 64-bit limit value > + prefetachable_32 and prefetchable_64 > are > + mutually exclusive, in other words, > + they can't be non-zero simultaneously > +If any field in Data section is 0, > +it means that such kind of reservation > +is not needed. >> >> >> I really don't like this 'mutually exclusive' fields approach because >> IMHO it increases confusion level when undertanding this capability >> structure. >> But - if we came to consensus on that, then IO fields should be used >> in the same way, >> because as I understand, this 'mutual exclusivity' serves to distinguish >> cases >> when we employ only *_LIMIT register and both *_LIMIT an UPPER_*_LIMIT >> registers. >> And this is how both IO and PREFETCHABLE works, isn't it? > > > I would just use simeple 64 bit registers. PCI spec has an ugly format > with fields spread all over the place but that is because of > compatibility concerns. It makes not sense to spend cycles just > to be similarly messy. Then I suggest to use exactly one field of a maximum possible size for each reserving object, and get rid of mutually exclusive fields. Then it can be something like that (order and names can be changed): u8 bus; u16 non_pref; u32 io; u64 pref; >>> >>> I think Michael suggested: >>> >>> u64 bus_res; >>> u64 mem_res; >>> u64 io_res; >>> u64 mem_pref_res; >>> >>> OR: >>> u32 bus_res; >>> u32 mem_res; >>> u32 io_res; >>> u64 mem_pref_res; >>> >>> >>> We can use 0XFFF..F as "not-set" value "merging" Gerd's and Michael's >>> requests. >> >> Let's dwell on the second option (with -1 as 'ignore' sign), if no new >> objections. >> > > BTW, talking about limit values provided in the capability - do we > want to completely override > existing PCI resources allocation mechanism b
Re: [Qemu-devel] [PATCH 8/8] target/arm: Implement new do_transaction_failed hook
On 08/04/2017 10:20 AM, Peter Maydell wrote: > Implement the new do_transaction_failed hook for ARM, which should > cause the CPU to take a prefetch abort or data abort. > > Signed-off-by: Peter Maydell > --- > target/arm/internals.h | 10 ++ > target/arm/cpu.c | 1 + > target/arm/op_helper.c | 43 +++ > 3 files changed, 54 insertions(+) Reviewed-by: Richard Henderson r~
Re: [Qemu-devel] [PATCH 7/8] target/arm: Allow deliver_fault() caller to specify EA bit
On 08/04/2017 10:20 AM, Peter Maydell wrote: > For external aborts, we will want to be able to specify the EA > (external abort type) bit in the syndrome field. Allow callers of > deliver_fault() to do that by adding a field to ARMMMUFaultInfo which > we use when constructing the syndrome values. > > Signed-off-by: Peter Maydell > --- > target/arm/internals.h | 2 ++ > target/arm/op_helper.c | 10 +- > 2 files changed, 7 insertions(+), 5 deletions(-) Reviewed-by: Richard Henderson r~
Re: [Qemu-devel] [PATCH 6/8] target/arm: Factor out fault delivery code
On 08/04/2017 10:20 AM, Peter Maydell wrote: > +if (fsc == 0x3f) { > +/* Caller doesn't have a long-format fault status code. This > + * should only happen if this fault will never actually be reported > + * to an EL that uses a syndrome register. Check that here. > + * 0x3f is a (currently) reserved FSR code, in case the constructed > + * syndrome does leak into the guest somehow. > + */ > +assert(target_el != 2 && !arm_el_is_aa64(env, target_el)); > +} I see that this is just code movement, but there appears to be a typo in the comment, confusing fsc vs fsr and the 0x3f reserved value. Otherwise, Reviewed-by: Richard Henderson r~
Re: [Qemu-devel] [PATCH v4 08/15] qcow2: set inactive flag
On 08/01/2017 09:19 AM, Anton Nefedov wrote: > Qcow2State and BlockDriverState flags have to be in sync > > Signed-off-by: Anton Nefedov > --- > block/qcow2.c | 1 + > 1 file changed, 1 insertion(+) Is this a bug fix needed for 2.10? > > diff --git a/block/qcow2.c b/block/qcow2.c > index 66aa8c2..2a1d2f2 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -2138,6 +2138,7 @@ static int qcow2_inactivate(BlockDriverState *bs) > > if (result == 0) { > qcow2_mark_clean(bs); > +s->flags |= BDRV_O_INACTIVE; > } > > return result; > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v4 03/15] block: introduce BDRV_REQ_ALLOCATE flag
On 08/01/2017 09:19 AM, Anton Nefedov wrote: > The flag is supposed to indicate that the region of the disk image has > to be sufficiently allocated so it reads as zeroes. The call with the flag > set has to return -ENOTSUP if allocation cannot be done efficiently > (i.e. without falling back to writing actual buffers) > > Signed-off-by: Anton Nefedov > --- > include/block/block.h | 6 +- > include/block/block_int.h | 2 +- > block/io.c| 20 +--- > 3 files changed, 23 insertions(+), 5 deletions(-) Reviewed-by: Eric Blake You might want the commit message to be a bit more verbose... > > diff --git a/include/block/block.h b/include/block/block.h > index 7fe0125..828da67 100644 > --- a/include/block/block.h > +++ b/include/block/block.h > @@ -65,9 +65,13 @@ typedef enum { > BDRV_REQ_NO_SERIALISING = 0x8, > BDRV_REQ_FUA= 0x10, > BDRV_REQ_WRITE_COMPRESSED = 0x20, > +/* The BDRV_REQ_ALLOCATE flag is used to indicate that the driver has to > + * efficiently allocate the space so it reads as zeroes, or return an > error. > + */ > +BDRV_REQ_ALLOCATE = 0x40, > > /* Mask of valid flags */ > -BDRV_REQ_MASK = 0x3f, > +BDRV_REQ_MASK = 0x7f, > } BdrvRequestFlags; > > typedef struct BlockSizes { > diff --git a/include/block/block_int.h b/include/block/block_int.h > index 9b94b32..9b64411 100644 > --- a/include/block/block_int.h > +++ b/include/block/block_int.h > @@ -585,7 +585,7 @@ struct BlockDriverState { > /* Flags honored during pwrite (so far: BDRV_REQ_FUA) */ > unsigned int supported_write_flags; > /* Flags honored during pwrite_zeroes (so far: BDRV_REQ_FUA, > - * BDRV_REQ_MAY_UNMAP) */ > + * BDRV_REQ_MAY_UNMAP, BDRV_REQ_ALLOCATE) */ > unsigned int supported_zero_flags; ...in addition to adding the new flag here and documenting its semantics for drivers... > > /* the following member gives a name to every node on the bs graph. */ > diff --git a/block/io.c b/block/io.c > index 375fc66..04d495e 100644 > --- a/block/io.c > +++ b/block/io.c > @@ -1245,7 +1245,7 @@ static int coroutine_fn > bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, > assert(!bs->supported_zero_flags); > } > > -if (ret == -ENOTSUP) { > +if (ret == -ENOTSUP && !(flags & BDRV_REQ_ALLOCATE)) { > /* Fall back to bounce buffer if write zeroes is unsupported */ > BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; ...you also made sure that anywhere the flag is in use you avoid a slow fallback... > @@ -1639,6 +1645,14 @@ int coroutine_fn bdrv_co_pwrite_zeroes(BdrvChild > *child, int64_t offset, > { > trace_bdrv_co_pwrite_zeroes(child->bs, offset, bytes, flags); > > +assert(!(flags & BDRV_REQ_MAY_UNMAP && flags & BDRV_REQ_ALLOCATE)); > + > +if (flags & BDRV_REQ_ALLOCATE && > +!(child->bs->supported_zero_flags & BDRV_REQ_ALLOCATE)) > +{ > +return -ENOTSUP; ...as well as providing a sane default to make the flag always trigger -ENOTSUP until individual drivers implement something in later patches. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 3/3] parallels: drop check that bdrv_truncate() is working
On 08/04/2017 10:32 PM, Eric Blake wrote: > On 08/04/2017 10:10 AM, Denis V. Lunev wrote: >> This would be actually strange and error prone. If truncate() nowadays >> will fail, there is something fatally wrong. Let's check for that during >> the actual work. >> >> The only fallback case is when the file is not zero initialized. In this >> case we should switch to preallocation via fallocate(). > I got confused by the commit message. Here's my attempt an an > alternative, to see if I'm understanding the point of this patch: > > The code was trying to truncate a file to its current length, as an > optimization to help decide whether an alternative prealloc mode was > useful. But it forgot to check whether bdrv_getlength() succeeded, and > dealing with that failure just complicates what was supposed to be a > no-op probe for optimizing later operation. We will still properly fail > later when an actual truncation attempt is made and the device can't > support it. So when deciding how to set prealloc_mode while opening the > device, all we really need is to check just the one condition that > matters - knowing whether the device is zero initialized. > This is not an optimization. This is check that truncate is working for underlying BDS. 2 years ago I though that I have seen some BDSes without truncate support. So, with this patch I have dropped this check. If the user has specified to change preallocation mode, he is responsible to be correct. If truncate is broken, this will be revealed at the first attempt to expand the file. >> Signed-off-by: Denis V. Lunev >> CC: Markus Armbruster >> CC: Kevin Wolf >> CC: Max Reitz >> CC: Stefan Hajnoczi >> --- >> block/parallels.c | 4 +--- >> 1 file changed, 1 insertion(+), 3 deletions(-) > The code change makes sense at first glance, but I'm a bit reluctant to > give R-b, since the commit message threw me off and I'm not familiar > with the parallels code in the first place. > >> diff --git a/block/parallels.c b/block/parallels.c >> index 6794e53c0b..e1e06d23cc 100644 >> --- a/block/parallels.c >> +++ b/block/parallels.c >> @@ -703,9 +703,7 @@ static int parallels_open(BlockDriverState *bs, QDict >> *options, int flags, >> goto fail_options; >> } >> >> -if (!(flags & BDRV_O_RESIZE) || !bdrv_has_zero_init(bs->file->bs) || >> -bdrv_truncate(bs->file, bdrv_getlength(bs->file->bs), >> - PREALLOC_MODE_OFF, NULL) != 0) { >> +if (!bdrv_has_zero_init(bs->file->bs)) { >> s->prealloc_mode = PRL_PREALLOC_MODE_FALLOCATE; >> } >> >>
Re: [Qemu-devel] [PATCH 3/3] parallels: drop check that bdrv_truncate() is working
On 08/04/2017 10:10 AM, Denis V. Lunev wrote: > This would be actually strange and error prone. If truncate() nowadays > will fail, there is something fatally wrong. Let's check for that during > the actual work. > > The only fallback case is when the file is not zero initialized. In this > case we should switch to preallocation via fallocate(). I got confused by the commit message. Here's my attempt an an alternative, to see if I'm understanding the point of this patch: The code was trying to truncate a file to its current length, as an optimization to help decide whether an alternative prealloc mode was useful. But it forgot to check whether bdrv_getlength() succeeded, and dealing with that failure just complicates what was supposed to be a no-op probe for optimizing later operation. We will still properly fail later when an actual truncation attempt is made and the device can't support it. So when deciding how to set prealloc_mode while opening the device, all we really need is to check just the one condition that matters - knowing whether the device is zero initialized. > > Signed-off-by: Denis V. Lunev > CC: Markus Armbruster > CC: Kevin Wolf > CC: Max Reitz > CC: Stefan Hajnoczi > --- > block/parallels.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) The code change makes sense at first glance, but I'm a bit reluctant to give R-b, since the commit message threw me off and I'm not familiar with the parallels code in the first place. > > diff --git a/block/parallels.c b/block/parallels.c > index 6794e53c0b..e1e06d23cc 100644 > --- a/block/parallels.c > +++ b/block/parallels.c > @@ -703,9 +703,7 @@ static int parallels_open(BlockDriverState *bs, QDict > *options, int flags, > goto fail_options; > } > > -if (!(flags & BDRV_O_RESIZE) || !bdrv_has_zero_init(bs->file->bs) || > -bdrv_truncate(bs->file, bdrv_getlength(bs->file->bs), > - PREALLOC_MODE_OFF, NULL) != 0) { > +if (!bdrv_has_zero_init(bs->file->bs)) { > s->prealloc_mode = PRL_PREALLOC_MODE_FALLOCATE; > } > > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [Qemu-arm] [PATCH 4/8] boards.h: Define new flag ignore_memory_transaction_failures
On 08/04/2017 11:09 AM, Philippe Mathieu-Daudé wrote: > Since create_unimplemented_device() register overlapped with low priority, why > not register it as default device directly, over the whole address space? That's a good suggestion. It makes more sense to me than adding a flag on the MachineClass. r~
Re: [Qemu-devel] [PATCH 2/3] parallels: respect error code of bdrv_getlength() in allocate_clusters()
On 08/04/2017 10:10 AM, Denis V. Lunev wrote: > If we can not get the file length, the state of BDS is broken completely. > Return error to the caller. > > Signed-off-by: Denis V. Lunev > CC: Markus Armbruster > CC: Kevin Wolf > CC: Max Reitz > CC: Stefan Hajnoczi > --- > block/parallels.c | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 1/3] block: respect error code from bdrv_getlength in handle_aiocb_write_zeroes
On 08/04/2017 10:10 AM, Denis V. Lunev wrote: > Original idea beyond the code in question was the following: we have failed > to write zeroes with fallocate(FALLOC_FL_ZERO_RANGE) as the simplest > approach and via fallocate(FALLOC_FL_PUNCH_HOLE)/fallocate(0). We have the > only chance now: if the request comes beyond end of the file. Thus we > should calculate file length and respect the error code from that op. > > Signed-off-by: Denis V. Lunev > CC: Markus Armbruster > CC: Kevin Wolf > CC: Max Reitz > CC: Stefan Hajnoczi > --- > block/file-posix.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] block: drop bdrv_set_key from BlockDriver
On 08/04/2017 10:26 AM, Paolo Bonzini wrote: > This is not used anymore since c01c214b69 ("block: remove all encryption > handling APIs", 2017-07-11). > > Signed-off-by: Paolo Bonzini > --- > include/block/block_int.h | 1 - > 1 file changed, 1 deletion(-) Reviewed-by: Eric Blake > > diff --git a/include/block/block_int.h b/include/block/block_int.h > index d4f4ea7584..7571c0aaaf 100644 > --- a/include/block/block_int.h > +++ b/include/block/block_int.h > @@ -127,7 +127,6 @@ struct BlockDriver { >Error **errp); > void (*bdrv_close)(BlockDriverState *bs); > int (*bdrv_create)(const char *filename, QemuOpts *opts, Error **errp); > -int (*bdrv_set_key)(BlockDriverState *bs, const char *key); > int (*bdrv_make_empty)(BlockDriverState *bs); > > void (*bdrv_refresh_filename)(BlockDriverState *bs, QDict *options); > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v3 2/3] pci: add QEMU-specific PCI capability structure
2017-08-01 20:28 GMT+03:00 Alexander Bezzubikov : > 2017-08-01 16:38 GMT+03:00 Marcel Apfelbaum : >> On 31/07/2017 22:01, Alexander Bezzubikov wrote: >>> >>> 2017-07-31 21:57 GMT+03:00 Michael S. Tsirkin : On Mon, Jul 31, 2017 at 09:54:55PM +0300, Alexander Bezzubikov wrote: > > 2017-07-31 17:09 GMT+03:00 Marcel Apfelbaum : >> >> On 31/07/2017 17:00, Michael S. Tsirkin wrote: >>> >>> >>> On Sat, Jul 29, 2017 at 02:34:31AM +0300, Aleksandr Bezzubikov wrote: On PCI init PCI bridge devices may need some extra info about bus number to reserve, IO, memory and prefetchable memory limits. QEMU can provide this with special vendor-specific PCI capability. This capability is intended to be used only for Red Hat PCI bridges, i.e. QEMU cooperation. Signed-off-by: Aleksandr Bezzubikov --- src/fw/dev-pci.h | 62 1 file changed, 62 insertions(+) create mode 100644 src/fw/dev-pci.h diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h new file mode 100644 index 000..fbd49ed --- /dev/null +++ b/src/fw/dev-pci.h @@ -0,0 +1,62 @@ +#ifndef _PCI_CAP_H +#define _PCI_CAP_H + +#include "types.h" + +/* + +QEMU-specific vendor(Red Hat)-specific capability. +It's intended to provide some hints for firmware to init PCI devices. + +Its is shown below: + +Header: + +u8 id; Standard PCI Capability Header field +u8 next; Standard PCI Capability Header field +u8 len; Standard PCI Capability Header field +u8 type; Red Hat vendor-specific capability type: + now only REDHAT_QEMU_CAP 1 exists +Data: + +u16 non_prefetchable_16; non-prefetchable memory limit + +u8 bus_res; minimum bus number to reserve; + this is necessary for PCI Express Root Ports + to support PCIE-to-PCI bridge hotplug + +u8 io_8; IO limit in case of 8-bit limit value +u32 io_32; IO limit in case of 16-bit limit value + io_8 and io_16 are mutually exclusive, in other words, + they can't be non-zero simultaneously + +u32 prefetchable_32; non-prefetchable memory limit + in case of 32-bit limit value +u64 prefetchable_64; non-prefetchable memory limit + in case of 64-bit limit value + prefetachable_32 and prefetchable_64 are + mutually exclusive, in other words, + they can't be non-zero simultaneously +If any field in Data section is 0, +it means that such kind of reservation +is not needed. > > > I really don't like this 'mutually exclusive' fields approach because > IMHO it increases confusion level when undertanding this capability > structure. > But - if we came to consensus on that, then IO fields should be used > in the same way, > because as I understand, this 'mutual exclusivity' serves to distinguish > cases > when we employ only *_LIMIT register and both *_LIMIT an UPPER_*_LIMIT > registers. > And this is how both IO and PREFETCHABLE works, isn't it? I would just use simeple 64 bit registers. PCI spec has an ugly format with fields spread all over the place but that is because of compatibility concerns. It makes not sense to spend cycles just to be similarly messy. >>> >>> >>> Then I suggest to use exactly one field of a maximum possible size >>> for each reserving object, and get rid of mutually exclusive fields. >>> Then it can be something like that (order and names can be changed): >>> u8 bus; >>> u16 non_pref; >>> u32 io; >>> u64 pref; >>> >> >> I think Michael suggested: >> >> u64 bus_res; >> u64 mem_res; >> u64 io_res; >> u64 mem_pref_res; >> >> OR: >> u32 bus_res; >> u32 mem_res; >> u32 io_res; >> u64 mem_pref_res; >> >> >> We can use 0XFFF..F as "not-set" value "merging" Gerd's and Michael's >> requests. > > Let's dwell on the second option (with -1 as 'ignore' sign), if no new > objections. > BTW, talking about limit values provided in the capability - do we want to completely override existing PCI resources allocation mechanism being used in SeaBIOS, I mean, to assign capability values hardly, not taking into consideration any existing checks, or somehow make this process soft (not an obvious way, can lead to an
Re: [Qemu-devel] [PATCH 2/8] cpu: Define new cpu_transaction_failed() hook
On 08/04/2017 10:20 AM, Peter Maydell wrote: > Currently we have a rather half-baked setup for allowing CPUs to > generate exceptions on accesses to invalid memory: the CPU has a > cpu_unassigned_access() hook which the memory system calls in > unassigned_mem_write() and unassigned_mem_read() if the current_cpu > pointer is non-NULL. This was originally designed before we > implemented the MemTxResult type that allows memory operations to > report a success or failure code, which is why the hook is called > right at the bottom of the memory system. The major problem with > this is that it means that the hook can be called even when the > access was not actually done by the CPU: for instance if the CPU > writes to a DMA engine register which causes the DMA engine to begin > a transaction which has been set up by the guest to operate on > invalid memory then this will casue the CPU to take an exception > incorrectly. Another minor problem is that currently if a device > returns a transaction error then this won't turn into a CPU exception > at all. > > The right way to do this is to have allow the CPU to respond > to memory system transaction failures at the point where the > CPU specific code calls into the memory system. > > Define a new QOM CPU method and utility function > cpu_transaction_failed() which is called in these cases. > The functionality here overlaps with the existing > cpu_unassigned_access() because individual target CPUs will > need some work to convert them to the new system. When this > transition is complete we can remove the old cpu_unassigned_access() > code. > > Signed-off-by: Peter Maydell > --- > include/qom/cpu.h | 21 + > 1 file changed, 21 insertions(+) Reviewed-by: Richard Henderson r~
Re: [Qemu-devel] [PATCH v8 0/5] hypertrace: Lightweight guest-to-QEMU trace channel
Stefan Hajnoczi writes: > On Sun, Jul 30, 2017 at 05:08:18PM +0300, LluÃs Vilanova wrote: >> The hypertrace channel allows guest code to emit events in QEMU (the host) >> using >> its tracing infrastructure (see "docs/trace.txt"). This works in both >> 'system' >> and 'user' modes, is architecture-agnostic and introduces minimal noise on >> the >> guest. >> >> See first commit for a full description, use-cases and an example. >> >> Signed-off-by: LluÃs Vilanova >> --- >> >> Changes in v8 >> = >> >> * Do not use 'seq' when there's no extra hypertrace arguments (BSD behaves >> differently for "seq 0"). >> * Fix compilation for bsd-user. > Hi LluÃs, > Any changes regarding the fundamental approach since September 2016? > Back then I had the following concerns about hypertrace for full-system > virtualization: > Going to QEMU for every guest trace event has high overhead under > virtualization. The alternative approach is recording separate traces > and merging them for analysis. This is possible with trace-cmd [1] and > LTTng [2]. > Merging traces eliminates the performance bottleneck and does not > require new paravirt interfaces or guest tracing libraries. I think it > it would be a distraction to support hypertrace for the virtualization > use case because it fundamentally has a high overhead. > I see promise in using hypertrace for TCG because it is low-overhead > when running inside the QEMU process. I'll review the patches again > with this in mind and not focus on virtualization. > [1] https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg00887.html > [2] > http://archive.eclipse.org/tracecompass/doc/stable/org.eclipse.tracecompass.doc.user/Virtual-Machine-Analysis.html > and also generic trace synchronization > > http://archive.eclipse.org/tracecompass/doc/stable/org.eclipse.tracecompass.doc.user/Trace-synchronization.html#Trace_synchronization There's been no fundamental changes since then (just the few bits listed in the v5-v8 changelog). But I'm kind of split on this one. If you want high-performance trace correlation, this will work much better for TCG than virtualization (where [1] will probably be more efficient). If you want a hook to trigger other operations (like in the docs example), I think this is the right approach. In fact, my initial interest in hypertrace was for instrumentation, so maybe this should be subsumed into the proposal of a stable instrumentation API. What do you think? Cheers, Lluis > Stefan >> Changes in v7 >> = >> >> * Use 'expr' instead of assuming 'bash' when generating the "emit.c" file. >> * Restore generation of trace-events-all. >> >> >> Changes in v6 >> = >> >> * Fix compilation errors. >> >> >> Changes in v5 >> = >> >> * Rebase on 5a477a7806. >> * Fix typo in "bsd-user/main.c" [Stephan Hajnoczi]. >> * Replace abort() with exit() in command-line errors [Stephan Hajnoczi]. >> * Fix alignment of data and control channels [Stephan Hajnoczi]. >> * Fix signal reflection in user-mode (SIGINT, SIGABRT, SIGSEGV) [Stephan >> Hajnoczi]. >> * Clarify semantics of hypertrace_guest_mmap_check() [Stephan Hajnoczi]. >> * Use uintptr_t instead of unsigned long in SEGV handler [Stephan Hajnoczi]. >> * Emit hypertrace's event with host-endian arguments [Stephan Hajnoczi]. >> * Enable true concurrency between user-mode guest threads by using a >> spearate control channel page per client [Stephan Hajnoczi]. >> * Remove unused PAGE_SIZE define [Stephan Hajnoczi]. >> * Submit linux kernel API module separately to Linux upstream [Stephan >> Hajnoczi]. >> * Assume guest code events are always enabled. >> >> >> Changes in v4 >> = >> >> * Fix typo in stap script example. >> * Fix compilation instructions in doc/hypertrace.txt. >> * Rebase on 0737f32daf. >> >> >> Changes in v3 >> = >> >> * Rebase on 4a58f35. >> * Remove debugging printf's. >> * Fix style issues identified by checkpatch. >> * Fix control channel mapping in guest linux module. >> * Add a short event description in "trace-events". >> * Polish documentation in 1st patch. >> >> >> Changes in v2 >> = >> >> * Remove unnecessary casts for g2h() [Eric Blake]. >> * Use perror() [Eric Blake]. >> * Avoid expansions in application example [Eric Blake]. >> * Add copyright in document "hypertrace.txt" [Eric Blake]. >> * Make the user-mode hypertrace invocations thread-safe [Stefan Hajnoczi]. >> * Split dynamic hypertrace configuration into a separate "config" channel. >> >> LluÃs Vilanova (5): >> hypertrace: Add documentation >> hypertrace: Add tracing event "guest_hypertrace" >> hypertrace: [*-user] Add QEMU-side proxy to "guest_hypertrace" event >> hypertrace: [softmmu] Add QEMU-side proxy to "guest_hypertrace" event >> hypertrace: Add guest-side user-level library >> >> >> Makefile | 11 + >> Makefile.objs |6 + >> bsd-user/main.c
Re: [Qemu-devel] [Qemu-arm] [PATCH 4/8] boards.h: Define new flag ignore_memory_transaction_failures
On 08/04/2017 02:20 PM, Peter Maydell wrote: Define a new MachineClass field ignore_memory_transaction_failures. If this is flag is true then the CPU will ignore memory transaction failures which should cause the CPU to take an exception due to an access to an unassigned physical address; the transaction will instead return zero (for a read) or be ignored (for a write). This should be set only by legacy board models which rely on the old RAZ/WI behaviour for handling devices that QEMU does not yet model. New board models should instead use "unimplemented-device" for all memory ranges where the guest will attempt to probe for a device that QEMU doesn't implement and a stub device is required. This is a very good idea. At least it will help understanding why not all firmwares compiled for the same board can boot. Since create_unimplemented_device() register overlapped with low priority, why not register it as default device directly, over the whole address space? We need this for ARM boards, where we're about to implement support for generating external aborts on memory transaction failures. Too many of our legacy board models rely on the RAZ/WI behaviour and we would break currently working guests when their "probe for device" code provoked an external abort rather than a RAZ. I think some firmware will give some surprises, those probing device is not here and expect RAZ/WI. I remember some fw probing PCI space, or enumerating CS this way for ex. RAZ/WI is a bus-feature, this is also bus-dependent to reply with abort or behave RAZ/WI. Maybe the effort should be done on how model/use buses in QEMU? Bus device would be an alias of unimplemented_device, which current purpose is more debugging than avoiding unassigned physical access aborts. I'm pretty sure this library setup probes for unassigned access installing an handler and checking it got hit, in this case (ab)using unimplemented_device would prevent this firmware to boot: http://www.ti.com/ww/en/functional_safety/safeti/index.html (I might have self-answered my first question) Signed-off-by: Peter Maydell --- include/hw/boards.h | 11 +++ include/qom/cpu.h | 7 ++- qom/cpu.c | 7 +++ 3 files changed, 24 insertions(+), 1 deletion(-) diff --git a/include/hw/boards.h b/include/hw/boards.h index 3363dd1..7f044d1 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -131,6 +131,16 @@ typedef struct { *size than the target architecture's minimum. (Attempting to create *such a CPU will fail.) Note that changing this is a migration *compatibility break for the machine. + * @ignore_memory_transaction_failures: + *If this is flag is true then the CPU will ignore memory transaction + *failures which should cause the CPU to take an exception due to an + *access to an unassigned physical address; the transaction will instead + *return zero (for a read) or be ignored (for a write). This should be + *set only by legacy board models which rely on the old RAZ/WI behaviour + *for handling devices that QEMU does not yet model. New board models + *should instead use "unimplemented-device" for all memory ranges where + *the guest will attempt to probe for a device that QEMU doesn't + *implement and a stub device is required. */ struct MachineClass { /*< private >*/ @@ -171,6 +181,7 @@ struct MachineClass { bool rom_file_has_mr; int minimum_page_bits; bool has_hotpluggable_cpus; +bool ignore_memory_transaction_failures; int numa_mem_align_shift; void (*numa_auto_assign_ram)(MachineClass *mc, NodeInfo *nodes, int nb_nodes, ram_addr_t size); diff --git a/include/qom/cpu.h b/include/qom/cpu.h index fc54d55..8cff86f 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -311,6 +311,9 @@ struct qemu_work_item; * @trace_dstate_delayed: Delayed changes to trace_dstate (includes all changes *to @trace_dstate). * @trace_dstate: Dynamic tracing state of events for this vCPU (bitmask). + * @ignore_memory_transaction_failures: Cached copy of the MachineState + *flag of the same name: allows the board to suppress calling of the + *CPU do_transaction_failed hook function. * * State of one CPU core or thread. */ @@ -397,6 +400,8 @@ struct CPUState { */ bool throttle_thread_scheduled; +bool ignore_memory_transaction_failures; + /* Note that this is accessed at the start of every TB via a negative offset from AREG0. Leave this field at the end so as to make the (absolute value) offset as small as possible. This reduces code @@ -853,7 +858,7 @@ static inline void cpu_transaction_failed(CPUState *cpu, hwaddr physaddr, { CPUClass *cc = CPU_GET_CLASS(cpu); -if (cc->do_transaction_failed) { +if (!cpu->ignore_memory_transaction_failures && cc->do_transaction_failed)
[Qemu-devel] [PATCH f0r 2.11] runstate/migrate: Two more transitions
From: "Dr. David Alan Gilbert" There's a race if someone does a 'stop' near the end of migrate; the migration process goes through two runstates: 'finish migrate' 'postmigrate' If the user issues a 'stop' between the two we end up with invalid state transitions. Add the transitions as valid. Signed-off-by: Dr. David Alan Gilbert --- vl.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/vl.c b/vl.c index 99fcfa0442..bacb03f49d 100644 --- a/vl.c +++ b/vl.c @@ -621,6 +621,7 @@ static const RunStateTransition runstate_transitions_def[] = { { RUN_STATE_PAUSED, RUN_STATE_RUNNING }, { RUN_STATE_PAUSED, RUN_STATE_FINISH_MIGRATE }, +{ RUN_STATE_PAUSED, RUN_STATE_POSTMIGRATE }, { RUN_STATE_PAUSED, RUN_STATE_PRELAUNCH }, { RUN_STATE_PAUSED, RUN_STATE_COLO}, @@ -633,6 +634,7 @@ static const RunStateTransition runstate_transitions_def[] = { { RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE }, { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING }, +{ RUN_STATE_FINISH_MIGRATE, RUN_STATE_PAUSED }, { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE }, { RUN_STATE_FINISH_MIGRATE, RUN_STATE_PRELAUNCH }, { RUN_STATE_FINISH_MIGRATE, RUN_STATE_COLO}, -- 2.13.3
Re: [Qemu-devel] Do we need a "virt-2.10" machine type before the 2.10 release?
On 4 August 2017 at 18:48, Dr. David Alan Gilbert wrote: > * Peter Maydell (peter.mayd...@linaro.org) wrote: >> Hi; I noticed today that the virt board doesn't have a virt-2.10 >> machine type defined. Do we need to add it before release? >> >> (I don't know if there have in fact been any changes between >> 2.9 and 2.10 that would be compatibility issues.) > > I think there's two sub questions: > a) The virt-2.9 needs to pick up HW_COMPAT_2_9 if anything > in it is relevant for ARM > > b) If anything is relevant in it for ARM then you probably > want a virt-2.10 so that you can take advantage of whatever > was changed that needed compatibility adding in HW_COMPAT_2_9 ...and also, do we as policy want to define a virt-X.Y for every X.Y release even if it happens that there are no changes between X.Y-1 and X.Y ? From a user point of view it seems a bit odd for one to be missing. thanks -- PMM
Re: [Qemu-devel] [PATCH v8 1/5] hypertrace: Add documentation
On 07/30/2017 09:12 AM, LluÃs Vilanova wrote: > Signed-off-by: LluÃs Vilanova > --- > docs/devel/tracing.txt |3 + > docs/hypertrace.txt| 225 > Do we want the new document directly in docs/, or does it live in one of the subdirectories? > 2 files changed, 228 insertions(+) > create mode 100644 docs/hypertrace.txt > > diff --git a/docs/devel/tracing.txt b/docs/devel/tracing.txt > index 5768a0b7a2..9178a308da 100644 > --- a/docs/devel/tracing.txt > +++ b/docs/devel/tracing.txt > @@ -5,6 +5,9 @@ > This document describes the tracing infrastructure in QEMU and how to use it > for debugging, profiling, and observing execution. > > +See "docs/hypertrace.txt" to correlate guest tracing events with those in the > +QEMU host. Depending on where it should live, this may need a tweak. > + > +The hypertrace channel allows guest code to emit events in QEMU (the host) > using > +its tracing infrastructure (see "docs/trace.txt"). This works in both > 'system' Stale filename; needs adjusting. > +and 'user' modes. Therefore, hypertrace is to tracing what hypercalls are to > +system calls. > + -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] Do we need a "virt-2.10" machine type before the 2.10 release?
* Peter Maydell (peter.mayd...@linaro.org) wrote: > Hi; I noticed today that the virt board doesn't have a virt-2.10 > machine type defined. Do we need to add it before release? > > (I don't know if there have in fact been any changes between > 2.9 and 2.10 that would be compatibility issues.) I think there's two sub questions: a) The virt-2.9 needs to pick up HW_COMPAT_2_9 if anything in it is relevant for ARM b) If anything is relevant in it for ARM then you probably want a virt-2.10 so that you can take advantage of whatever was changed that needed compatibility adding in HW_COMPAT_2_9 (The other thing is figuring out what we missed in HW_COMPAT_2_9, we normally miss something). Dave > thanks > -- PMM -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH 1/8] memory.h: Move MemTxResult type to memattrs.h
On 08/04/2017 10:20 AM, Peter Maydell wrote: > Move the MemTxResult type to memattrs.h. We're going to want to > use it in cpu/qom.h, which doesn't want to include all of > memory.h. In practice MemTxResult and MemTxAttrs are pretty > closely linked since both are used for the new-style > read_with_attrs and write_with_attrs callbacks, so memattrs.h > is a reasonable home for this rather than creating a whole > new header file for it. > > Signed-off-by: Peter Maydell > --- > include/exec/memattrs.h | 10 ++ > include/exec/memory.h | 10 -- > 2 files changed, 10 insertions(+), 10 deletions(-) Reviewed-by: Richard Henderson r~
[Qemu-devel] Do we need a "virt-2.10" machine type before the 2.10 release?
Hi; I noticed today that the virt board doesn't have a virt-2.10 machine type defined. Do we need to add it before release? (I don't know if there have in fact been any changes between 2.9 and 2.10 that would be compatibility issues.) thanks -- PMM
[Qemu-devel] [PATCH 3/8] cputlb: Support generating CPU exceptions on memory transaction failures
Call the new cpu_transaction_failed() hook at the places where CPU generated code interacts with the memory system: io_readx() io_writex() get_page_addr_code() Any access from C code (eg via cpu_physical_memory_rw(), address_space_rw(), ld/st_*_phys()) will *not* trigger CPU exceptions via cpu_transaction_failed(). Handling for transactions failures for this kind of call should be done by using a function which returns a MemTxResult and treating the failure case appropriately in the calling code. In an ideal world we would not generate CPU exceptions for instruction fetch failures in get_page_addr_code() but instead wait until the code translation process tried a load and it failed; however that change would require too great a restructuring and redesign to attempt at this point. Signed-off-by: Peter Maydell --- softmmu_template.h | 4 ++-- accel/tcg/cputlb.c | 32 ++-- 2 files changed, 32 insertions(+), 4 deletions(-) diff --git a/softmmu_template.h b/softmmu_template.h index 4a2b665..d756329 100644 --- a/softmmu_template.h +++ b/softmmu_template.h @@ -101,7 +101,7 @@ static inline DATA_TYPE glue(io_read, SUFFIX)(CPUArchState *env, uintptr_t retaddr) { CPUIOTLBEntry *iotlbentry = &env->iotlb[mmu_idx][index]; -return io_readx(env, iotlbentry, addr, retaddr, DATA_SIZE); +return io_readx(env, iotlbentry, mmu_idx, addr, retaddr, DATA_SIZE); } #endif @@ -262,7 +262,7 @@ static inline void glue(io_write, SUFFIX)(CPUArchState *env, uintptr_t retaddr) { CPUIOTLBEntry *iotlbentry = &env->iotlb[mmu_idx][index]; -return io_writex(env, iotlbentry, val, addr, retaddr, DATA_SIZE); +return io_writex(env, iotlbentry, mmu_idx, val, addr, retaddr, DATA_SIZE); } void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val, diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 85635ae..e72415a 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -747,6 +747,7 @@ static inline ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr) } static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry, + int mmu_idx, target_ulong addr, uintptr_t retaddr, int size) { CPUState *cpu = ENV_GET_CPU(env); @@ -754,6 +755,7 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry, MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs); uint64_t val; bool locked = false; +MemTxResult r; physaddr = (physaddr & TARGET_PAGE_MASK) + addr; cpu->mem_io_pc = retaddr; @@ -767,7 +769,12 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry, qemu_mutex_lock_iothread(); locked = true; } -memory_region_dispatch_read(mr, physaddr, &val, size, iotlbentry->attrs); +r = memory_region_dispatch_read(mr, physaddr, +&val, size, iotlbentry->attrs); +if (r != MEMTX_OK) { +cpu_transaction_failed(cpu, physaddr, addr, size, MMU_DATA_LOAD, + mmu_idx, iotlbentry->attrs, r, retaddr); +} if (locked) { qemu_mutex_unlock_iothread(); } @@ -776,6 +783,7 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry, } static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry, + int mmu_idx, uint64_t val, target_ulong addr, uintptr_t retaddr, int size) { @@ -783,6 +791,7 @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry, hwaddr physaddr = iotlbentry->addr; MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs); bool locked = false; +MemTxResult r; physaddr = (physaddr & TARGET_PAGE_MASK) + addr; if (mr != &io_mem_rom && mr != &io_mem_notdirty && !cpu->can_do_io) { @@ -795,7 +804,12 @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry, qemu_mutex_lock_iothread(); locked = true; } -memory_region_dispatch_write(mr, physaddr, val, size, iotlbentry->attrs); +r = memory_region_dispatch_write(mr, physaddr, + val, size, iotlbentry->attrs); +if (r != MEMTX_OK) { +cpu_transaction_failed(cpu, physaddr, addr, size, MMU_DATA_STORE, + mmu_idx, iotlbentry->attrs, r, retaddr); +} if (locked) { qemu_mutex_unlock_iothread(); } @@ -845,6 +859,7 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr) MemoryRegion *mr; CPUState *cpu = ENV_GET_CPU(env); CPUIOTLBEntry *iotlbentry; +hwaddr physaddr; index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1); mmu_idx = cpu_mmu_index(env, true); @@ -868,6 +883,19 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr)
[Qemu-devel] [PATCH 4/8] boards.h: Define new flag ignore_memory_transaction_failures
Define a new MachineClass field ignore_memory_transaction_failures. If this is flag is true then the CPU will ignore memory transaction failures which should cause the CPU to take an exception due to an access to an unassigned physical address; the transaction will instead return zero (for a read) or be ignored (for a write). This should be set only by legacy board models which rely on the old RAZ/WI behaviour for handling devices that QEMU does not yet model. New board models should instead use "unimplemented-device" for all memory ranges where the guest will attempt to probe for a device that QEMU doesn't implement and a stub device is required. We need this for ARM boards, where we're about to implement support for generating external aborts on memory transaction failures. Too many of our legacy board models rely on the RAZ/WI behaviour and we would break currently working guests when their "probe for device" code provoked an external abort rather than a RAZ. Signed-off-by: Peter Maydell --- include/hw/boards.h | 11 +++ include/qom/cpu.h | 7 ++- qom/cpu.c | 7 +++ 3 files changed, 24 insertions(+), 1 deletion(-) diff --git a/include/hw/boards.h b/include/hw/boards.h index 3363dd1..7f044d1 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -131,6 +131,16 @@ typedef struct { *size than the target architecture's minimum. (Attempting to create *such a CPU will fail.) Note that changing this is a migration *compatibility break for the machine. + * @ignore_memory_transaction_failures: + *If this is flag is true then the CPU will ignore memory transaction + *failures which should cause the CPU to take an exception due to an + *access to an unassigned physical address; the transaction will instead + *return zero (for a read) or be ignored (for a write). This should be + *set only by legacy board models which rely on the old RAZ/WI behaviour + *for handling devices that QEMU does not yet model. New board models + *should instead use "unimplemented-device" for all memory ranges where + *the guest will attempt to probe for a device that QEMU doesn't + *implement and a stub device is required. */ struct MachineClass { /*< private >*/ @@ -171,6 +181,7 @@ struct MachineClass { bool rom_file_has_mr; int minimum_page_bits; bool has_hotpluggable_cpus; +bool ignore_memory_transaction_failures; int numa_mem_align_shift; void (*numa_auto_assign_ram)(MachineClass *mc, NodeInfo *nodes, int nb_nodes, ram_addr_t size); diff --git a/include/qom/cpu.h b/include/qom/cpu.h index fc54d55..8cff86f 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -311,6 +311,9 @@ struct qemu_work_item; * @trace_dstate_delayed: Delayed changes to trace_dstate (includes all changes *to @trace_dstate). * @trace_dstate: Dynamic tracing state of events for this vCPU (bitmask). + * @ignore_memory_transaction_failures: Cached copy of the MachineState + *flag of the same name: allows the board to suppress calling of the + *CPU do_transaction_failed hook function. * * State of one CPU core or thread. */ @@ -397,6 +400,8 @@ struct CPUState { */ bool throttle_thread_scheduled; +bool ignore_memory_transaction_failures; + /* Note that this is accessed at the start of every TB via a negative offset from AREG0. Leave this field at the end so as to make the (absolute value) offset as small as possible. This reduces code @@ -853,7 +858,7 @@ static inline void cpu_transaction_failed(CPUState *cpu, hwaddr physaddr, { CPUClass *cc = CPU_GET_CLASS(cpu); -if (cc->do_transaction_failed) { +if (!cpu->ignore_memory_transaction_failures && cc->do_transaction_failed) { cc->do_transaction_failed(cpu, physaddr, addr, size, access_type, mmu_idx, attrs, response, retaddr); } diff --git a/qom/cpu.c b/qom/cpu.c index 4f38db0..d8dcf64 100644 --- a/qom/cpu.c +++ b/qom/cpu.c @@ -29,6 +29,7 @@ #include "exec/cpu-common.h" #include "qemu/error-report.h" #include "sysemu/sysemu.h" +#include "hw/boards.h" #include "hw/qdev-properties.h" #include "trace-root.h" @@ -360,6 +361,12 @@ static void cpu_common_parse_features(const char *typename, char *features, static void cpu_common_realizefn(DeviceState *dev, Error **errp) { CPUState *cpu = CPU(dev); +Object *machine = qdev_get_machine(); +ObjectClass *oc = object_get_class(machine); +MachineClass *mc = MACHINE_CLASS(oc); + +cpu->ignore_memory_transaction_failures = +mc->ignore_memory_transaction_failures; if (dev->hotplugged) { cpu_synchronize_post_init(cpu); -- 2.7.4
[Qemu-devel] [PATCH 5/8] hw/arm: Set ignore_memory_transaction_failures for most ARM boards
Set the MachineClass flag ignore_memory_transaction_failures for almost all ARM boards. This means they retain the legacy behaviour that accesses to unimplemented addresses will RAZ/WI rather than aborting, when a subsequent commit adds support for external aborts. The exceptions are: * virt -- we know that guests won't try to prod devices that we don't describe in the device tree or ACPI tables * mps2 -- this board was written to use unimplemented-device for all the ranges with devices we don't yet handle New boards should not set the flag, but instead be written like the mps2. Signed-off-by: Peter Maydell --- hw/arm/aspeed.c | 3 +++ hw/arm/collie.c | 1 + hw/arm/cubieboard.c | 1 + hw/arm/digic_boards.c | 1 + hw/arm/exynos4_boards.c | 2 ++ hw/arm/gumstix.c| 2 ++ hw/arm/highbank.c | 2 ++ hw/arm/imx25_pdk.c | 1 + hw/arm/integratorcp.c | 1 + hw/arm/kzm.c| 1 + hw/arm/mainstone.c | 1 + hw/arm/musicpal.c | 1 + hw/arm/netduino2.c | 1 + hw/arm/nseries.c| 2 ++ hw/arm/omap_sx1.c | 2 ++ hw/arm/palm.c | 1 + hw/arm/raspi.c | 1 + hw/arm/realview.c | 4 hw/arm/sabrelite.c | 1 + hw/arm/spitz.c | 4 hw/arm/stellaris.c | 2 ++ hw/arm/tosa.c | 1 + hw/arm/versatilepb.c| 2 ++ hw/arm/vexpress.c | 1 + hw/arm/xilinx_zynq.c| 1 + hw/arm/xlnx-ep108.c | 2 ++ hw/arm/z2.c | 1 + 27 files changed, 43 insertions(+) diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c index 0c5635f..ab895ad 100644 --- a/hw/arm/aspeed.c +++ b/hw/arm/aspeed.c @@ -270,6 +270,7 @@ static void palmetto_bmc_class_init(ObjectClass *oc, void *data) mc->no_floppy = 1; mc->no_cdrom = 1; mc->no_parallel = 1; +mc->ignore_memory_transaction_failures = true; } static const TypeInfo palmetto_bmc_type = { @@ -302,6 +303,7 @@ static void ast2500_evb_class_init(ObjectClass *oc, void *data) mc->no_floppy = 1; mc->no_cdrom = 1; mc->no_parallel = 1; +mc->ignore_memory_transaction_failures = true; } static const TypeInfo ast2500_evb_type = { @@ -326,6 +328,7 @@ static void romulus_bmc_class_init(ObjectClass *oc, void *data) mc->no_floppy = 1; mc->no_cdrom = 1; mc->no_parallel = 1; +mc->ignore_memory_transaction_failures = true; } static const TypeInfo romulus_bmc_type = { diff --git a/hw/arm/collie.c b/hw/arm/collie.c index 2e69531..8830192 100644 --- a/hw/arm/collie.c +++ b/hw/arm/collie.c @@ -64,6 +64,7 @@ static void collie_machine_init(MachineClass *mc) { mc->desc = "Sharp SL-5500 (Collie) PDA (SA-1110)"; mc->init = collie_init; +mc->ignore_memory_transaction_failures = true; } DEFINE_MACHINE("collie", collie_machine_init) diff --git a/hw/arm/cubieboard.c b/hw/arm/cubieboard.c index b98e1c4..32f1edd 100644 --- a/hw/arm/cubieboard.c +++ b/hw/arm/cubieboard.c @@ -86,6 +86,7 @@ static void cubieboard_machine_init(MachineClass *mc) mc->init = cubieboard_init; mc->block_default_type = IF_IDE; mc->units_per_default_bus = 1; +mc->ignore_memory_transaction_failures = true; } DEFINE_MACHINE("cubieboard", cubieboard_machine_init) diff --git a/hw/arm/digic_boards.c b/hw/arm/digic_boards.c index 520c8e9..9f11dcd 100644 --- a/hw/arm/digic_boards.c +++ b/hw/arm/digic_boards.c @@ -155,6 +155,7 @@ static void canon_a1100_machine_init(MachineClass *mc) { mc->desc = "Canon PowerShot A1100 IS"; mc->init = &canon_a1100_init; +mc->ignore_memory_transaction_failures = true; } DEFINE_MACHINE("canon-a1100", canon_a1100_machine_init) diff --git a/hw/arm/exynos4_boards.c b/hw/arm/exynos4_boards.c index 7c03ed3..f1441ec 100644 --- a/hw/arm/exynos4_boards.c +++ b/hw/arm/exynos4_boards.c @@ -189,6 +189,7 @@ static void nuri_class_init(ObjectClass *oc, void *data) mc->desc = "Samsung NURI board (Exynos4210)"; mc->init = nuri_init; mc->max_cpus = EXYNOS4210_NCPUS; +mc->ignore_memory_transaction_failures = true; } static const TypeInfo nuri_type = { @@ -204,6 +205,7 @@ static void smdkc210_class_init(ObjectClass *oc, void *data) mc->desc = "Samsung SMDKC210 board (Exynos4210)"; mc->init = smdkc210_init; mc->max_cpus = EXYNOS4210_NCPUS; +mc->ignore_memory_transaction_failures = true; } static const TypeInfo smdkc210_type = { diff --git a/hw/arm/gumstix.c b/hw/arm/gumstix.c index d59d9ba..092ce36 100644 --- a/hw/arm/gumstix.c +++ b/hw/arm/gumstix.c @@ -128,6 +128,7 @@ static void connex_class_init(ObjectClass *oc, void *data) mc->desc = "Gumstix Connex (PXA255)"; mc->init = connex_init; +mc->ignore_memory_transaction_failures = true; } static const TypeInfo connex_type = { @@ -142,6 +143,7 @@ static void verdex_class_init(ObjectClass *oc, void *data) mc->desc = "Gumstix Verdex (PXA270)"; mc->init = verdex_init; +mc->ignore_memory_transaction_failures = true; } static const TypeInfo ver
[Qemu-devel] [PATCH 2/8] cpu: Define new cpu_transaction_failed() hook
Currently we have a rather half-baked setup for allowing CPUs to generate exceptions on accesses to invalid memory: the CPU has a cpu_unassigned_access() hook which the memory system calls in unassigned_mem_write() and unassigned_mem_read() if the current_cpu pointer is non-NULL. This was originally designed before we implemented the MemTxResult type that allows memory operations to report a success or failure code, which is why the hook is called right at the bottom of the memory system. The major problem with this is that it means that the hook can be called even when the access was not actually done by the CPU: for instance if the CPU writes to a DMA engine register which causes the DMA engine to begin a transaction which has been set up by the guest to operate on invalid memory then this will casue the CPU to take an exception incorrectly. Another minor problem is that currently if a device returns a transaction error then this won't turn into a CPU exception at all. The right way to do this is to have allow the CPU to respond to memory system transaction failures at the point where the CPU specific code calls into the memory system. Define a new QOM CPU method and utility function cpu_transaction_failed() which is called in these cases. The functionality here overlaps with the existing cpu_unassigned_access() because individual target CPUs will need some work to convert them to the new system. When this transition is complete we can remove the old cpu_unassigned_access() code. Signed-off-by: Peter Maydell --- include/qom/cpu.h | 21 + 1 file changed, 21 insertions(+) diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 25eefea..fc54d55 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -85,8 +85,10 @@ struct TranslationBlock; * @has_work: Callback for checking if there is work to do. * @do_interrupt: Callback for interrupt handling. * @do_unassigned_access: Callback for unassigned access handling. + * (this is deprecated: new targets should use do_transaction_failed instead) * @do_unaligned_access: Callback for unaligned access handling, if * the target defines #ALIGNED_ONLY. + * @do_transaction_failed: Callback for handling failed memory transactions * @virtio_is_big_endian: Callback to return %true if a CPU which supports * runtime configurable endianness is currently big-endian. Non-configurable * CPUs can use the default implementation of this method. This method should @@ -153,6 +155,10 @@ typedef struct CPUClass { void (*do_unaligned_access)(CPUState *cpu, vaddr addr, MMUAccessType access_type, int mmu_idx, uintptr_t retaddr); +void (*do_transaction_failed)(CPUState *cpu, hwaddr physaddr, vaddr addr, + unsigned size, MMUAccessType access_type, + int mmu_idx, MemTxAttrs attrs, + MemTxResult response, uintptr_t retaddr); bool (*virtio_is_big_endian)(CPUState *cpu); int (*memory_rw_debug)(CPUState *cpu, vaddr addr, uint8_t *buf, int len, bool is_write); @@ -837,6 +843,21 @@ static inline void cpu_unaligned_access(CPUState *cpu, vaddr addr, cc->do_unaligned_access(cpu, addr, access_type, mmu_idx, retaddr); } + +static inline void cpu_transaction_failed(CPUState *cpu, hwaddr physaddr, + vaddr addr, unsigned size, + MMUAccessType access_type, + int mmu_idx, MemTxAttrs attrs, + MemTxResult response, + uintptr_t retaddr) +{ +CPUClass *cc = CPU_GET_CLASS(cpu); + +if (cc->do_transaction_failed) { +cc->do_transaction_failed(cpu, physaddr, addr, size, access_type, + mmu_idx, attrs, response, retaddr); +} +} #endif #endif /* NEED_CPU_H */ -- 2.7.4
[Qemu-devel] [PATCH 1/8] memory.h: Move MemTxResult type to memattrs.h
Move the MemTxResult type to memattrs.h. We're going to want to use it in cpu/qom.h, which doesn't want to include all of memory.h. In practice MemTxResult and MemTxAttrs are pretty closely linked since both are used for the new-style read_with_attrs and write_with_attrs callbacks, so memattrs.h is a reasonable home for this rather than creating a whole new header file for it. Signed-off-by: Peter Maydell --- include/exec/memattrs.h | 10 ++ include/exec/memory.h | 10 -- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/include/exec/memattrs.h b/include/exec/memattrs.h index e601061..d4a1642 100644 --- a/include/exec/memattrs.h +++ b/include/exec/memattrs.h @@ -46,4 +46,14 @@ typedef struct MemTxAttrs { */ #define MEMTXATTRS_UNSPECIFIED ((MemTxAttrs) { .unspecified = 1 }) +/* New-style MMIO accessors can indicate that the transaction failed. + * A zero (MEMTX_OK) response means success; anything else is a failure + * of some kind. The memory subsystem will bitwise-OR together results + * if it is synthesizing an operation from multiple smaller accesses. + */ +#define MEMTX_OK 0 +#define MEMTX_ERROR (1U << 0) /* device returned an error */ +#define MEMTX_DECODE_ERROR (1U << 1) /* nothing at that address */ +typedef uint32_t MemTxResult; + #endif diff --git a/include/exec/memory.h b/include/exec/memory.h index 400dd44..1dcd312 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -112,16 +112,6 @@ static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn, n->end = end; } -/* New-style MMIO accessors can indicate that the transaction failed. - * A zero (MEMTX_OK) response means success; anything else is a failure - * of some kind. The memory subsystem will bitwise-OR together results - * if it is synthesizing an operation from multiple smaller accesses. - */ -#define MEMTX_OK 0 -#define MEMTX_ERROR (1U << 0) /* device returned an error */ -#define MEMTX_DECODE_ERROR (1U << 1) /* nothing at that address */ -typedef uint32_t MemTxResult; - /* * Memory region callbacks */ -- 2.7.4
[Qemu-devel] [PATCH 6/8] target/arm: Factor out fault delivery code
We currently have some similar code in tlb_fill() and in arm_cpu_do_unaligned_access() for delivering a data abort or prefetch abort. We're also going to want to do the same thing to handle external aborts. Factor out the common code into a new function deliver_fault(). Signed-off-by: Peter Maydell --- target/arm/op_helper.c | 110 + 1 file changed, 57 insertions(+), 53 deletions(-) diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c index 2a85666..aa52a98 100644 --- a/target/arm/op_helper.c +++ b/target/arm/op_helper.c @@ -115,6 +115,51 @@ static inline uint32_t merge_syn_data_abort(uint32_t template_syn, return syn; } +static void deliver_fault(ARMCPU *cpu, vaddr addr, MMUAccessType access_type, + uint32_t fsr, uint32_t fsc, ARMMMUFaultInfo *fi) +{ +CPUARMState *env = &cpu->env; +int target_el; +bool same_el; +uint32_t syn, exc; + +target_el = exception_target_el(env); +if (fi->stage2) { +target_el = 2; +env->cp15.hpfar_el2 = extract64(fi->s2addr, 12, 47) << 4; +} +same_el = (arm_current_el(env) == target_el); + +if (fsc == 0x3f) { +/* Caller doesn't have a long-format fault status code. This + * should only happen if this fault will never actually be reported + * to an EL that uses a syndrome register. Check that here. + * 0x3f is a (currently) reserved FSR code, in case the constructed + * syndrome does leak into the guest somehow. + */ +assert(target_el != 2 && !arm_el_is_aa64(env, target_el)); +} + +if (access_type == MMU_INST_FETCH) { +syn = syn_insn_abort(same_el, 0, fi->s1ptw, fsc); +exc = EXCP_PREFETCH_ABORT; +} else { +syn = merge_syn_data_abort(env->exception.syndrome, target_el, + same_el, fi->s1ptw, + access_type == MMU_DATA_STORE, + fsc); +if (access_type == MMU_DATA_STORE +&& arm_feature(env, ARM_FEATURE_V6)) { +fsr |= (1 << 11); +} +exc = EXCP_DATA_ABORT; +} + +env->exception.vaddress = addr; +env->exception.fsr = fsr; +raise_exception(env, exc, syn, target_el); +} + /* try to fill the TLB and return an exception if error. If retaddr is * NULL, it means that the function was called in C code (i.e. not * from generated code or from helper.c) @@ -129,23 +174,13 @@ void tlb_fill(CPUState *cs, target_ulong addr, MMUAccessType access_type, ret = arm_tlb_fill(cs, addr, access_type, mmu_idx, &fsr, &fi); if (unlikely(ret)) { ARMCPU *cpu = ARM_CPU(cs); -CPUARMState *env = &cpu->env; -uint32_t syn, exc, fsc; -unsigned int target_el; -bool same_el; +uint32_t fsc; if (retaddr) { /* now we have a real cpu fault */ cpu_restore_state(cs, retaddr); } -target_el = exception_target_el(env); -if (fi.stage2) { -target_el = 2; -env->cp15.hpfar_el2 = extract64(fi.s2addr, 12, 47) << 4; -} -same_el = arm_current_el(env) == target_el; - if (fsr & (1 << 9)) { /* LPAE format fault status register : bottom 6 bits are * status code in the same form as needed for syndrome @@ -153,34 +188,15 @@ void tlb_fill(CPUState *cs, target_ulong addr, MMUAccessType access_type, fsc = extract32(fsr, 0, 6); } else { /* Short format FSR : this fault will never actually be reported - * to an EL that uses a syndrome register. Check that here, - * and use a (currently) reserved FSR code in case the constructed - * syndrome does leak into the guest somehow. + * to an EL that uses a syndrome register. Use a (currently) + * reserved FSR code in case the constructed syndrome does leak + * into the guest somehow. deliver_fault will assert that + * we don't target an EL using the syndrome. */ -assert(target_el != 2 && !arm_el_is_aa64(env, target_el)); fsc = 0x3f; } -/* For insn and data aborts we assume there is no instruction syndrome - * information; this is always true for exceptions reported to EL1. - */ -if (access_type == MMU_INST_FETCH) { -syn = syn_insn_abort(same_el, 0, fi.s1ptw, fsc); -exc = EXCP_PREFETCH_ABORT; -} else { -syn = merge_syn_data_abort(env->exception.syndrome, target_el, - same_el, fi.s1ptw, - access_type == MMU_DATA_STORE, fsc); -if (access_type == MMU_DATA_STORE -&& arm_feature(env, ARM_FEATURE_V6)) { -fsr |= (1 << 11); -} -exc = EXCP
[Qemu-devel] [PATCH 8/8] target/arm: Implement new do_transaction_failed hook
Implement the new do_transaction_failed hook for ARM, which should cause the CPU to take a prefetch abort or data abort. Signed-off-by: Peter Maydell --- target/arm/internals.h | 10 ++ target/arm/cpu.c | 1 + target/arm/op_helper.c | 43 +++ 3 files changed, 54 insertions(+) diff --git a/target/arm/internals.h b/target/arm/internals.h index a3adbd8..13bb001 100644 --- a/target/arm/internals.h +++ b/target/arm/internals.h @@ -471,6 +471,16 @@ void arm_cpu_do_unaligned_access(CPUState *cs, vaddr vaddr, MMUAccessType access_type, int mmu_idx, uintptr_t retaddr); +/* arm_cpu_do_transaction_failed: handle a memory system error response + * (eg "no device/memory present at address") by raising an external abort + * exception + */ +void arm_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr, + vaddr addr, unsigned size, + MMUAccessType access_type, + int mmu_idx, MemTxAttrs attrs, + MemTxResult response, uintptr_t retaddr); + /* Call the EL change hook if one has been registered */ static inline void arm_call_el_change_hook(ARMCPU *cpu) { diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 05c038b..6baede0 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1670,6 +1670,7 @@ static void arm_cpu_class_init(ObjectClass *oc, void *data) #else cc->do_interrupt = arm_cpu_do_interrupt; cc->do_unaligned_access = arm_cpu_do_unaligned_access; +cc->do_transaction_failed = arm_cpu_do_transaction_failed; cc->get_phys_page_attrs_debug = arm_cpu_get_phys_page_attrs_debug; cc->asidx_from_attrs = arm_asidx_from_attrs; cc->vmsd = &vmstate_arm_cpu; diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c index 7eac272..54b6dd8 100644 --- a/target/arm/op_helper.c +++ b/target/arm/op_helper.c @@ -229,6 +229,49 @@ void arm_cpu_do_unaligned_access(CPUState *cs, vaddr vaddr, deliver_fault(cpu, vaddr, access_type, fsr, fsc, &fi); } +/* arm_cpu_do_transaction_failed: handle a memory system error response + * (eg "no device/memory present at address") by raising an external abort + * exception + */ +void arm_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr, + vaddr addr, unsigned size, + MMUAccessType access_type, + int mmu_idx, MemTxAttrs attrs, + MemTxResult response, uintptr_t retaddr) +{ +ARMCPU *cpu = ARM_CPU(cs); +CPUARMState *env = &cpu->env; +uint32_t fsr, fsc; +ARMMMUFaultInfo fi = {}; +ARMMMUIdx arm_mmu_idx = core_to_arm_mmu_idx(env, mmu_idx); + +if (retaddr) { +/* now we have a real cpu fault */ +cpu_restore_state(cs, retaddr); +} + +/* The EA bit in syndromes and fault status registers is an + * IMPDEF classification of external aborts. ARM implementations + * usually use this to indicate AXI bus Decode error (0) or + * Slave error (1); in QEMU we follow that. + */ +fi.ea = (response != MEMTX_DECODE_ERROR); + +/* The fault status register format depends on whether we're using + * the LPAE long descriptor format, or the short descriptor format. + */ +if (arm_s1_regime_using_lpae_format(env, arm_mmu_idx)) { +/* long descriptor form, STATUS 0b01: synchronous ext abort */ +fsr = (fi.ea << 12) | (1 << 9) | 0x10; +} else { +/* short descriptor form, FSR 0b01000 : synchronous ext abort */ +fsr = (fi.ea << 12) | 0x8; +} +fsc = 0x10; + +deliver_fault(cpu, addr, access_type, fsr, fsc, &fi); +} + #endif /* !defined(CONFIG_USER_ONLY) */ uint32_t HELPER(add_setq)(CPUARMState *env, uint32_t a, uint32_t b) -- 2.7.4
[Qemu-devel] [PATCH 7/8] target/arm: Allow deliver_fault() caller to specify EA bit
For external aborts, we will want to be able to specify the EA (external abort type) bit in the syndrome field. Allow callers of deliver_fault() to do that by adding a field to ARMMMUFaultInfo which we use when constructing the syndrome values. Signed-off-by: Peter Maydell --- target/arm/internals.h | 2 ++ target/arm/op_helper.c | 10 +- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/target/arm/internals.h b/target/arm/internals.h index 1f6efef..a3adbd8 100644 --- a/target/arm/internals.h +++ b/target/arm/internals.h @@ -448,12 +448,14 @@ void arm_handle_psci_call(ARMCPU *cpu); * @s2addr: Address that caused a fault at stage 2 * @stage2: True if we faulted at stage 2 * @s1ptw: True if we faulted at stage 2 while doing a stage 1 page-table walk + * @ea: True if we should set the EA (external abort type) bit in syndrome */ typedef struct ARMMMUFaultInfo ARMMMUFaultInfo; struct ARMMMUFaultInfo { target_ulong s2addr; bool stage2; bool s1ptw; +bool ea; }; /* Do a page table walk and add page to TLB if possible */ diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c index aa52a98..7eac272 100644 --- a/target/arm/op_helper.c +++ b/target/arm/op_helper.c @@ -80,7 +80,7 @@ uint32_t HELPER(neon_tbl)(CPUARMState *env, uint32_t ireg, uint32_t def, static inline uint32_t merge_syn_data_abort(uint32_t template_syn, unsigned int target_el, -bool same_el, +bool same_el, bool ea, bool s1ptw, bool is_write, int fsc) { @@ -99,7 +99,7 @@ static inline uint32_t merge_syn_data_abort(uint32_t template_syn, */ if (!(template_syn & ARM_EL_ISV) || target_el != 2 || s1ptw) { syn = syn_data_abort_no_iss(same_el, -0, 0, s1ptw, is_write, fsc); +ea, 0, s1ptw, is_write, fsc); } else { /* Fields: IL, ISV, SAS, SSE, SRT, SF and AR come from the template * syndrome created at translation time. @@ -107,7 +107,7 @@ static inline uint32_t merge_syn_data_abort(uint32_t template_syn, */ syn = syn_data_abort_with_iss(same_el, 0, 0, 0, 0, 0, - 0, 0, s1ptw, is_write, fsc, + ea, 0, s1ptw, is_write, fsc, false); /* Merge the runtime syndrome with the template syndrome. */ syn |= template_syn; @@ -141,11 +141,11 @@ static void deliver_fault(ARMCPU *cpu, vaddr addr, MMUAccessType access_type, } if (access_type == MMU_INST_FETCH) { -syn = syn_insn_abort(same_el, 0, fi->s1ptw, fsc); +syn = syn_insn_abort(same_el, fi->ea, fi->s1ptw, fsc); exc = EXCP_PREFETCH_ABORT; } else { syn = merge_syn_data_abort(env->exception.syndrome, target_el, - same_el, fi->s1ptw, + same_el, fi->ea, fi->s1ptw, access_type == MMU_DATA_STORE, fsc); if (access_type == MMU_DATA_STORE -- 2.7.4
[Qemu-devel] [PATCH 0/8] Implement ARM external abort handling
Following recent list discussion https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg00063.html here's a patchseries which defines a new API for handling CPU memory transaction failures at the right level in the memory subsystem code, and implements it for ARM so that we can generate prefetch abort and data abort exceptions for external aborts. The first 3 patches here implement the core code support for the new cpu_transaction_failed_hook. The next 2 patches add support for turning it off on a per-board basis, and use that to go back to RAZ/WI on the legacy ARM board models, because right now we rely on that so that guest code doesn't blow up when it touches a device we don't have a model for yet. (We leave the support enabled for 'virt' and 'mps2'.) Finally the last 3 patches do some cleanup and then implement the new hook for ARM. (I have not as yet audited the target/arm code to check that we correctly handle failed transactions in all the places that C code does physical address accesses; but since those lookups don't generate exceptions today, the series leaves behaviour there no worse off than they were before.) thanks -- PMM Peter Maydell (8): memory.h: Move MemTxResult type to memattrs.h cpu: Define new cpu_transaction_failed() hook cputlb: Support generating CPU exceptions on memory transaction failures boards.h: Define new flag ignore_memory_transaction_failures hw/arm: Set ignore_memory_transaction_failures for most ARM boards target/arm: Factor out fault delivery code target/arm: Allow deliver_fault() caller to specify EA bit target/arm: Implement new do_transaction_failed hook include/exec/memattrs.h | 10 include/exec/memory.h | 10 include/hw/boards.h | 11 include/qom/cpu.h | 26 softmmu_template.h | 4 +- target/arm/internals.h | 12 accel/tcg/cputlb.c | 32 +- hw/arm/aspeed.c | 3 + hw/arm/collie.c | 1 + hw/arm/cubieboard.c | 1 + hw/arm/digic_boards.c | 1 + hw/arm/exynos4_boards.c | 2 + hw/arm/gumstix.c| 2 + hw/arm/highbank.c | 2 + hw/arm/imx25_pdk.c | 1 + hw/arm/integratorcp.c | 1 + hw/arm/kzm.c| 1 + hw/arm/mainstone.c | 1 + hw/arm/musicpal.c | 1 + hw/arm/netduino2.c | 1 + hw/arm/nseries.c| 2 + hw/arm/omap_sx1.c | 2 + hw/arm/palm.c | 1 + hw/arm/raspi.c | 1 + hw/arm/realview.c | 4 ++ hw/arm/sabrelite.c | 1 + hw/arm/spitz.c | 4 ++ hw/arm/stellaris.c | 2 + hw/arm/tosa.c | 1 + hw/arm/versatilepb.c| 2 + hw/arm/vexpress.c | 1 + hw/arm/xilinx_zynq.c| 1 + hw/arm/xlnx-ep108.c | 2 + hw/arm/z2.c | 1 + qom/cpu.c | 7 +++ target/arm/cpu.c| 1 + target/arm/op_helper.c | 155 +++- 37 files changed, 243 insertions(+), 68 deletions(-) -- 2.7.4
Re: [Qemu-devel] [PATCH 1/2] rcu: completely disable pthread_atfork callbacks as soon as possible
* Paolo Bonzini (pbonz...@redhat.com) wrote: > Because of -daemonize, system mode QEMU sometimes needs to fork() and > keep RCU enabled in the child. However, there is a possible deadlock > with synchronize_rcu: > > - the CPU thread is inside a RCU critical section and wants to take > the BQL in order to do MMIO > > - the monitor thread, which is owning the BQL, calls rcu_init_lock > which tries to take the rcu_sync_lock > > - the call_rcu thread has taken rcu_sync_lock in synchronize_rcu, but > synchronize_rcu needs the CPU thread to end the critical section > before returning. > > This cannot happen for user-mode emulation, because it does not have > a BQL. > > To fix it, assume that system mode QEMU only forks in preparation for > exec (except when daemonizing) and disable pthread_atfork as soon as > the double fork has happened. > > Reported-by: David Gilbert > Signed-off-by: Paolo Bonzini That seems to fix that issue (not tested it any more than that though), so for fixing that issue: Tested-by: Dr. David Alan Gilbert > --- > include/qemu/rcu.h | 6 ++ > util/rcu.c | 20 > vl.c | 1 + > 3 files changed, 27 insertions(+) > > diff --git a/include/qemu/rcu.h b/include/qemu/rcu.h > index 83ae2808be..c0da9907e8 100644 > --- a/include/qemu/rcu.h > +++ b/include/qemu/rcu.h > @@ -105,6 +105,12 @@ extern void synchronize_rcu(void); > */ > extern void rcu_register_thread(void); > extern void rcu_unregister_thread(void); > + > +/* > + * Support for fork(). fork() support is enabled at startup. > + */ > +extern void rcu_enable_atfork(void); > +extern void rcu_disable_atfork(void); > extern void rcu_after_fork(void); > > struct rcu_head; > diff --git a/util/rcu.c b/util/rcu.c > index 9adc5e4a36..2142ddd93b 100644 > --- a/util/rcu.c > +++ b/util/rcu.c > @@ -318,15 +318,35 @@ static void rcu_init_complete(void) > rcu_register_thread(); > } > > +static int atfork_depth = 1; > + > +void rcu_enable_atfork(void) > +{ > +atfork_depth++; > +} > + > +void rcu_disable_atfork(void) > +{ > +atfork_depth--; > +} > + > #ifdef CONFIG_POSIX > static void rcu_init_lock(void) > { > +if (atfork_depth < 1) { > +return; > +} > + > qemu_mutex_lock(&rcu_sync_lock); > qemu_mutex_lock(&rcu_registry_lock); > } > > static void rcu_init_unlock(void) > { > +if (atfork_depth < 1) { > +return; > +} > + > qemu_mutex_unlock(&rcu_registry_lock); > qemu_mutex_unlock(&rcu_sync_lock); > } > diff --git a/vl.c b/vl.c > index 99fcfa0442..8967115514 100644 > --- a/vl.c > +++ b/vl.c > @@ -4121,6 +4121,7 @@ int main(int argc, char **argv, char **envp) > set_memory_options(&ram_slots, &maxram_size, machine_class); > > os_daemonize(); > +rcu_disable_atfork(); > > if (pid_file && qemu_create_pidfile(pid_file) != 0) { > error_report("could not acquire pid file: %s", strerror(errno)); > -- > 2.13.3 > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH for-2.11 22/23] tcg/ppc: Look for shifted constants
On 08/04/2017 09:39 AM, Philippe Mathieu-Daudé wrote: >> @@ -638,6 +639,14 @@ static void tcg_out_movi_int(TCGContext *s, TCGType >> type, TCGReg ret, >> return; >> } >> +lsb = ctz64(arg); >> +high = arg >> lsb; >> +if (arg == (int16_t)arg) { > > Can you move these here? > > +lsb = ctz64(arg); > +high = arg >> lsb; No, because you've found a bug -- the if should be testing high, not arg. ;-) Thanks, r~
Re: [Qemu-devel] [PATCH] kvm: workaround build break on gcc-7.1.1 / fedora26
On 03/08/2017 16:24, Cornelia Huck wrote: > On Thu, 3 Aug 2017 09:10:29 -0500 > Eric Blake wrote: > >> On 08/03/2017 08:46 AM, Philippe Mathieu-Daudé wrote: >>> Hi Greg, >>> >>> On 08/02/2017 11:47 AM, Greg Kurz wrote: Building QEMU on fedora26 with the latest gcc package fails: CC ppc64-softmmu/target/ppc/kvm.o In file included from include/sysemu/hw_accel.h:16:0, from target/ppc/kvm.c:31: target/ppc/kvm.c: In function ‘kvmppc_booke_watchdog_enable’: include/sysemu/kvm.h:449:35: error: ‘args_tmp[i]’ may be used uninitialized in this function [-Werror=maybe-uninitialized] cap.args[i] = args_tmp[i]; \ ^ target/ppc/kvm.c: In function ‘kvmppc_set_papr’: include/sysemu/kvm.h:449:35: error: ‘args_tmp[i]’ may be used uninitialized in this function [-Werror=maybe-uninitialized] cc1: all warnings being treated as errors >>> >>> I'm trying to reproduce this in our docker images (all x86_64 based) but >>> can't reproduce. >> >> That's because x86_64 hosts only call kvm_vm_enable_cap() with non-empty >> varargs. > > There's > > target/i386/kvm.c:kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_SYNIC, > 0)) { > >> But we have: >> >> accel/kvm/kvm-all.c: ret = kvm_vm_enable_cap(s, KVM_CAP_S390_IRQCHIP, 0); >> >> which is only compiled on s390 hosts (or, at least that's my guess, >> based on the cap name) > > I don't see how the compiler can optimize this away, as the check for > this cap is an ioctl... Indeed. This is a compiler bug and it only provides a warning (meaning --disable-werror silences it). I don't think it's a good idea to uglify the code for something that the compiler should easily optimize away... Paolo > This seems a bit ugly. And I still don't understand why this only seems > to hit on ppc...
Re: [Qemu-devel] [PATCH v2 6/6] dev-storage: Fix the unusual function name
On 08/04/2017 07:26 AM, Mao Zhongyi wrote: The function name of usb_msd_{realize,unrealize}_*, usb_msd_class_initfn_* are unusual. Rename it to usb_msd_*_{realize,unrealize}, usb_msd_class_*_initfn. Cc: Gerd Hoffmann Signed-off-by: Mao Zhongyi Reviewed-by: Philippe Mathieu-Daudé --- hw/usb/dev-storage.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/hw/usb/dev-storage.c b/hw/usb/dev-storage.c index 801f552..2a05cd5 100644 --- a/hw/usb/dev-storage.c +++ b/hw/usb/dev-storage.c @@ -596,7 +596,7 @@ static void usb_msd_unrealize_storage(USBDevice *dev, Error **errp) object_unref(OBJECT(&s->bus)); } -static void usb_msd_realize_storage(USBDevice *dev, Error **errp) +static void usb_msd_storage_realize(USBDevice *dev, Error **errp) { MSDState *s = USB_STORAGE_DEV(dev); BlockBackend *blk = s->conf.blk; @@ -643,14 +643,14 @@ static void usb_msd_realize_storage(USBDevice *dev, Error **errp) s->scsi_dev = scsi_dev; } -static void usb_msd_unrealize_bot(USBDevice *dev, Error **errp) +static void usb_msd_bot_unrealize(USBDevice *dev, Error **errp) { MSDState *s = USB_STORAGE_DEV(dev); object_unref(OBJECT(&s->bus)); } -static void usb_msd_realize_bot(USBDevice *dev, Error **errp) +static void usb_msd_bot_realize(USBDevice *dev, Error **errp) { MSDState *s = USB_STORAGE_DEV(dev); DeviceState *d = DEVICE(dev); @@ -764,12 +764,12 @@ static void usb_msd_class_initfn_common(ObjectClass *klass, void *data) dc->vmsd = &vmstate_usb_msd; } -static void usb_msd_class_initfn_storage(ObjectClass *klass, void *data) +static void usb_msd_class_storage_initfn(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); USBDeviceClass *uc = USB_DEVICE_CLASS(klass); -uc->realize = usb_msd_realize_storage; +uc->realize = usb_msd_storage_realize; uc->unrealize = usb_msd_unrealize_storage; dc->props = msd_properties; } @@ -828,26 +828,26 @@ static void usb_msd_instance_init(Object *obj) object_property_set_int(obj, -1, "bootindex", NULL); } -static void usb_msd_class_initfn_bot(ObjectClass *klass, void *data) +static void usb_msd_class_bot_initfn(ObjectClass *klass, void *data) { USBDeviceClass *uc = USB_DEVICE_CLASS(klass); -uc->realize = usb_msd_realize_bot; -uc->unrealize = usb_msd_unrealize_bot; +uc->realize = usb_msd_bot_realize; +uc->unrealize = usb_msd_bot_unrealize; uc->attached_settable = true; } static const TypeInfo msd_info = { .name = "usb-storage", .parent= TYPE_USB_STORAGE, -.class_init= usb_msd_class_initfn_storage, +.class_init= usb_msd_class_storage_initfn, .instance_init = usb_msd_instance_init, }; static const TypeInfo bot_info = { .name = "usb-bot", .parent= TYPE_USB_STORAGE, -.class_init= usb_msd_class_initfn_bot, +.class_init= usb_msd_class_bot_initfn, }; static void usb_msd_register_types(void)
Re: [Qemu-devel] [Bug 1708551] Re: macOS Guest Reading USB 3.0 Bus as USB 2.0
Removing the `bus=usb-bus.0` param from everything, [as shown here](https://pastebin.com/x0Cp70XD), boots fine; but completely breaks mouse functionality. To mitigate, i tried passing through a Logitech mousepad I own--in similar fashion to the iPhone; but Host complained about permissions & `libusb` requiring "write access to USB device nodes". FWIW, I vaguely recall the `usb-tablet` parameter being "a whole thing" with virtual macOS; however, I'll need to check my records to refresh my memory. > On Aug 4, 2017, at 11:29 AM, Thomas Huth <1708...@bugs.launchpad.net> wrote: > > Then simply omit the "bus=usb-bus.0" here - the devices then should be > put onto the XHCI bus automatically. > > -- > You received this bug notification because you are subscribed to the bug > report. > https://bugs.launchpad.net/bugs/1708551 > > Title: > macOS Guest Reading USB 3.0 Bus as USB 2.0 > > Status in QEMU: > Incomplete > > Bug description: > Description: > I'm having trouble with USB Passthrough. Running `system_profiler > SPUSBDataType` on macOS guest confirms that the system "sees" my device, and > that qemu is passing *something* through. However, the system sees my > connection as USB 2.0, even though i'm passing through XHCI controllers > (nec-usb-xhci/qemu-xhci). I suspect this is the reason why my guest is unable > to recognize my iPhone in XCode & iTunes. > > Parameters: > QEMU release version: 2.10.0-rc0 > Bios: [patched version of OVMF](https://github.com/gsomlo/edk2/tree/macboot)] > Command Line: https://pastebin.com/pBSYbrW1 > Host: Arch Linux > Guest: macOS v10.12.6 > Guest System Info: https://pastebin.com/U1Tihxei > > Notes: > 1. Observed sometime after late-May-early-June of this year. > > 2. Due to [a defect in qemu v2.8 which affected GTK > users](https://bugs.launchpad.net/qemu/+bug/1578192), and [a recent > change to macOS' booting process conflicting with qemu > v2.9](https://lists.nongnu.org/archive/html/qemu- > devel/2017-03/msg06366.html), i'm forced to use qemu v2.10.0-rc0 (as > -rc1 fails to load OVMF right now). > > To manage notifications about this bug go to: > https://bugs.launchpad.net/qemu/+bug/1708551/+subscriptions -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1708551 Title: macOS Guest Reading USB 3.0 Bus as USB 2.0 Status in QEMU: Incomplete Bug description: Description: I'm having trouble with USB Passthrough. Running `system_profiler SPUSBDataType` on macOS guest confirms that the system "sees" my device, and that qemu is passing *something* through. However, the system sees my connection as USB 2.0, even though i'm passing through XHCI controllers (nec-usb-xhci/qemu-xhci). I suspect this is the reason why my guest is unable to recognize my iPhone in XCode & iTunes. Parameters: QEMU release version: 2.10.0-rc0 Bios: [patched version of OVMF](https://github.com/gsomlo/edk2/tree/macboot)] Command Line: https://pastebin.com/pBSYbrW1 Host: Arch Linux Guest: macOS v10.12.6 Guest System Info: https://pastebin.com/U1Tihxei Notes: 1. Observed sometime after late-May-early-June of this year. 2. Due to [a defect in qemu v2.8 which affected GTK users](https://bugs.launchpad.net/qemu/+bug/1578192), and [a recent change to macOS' booting process conflicting with qemu v2.9](https://lists.nongnu.org/archive/html/qemu- devel/2017-03/msg06366.html), i'm forced to use qemu v2.10.0-rc0 (as -rc1 fails to load OVMF right now). To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1708551/+subscriptions
[Qemu-devel] [Bug 1352179] Re: could not open disk image
Triaging old bug tickets ... can you still reproduce this problem with the latest release of QEMU? how did you create the disk image? ** Changed in: qemu Status: New => Incomplete -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1352179 Title: could not open disk image Status in QEMU: Incomplete Bug description: After restart the server it's show this error: Error starting domain: internal error process exited while connecting to monitor: char device redirected to /dev/pts/1 qemu-kvm: -drive file=/var/lib/nova/instances/b4535ce9-54b5-4581-a906-16b83bf1ba2f/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=none: could not open disk image /var/lib/nova/instances/b4535ce9-54b5-4581-a906-16b83bf1ba2f/disk: No such file or directory the disk info show  qemu-img info disk image: disk file format: qcow2 virtual size: 100G (107374182400 bytes) disk size: 22G cluster_size: 65536 backing file: /var/lib/nova/instances/_base/b4535ce9-54b5-4581-a906-16b83bf1ba2f but this file (backing file : /var/lib/nova/instances/_base/b4535ce9-54b5-4581-a906-16b83bf1ba2f) is empty. And all the instances can't find the disk image We use CentOS release 6.5 (64bit) kernel version : 2.6.32-431.11.2.el6.x86_64 qemu-kvm-0.12.1.2-2.415.el6_5.10.x86_64 virsh version Compiled against library: libvirt 0.10.2 Using library: libvirt 0.10.2 Using API: QEMU 0.10.2 Running hypervisor: QEMU 0.12.1 To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1352179/+subscriptions
[Qemu-devel] [PATCH 1/2] rcu: completely disable pthread_atfork callbacks as soon as possible
Because of -daemonize, system mode QEMU sometimes needs to fork() and keep RCU enabled in the child. However, there is a possible deadlock with synchronize_rcu: - the CPU thread is inside a RCU critical section and wants to take the BQL in order to do MMIO - the monitor thread, which is owning the BQL, calls rcu_init_lock which tries to take the rcu_sync_lock - the call_rcu thread has taken rcu_sync_lock in synchronize_rcu, but synchronize_rcu needs the CPU thread to end the critical section before returning. This cannot happen for user-mode emulation, because it does not have a BQL. To fix it, assume that system mode QEMU only forks in preparation for exec (except when daemonizing) and disable pthread_atfork as soon as the double fork has happened. Reported-by: David Gilbert Signed-off-by: Paolo Bonzini --- include/qemu/rcu.h | 6 ++ util/rcu.c | 20 vl.c | 1 + 3 files changed, 27 insertions(+) diff --git a/include/qemu/rcu.h b/include/qemu/rcu.h index 83ae2808be..c0da9907e8 100644 --- a/include/qemu/rcu.h +++ b/include/qemu/rcu.h @@ -105,6 +105,12 @@ extern void synchronize_rcu(void); */ extern void rcu_register_thread(void); extern void rcu_unregister_thread(void); + +/* + * Support for fork(). fork() support is enabled at startup. + */ +extern void rcu_enable_atfork(void); +extern void rcu_disable_atfork(void); extern void rcu_after_fork(void); struct rcu_head; diff --git a/util/rcu.c b/util/rcu.c index 9adc5e4a36..2142ddd93b 100644 --- a/util/rcu.c +++ b/util/rcu.c @@ -318,15 +318,35 @@ static void rcu_init_complete(void) rcu_register_thread(); } +static int atfork_depth = 1; + +void rcu_enable_atfork(void) +{ +atfork_depth++; +} + +void rcu_disable_atfork(void) +{ +atfork_depth--; +} + #ifdef CONFIG_POSIX static void rcu_init_lock(void) { +if (atfork_depth < 1) { +return; +} + qemu_mutex_lock(&rcu_sync_lock); qemu_mutex_lock(&rcu_registry_lock); } static void rcu_init_unlock(void) { +if (atfork_depth < 1) { +return; +} + qemu_mutex_unlock(&rcu_registry_lock); qemu_mutex_unlock(&rcu_sync_lock); } diff --git a/vl.c b/vl.c index 99fcfa0442..8967115514 100644 --- a/vl.c +++ b/vl.c @@ -4121,6 +4121,7 @@ int main(int argc, char **argv, char **envp) set_memory_options(&ram_slots, &maxram_size, machine_class); os_daemonize(); +rcu_disable_atfork(); if (pid_file && qemu_create_pidfile(pid_file) != 0) { error_report("could not acquire pid file: %s", strerror(errno)); -- 2.13.3
Re: [Qemu-devel] [PATCH for-2.11 22/23] tcg/ppc: Look for shifted constants
Hi Richard, On 08/04/2017 02:44 AM, Richard Henderson wrote: Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.inc.c | 9 + 1 file changed, 9 insertions(+) diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c index bc14d2c9c6..4b32809217 100644 --- a/tcg/ppc/tcg-target.inc.c +++ b/tcg/ppc/tcg-target.inc.c @@ -598,6 +598,7 @@ static void tcg_out_movi_int(TCGContext *s, TCGType type, TCGReg ret, { intptr_t tb_diff; int32_t high; +int lsb; tcg_debug_assert(TCG_TARGET_REG_BITS == 64 || type == TCG_TYPE_I32); @@ -638,6 +639,14 @@ static void tcg_out_movi_int(TCGContext *s, TCGType type, TCGReg ret, return; } +lsb = ctz64(arg); +high = arg >> lsb; +if (arg == (int16_t)arg) { Can you move these here? +lsb = ctz64(arg); +high = arg >> lsb; +tcg_out32(s, ADDI | TAI(ret, 0, high)); +tcg_out_shli64(s, ret, ret, lsb); +return; +} + high = arg >> 31 >> 1; tcg_out_movi(s, TCG_TYPE_I32, ret, high); if (high) { Regards, Phil.
Re: [Qemu-devel] [PATCH 05/17] block/nbd-client: get rid of ssize_t
On Fri, Aug 04, 2017 at 06:14:28PM +0300, Vladimir Sementsov-Ogievskiy wrote: > Use int variable for nbd_co_send_request return value (as > nbd_co_send_request returns int). Hmm, nbd_co_send_request() propagates return value of nbd_send_request, which returns ssize_t. IOW, I think we need to fix nbd_co_send_request to also return ssize_t, rather than changing callers to use int. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
[Qemu-devel] [PATCH 2/2] Revert "rcu: do not create thread in pthread_atfork callback"
This reverts commit a59629fcc6f603e19b516dc08f75334e5c480bd0. This is not needed anymore because the IOThread mutex is not "magic" anymore (need not kick the CPU thread) and also because fork callbacks are only enabled at the very beginning of QEMU's execution. Signed-off-by: Paolo Bonzini --- include/qemu/rcu.h | 1 - linux-user/syscall.c | 1 - os-posix.c | 2 -- util/rcu.c | 10 +++--- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/include/qemu/rcu.h b/include/qemu/rcu.h index c0da9907e8..f19413d649 100644 --- a/include/qemu/rcu.h +++ b/include/qemu/rcu.h @@ -111,7 +111,6 @@ extern void rcu_unregister_thread(void); */ extern void rcu_enable_atfork(void); extern void rcu_disable_atfork(void); -extern void rcu_after_fork(void); struct rcu_head; typedef void RCUCBFunc(struct rcu_head *head); diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 54343c06be..9b6364a266 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -6354,7 +6354,6 @@ static int do_fork(CPUArchState *env, unsigned int flags, abi_ulong newsp, ret = fork(); if (ret == 0) { /* Child Process. */ -rcu_after_fork(); cpu_clone_regs(env, newsp); fork_end(1); /* There is a race condition here. The parent process could diff --git a/os-posix.c b/os-posix.c index c6ddb7d830..92e9d85215 100644 --- a/os-posix.c +++ b/os-posix.c @@ -34,7 +34,6 @@ #include "sysemu/sysemu.h" #include "net/slirp.h" #include "qemu-options.h" -#include "qemu/rcu.h" #include "qemu/error-report.h" #include "qemu/log.h" #include "qemu/cutils.h" @@ -249,7 +248,6 @@ void os_daemonize(void) signal(SIGTSTP, SIG_IGN); signal(SIGTTOU, SIG_IGN); signal(SIGTTIN, SIG_IGN); -rcu_after_fork(); } } diff --git a/util/rcu.c b/util/rcu.c index 2142ddd93b..ca5a63e36a 100644 --- a/util/rcu.c +++ b/util/rcu.c @@ -350,18 +350,22 @@ static void rcu_init_unlock(void) qemu_mutex_unlock(&rcu_registry_lock); qemu_mutex_unlock(&rcu_sync_lock); } -#endif -void rcu_after_fork(void) +static void rcu_init_child(void) { +if (atfork_depth < 1) { +return; +} + memset(®istry, 0, sizeof(registry)); rcu_init_complete(); } +#endif static void __attribute__((__constructor__)) rcu_init(void) { #ifdef CONFIG_POSIX -pthread_atfork(rcu_init_lock, rcu_init_unlock, rcu_init_unlock); +pthread_atfork(rcu_init_lock, rcu_init_unlock, rcu_init_child); #endif rcu_init_complete(); } -- 2.13.3
[Qemu-devel] [PATCH for-2.10 0/2] RCU: forking fix and cleanups
The first patch fixes a migration deadlock. The second reverts a now-unnecessary hack (introduced to support the pre-MTTCG "kick the CPU thread" special behavior of qemu_mutex_lock_iothread). Paolo Paolo Bonzini (2): rcu: completely disable pthread_atfork callbacks as soon as possible Revert "rcu: do not create thread in pthread_atfork callback" include/qemu/rcu.h | 7 ++- linux-user/syscall.c | 1 - os-posix.c | 2 -- util/rcu.c | 30 +++--- vl.c | 1 + 5 files changed, 34 insertions(+), 7 deletions(-) -- 2.13.3
Re: [Qemu-devel] [Qemu-block] [PATCH] block: document semanatics of bdrv_co_preadv|pwritev
On Fri, Aug 04, 2017 at 10:41:54AM -0500, Eric Blake wrote: > On 08/04/2017 09:06 AM, Daniel P. Berrange wrote: > > On Fri, Aug 04, 2017 at 04:02:10PM +0200, Kevin Wolf wrote: > >> Am 04.08.2017 um 12:50 hat Daniel P. Berrange geschrieben: > >>> Signed-off-by: Daniel P. Berrange > >>> --- > > >> Really? We are asserting that they match in bdrv_aligned_preadv(): > >> > >> assert(!qiov || bytes == qiov->size); > > > > Hmm, why do we pass @bytes at all then ? If they're always the same, > > how about deleting it and just letting everyone read qiov->size > > directly. > > Read the assertion again: qiov can be NULL (generally, when writing > zeroes). So we can't rely on qiov->size in that scenario. This is odd. In the bdrv_aligned_readv() it looks very much like we'll reference qiov->niov, if bytes != 0, so if qiov was NULL we would crash. In bdrv_aligned_writev(), qiov->niov is also refernced if bytes != 0, *unless* flags contains BDRV_REQ_ZERO_WRITE, in which case we'll invoke bdrv_co_do_pwrite_zeroes() instead. So unless I'm missing something, bdrv_co_preadv|writev cannot be called with a NULL qiov, and bdrv_aligned_writev|readv might need their assertions tightened up. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [Qemu-devel] [Qemu-block] [PATCH] block/null: Remove 'filename' option
On Fri, Aug 04, 2017 at 04:43:54PM +0200, Kevin Wolf wrote: > This option was only added to allow 'null-co://' and 'null-aio://' as > filenames, its value never served any actual purpose and was ignored. > Nevertheless it was accepted as '-drive driver=null,filename=foo'. > > The correct way to enable the protocol prefixes (and that without adding > a useless -drive option) is implementing .bdrv_parse_filename. This is > what this patch does. > > Technically, this is an incompatible change, but the null block driver > is only used for benchmarking, testing and debugging, and an option > without effect isn't likely to be used by anyone anyway, so no bad > effects are to be expected. > > Reported-by: Markus Armbruster > Signed-off-by: Kevin Wolf > --- > block/null.c | 31 ++- > 1 file changed, 26 insertions(+), 5 deletions(-) > > diff --git a/block/null.c b/block/null.c > index 876f90965b..dd9c13f9ba 100644 > --- a/block/null.c > +++ b/block/null.c > @@ -30,11 +30,6 @@ static QemuOptsList runtime_opts = { > .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head), > .desc = { > { > -.name = "filename", > -.type = QEMU_OPT_STRING, > -.help = "", > -}, > -{ > .name = BLOCK_OPT_SIZE, > .type = QEMU_OPT_SIZE, > .help = "size of the null block", > @@ -54,6 +49,30 @@ static QemuOptsList runtime_opts = { > }, > }; > > +static void null_co_parse_filename(const char *filename, QDict *options, > + Error **errp) > +{ > +/* This functions only exists so that a null-co:// filename is accepted > + * with the null-co driver. */ > +if (strcmp(filename, "null-co://")) { > +error_setg(errp, "The only allowed filename for this driver is " > + "'null-co://'"); > +return; > +} > +} > + > +static void null_aio_parse_filename(const char *filename, QDict *options, > +Error **errp) > +{ > +/* This functions only exists so that a null-aio:// filename is accepted > + * with the null-aio driver. */ > +if (strcmp(filename, "null-aio://")) { > +error_setg(errp, "The only allowed filename for this driver is " > + "'null-aio://'"); > +return; > +} > +} > + > static int null_file_open(BlockDriverState *bs, QDict *options, int flags, >Error **errp) > { > @@ -242,6 +261,7 @@ static BlockDriver bdrv_null_co = { > .instance_size = sizeof(BDRVNullState), > > .bdrv_file_open = null_file_open, > +.bdrv_parse_filename= null_co_parse_filename, > .bdrv_close = null_close, > .bdrv_getlength = null_getlength, > > @@ -261,6 +281,7 @@ static BlockDriver bdrv_null_aio = { > .instance_size = sizeof(BDRVNullState), > > .bdrv_file_open = null_file_open, > +.bdrv_parse_filename= null_aio_parse_filename, > .bdrv_close = null_close, > .bdrv_getlength = null_getlength, > > -- > 2.13.3 > > Reviewed-by: Jeff Cody
Re: [Qemu-devel] [PATCH for-2.10] vmdk: Fix error handling/reporting of vmdk_check
On 08/04/2017 09:09 AM, Fam Zheng wrote: > Errors from the callees must be captured and propagated to our caller, > ensure this for both find_extent() and bdrv_getlength(). > > Reported-by: Markus Armbruster > Signed-off-by: Fam Zheng > --- > block/vmdk.c | 26 ++ > 1 file changed, 18 insertions(+), 8 deletions(-) > > +if (ret == VMDK_OK) { > +int64_t extent_len = bdrv_getlength(extent->file->bs); > +if (extent_len < 0) { > +fprintf(stderr, > +"ERROR: could not get extent file length for sector > %" > +PRId64 "\n", sector_num); > +ret = extent_len; Pre-existing - our use of fprintf() is not ideal. But this patch doesn't make it worse. Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH for-2.10] quorum: Handle bdrv_getlength() failures in quorum_co_flush()
On 08/04/2017 09:08 AM, Alberto Garcia wrote: > A bdrv_getlength() call can fail and return a negative value. This > is not being handled in quorum_co_flush(), which can result in a > QUORUM_REPORT_BAD event with an arbitrary value on the 'sectors-count' > field. > > Reported-by: Markus Armbruster > Signed-off-by: Alberto Garcia > --- > block/quorum.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/block/quorum.c b/block/quorum.c > index 55ba916655..d77991d680 100644 > --- a/block/quorum.c > +++ b/block/quorum.c > @@ -785,8 +785,9 @@ static coroutine_fn int quorum_co_flush(BlockDriverState > *bs) > for (i = 0; i < s->num_children; i++) { > result = bdrv_co_flush(s->children[i]->bs); > if (result) { > +int64_t length = bdrv_getlength(s->children[i]->bs); > quorum_report_bad(QUORUM_OP_TYPE_FLUSH, 0, > - bdrv_getlength(s->children[i]->bs), > + length > 0 ? length : 0, In the fallback case, is always picking 0 good enough? Then again, this is in the error path, so it is unlikely in practice, and I don't see any better way to handle it. Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] Is blk_getlength() in find_image_format() and img_map() kosher?
On 08/04/2017 08:55 AM, Markus Armbruster wrote: > Have a look at find_image_format(): > > if (blk_is_sg(file) || !blk_is_inserted(file) || blk_getlength(file) == > 0) { > *pdrv = &bdrv_raw; > return ret; > } > > blk_getlength() can fail. Shouldn't we error out then? > > We pretty obviously should in img_map(): > > length = blk_getlength(blk); > while (curr.start + curr.length < length) { > > Since I have to touch @length in the series I'm working on, I'll stick > in a fix. /me goes and checks my byte-based block status series Hmm, that problem is still present even after my refactoring of img_map(). I can rebase my series on top of your fix, as your fix belongs in 2.10. Alternatively, WHY do we allow blk_getlength() to fail? Are there really situations where we can open a BDS but not know its length? I know that we allow for online resizing detection, but presumably, either we always know the (prior) size of the device, or we're going to have problems talking to the device for anything beyond just a size request. What semantics would change if we could guarantee that blk_getlength() never failed? -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v4 14/15] iotest 190: test BDRV_REQ_ALLOCATE
On 08/01/2017 09:19 AM, Anton Nefedov wrote: > Signed-off-by: Anton Nefedov > --- > tests/qemu-iotests/190 | 146 > + > tests/qemu-iotests/190.out | 50 > tests/qemu-iotests/group | 1 + > 3 files changed, 197 insertions(+) > create mode 100755 tests/qemu-iotests/190 > create mode 100644 tests/qemu-iotests/190.out 190 already exists on mainline, you'll have to choose an available test id that hasn't been mentioned on-list yet (yeah, our process for reserving test ids is lousy. Max has already said he doesn't mind fixing conflicts if it is just a duplicated test number, for qemu-iotests that he manages) > +_cleanup() > +{ > + _cleanup_test_img > +} > +trap "_cleanup; exit \$status" 0 1 2 3 15 This part won't be needed, if you land after Stefan's patch that adds per-test directories with the ability to preserve temporary files -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v4 02/15] blkverify: set supported write/zero flags
On 08/01/2017 09:18 AM, Anton Nefedov wrote: > Signed-off-by: Anton Nefedov > --- > block/blkverify.c | 9 + > 1 file changed, 9 insertions(+) Basically, blkverify supports a flag if BOTH of its underlying files also support the flag; if either side can't handle the flag, then we fall back to emulation for both sides. With more overhead, we COULD state that we support both bits if at least one of the two underlying BDS supports the bit, and then emulate support for the bit on the second BDS where it was lacking, so that at least the first BDS doesn't suffer from the penalties of the fallbacks. But that means duplicating the block layer fallback code in blkverify, which is already something that we don't necessarily expect high performance from. For FUA, failure to implement the bit merely means that we have more device-wide flush calls (instead of per-transaction mini-flushes), but the end data should be the same. But for MAY_UNMAP, I'm worried that we may have situations where a plain BDS will create holes, while running the same device paired through blkverify will fall back to slower explicit zeroes. I'm wondering whether this will bite us, if we have scenarios where the mere fact of trying to verify block device behavior changes what behavior we are even verifying. Thus, while I think the code change _looks_ okay, I'm not sure if it is correct design-wise, nor whether it is 2.10 material. > +bs->supported_write_flags = BDRV_REQ_FUA & > +bs->file->bs->supported_write_flags & > +s->test_file->bs->supported_write_flags; > + > +bs->supported_zero_flags = > +(BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) & > +bs->file->bs->supported_zero_flags & > +s->test_file->bs->supported_zero_flags; > + > ret = 0; > fail: > qemu_opts_del(opts); > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v8 0/5] hypertrace: Lightweight guest-to-QEMU trace channel
On Sun, Jul 30, 2017 at 05:08:18PM +0300, LluÃs Vilanova wrote: > The hypertrace channel allows guest code to emit events in QEMU (the host) > using > its tracing infrastructure (see "docs/trace.txt"). This works in both 'system' > and 'user' modes, is architecture-agnostic and introduces minimal noise on the > guest. > > See first commit for a full description, use-cases and an example. > > Signed-off-by: LluÃs Vilanova > --- > > Changes in v8 > = > > * Do not use 'seq' when there's no extra hypertrace arguments (BSD behaves > differently for "seq 0"). > * Fix compilation for bsd-user. Hi LluÃs, Any changes regarding the fundamental approach since September 2016? Back then I had the following concerns about hypertrace for full-system virtualization: Going to QEMU for every guest trace event has high overhead under virtualization. The alternative approach is recording separate traces and merging them for analysis. This is possible with trace-cmd [1] and LTTng [2]. Merging traces eliminates the performance bottleneck and does not require new paravirt interfaces or guest tracing libraries. I think it it would be a distraction to support hypertrace for the virtualization use case because it fundamentally has a high overhead. I see promise in using hypertrace for TCG because it is low-overhead when running inside the QEMU process. I'll review the patches again with this in mind and not focus on virtualization. [1] https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg00887.html [2] http://archive.eclipse.org/tracecompass/doc/stable/org.eclipse.tracecompass.doc.user/Virtual-Machine-Analysis.html and also generic trace synchronization http://archive.eclipse.org/tracecompass/doc/stable/org.eclipse.tracecompass.doc.user/Trace-synchronization.html#Trace_synchronization Stefan > Changes in v7 > = > > * Use 'expr' instead of assuming 'bash' when generating the "emit.c" file. > * Restore generation of trace-events-all. > > > Changes in v6 > = > > * Fix compilation errors. > > > Changes in v5 > = > > * Rebase on 5a477a7806. > * Fix typo in "bsd-user/main.c" [Stephan Hajnoczi]. > * Replace abort() with exit() in command-line errors [Stephan Hajnoczi]. > * Fix alignment of data and control channels [Stephan Hajnoczi]. > * Fix signal reflection in user-mode (SIGINT, SIGABRT, SIGSEGV) [Stephan > Hajnoczi]. > * Clarify semantics of hypertrace_guest_mmap_check() [Stephan Hajnoczi]. > * Use uintptr_t instead of unsigned long in SEGV handler [Stephan Hajnoczi]. > * Emit hypertrace's event with host-endian arguments [Stephan Hajnoczi]. > * Enable true concurrency between user-mode guest threads by using a spearate > control channel page per client [Stephan Hajnoczi]. > * Remove unused PAGE_SIZE define [Stephan Hajnoczi]. > * Submit linux kernel API module separately to Linux upstream [Stephan > Hajnoczi]. > * Assume guest code events are always enabled. > > > Changes in v4 > = > > * Fix typo in stap script example. > * Fix compilation instructions in doc/hypertrace.txt. > * Rebase on 0737f32daf. > > > Changes in v3 > = > > * Rebase on 4a58f35. > * Remove debugging printf's. > * Fix style issues identified by checkpatch. > * Fix control channel mapping in guest linux module. > * Add a short event description in "trace-events". > * Polish documentation in 1st patch. > > > Changes in v2 > = > > * Remove unnecessary casts for g2h() [Eric Blake]. > * Use perror() [Eric Blake]. > * Avoid expansions in application example [Eric Blake]. > * Add copyright in document "hypertrace.txt" [Eric Blake]. > * Make the user-mode hypertrace invocations thread-safe [Stefan Hajnoczi]. > * Split dynamic hypertrace configuration into a separate "config" channel. > > LluÃs Vilanova (5): > hypertrace: Add documentation > hypertrace: Add tracing event "guest_hypertrace" > hypertrace: [*-user] Add QEMU-side proxy to "guest_hypertrace" event > hypertrace: [softmmu] Add QEMU-side proxy to "guest_hypertrace" event > hypertrace: Add guest-side user-level library > > > Makefile | 11 + > Makefile.objs |6 + > bsd-user/main.c| 17 + > bsd-user/mmap.c| 15 + > bsd-user/qemu.h|3 > bsd-user/syscall.c | 34 ++- > configure | 36 +++ > docs/devel/tracing.txt |3 > docs/hypertrace.txt| 225 > hypertrace/Makefile.objs | 25 ++ > hypertrace/common.c| 55 + > hypertrace/common.h| 25 ++ > hypertrace/guest/Makefile | 30 +++ > hypertrace/guest/common.c | 301 ++ > hypertrace/guest/qemu-hypertrace.h | 80 +++ > hypertrace/softmmu.c | 237 + > hype
Re: [Qemu-devel] [PATCH v3] block: document semanatics of bdrv_co_preadv|pwritev
On Fri, Aug 04, 2017 at 09:28:54AM -0500, Eric Blake wrote: > On 08/04/2017 09:08 AM, Daniel P. Berrange wrote: > > Signed-off-by: Daniel P. Berrange > > --- > > > > - Clarify that @bytes matches @qiov total size (Kevin) > > > > include/block/block_int.h | 31 +++ > > 1 file changed, 31 insertions(+) > > [looks like the nongnu.org infrastructure is having heavy load today, so > mails are getting through more slowly than usual - leads to lots of > crossed emails, where I'm seeing replies or direct sends sooner than > list copies] > > > +/** > > + * @offset: position in bytes to write at > > + * @bytes: number of bytes to write > > + * @qiov: the buffers containing data to write > > + * @flags: zero or more of bits allowed by 'supported_write_flags' > > maybe s/of // > > > + * > > + * @offset and @bytes will be a multiple of 'request_alignment', > > + * but the length of individual @qiov elements does not have to > > + * be a multiple. > > + * > > + * @bytes will always equal the total size of @qiov, and will be > > + * no larger than 'max_transfer'. > > + * > > + * The buffer in @qiov may point directly to guest memory. > > + */ > > int coroutine_fn (*bdrv_co_pwritev)(BlockDriverState *bs, > > uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags); > > Do we make guarantees that the driver callback is never reached if the > image is currently read-only? If so, is that a guarantee worth documenting? bdrv_co_pwritev() returns EPERM if bs->read_only, so it looks like you are right we have a guarantee we can document > Reviewed-by: Eric Blake Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
[Qemu-devel] [PATCH 02/17] nbd/client: refactor nbd_read_eof
Refactor nbd_read_eof to return 1 on success, 0 on eof, when no data was read and <0 for other cases, because returned size of read data is not actually used. Signed-off-by: Vladimir Sementsov-Ogievskiy --- nbd/nbd-internal.h | 34 +- nbd/client.c | 5 - tests/qemu-iotests/083.out | 4 ++-- 3 files changed, 27 insertions(+), 16 deletions(-) diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h index 396ddb4d3e..3fb0b6098a 100644 --- a/nbd/nbd-internal.h +++ b/nbd/nbd-internal.h @@ -77,21 +77,37 @@ #define NBD_ESHUTDOWN 108 /* nbd_read_eof - * Tries to read @size bytes from @ioc. Returns number of bytes actually read. - * May return a value >= 0 and < size only on EOF, i.e. when iteratively called - * qio_channel_readv() returns 0. So, there is no need to call nbd_read_eof - * iteratively. + * Tries to read @size bytes from @ioc. + * Returns 1 on success + * 0 on eof, when no data was read (errp is not set) + * -EINVAL on eof, when some data < @size was read until eof + * < 0 on read fail */ -static inline ssize_t nbd_read_eof(QIOChannel *ioc, void *buffer, size_t size, - Error **errp) +static inline int nbd_read_eof(QIOChannel *ioc, void *buffer, size_t size, + Error **errp) { struct iovec iov = { .iov_base = buffer, .iov_len = size }; +ssize_t ret; + /* Sockets are kept in blocking mode in the negotiation phase. After * that, a non-readable socket simply means that another thread stole * our request/reply. Synchronization is done with recv_coroutine, so * that this is coroutine-safe. */ -return nbd_rwv(ioc, &iov, 1, size, true, errp); + +assert(size > 0); + +ret = nbd_rwv(ioc, &iov, 1, size, true, errp); +if (ret <= 0) { +return ret; +} + +if (ret != size) { +error_setg(errp, "End of file"); +return -EINVAL; +} + +return 1; } /* nbd_read @@ -100,9 +116,9 @@ static inline ssize_t nbd_read_eof(QIOChannel *ioc, void *buffer, size_t size, static inline int nbd_read(QIOChannel *ioc, void *buffer, size_t size, Error **errp) { -ssize_t ret = nbd_read_eof(ioc, buffer, size, errp); +int ret = nbd_read_eof(ioc, buffer, size, errp); -if (ret >= 0 && ret != size) { +if (ret == 0) { ret = -EINVAL; error_setg(errp, "End of file"); } diff --git a/nbd/client.c b/nbd/client.c index f1c16b588f..4556056daa 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -925,11 +925,6 @@ ssize_t nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp) return ret; } -if (ret != sizeof(buf)) { -error_setg(errp, "read failed"); -return -EINVAL; -} - /* Reply [ 0 .. 3]magic (NBD_REPLY_MAGIC) [ 4 .. 7]error (0 == no error) diff --git a/tests/qemu-iotests/083.out b/tests/qemu-iotests/083.out index a24c6bfece..d3bea1b2f5 100644 --- a/tests/qemu-iotests/083.out +++ b/tests/qemu-iotests/083.out @@ -69,12 +69,12 @@ read failed: Input/output error === Check disconnect 4 reply === -read failed +End of file read failed: Input/output error === Check disconnect 8 reply === -read failed +End of file read failed: Input/output error === Check disconnect before data === -- 2.11.1
[Qemu-devel] [PATCH 00/17] nbd client refactoring and fixing
A bit more refactoring and fixing before BLOCK_STATUS series. I've tried to make individual patches simple enough, so there are a lot of them. Vladimir Sementsov-Ogievskiy (17): nbd/client: fix nbd_opt_go nbd/client: refactor nbd_read_eof nbd/client: refactor nbd_receive_reply nbd/client: fix nbd_send_request to return int block/nbd-client: get rid of ssize_t block/nbd-client: fix nbd_read_reply_entry block/nbd-client: refactor request send/receive block/nbd-client: rename nbd_recv_coroutines_enter_all block/nbd-client: move nbd_co_receive_reply content into nbd_co_request block/nbd-client: move nbd_coroutine_end content into nbd_co_request block/nbd-client: fix nbd_co_request: set s->reply.handle to 0 on error block/nbd-client: refactor nbd_co_request block/nbd-client: refactor NBDClientSession.recv_coroutine block/nbd-client: exit reply-reading coroutine on incorrect handle block/nbd-client: refactor reading reply block/nbd-client: drop reply field from NBDClientSession block/nbd-client: always return EIO on and after the first io channel error block/nbd-client.h | 9 ++- include/block/nbd.h| 4 +- nbd/nbd-internal.h | 34 ++--- block/nbd-client.c | 173 ++--- nbd/client.c | 21 +++--- tests/qemu-iotests/083.out | 4 +- 6 files changed, 115 insertions(+), 130 deletions(-) -- 2.11.1
Re: [Qemu-devel] [Qemu-block] [PATCH v2] qemu-img: Clarify about relative backing file options
On Fri, Aug 04, 2017 at 10:36:58PM +0800, Fam Zheng wrote: > It's not too surprising when a user specifies the backing file relative > to the current working directory instead of the top layer image. This > causes error when they differ. Though the error message has enough > information to infer the fact about the misunderstanding, it is better > if we document this explicitly, so that users don't have to learn from > mistakes. > > Signed-off-by: Fam Zheng > > --- > v2: Improve wording. [Eric] > --- > qemu-img.texi | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/qemu-img.texi b/qemu-img.texi > index 72dabd6b3e..90c7eab4a8 100644 > --- a/qemu-img.texi > +++ b/qemu-img.texi > @@ -244,6 +244,9 @@ only the differences from @var{backing_file}. No size > needs to be specified in > this case. @var{backing_file} will never be modified unless you use the > @code{commit} monitor command (or qemu-img commit). > > +If a relative path name is given, the backing file is looked up relative to > +the directory containing @var{filename}. > + > Note that a given backing file will be opened to check that it is valid. Use > the @code{-u} option to enable unsafe backing file mode, which means that the > image will be created even if the associated backing file cannot be opened. A > @@ -343,6 +346,9 @@ created as a copy on write image of the specified base > image; the > @var{backing_file} should have the same content as the input's base image, > however the path, image format, etc may differ. > > +If a relative path name is given, the backing file is looked up relative to > +the directory containing @var{output_filename}. > + > If the @code{-n} option is specified, the target volume creation will be > skipped. This is useful for formats such as @code{rbd} if the target > volume has already been created with site specific options that cannot > @@ -490,6 +496,9 @@ The backing file is changed to @var{backing_file} and (if > the image format of > string), then the image is rebased onto no backing file (i.e. it will exist > independently of any backing file). > > +If a relative path name is given, the backing file is looked up relative to > +the directory containing @var{filename}. > + > @var{cache} specifies the cache mode to be used for @var{filename}, whereas > @var{src_cache} specifies the cache mode for reading backing files. > > -- > 2.13.3 > > Reviewed-by: Jeff Cody
[Qemu-devel] [PATCH 07/17] block/nbd-client: refactor request send/receive
Move nbd_co_receive_reply and nbd_coroutine_end calls into nbd_co_send_request and rename the latter to just nbd_co_request. This removes code duplications in nbd_client_co_{pwrite,pread,...} functions. Also this is needed for further refactoring. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.c | 89 -- 1 file changed, 33 insertions(+), 56 deletions(-) diff --git a/block/nbd-client.c b/block/nbd-client.c index 0c88d84de6..c9ade9b517 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -112,12 +112,20 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque) s->read_reply_co = NULL; } -static int nbd_co_send_request(BlockDriverState *bs, - NBDRequest *request, - QEMUIOVector *qiov) +static void nbd_co_receive_reply(NBDClientSession *s, + NBDRequest *request, + NBDReply *reply, + QEMUIOVector *qiov); +static void nbd_coroutine_end(BlockDriverState *bs, + NBDRequest *request); + +static int nbd_co_request(BlockDriverState *bs, + NBDRequest *request, + QEMUIOVector *qiov) { NBDClientSession *s = nbd_get_client_session(bs); int rc, ret, i; +NBDReply reply; qemu_co_mutex_lock(&s->send_mutex); while (s->in_flight == MAX_NBD_REQUESTS) { @@ -141,7 +149,8 @@ static int nbd_co_send_request(BlockDriverState *bs, return -EPIPE; } -if (qiov) { +if (request->type == NBD_CMD_WRITE) { +assert(qiov != NULL); qio_channel_set_cork(s->ioc, true); rc = nbd_send_request(s->ioc, request); if (rc >= 0) { @@ -156,6 +165,21 @@ static int nbd_co_send_request(BlockDriverState *bs, rc = nbd_send_request(s->ioc, request); } qemu_co_mutex_unlock(&s->send_mutex); + +if (rc < 0) { +goto out; +} + +if (request->type == NBD_CMD_READ) { +assert(qiov != NULL); +} else { +qiov = NULL; +} +nbd_co_receive_reply(s, request, &reply, qiov); +rc = -reply.error; + +out: +nbd_coroutine_end(bs, request); return rc; } @@ -208,26 +232,16 @@ static void nbd_coroutine_end(BlockDriverState *bs, int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags) { -NBDClientSession *client = nbd_get_client_session(bs); NBDRequest request = { .type = NBD_CMD_READ, .from = offset, .len = bytes, }; -NBDReply reply; -int ret; assert(bytes <= NBD_MAX_BUFFER_SIZE); assert(!flags); -ret = nbd_co_send_request(bs, &request, NULL); -if (ret < 0) { -reply.error = -ret; -} else { -nbd_co_receive_reply(client, &request, &reply, qiov); -} -nbd_coroutine_end(bs, &request); -return -reply.error; +return nbd_co_request(bs, &request, qiov); } int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, @@ -239,8 +253,6 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, .from = offset, .len = bytes, }; -NBDReply reply; -int ret; if (flags & BDRV_REQ_FUA) { assert(client->info.flags & NBD_FLAG_SEND_FUA); @@ -249,27 +261,18 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, assert(bytes <= NBD_MAX_BUFFER_SIZE); -ret = nbd_co_send_request(bs, &request, qiov); -if (ret < 0) { -reply.error = -ret; -} else { -nbd_co_receive_reply(client, &request, &reply, NULL); -} -nbd_coroutine_end(bs, &request); -return -reply.error; +return nbd_co_request(bs, &request, qiov); } int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes, BdrvRequestFlags flags) { -int ret; NBDClientSession *client = nbd_get_client_session(bs); NBDRequest request = { .type = NBD_CMD_WRITE_ZEROES, .from = offset, .len = bytes, }; -NBDReply reply; if (!(client->info.flags & NBD_FLAG_SEND_WRITE_ZEROES)) { return -ENOTSUP; @@ -283,22 +286,13 @@ int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, request.flags |= NBD_CMD_FLAG_NO_HOLE; } -ret = nbd_co_send_request(bs, &request, NULL); -if (ret < 0) { -reply.error = -ret; -} else { -nbd_co_receive_reply(client, &request, &reply, NULL); -} -nbd_coroutine_end(bs, &request); -return -reply.error; +return nbd_co_request(bs, &request, NULL); } int nbd_client_co_flush(BlockDriverState *bs) { NBDClientSession *client = nbd_get_client_session(bs); NBDRequest request = { .type = NBD_CMD_FLUSH }; -NBDReply reply; -int ret; if (!(client->info.fla
[Qemu-devel] [PATCH 15/17] block/nbd-client: refactor reading reply
Read the whole reply in one place - in nbd_read_reply_entry. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.h | 1 + block/nbd-client.c | 27 +-- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/block/nbd-client.h b/block/nbd-client.h index aa36be8950..0f84ccc073 100644 --- a/block/nbd-client.h +++ b/block/nbd-client.h @@ -30,6 +30,7 @@ typedef struct NBDClientSession { struct { Coroutine *co; NBDRequest *request; +QEMUIOVector *qiov; } requests[MAX_NBD_REQUESTS]; NBDReply reply; } NBDClientSession; diff --git a/block/nbd-client.c b/block/nbd-client.c index 0e12db4be3..61780c5df9 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -94,6 +94,18 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque) break; } +if (s->reply.error == 0 && +s->requests[i].request->type == NBD_CMD_READ) +{ +assert(s->requests[i].qiov != NULL); +ret = nbd_rwv(s->ioc, s->requests[i].qiov->iov, + s->requests[i].qiov->niov, + s->requests[i].request->len, true, NULL); +if (ret != s->requests[i].request->len) { +break; +} +} + /* We're woken up by the receiving coroutine itself. Note that there * is no race between yielding and reentering read_reply_co. This * is because: @@ -138,6 +150,7 @@ static int nbd_co_request(BlockDriverState *bs, assert(i < MAX_NBD_REQUESTS); request->handle = INDEX_TO_HANDLE(s, i); s->requests[i].request = request; +s->requests[i].qiov = qiov; if (!s->ioc) { qemu_co_mutex_unlock(&s->send_mutex); @@ -165,12 +178,6 @@ static int nbd_co_request(BlockDriverState *bs, goto out; } -if (request->type == NBD_CMD_READ) { -assert(qiov != NULL); -} else { -qiov = NULL; -} - /* Wait until we're woken up by nbd_read_reply_entry. */ qemu_coroutine_yield(); if (!s->ioc || s->reply.handle == 0) { @@ -180,14 +187,6 @@ static int nbd_co_request(BlockDriverState *bs, assert(s->reply.handle == request->handle); -if (qiov && s->reply.error == 0) { -ret = nbd_rwv(s->ioc, qiov->iov, qiov->niov, request->len, true, NULL); -if (ret != request->len) { -rc = -EIO; -goto out; -} -} - rc = -s->reply.error; out: -- 2.11.1
[Qemu-devel] [PATCH 06/17] block/nbd-client: fix nbd_read_reply_entry
Set reply.handle to 0 on error path to prevent normal path of nbd_co_receive_reply. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.c | 1 + 1 file changed, 1 insertion(+) diff --git a/block/nbd-client.c b/block/nbd-client.c index dc19894a7c..0c88d84de6 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -107,6 +107,7 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque) qemu_coroutine_yield(); } +s->reply.handle = 0; nbd_recv_coroutines_enter_all(s); s->read_reply_co = NULL; } -- 2.11.1
[Qemu-devel] [PATCH 13/17] block/nbd-client: refactor NBDClientSession.recv_coroutine
Move from recv_coroutine[i] to requests[i].co. This is needed for further refactoring, new fields will be added to created structure. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.h | 4 +++- block/nbd-client.c | 20 ++-- 2 files changed, 13 insertions(+), 11 deletions(-) diff --git a/block/nbd-client.h b/block/nbd-client.h index df80771357..48e2559df6 100644 --- a/block/nbd-client.h +++ b/block/nbd-client.h @@ -27,7 +27,9 @@ typedef struct NBDClientSession { Coroutine *read_reply_co; int in_flight; -Coroutine *recv_coroutine[MAX_NBD_REQUESTS]; +struct { +Coroutine *co; +} requests[MAX_NBD_REQUESTS]; NBDReply reply; } NBDClientSession; diff --git a/block/nbd-client.c b/block/nbd-client.c index d6145c7db0..5eb126c399 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -39,8 +39,8 @@ static void nbd_recv_coroutines_wake_all(NBDClientSession *s) int i; for (i = 0; i < MAX_NBD_REQUESTS; i++) { -if (s->recv_coroutine[i]) { -aio_co_wake(s->recv_coroutine[i]); +if (s->requests[i].co) { +aio_co_wake(s->requests[i].co); } } } @@ -88,22 +88,22 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque) * one coroutine is called until the reply finishes. */ i = HANDLE_TO_INDEX(s, s->reply.handle); -if (i >= MAX_NBD_REQUESTS || !s->recv_coroutine[i]) { +if (i >= MAX_NBD_REQUESTS || !s->requests[i].co) { break; } -/* We're woken up by the recv_coroutine itself. Note that there +/* We're woken up by the receiving coroutine itself. Note that there * is no race between yielding and reentering read_reply_co. This * is because: * - * - if recv_coroutine[i] runs on the same AioContext, it is only + * - if requests[i].co runs on the same AioContext, it is only * entered after we yield * - * - if recv_coroutine[i] runs on a different AioContext, reentering + * - if requests[i].co runs on a different AioContext, reentering * read_reply_co happens through a bottom half, which can only * run after we yield. */ -aio_co_wake(s->recv_coroutine[i]); +aio_co_wake(s->requests[i].co); qemu_coroutine_yield(); } @@ -126,8 +126,8 @@ static int nbd_co_request(BlockDriverState *bs, s->in_flight++; for (i = 0; i < MAX_NBD_REQUESTS; i++) { -if (s->recv_coroutine[i] == NULL) { -s->recv_coroutine[i] = qemu_coroutine_self(); +if (s->requests[i].co == NULL) { +s->requests[i].co = qemu_coroutine_self(); break; } } @@ -189,7 +189,7 @@ out: /* Tell the read handler to read another header. */ s->reply.handle = 0; -s->recv_coroutine[i] = NULL; +s->requests[i].co = NULL; /* Kick the read_reply_co to get the next reply. */ if (s->read_reply_co) { -- 2.11.1
[Qemu-devel] [PATCH 01/17] nbd/client: fix nbd_opt_go
Do not send NBD_OPT_ABORT to the broken server. After sending NBD_REP_ACK on NBD_OPT_GO server is most probably in transmission phase, when option sending is finished. Signed-off-by: Vladimir Sementsov-Ogievskiy --- nbd/client.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/nbd/client.c b/nbd/client.c index 0a17de80b5..f1c16b588f 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -399,12 +399,10 @@ static int nbd_opt_go(QIOChannel *ioc, const char *wantname, phase, but make sure it sent flags */ if (len) { error_setg(errp, "server sent invalid NBD_REP_ACK"); -nbd_send_opt_abort(ioc); return -1; } if (!info->flags) { error_setg(errp, "broken server omitted NBD_INFO_EXPORT"); -nbd_send_opt_abort(ioc); return -1; } trace_nbd_opt_go_success(); -- 2.11.1
[Qemu-devel] [PATCH 09/17] block/nbd-client: move nbd_co_receive_reply content into nbd_co_request
Move code from nbd_co_receive_reply into nbd_co_request. This simplify things, makes further refactoring possible. Also, a function starting with qemu_coroutine_yield is weird. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.c | 33 ++--- 1 file changed, 10 insertions(+), 23 deletions(-) diff --git a/block/nbd-client.c b/block/nbd-client.c index 8ad2264a40..db1d7025fa 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -112,10 +112,6 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque) s->read_reply_co = NULL; } -static void nbd_co_receive_reply(NBDClientSession *s, - NBDRequest *request, - NBDReply *reply, - QEMUIOVector *qiov); static void nbd_coroutine_end(BlockDriverState *bs, NBDRequest *request); @@ -175,39 +171,30 @@ static int nbd_co_request(BlockDriverState *bs, } else { qiov = NULL; } -nbd_co_receive_reply(s, request, &reply, qiov); -rc = -reply.error; - -out: -nbd_coroutine_end(bs, request); -return rc; -} - -static void nbd_co_receive_reply(NBDClientSession *s, - NBDRequest *request, - NBDReply *reply, - QEMUIOVector *qiov) -{ -int ret; /* Wait until we're woken up by nbd_read_reply_entry. */ qemu_coroutine_yield(); -*reply = s->reply; -if (reply->handle != request->handle || +reply = s->reply; +if (reply.handle != request->handle || !s->ioc) { -reply->error = EIO; +reply.error = EIO; } else { -if (qiov && reply->error == 0) { +if (qiov && reply.error == 0) { ret = nbd_rwv(s->ioc, qiov->iov, qiov->niov, request->len, true, NULL); if (ret != request->len) { -reply->error = EIO; +reply.error = EIO; } } /* Tell the read handler to read another header. */ s->reply.handle = 0; } +rc = -reply.error; + +out: +nbd_coroutine_end(bs, request); +return rc; } static void nbd_coroutine_end(BlockDriverState *bs, -- 2.11.1
[Qemu-devel] [PATCH 08/17] block/nbd-client: rename nbd_recv_coroutines_enter_all
Rename nbd_recv_coroutines_enter_all to nbd_recv_coroutines_wake_all, as it most probably just add all recv coroutines into co_queue_wakeup, not directly enter them. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/nbd-client.c b/block/nbd-client.c index c9ade9b517..8ad2264a40 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -34,7 +34,7 @@ #define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs)) #define INDEX_TO_HANDLE(bs, index) ((index) ^ ((uint64_t)(intptr_t)bs)) -static void nbd_recv_coroutines_enter_all(NBDClientSession *s) +static void nbd_recv_coroutines_wake_all(NBDClientSession *s) { int i; @@ -108,7 +108,7 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque) } s->reply.handle = 0; -nbd_recv_coroutines_enter_all(s); +nbd_recv_coroutines_wake_all(s); s->read_reply_co = NULL; } -- 2.11.1
[Qemu-devel] [PATCH 12/17] block/nbd-client: refactor nbd_co_request
Reduce nesting, get rid of extra variable. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.c | 25 - 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/block/nbd-client.c b/block/nbd-client.c index b84cab4079..d6145c7db0 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -118,7 +118,6 @@ static int nbd_co_request(BlockDriverState *bs, { NBDClientSession *s = nbd_get_client_session(bs); int rc, ret, i; -NBDReply reply; qemu_co_mutex_lock(&s->send_mutex); while (s->in_flight == MAX_NBD_REQUESTS) { @@ -171,20 +170,20 @@ static int nbd_co_request(BlockDriverState *bs, /* Wait until we're woken up by nbd_read_reply_entry. */ qemu_coroutine_yield(); -reply = s->reply; -if (reply.handle != request->handle || -!s->ioc) { -reply.error = EIO; -} else { -if (qiov && reply.error == 0) { -ret = nbd_rwv(s->ioc, qiov->iov, qiov->niov, request->len, true, - NULL); -if (ret != request->len) { -reply.error = EIO; -} +if (s->reply.handle != request->handle || !s->ioc) { +rc = -EIO; +goto out; +} + +if (qiov && s->reply.error == 0) { +ret = nbd_rwv(s->ioc, qiov->iov, qiov->niov, request->len, true, NULL); +if (ret != request->len) { +rc = -EIO; +goto out; } } -rc = -reply.error; + +rc = -s->reply.error; out: /* Tell the read handler to read another header. */ -- 2.11.1
[Qemu-devel] [PATCH 14/17] block/nbd-client: exit reply-reading coroutine on incorrect handle
Check reply-handle == request-handle in the same place, where recv coroutine number is calculated from reply->handle and it's correctness checked - in nbd_read_reply_entry. Also finish nbd_read_reply_entry in case of reply-handle != request-handle in the same way as in case of incorrect reply-handle. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.h | 1 + block/nbd-client.c | 9 +++-- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/block/nbd-client.h b/block/nbd-client.h index 48e2559df6..aa36be8950 100644 --- a/block/nbd-client.h +++ b/block/nbd-client.h @@ -29,6 +29,7 @@ typedef struct NBDClientSession { struct { Coroutine *co; +NBDRequest *request; } requests[MAX_NBD_REQUESTS]; NBDReply reply; } NBDClientSession; diff --git a/block/nbd-client.c b/block/nbd-client.c index 5eb126c399..0e12db4be3 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -88,7 +88,9 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque) * one coroutine is called until the reply finishes. */ i = HANDLE_TO_INDEX(s, s->reply.handle); -if (i >= MAX_NBD_REQUESTS || !s->requests[i].co) { +if (i >= MAX_NBD_REQUESTS || !s->requests[i].co || +s->reply.handle != s->requests[i].request->handle) +{ break; } @@ -135,6 +137,7 @@ static int nbd_co_request(BlockDriverState *bs, g_assert(qemu_in_coroutine()); assert(i < MAX_NBD_REQUESTS); request->handle = INDEX_TO_HANDLE(s, i); +s->requests[i].request = request; if (!s->ioc) { qemu_co_mutex_unlock(&s->send_mutex); @@ -170,11 +173,13 @@ static int nbd_co_request(BlockDriverState *bs, /* Wait until we're woken up by nbd_read_reply_entry. */ qemu_coroutine_yield(); -if (s->reply.handle != request->handle || !s->ioc) { +if (!s->ioc || s->reply.handle == 0) { rc = -EIO; goto out; } +assert(s->reply.handle == request->handle); + if (qiov && s->reply.error == 0) { ret = nbd_rwv(s->ioc, qiov->iov, qiov->niov, request->len, true, NULL); if (ret != request->len) { -- 2.11.1
[Qemu-devel] [PATCH 03/17] nbd/client: refactor nbd_receive_reply
Refactor nbd_receive_reply to return 1 on success, 0 on eof, when no data was read and <0 for other cases, because returned size of read data is not actually used. Signed-off-by: Vladimir Sementsov-Ogievskiy --- include/block/nbd.h | 2 +- nbd/client.c| 12 +--- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/include/block/nbd.h b/include/block/nbd.h index 9c3d0a5868..f7450608b4 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -164,7 +164,7 @@ int nbd_receive_negotiate(QIOChannel *ioc, const char *name, int nbd_init(int fd, QIOChannelSocket *sioc, NBDExportInfo *info, Error **errp); ssize_t nbd_send_request(QIOChannel *ioc, NBDRequest *request); -ssize_t nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp); +int nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp); int nbd_client(int fd); int nbd_disconnect(int fd); diff --git a/nbd/client.c b/nbd/client.c index 4556056daa..a1758a1931 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -914,11 +914,16 @@ ssize_t nbd_send_request(QIOChannel *ioc, NBDRequest *request) return nbd_write(ioc, buf, sizeof(buf), NULL); } -ssize_t nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp) +/* nbd_receive_reply + * Returns 1 on success + * 0 on eof, when no data was read from @ioc (errp is not set) + * < 0 on fail + */ +int nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp) { uint8_t buf[NBD_REPLY_SIZE]; uint32_t magic; -ssize_t ret; +int ret; ret = nbd_read_eof(ioc, buf, sizeof(buf), errp); if (ret <= 0) { @@ -948,6 +953,7 @@ ssize_t nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp) error_setg(errp, "invalid magic (got 0x%" PRIx32 ")", magic); return -EINVAL; } -return sizeof(buf); + +return 1; } -- 2.11.1
Re: [Qemu-devel] [Qemu-block] [PATCH] block: document semanatics of bdrv_co_preadv|pwritev
On 08/04/2017 09:06 AM, Daniel P. Berrange wrote: > On Fri, Aug 04, 2017 at 04:02:10PM +0200, Kevin Wolf wrote: >> Am 04.08.2017 um 12:50 hat Daniel P. Berrange geschrieben: >>> Signed-off-by: Daniel P. Berrange >>> --- >> Really? We are asserting that they match in bdrv_aligned_preadv(): >> >> assert(!qiov || bytes == qiov->size); > > Hmm, why do we pass @bytes at all then ? If they're always the same, > how about deleting it and just letting everyone read qiov->size > directly. Read the assertion again: qiov can be NULL (generally, when writing zeroes). So we can't rely on qiov->size in that scenario. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] pc/acpi: Fix booting of macOS with Clover EFI bootloader
On Fri, 4 Aug 2017 16:17:07 +0530 Dhiru Kholia wrote: > On Fri, Aug 4, 2017 at 2:35 PM, Igor Mammedov wrote: > > On Fri, 4 Aug 2017 12:15:40 +0530 > > Dhiru Kholia wrote: > > > >> This was tested with macOS 10.12.5 and Clover r4114. > >> > >> Without this patch, the macOS boot process gets stuck at the Apple logo > >> without showing any progress bar. > >> > >> I have documented the process of running macOS on QEMU/KVM at, > >> > >> https://github.com/kholia/OSX-KVM/ > >> > >> Instead of using this patch, adding an additional command-line knob > >> which exposes this setting (force_rev1_fadt) to the user might be a more > >> general solution. > > > > it's been reported that OVMF had issues that were fixed, > > you probably want to read this thread: > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg468456.html > > Hi Igor, > > I have now tested various OVMF versions with the latest QEMU (commit > aaaec6acad7c). > > When using edk2.git-ovmf-x64-0-20170726.b2826.gbb4831c03d.noarch.rpm > from [1] macOS does not boot and gets stuck. The CPU usage goes to > 100%. I haven't confirmed this but this OVMF build should have both > 198a46d and 072060a edk2 commits in it, based on the build date. > > The OVMF blob from Gabriel Somlo [2] hangs on the OVMF logo screen and > it too results in 100% CPU usage after hanging. > > I am using "boot-clover.sh" script from my repository [4] to test the > various OVMF versions. > > The only OVMF blob which works with the current QEMU for booting macOS > is the one from Proxmox [3]. Unfortunately, I don't know the > corresponding commit in the edk2 repository for this working OVMF > blob. So it's guest side issue, I'd prefer if it fixed there if possible instead of adding new CLI options to QEMU to work around issue. Added to CC BALATON Zoltan for whom updating OVMF fixed the problem, perhaps you'll be able to figure out what your setup is missing. > References, > > [1] https://www.kraxel.org/repos/jenkins/edk2/ > > [2] https://www.contrib.andrew.cmu.edu/~somlo/OSXKVM/ > > [3] https://git.proxmox.com/?p=pve-qemu-kvm.git;a=tree;f=debian > > [4] https://github.com/kholia/OSX-KVM/ > > It would be great if someone else could reproduce these results too. > > Thanks, > Dhiru >
[Qemu-devel] [PATCH 11/17] block/nbd-client: fix nbd_co_request: set s->reply.handle to 0 on error
We set s->reply.handle to 0 on one error path and don't set on another. For consistancy and to avoid assert in nbd_read_reply_entry let's set s->reply.handle to 0 in case of wrong handle too. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/nbd-client.c b/block/nbd-client.c index d6965d24db..b84cab4079 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -183,13 +183,13 @@ static int nbd_co_request(BlockDriverState *bs, reply.error = EIO; } } - -/* Tell the read handler to read another header. */ -s->reply.handle = 0; } rc = -reply.error; out: +/* Tell the read handler to read another header. */ +s->reply.handle = 0; + s->recv_coroutine[i] = NULL; /* Kick the read_reply_co to get the next reply. */ -- 2.11.1
[Qemu-devel] [PATCH 05/17] block/nbd-client: get rid of ssize_t
Use int variable for nbd_co_send_request return value (as nbd_co_send_request returns int). Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/block/nbd-client.c b/block/nbd-client.c index 25dd28406b..dc19894a7c 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -214,7 +214,7 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset, .len = bytes, }; NBDReply reply; -ssize_t ret; +int ret; assert(bytes <= NBD_MAX_BUFFER_SIZE); assert(!flags); @@ -239,7 +239,7 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, .len = bytes, }; NBDReply reply; -ssize_t ret; +int ret; if (flags & BDRV_REQ_FUA) { assert(client->info.flags & NBD_FLAG_SEND_FUA); @@ -261,7 +261,7 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes, BdrvRequestFlags flags) { -ssize_t ret; +int ret; NBDClientSession *client = nbd_get_client_session(bs); NBDRequest request = { .type = NBD_CMD_WRITE_ZEROES, @@ -297,7 +297,7 @@ int nbd_client_co_flush(BlockDriverState *bs) NBDClientSession *client = nbd_get_client_session(bs); NBDRequest request = { .type = NBD_CMD_FLUSH }; NBDReply reply; -ssize_t ret; +int ret; if (!(client->info.flags & NBD_FLAG_SEND_FLUSH)) { return 0; @@ -325,7 +325,7 @@ int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) .len = bytes, }; NBDReply reply; -ssize_t ret; +int ret; if (!(client->info.flags & NBD_FLAG_SEND_TRIM)) { return 0; -- 2.11.1
[Qemu-devel] [Bug 1708551] Re: macOS Guest Reading USB 3.0 Bus as USB 2.0
Then simply omit the "bus=usb-bus.0" here - the devices then should be put onto the XHCI bus automatically. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1708551 Title: macOS Guest Reading USB 3.0 Bus as USB 2.0 Status in QEMU: Incomplete Bug description: Description: I'm having trouble with USB Passthrough. Running `system_profiler SPUSBDataType` on macOS guest confirms that the system "sees" my device, and that qemu is passing *something* through. However, the system sees my connection as USB 2.0, even though i'm passing through XHCI controllers (nec-usb-xhci/qemu-xhci). I suspect this is the reason why my guest is unable to recognize my iPhone in XCode & iTunes. Parameters: QEMU release version: 2.10.0-rc0 Bios: [patched version of OVMF](https://github.com/gsomlo/edk2/tree/macboot)] Command Line: https://pastebin.com/pBSYbrW1 Host: Arch Linux Guest: macOS v10.12.6 Guest System Info: https://pastebin.com/U1Tihxei Notes: 1. Observed sometime after late-May-early-June of this year. 2. Due to [a defect in qemu v2.8 which affected GTK users](https://bugs.launchpad.net/qemu/+bug/1578192), and [a recent change to macOS' booting process conflicting with qemu v2.9](https://lists.nongnu.org/archive/html/qemu- devel/2017-03/msg06366.html), i'm forced to use qemu v2.10.0-rc0 (as -rc1 fails to load OVMF right now). To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1708551/+subscriptions
[Qemu-devel] [PATCH 3/3] parallels: drop check that bdrv_truncate() is working
This would be actually strange and error prone. If truncate() nowadays will fail, there is something fatally wrong. Let's check for that during the actual work. The only fallback case is when the file is not zero initialized. In this case we should switch to preallocation via fallocate(). Signed-off-by: Denis V. Lunev CC: Markus Armbruster CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi --- block/parallels.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/block/parallels.c b/block/parallels.c index 6794e53c0b..e1e06d23cc 100644 --- a/block/parallels.c +++ b/block/parallels.c @@ -703,9 +703,7 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, goto fail_options; } -if (!(flags & BDRV_O_RESIZE) || !bdrv_has_zero_init(bs->file->bs) || -bdrv_truncate(bs->file, bdrv_getlength(bs->file->bs), - PREALLOC_MODE_OFF, NULL) != 0) { +if (!bdrv_has_zero_init(bs->file->bs)) { s->prealloc_mode = PRL_PREALLOC_MODE_FALLOCATE; } -- 2.11.0
[Qemu-devel] [PATCH 2/3] parallels: respect error code of bdrv_getlength() in allocate_clusters()
If we can not get the file length, the state of BDS is broken completely. Return error to the caller. Signed-off-by: Denis V. Lunev CC: Markus Armbruster CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi --- block/parallels.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/block/parallels.c b/block/parallels.c index 5bbdfabb7a..6794e53c0b 100644 --- a/block/parallels.c +++ b/block/parallels.c @@ -192,7 +192,7 @@ static int64_t allocate_clusters(BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum) { BDRVParallelsState *s = bs->opaque; -int64_t pos, space, idx, to_allocate, i; +int64_t pos, space, idx, to_allocate, i, len; pos = block_status(s, sector_num, nb_sectors, pnum); if (pos > 0) { @@ -214,7 +214,11 @@ static int64_t allocate_clusters(BlockDriverState *bs, int64_t sector_num, assert(idx < s->bat_size && idx + to_allocate <= s->bat_size); space = to_allocate * s->tracks; -if (s->data_end + space > bdrv_getlength(bs->file->bs) >> BDRV_SECTOR_BITS) { +len = bdrv_getlength(bs->file->bs); +if (len < 0) { +return len; +} +if (s->data_end + space > (len >> BDRV_SECTOR_BITS)) { int ret; space += s->prealloc_size; if (s->prealloc_mode == PRL_PREALLOC_MODE_FALLOCATE) { -- 2.11.0
[Qemu-devel] [PATCH] block: drop bdrv_set_key from BlockDriver
This is not used anymore since c01c214b69 ("block: remove all encryption handling APIs", 2017-07-11). Signed-off-by: Paolo Bonzini --- include/block/block_int.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/block/block_int.h b/include/block/block_int.h index d4f4ea7584..7571c0aaaf 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -127,7 +127,6 @@ struct BlockDriver { Error **errp); void (*bdrv_close)(BlockDriverState *bs); int (*bdrv_create)(const char *filename, QemuOpts *opts, Error **errp); -int (*bdrv_set_key)(BlockDriverState *bs, const char *key); int (*bdrv_make_empty)(BlockDriverState *bs); void (*bdrv_refresh_filename)(BlockDriverState *bs, QDict *options); -- 2.13.3
[Qemu-devel] QAPI design mistake: RockerPort member @speed shoud bi
RockerPort member @speed is documented to be "in Mbps" (presumably Megabits/second). It uses QAPI type 'uint32'. This is inappropriate for QAPI/QMP. It should have been made plain bits/second (no multiple), and maybe 'uint64'. As far as I can tell, the QAPI part didn't get review from QAPI experts. Probably because it wasn't cc'ed to them. Too late to fix now, but let's not repeat the mistake.
[Qemu-devel] [PATCH 1/3] block: respect error code from bdrv_getlength in handle_aiocb_write_zeroes
Original idea beyond the code in question was the following: we have failed to write zeroes with fallocate(FALLOC_FL_ZERO_RANGE) as the simplest approach and via fallocate(FALLOC_FL_PUNCH_HOLE)/fallocate(0). We have the only chance now: if the request comes beyond end of the file. Thus we should calculate file length and respect the error code from that op. Signed-off-by: Denis V. Lunev CC: Markus Armbruster CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi --- block/file-posix.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/block/file-posix.c b/block/file-posix.c index cfbb236f6f..f4de022ae0 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1339,6 +1339,9 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb) #if defined(CONFIG_FALLOCATE) || defined(CONFIG_XFS) BDRVRawState *s = aiocb->bs->opaque; #endif +#ifdef CONFIG_FALLOCATE +int64_t len; +#endif if (aiocb->aio_type & QEMU_AIO_BLKDEV) { return handle_aiocb_write_zeroes_block(aiocb); @@ -1381,7 +1384,10 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb) #endif #ifdef CONFIG_FALLOCATE -if (s->has_fallocate && aiocb->aio_offset >= bdrv_getlength(aiocb->bs)) { +/* Last resort: we are trying to extend the file with zeroed data. This + * can be done via fallocate(fd, 0) */ +len = bdrv_getlength(aiocb->bs); +if (s->has_fallocate && len >= 0 && aiocb->aio_offset >= len) { int ret = do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes); if (ret == 0 || ret != -ENOTSUP) { return ret; -- 2.11.0
Re: [Qemu-devel] [PATCH v4 0/9] s390x: zPCI detangling
On Fri, 4 Aug 2017 13:29:37 +0200 Cornelia Huck wrote: > Next version, not so many changes from v3. > > As you might have guessed, the goals are still the same: > - Being able to disable PCI support in a build completely. > - Properly fencing off PCI if the relevant facility bit is not provided. > > Changes v3->v4: > - introduce pci_available boolean > - use pci_available to fence off setting the zcpi facility bit > - collected tags > > Branch is still git://github.com/cohuck/qemu no-zpci-cpumodel make check on a build with pci disabled revealed an interesting inconsistency: We create a virtio-9p-ccw device, but the base virtio-9p-device is in code that is not built for !pci. If I remove the pci dependency for hw/9pfs/ and fsdev/, things look fine (at least on s390x). We probably need a different dependency, though. virtio-9p maintainers, any suggestions?
[Qemu-devel] [PATCH 16/17] block/nbd-client: drop reply field from NBDClientSession
Drop 'reply' from NBDClientSession. It's usage is not very transparent: 1. it is used to deliver error to receiving coroutine, and receiving coroutine must save or handle it somehow and then zero out it in NBDClientSession. 2. it is used to inform receiving coroutines that nbd_read_reply_entry is out for some reason (error or disconnect) To simplify this scheme: - drop NBDClientSession.reply - introduce NBDClientSession.requests[...].ret for (1) - introduce NBDClientSession.eio_to_all for (2) Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.h | 3 ++- block/nbd-client.c | 25 ++--- 2 files changed, 12 insertions(+), 16 deletions(-) diff --git a/block/nbd-client.h b/block/nbd-client.h index 0f84ccc073..0b0aa67342 100644 --- a/block/nbd-client.h +++ b/block/nbd-client.h @@ -31,8 +31,9 @@ typedef struct NBDClientSession { Coroutine *co; NBDRequest *request; QEMUIOVector *qiov; +int ret; } requests[MAX_NBD_REQUESTS]; -NBDReply reply; +bool eio_to_all; } NBDClientSession; NBDClientSession *nbd_get_client_session(BlockDriverState *bs); diff --git a/block/nbd-client.c b/block/nbd-client.c index 61780c5df9..7c151b3dd3 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -72,10 +72,10 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque) uint64_t i; int ret; Error *local_err = NULL; +NBDReply reply; for (;;) { -assert(s->reply.handle == 0); -ret = nbd_receive_reply(s->ioc, &s->reply, &local_err); +ret = nbd_receive_reply(s->ioc, &reply, &local_err); if (ret < 0) { error_report_err(local_err); } @@ -87,16 +87,14 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque) * handler acts as a synchronization point and ensures that only * one coroutine is called until the reply finishes. */ -i = HANDLE_TO_INDEX(s, s->reply.handle); +i = HANDLE_TO_INDEX(s, reply.handle); if (i >= MAX_NBD_REQUESTS || !s->requests[i].co || -s->reply.handle != s->requests[i].request->handle) +reply.handle != s->requests[i].request->handle) { break; } -if (s->reply.error == 0 && -s->requests[i].request->type == NBD_CMD_READ) -{ +if (reply.error == 0 && s->requests[i].request->type == NBD_CMD_READ) { assert(s->requests[i].qiov != NULL); ret = nbd_rwv(s->ioc, s->requests[i].qiov->iov, s->requests[i].qiov->niov, @@ -106,6 +104,8 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque) } } +s->requests[i].ret = -reply.error; + /* We're woken up by the receiving coroutine itself. Note that there * is no race between yielding and reentering read_reply_co. This * is because: @@ -121,7 +121,7 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque) qemu_coroutine_yield(); } -s->reply.handle = 0; +s->eio_to_all = true; nbd_recv_coroutines_wake_all(s); s->read_reply_co = NULL; } @@ -180,19 +180,14 @@ static int nbd_co_request(BlockDriverState *bs, /* Wait until we're woken up by nbd_read_reply_entry. */ qemu_coroutine_yield(); -if (!s->ioc || s->reply.handle == 0) { +if (!s->ioc || s->eio_to_all) { rc = -EIO; goto out; } -assert(s->reply.handle == request->handle); - -rc = -s->reply.error; +rc = s->requests[i].ret; out: -/* Tell the read handler to read another header. */ -s->reply.handle = 0; - s->requests[i].co = NULL; /* Kick the read_reply_co to get the next reply. */ -- 2.11.1
[Qemu-devel] [PATCH 0/3] respect bdrv_getlength() error code
These cases were reported by Markus Armbruster Patches add error checking of the bdrv_getlength() call or remove the call of that function. Signed-off-by: Denis V. Lunev CC: Markus Armbruster CC: Kevin Wolf CC: Max Reitz CC: Stefan Hajnoczi
[Qemu-devel] [PATCH 04/17] nbd/client: fix nbd_send_request to return int
Fix nbd_send_request to return int, as it returns a return value of nbd_write (which is int), and the only user of nbd_send_request's return value (nbd_co_send_request) consider it as int too. Signed-off-by: Vladimir Sementsov-Ogievskiy --- include/block/nbd.h | 2 +- nbd/client.c| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/block/nbd.h b/include/block/nbd.h index f7450608b4..040cdd2e60 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -163,7 +163,7 @@ int nbd_receive_negotiate(QIOChannel *ioc, const char *name, Error **errp); int nbd_init(int fd, QIOChannelSocket *sioc, NBDExportInfo *info, Error **errp); -ssize_t nbd_send_request(QIOChannel *ioc, NBDRequest *request); +int nbd_send_request(QIOChannel *ioc, NBDRequest *request); int nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp); int nbd_client(int fd); int nbd_disconnect(int fd); diff --git a/nbd/client.c b/nbd/client.c index a1758a1931..00cba45853 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -896,7 +896,7 @@ int nbd_disconnect(int fd) } #endif -ssize_t nbd_send_request(QIOChannel *ioc, NBDRequest *request) +int nbd_send_request(QIOChannel *ioc, NBDRequest *request) { uint8_t buf[NBD_REQUEST_SIZE]; -- 2.11.1
Re: [Qemu-devel] [Qemu-block] [PATCH] block/null: Remove 'filename' option
On 08/04/2017 09:43 AM, Kevin Wolf wrote: > This option was only added to allow 'null-co://' and 'null-aio://' as > filenames, its value never served any actual purpose and was ignored. > Nevertheless it was accepted as '-drive driver=null,filename=foo'. > > The correct way to enable the protocol prefixes (and that without adding > a useless -drive option) is implementing .bdrv_parse_filename. This is > what this patch does. > > Technically, this is an incompatible change, but the null block driver > is only used for benchmarking, testing and debugging, and an option > without effect isn't likely to be used by anyone anyway, so no bad > effects are to be expected. Agreed with the analysis. Still, better to get it into 2.10 rather than going yet another release with the option available. > > Reported-by: Markus Armbruster > Signed-off-by: Kevin Wolf > --- > block/null.c | 31 ++- > 1 file changed, 26 insertions(+), 5 deletions(-) > Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH 17/17] block/nbd-client: always return EIO on and after the first io channel error
Do not communicate after the first error to avoid communicating throught broken channel. The only exclusion is try to send NBD_CMD_DISC anyway on in nbd_client_close. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.c | 26 -- 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/block/nbd-client.c b/block/nbd-client.c index 7c151b3dd3..6109abf8a7 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -49,6 +49,8 @@ static void nbd_teardown_connection(BlockDriverState *bs) { NBDClientSession *client = nbd_get_client_session(bs); +client->eio_to_all = true; + if (!client->ioc) { /* Already closed */ return; } @@ -75,11 +77,15 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque) NBDReply reply; for (;;) { +if (s->eio_to_all) { +break; +} + ret = nbd_receive_reply(s->ioc, &reply, &local_err); if (ret < 0) { error_report_err(local_err); } -if (ret <= 0) { +if (ret <= 0 || s->eio_to_all) { break; } @@ -99,7 +105,7 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque) ret = nbd_rwv(s->ioc, s->requests[i].qiov->iov, s->requests[i].qiov->niov, s->requests[i].request->len, true, NULL); -if (ret != s->requests[i].request->len) { +if (ret != s->requests[i].request->len || s->eio_to_all) { break; } } @@ -133,6 +139,10 @@ static int nbd_co_request(BlockDriverState *bs, NBDClientSession *s = nbd_get_client_session(bs); int rc, ret, i; +if (s->eio_to_all) { +return -EIO; +} + qemu_co_mutex_lock(&s->send_mutex); while (s->in_flight == MAX_NBD_REQUESTS) { qemu_co_queue_wait(&s->free_sema, &s->send_mutex); @@ -152,16 +162,16 @@ static int nbd_co_request(BlockDriverState *bs, s->requests[i].request = request; s->requests[i].qiov = qiov; -if (!s->ioc) { +if (s->eio_to_all) { qemu_co_mutex_unlock(&s->send_mutex); -return -EPIPE; +return -EIO; } if (request->type == NBD_CMD_WRITE) { assert(qiov != NULL); qio_channel_set_cork(s->ioc, true); rc = nbd_send_request(s->ioc, request); -if (rc >= 0) { +if (rc == 0 && !s->eio_to_all) { ret = nbd_rwv(s->ioc, qiov->iov, qiov->niov, request->len, false, NULL); if (ret != request->len) { @@ -174,13 +184,16 @@ static int nbd_co_request(BlockDriverState *bs, } qemu_co_mutex_unlock(&s->send_mutex); +if (s->eio_to_all) { +rc = -EIO; +} if (rc < 0) { goto out; } /* Wait until we're woken up by nbd_read_reply_entry. */ qemu_coroutine_yield(); -if (!s->ioc || s->eio_to_all) { +if (s->eio_to_all) { rc = -EIO; goto out; } @@ -335,6 +348,7 @@ int nbd_client_init(BlockDriverState *bs, logout("session init %s\n", export); qio_channel_set_blocking(QIO_CHANNEL(sioc), true, NULL); +client->eio_to_all = false; client->info.request_sizes = true; ret = nbd_receive_negotiate(QIO_CHANNEL(sioc), export, tlscreds, hostname, -- 2.11.1
[Qemu-devel] [PATCH] pc/acpi: Fix booting of macOS with Clover EFI bootloader
> On Aug 4, 2017, at 5:21 AM, qemu-devel-requ...@nongnu.org wrote: > > Message: 3 > Date: Fri, 4 Aug 2017 12:15:40 +0530 > From: > To: qemu-devel@nongnu.org > Cc: Igor Mammedov , Pankaj Gupta > , Dhiru Kholia > Subject: [Qemu-devel] [PATCH] pc/acpi: Fix booting of macOS with > Clover EFI bootloader > Message-ID: <20170804064540.10523-1-dhiru.kho...@gmail.com> > > This was tested with macOS 10.12.5 and Clover r4114. > > Without this patch, the macOS boot process gets stuck at the Apple logo > without showing any progress bar. > > I have documented the process of running macOS on QEMU/KVM at, > > https://github.com/kholia/OSX-KVM/ > > Instead of using this patch, adding an additional command-line knob > which exposes this setting (force_rev1_fadt) to the user might be a more > general solution. > --- > hw/i386/acpi-build.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > index b9c245c..0f8df19 100644 > --- a/hw/i386/acpi-build.c > +++ b/hw/i386/acpi-build.c > @@ -145,6 +145,7 @@ static void acpi_get_pm_info(AcpiPmInfo *pm) > object_property_get_uint(obj, ACPI_PCIHP_IO_LEN_PROP, NULL); > } > if (lpc) { > +pm->force_rev1_fadt = true; > obj = lpc; > pm->cpu_hp_io_base = ICH9_CPU_HOTPLUG_IO_BASE; > } > -- > 2.9.4 Very interesting. You are using the hackintosh method to install Mac OS X in QEMU. I think the supported way is to use Apple's official installer. Could you send your full command-line? I think in this instance a '-acpi-rev1' option would be useful.
Re: [Qemu-devel] [PATCH v4 00/22] Clean up around qmp() and hmp()
On Fri, 08/04 06:50, Eric Blake wrote: > On 08/03/2017 08:54 PM, no-re...@patchew.org wrote: > > Hi, > > > > This series failed automatic build test. Please find the testing commands > > and > > their output below. If you have docker installed, you can probably > > reproduce it > > locally. > > When will patchew have a syntax for stating dependencies? (Of course, I > should actually mention those dependencies in my cover letter, not after > the fact). I can give it a try this weekend, if it works, it will be announced on this list. Fam
Re: [Qemu-devel] Is the use of bdrv_getlength() in handle_aiocb_write_zeroes() kosher?
"Denis V. Lunev" writes: > On 08/04/2017 03:16 PM, Markus Armbruster wrote: >> Denis, you added this in commit d50d822: >> >> #ifdef CONFIG_FALLOCATE >> if (s->has_fallocate && aiocb->aio_offset >= bdrv_getlength(aiocb->bs)) { >> int ret = do_fallocate(s->fd, 0, aiocb->aio_offset, >> aiocb->aio_nbytes); >> if (ret == 0 || ret != -ENOTSUP) { >> return ret; >> } >> s->has_fallocate = false; >> } >> #endif >> >> bdrv_getlength() can fail. Does it do the right thing then? For what >> it's worth, the comparison of its value is signed. > fallocate() with 0 flags can work only beyond end of file > or on top of the hole. Thus the check is made to validate > that we are beyond EOF. > > Technically fallocate should fail if that condition will be > violated. But you right, we can add sanity check here. > This would not harm. > > Should I send it? I figure an explicit check for bdrv_getlength() failure would make the code easier to understand. In short: yes, please!