Re: [Qemu-devel] [PATCH 02/17] ppc: avoid excessive TLB flushing
On Thu, Aug 28, 2014 at 09:35:27PM +0200, Paolo Bonzini wrote: Il 28/08/2014 19:30, Peter Maydell ha scritto: On 28 August 2014 18:14, Paolo Bonzini pbonz...@redhat.com wrote: [snip] Does PPC hardware do lots of TLB flushes on user-kernel transitions, or does it have some sort of info in the TLB entry about whether it should match or not? The IR and DR bits simply disable paging for respectively instructions and data. I suppose real hardware simply does not use the TLB when paging is disabled. That's right for the most part, although IR and DR transitions are still pretty horribly slow, due to the various synchronizations that need to happen. There are some other complications though. At least POWER7 and POWER8 supports a virtual real mode area (VRMA, which is where a guest OS sees itself as having translation off, but in fact translations (managed by the hypervisor) are still in use. In the (hashed) page table the VRMA co-exists with the guest's normal translations. I'm not up to date with the TLB architecture and how it's impacted. Note that POWER chips (since POWER4?) have both the ERAT (which is what most cpus would think of as fast but smallish TLB) and the TLB which is a sort of L1.5 cache, which needs to be combined with the SLB to form full translations. I strongly suspect the ERAT does need to be flushed on real-mode/virtual-mode transitions, but I'm less sure about the SLB and TLB. As yet another case, BookE (embedded) powerpc CPUs have IS/DS bits instead of IR/DR, reflecting that instead of a true translation off mode, they have two different address spaces. In this case, however, the address space is tagged into the TLB, so the whole thing doesn't need to be flushed on address space transitions. IIRC each user-kernel transition disables paging, and then the kernel can re-enable it (optionally only on data). So the transition is user-kernel unpaged-kernel paged, and the kernel unpaged-kernel paged part is what triggers the TLB flush. (Something like this---Alex explained it to me a year ago when I asked why tlb_flush was always the top function in the profile of qemu-system-ppc*). Paolo -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson pgpwrXDe3zS5U.pgp Description: PGP signature
Re: [Qemu-devel] [libvirt] NBD TLS support in QEMU
On 03.09.2014 18:44, Stefan Hajnoczi wrote: Hi, QEMU offers both NBD client and server functionality. The NBD protocol runs unencrypted, which is a problem when the client and server communicate over an untrusted network. This is not problem for NBD only, but for the rest of data that qemu sends over network, i.e. migration stream, VNC/SPICE, ... The particular use case that prompted this mail is storage migration in OpenStack. The goal is to encrypt the NBD connection between source and destination hosts during storage migration. I think we can integrate TLS into the NBD protocol as an optional flag. A quick web search does not reveal existing open source SSL/TLS NBD implementations. I do see a VMware NBDSSL protocol but there is no specification so I guess it is proprietary. In case of libvirt, we have so called tunnelled migration (both spelled misspelled :P) in which libvirt opens a local ports on both src dst side and then sets up secured forwarding pipe to the other side. Or a insecured one if user wishes so. The only problem is that when I adapted libvirt for NBD, I intentionally forbade NBD in tunnelled migration as it requires one more data stream for which libvirt migration protocol is not ready yet. Having saidy that, I feel that libvirt is the show stopper here, not QEMU. I'm not saying that I'm against this. I've heard rumors that not everybody out there uses libvirt and thus they might appreciate this ability. The NBD protocol starts with a negotiation phase. This would be the appropriate place to indicate that TLS will be used. After client and server complete TLS setup the connection can continue as normal. Yep, that's how most of the secured protocols run. Somebody mentions STARTTLS for which I vote as well. Besides QEMU, the userspace NBD tools (http://nbd.sf.net/) can also be extended to support TLS. In this case the kernel needs a localhost socket and userspace handles TLS. Michal
[Qemu-devel] [PATCH] target-ppc: Fix kvmppc_set_compat to use negotiated cpu-version
By mistake, QEMU uses the maximum compatibility level from the command line instead of the value negotiated in client-architecture-support call. This replaces @max_compat with @cpu_version. This only affects guests which do not support the host CPU. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- target-ppc/translate_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c index 48177ed..188b67e 100644 --- a/target-ppc/translate_init.c +++ b/target-ppc/translate_init.c @@ -9137,7 +9137,7 @@ int ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version) break; } -if (kvm_enabled() kvmppc_set_compat(cpu, cpu-max_compat) 0) { +if (kvm_enabled() kvmppc_set_compat(cpu, cpu-cpu_version) 0) { error_report(Unable to set compatibility mode in KVM); ret = -1; } -- 2.0.0
Re: [Qemu-devel] [Qemu-ppc] [PATCH 02/17] ppc: avoid excessive TLB flushing
On 28.08.14 19:14, Paolo Bonzini wrote: PowerPC TCG flushes the TLB on every IR/DR change, which basically means on every user-kernel context switch. Use the 6-element TLB array as a cache, where each MMU index is mapped to a different state of the IR/DR/PR/HV bits. This brings the number of TLB flushes down from ~90 to ~5 for starting up the Debian installer, which is in line with x86 and gives a ~10% performance improvement. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- cputlb.c| 19 + hw/ppc/spapr_hcall.c| 6 +- include/exec/exec-all.h | 5 + target-ppc/cpu.h| 4 +++- target-ppc/excp_helper.c| 6 +- target-ppc/helper_regs.h| 52 +++-- target-ppc/translate_init.c | 5 + 7 files changed, 74 insertions(+), 23 deletions(-) diff --git a/cputlb.c b/cputlb.c index afd3705..17e1b03 100644 --- a/cputlb.c +++ b/cputlb.c @@ -67,6 +67,25 @@ void tlb_flush(CPUState *cpu, int flush_global) tlb_flush_count++; } +void tlb_flush_idx(CPUState *cpu, int mmu_idx) +{ +CPUArchState *env = cpu-env_ptr; + +#if defined(DEBUG_TLB) +printf(tlb_flush_idx %d:\n, mmu_idx); +#endif +/* must reset current TB so that interrupts cannot modify the + links while we are modifying them */ +cpu-current_tb = NULL; + +memset(env-tlb_table[mmu_idx], -1, sizeof(env-tlb_table[mmu_idx])); +memset(cpu-tb_jmp_cache, 0, sizeof(cpu-tb_jmp_cache)); + +env-tlb_flush_addr = -1; +env-tlb_flush_mask = 0; +tlb_flush_count++; +} + static inline void tlb_flush_entry(CPUTLBEntry *tlb_entry, target_ulong addr) { if (addr == (tlb_entry-addr_read diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c index 467858c..b95961c 100644 --- a/hw/ppc/spapr_hcall.c +++ b/hw/ppc/spapr_hcall.c @@ -556,13 +556,17 @@ static target_ulong h_cede(PowerPCCPU *cpu, sPAPREnvironment *spapr, { CPUPPCState *env = cpu-env; CPUState *cs = CPU(cpu); +bool flush; env-msr |= (1ULL MSR_EE); -hreg_compute_hflags(env); +flush = hreg_compute_hflags(env); if (!cpu_has_work(cs)) { cs-halted = 1; cs-exception_index = EXCP_HLT; cs-exit_request = 1; +} else if (flush) { +cs-interrupt_request |= CPU_INTERRUPT_EXITTB; +cs-exit_request = 1; Can this ever happen? } return H_SUCCESS; } diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index 5e5d86e..629a550 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -100,6 +100,7 @@ void tcg_cpu_address_space_init(CPUState *cpu, AddressSpace *as); /* cputlb.c */ void tlb_flush_page(CPUState *cpu, target_ulong addr); void tlb_flush(CPUState *cpu, int flush_global); +void tlb_flush_idx(CPUState *cpu, int mmu_idx); void tlb_set_page(CPUState *cpu, target_ulong vaddr, hwaddr paddr, int prot, int mmu_idx, target_ulong size); @@ -112,6 +113,10 @@ static inline void tlb_flush_page(CPUState *cpu, target_ulong addr) static inline void tlb_flush(CPUState *cpu, int flush_global) { } + +static inline void tlb_flush_idx(CPUState *cpu, int mmu_idx) +{ +} #endif #define CODE_GEN_ALIGN 16 /* must be = of the size of a icache line */ diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h index b64c652..c1cb27f 100644 --- a/target-ppc/cpu.h +++ b/target-ppc/cpu.h @@ -922,7 +922,7 @@ struct ppc_segment_page_sizes { /*/ /* The whole PowerPC CPU context */ -#define NB_MMU_MODES 3 +#define NB_MMU_MODES 6 #define PPC_CPU_OPCODES_LEN 0x40 @@ -1085,6 +1085,8 @@ struct CPUPPCState { target_ulong hflags; /* hflags is a MSR HFLAGS_MASK */ target_ulong hflags_nmsr; /* specific hflags, not coming from MSR */ int mmu_idx; /* precomputed MMU index to speed up mem accesses */ +uint32_t mmu_msr[NB_MMU_MODES]; /* ir/dr/hv/pr values for TLBs */ +int mmu_fifo; /* for replacement in mmu_msr */ /* Power management */ int (*check_pow)(CPUPPCState *env); diff --git a/target-ppc/excp_helper.c b/target-ppc/excp_helper.c index be71590..bf25d44 100644 --- a/target-ppc/excp_helper.c +++ b/target-ppc/excp_helper.c @@ -623,9 +623,6 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int excp_model, int excp) if (env-spr[SPR_LPCR] LPCR_AIL) { new_msr |= (1 MSR_IR) | (1 MSR_DR); -} else if (msr ((1 MSR_IR) | (1 MSR_DR))) { -/* If we disactivated any translation, flush TLBs */ -tlb_flush(cs, 1); } #ifdef TARGET_PPC64 @@ -678,8 +675,7 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int excp_model, int excp) if ((env-mmu_model == POWERPC_MMU_BOOKE) || (env-mmu_model ==
Re: [Qemu-devel] [Qemu-ppc] [PATCH 03/17] ppc: fix monitor access to CR
On 03.09.14 20:21, Tom Musta wrote: On 8/28/2014 12:14 PM, Paolo Bonzini wrote: This was off-by-one. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- monitor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/monitor.c b/monitor.c index 34cee74..ec73dd4 100644 --- a/monitor.c +++ b/monitor.c @@ -2968,7 +2968,7 @@ static target_long monitor_get_ccr (const struct MonitorDef *md, int val) u = 0; for (i = 0; i 8; i++) -u |= env-crf[i] (32 - (4 * i)); +u |= env-crf[i] (32 - (4 * (i + 1))); return u; } Reviewed-by: Tom Musta tommu...@gmail.com Thanks, applied to ppc-next. Alex
Re: [Qemu-devel] [Qemu-ppc] [PATCH 06/17] ppc: use CRF_* in int_helper.c
On 03.09.14 20:28, Tom Musta wrote: On 8/28/2014 12:15 PM, Paolo Bonzini wrote: Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- target-ppc/int_helper.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c index f6e8846..9c1c5cd 100644 --- a/target-ppc/int_helper.c +++ b/target-ppc/int_helper.c @@ -2303,25 +2303,25 @@ uint32_t helper_bcdadd(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps) if (sgna == sgnb) { result.u8[BCD_DIG_BYTE(0)] = bcd_preferred_sgn(sgna, ps); zero = bcd_add_mag(result, a, b, invalid, overflow); -cr = (sgna 0) ? 4 : 8; +cr = (sgna 0) ? 1 CRF_GT : 1 CRF_LT; } else if (bcd_cmp_mag(a, b) 0) { result.u8[BCD_DIG_BYTE(0)] = bcd_preferred_sgn(sgna, ps); zero = bcd_sub_mag(result, a, b, invalid, overflow); -cr = (sgna 0) ? 4 : 8; +cr = (sgna 0) ? 1 CRF_GT : 1 CRF_LT; } else { result.u8[BCD_DIG_BYTE(0)] = bcd_preferred_sgn(sgnb, ps); zero = bcd_sub_mag(result, b, a, invalid, overflow); -cr = (sgnb 0) ? 4 : 8; +cr = (sgnb 0) ? 1 CRF_GT : 1 CRF_LT; } } if (unlikely(invalid)) { result.u64[HI_IDX] = result.u64[LO_IDX] = -1; -cr = 1; +cr = 1 CRF_SO; } else if (overflow) { -cr |= 1; +cr |= 1 CRF_SO; } else if (zero) { -cr = 2; +cr = 1 CRF_EQ; } *r = result; Reviewed-by: Tom Musta tommu...@gmail.com Tested-by: Tom Musta tommu...@gmail.com Thanks, applied to ppc-next. Alex
Re: [Qemu-devel] [PATCH 2/3] s390x/css: support format-0 ccws
On 05/09/14 00:29, Alexander Graf wrote: On 04.09.14 17:32, Jens Freimann wrote: From: Cornelia Huck cornelia.h...@de.ibm.com Add support for format-0 ccws in channel programs. As a format-1 ccw contains the same information as format-0 ccws, only supporting larger addresses, simply convert every ccw to format-1 as we walk the chain. Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/s390x/css.c| 30 +- hw/s390x/css.h| 1 + target-s390x/ioinst.h | 10 ++ 3 files changed, 32 insertions(+), 9 deletions(-) diff --git a/hw/s390x/css.c b/hw/s390x/css.c index 49c2aaf..34637cb 100644 --- a/hw/s390x/css.c +++ b/hw/s390x/css.c @@ -243,17 +243,25 @@ static void copy_sense_id_to_guest(SenseId *dest, SenseId *src) } } -static CCW1 copy_ccw_from_guest(hwaddr addr) +static CCW1 copy_ccw_from_guest(hwaddr addr, bool fmt1) { -CCW1 tmp; +CCW0 tmp0; +CCW1 tmp1; CCW1 ret; -cpu_physical_memory_read(addr, tmp, sizeof(tmp)); -ret.cmd_code = tmp.cmd_code; -ret.flags = tmp.flags; -ret.count = be16_to_cpu(tmp.count); -ret.cda = be32_to_cpu(tmp.cda); - +if (fmt1) { +cpu_physical_memory_read(addr, tmp1, sizeof(tmp1)); +ret.cmd_code = tmp1.cmd_code; +ret.flags = tmp1.flags; +ret.count = be16_to_cpu(tmp1.count); +ret.cda = be32_to_cpu(tmp1.cda); +} else { +cpu_physical_memory_read(addr, tmp0, sizeof(tmp0)); +ret.cmd_code = tmp0.cmd_code; +ret.flags = tmp0.flags; +ret.count = be16_to_cpu(tmp0.count); +ret.cda = be16_to_cpu(tmp0.cda1) | (tmp0.cda0 16); +} return ret; } @@ -268,7 +276,8 @@ static int css_interpret_ccw(SubchDev *sch, hwaddr ccw_addr) return -EIO; } -ccw = copy_ccw_from_guest(ccw_addr); +/* Translate everything to format-1 ccws - the information is the same. */ +ccw = copy_ccw_from_guest(ccw_addr, sch-ccw_fmt_1); /* Check for invalid command codes. */ if ((ccw.cmd_code 0x0f) == 0) { @@ -386,6 +395,7 @@ static void sch_handle_start_func(SubchDev *sch, ORB *orb) s-ctrl |= (SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND); return; } +sch-ccw_fmt_1 = !!(orb-ctrl0 ORB_CTRL0_MASK_FMT); } else { s-ctrl = ~(SCSW_ACTL_SUSP | SCSW_ACTL_RESUME_PEND); } @@ -1347,6 +1357,7 @@ void subch_device_save(SubchDev *s, QEMUFile *f) qemu_put_byte(f, s-id.ciw[i].command); qemu_put_be16(f, s-id.ciw[i].count); } +qemu_put_byte(f, s-ccw_fmt_1); This changes the migration stream format. Please increase the version number for the device that gets migrated, so that we have the chance to catch it. Though - does migration work at all yet? :) Not yet. Myself and Jens are currently testing Davids latest patches regarding CPU states. (just chasing the final (tm) bug). After that I can push the initial cpu migration patch set. So maybe we can leave this as is? Whatever you suggest. Alex return; } @@ -1402,6 +1413,7 @@ int subch_device_load(SubchDev *s, QEMUFile *f) s-id.ciw[i].command = qemu_get_byte(f); s-id.ciw[i].count = qemu_get_be16(f); } +s-ccw_fmt_1 = qemu_get_byte(f); return 0; } diff --git a/hw/s390x/css.h b/hw/s390x/css.h index c864ea7..384a455 100644 --- a/hw/s390x/css.h +++ b/hw/s390x/css.h @@ -76,6 +76,7 @@ struct SubchDev { hwaddr channel_prog; CCW1 last_cmd; bool last_cmd_valid; +bool ccw_fmt_1; bool thinint_active; /* transport-provided data: */ int (*ccw_cb) (SubchDev *, CCW1); diff --git a/target-s390x/ioinst.h b/target-s390x/ioinst.h index 5bbc67d..29f6423 100644 --- a/target-s390x/ioinst.h +++ b/target-s390x/ioinst.h @@ -156,6 +156,16 @@ typedef struct ORB { #define ORB_CTRL1_MASK_ORBX 0x01 #define ORB_CTRL1_MASK_INVALID 0x3e +/* channel command word (type 0) */ +typedef struct CCW0 { +uint8_t cmd_code; +uint8_t cda0; +uint16_t cda1; +uint8_t flags; +uint8_t reserved; +uint16_t count; +} QEMU_PACKED CCW0; + /* channel command word (type 1) */ typedef struct CCW1 { uint8_t cmd_code;
Re: [Qemu-devel] [PATCH 2/3] s390x/css: support format-0 ccws
On 05.09.14 09:23, Christian Borntraeger wrote: On 05/09/14 00:29, Alexander Graf wrote: On 04.09.14 17:32, Jens Freimann wrote: From: Cornelia Huck cornelia.h...@de.ibm.com Add support for format-0 ccws in channel programs. As a format-1 ccw contains the same information as format-0 ccws, only supporting larger addresses, simply convert every ccw to format-1 as we walk the chain. Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/s390x/css.c| 30 +- hw/s390x/css.h| 1 + target-s390x/ioinst.h | 10 ++ 3 files changed, 32 insertions(+), 9 deletions(-) diff --git a/hw/s390x/css.c b/hw/s390x/css.c index 49c2aaf..34637cb 100644 --- a/hw/s390x/css.c +++ b/hw/s390x/css.c @@ -243,17 +243,25 @@ static void copy_sense_id_to_guest(SenseId *dest, SenseId *src) } } -static CCW1 copy_ccw_from_guest(hwaddr addr) +static CCW1 copy_ccw_from_guest(hwaddr addr, bool fmt1) { -CCW1 tmp; +CCW0 tmp0; +CCW1 tmp1; CCW1 ret; -cpu_physical_memory_read(addr, tmp, sizeof(tmp)); -ret.cmd_code = tmp.cmd_code; -ret.flags = tmp.flags; -ret.count = be16_to_cpu(tmp.count); -ret.cda = be32_to_cpu(tmp.cda); - +if (fmt1) { +cpu_physical_memory_read(addr, tmp1, sizeof(tmp1)); +ret.cmd_code = tmp1.cmd_code; +ret.flags = tmp1.flags; +ret.count = be16_to_cpu(tmp1.count); +ret.cda = be32_to_cpu(tmp1.cda); +} else { +cpu_physical_memory_read(addr, tmp0, sizeof(tmp0)); +ret.cmd_code = tmp0.cmd_code; +ret.flags = tmp0.flags; +ret.count = be16_to_cpu(tmp0.count); +ret.cda = be16_to_cpu(tmp0.cda1) | (tmp0.cda0 16); +} return ret; } @@ -268,7 +276,8 @@ static int css_interpret_ccw(SubchDev *sch, hwaddr ccw_addr) return -EIO; } -ccw = copy_ccw_from_guest(ccw_addr); +/* Translate everything to format-1 ccws - the information is the same. */ +ccw = copy_ccw_from_guest(ccw_addr, sch-ccw_fmt_1); /* Check for invalid command codes. */ if ((ccw.cmd_code 0x0f) == 0) { @@ -386,6 +395,7 @@ static void sch_handle_start_func(SubchDev *sch, ORB *orb) s-ctrl |= (SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND); return; } +sch-ccw_fmt_1 = !!(orb-ctrl0 ORB_CTRL0_MASK_FMT); } else { s-ctrl = ~(SCSW_ACTL_SUSP | SCSW_ACTL_RESUME_PEND); } @@ -1347,6 +1357,7 @@ void subch_device_save(SubchDev *s, QEMUFile *f) qemu_put_byte(f, s-id.ciw[i].command); qemu_put_be16(f, s-id.ciw[i].count); } +qemu_put_byte(f, s-ccw_fmt_1); This changes the migration stream format. Please increase the version number for the device that gets migrated, so that we have the chance to catch it. Though - does migration work at all yet? :) Not yet. Myself and Jens are currently testing Davids latest patches regarding CPU states. (just chasing the final (tm) bug). After that I can push the initial cpu migration patch set. So maybe we can leave this as is? Whatever you suggest. Well, if migration doesn't work at all yet I think it's ok to consider all of the state in flux. However, please make sure that once you have migration work for the first time, that any change like this has to also increase the version number of the device migration stream. Alex
Re: [Qemu-devel] [Qemu-ppc] [PATCH 07/17] ppc: fix result of DLMZB when no zero bytes are found
On 03.09.14 20:28, Tom Musta wrote: On 8/28/2014 12:15 PM, Paolo Bonzini wrote: It must return 8 and place 8 in XER, but the current code uses i directly which is 9 at this point of the code. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- target-ppc/int_helper.c | 1 + 1 file changed, 1 insertion(+) diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c index 9c1c5cd..7955bf7 100644 --- a/target-ppc/int_helper.c +++ b/target-ppc/int_helper.c @@ -2573,6 +2573,7 @@ target_ulong helper_dlmzb(CPUPPCState *env, target_ulong high, } i++; } +i = 8; if (update_Rc) { env-crf[0] = 0x2; } Reviewed-by: Tom Musta tommu...@gmail.com Thanks, applied to ppc-next. Alex
Re: [Qemu-devel] [Qemu-ppc] [PATCH 11/17] ppc: rename gen_set_cr6_from_fpscr
On 03.09.14 21:41, Tom Musta wrote: On 8/28/2014 12:15 PM, Paolo Bonzini wrote: It sets CR1, not CR6 (and the spec agrees). Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- target-ppc/translate.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/target-ppc/translate.c b/target-ppc/translate.c index 8def0ae..67f13f7 100644 --- a/target-ppc/translate.c +++ b/target-ppc/translate.c @@ -8179,7 +8179,7 @@ static inline TCGv_ptr gen_fprp_ptr(int reg) } #if defined(TARGET_PPC64) -static void gen_set_cr6_from_fpscr(DisasContext *ctx) +static void gen_set_cr1_from_fpscr(DisasContext *ctx) { TCGv_i32 tmp = tcg_temp_new_i32(); tcg_gen_trunc_tl_i32(tmp, cpu_fpscr); @@ -8187,7 +8187,7 @@ static void gen_set_cr6_from_fpscr(DisasContext *ctx) tcg_temp_free_i32(tmp); } #else -static void gen_set_cr6_from_fpscr(DisasContext *ctx) +static void gen_set_cr1_from_fpscr(DisasContext *ctx) { gen_op_mtcr(4, cpu_fpscr, 28); } @@ -8207,7 +8207,7 @@ static void gen_##name(DisasContext *ctx)\ rb = gen_fprp_ptr(rB(ctx-opcode)); \ gen_helper_##name(cpu_env, rd, ra, rb); \ if (unlikely(Rc(ctx-opcode) != 0)) {\ -gen_set_cr6_from_fpscr(ctx); \ +gen_set_cr1_from_fpscr(ctx); \ }\ tcg_temp_free_ptr(rd); \ tcg_temp_free_ptr(ra); \ @@ -8265,7 +8265,7 @@ static void gen_##name(DisasContext *ctx) \ u32_2 = tcg_const_i32(u32f2(ctx-opcode));\ gen_helper_##name(cpu_env, rt, rb, u32_1, u32_2); \ if (unlikely(Rc(ctx-opcode) != 0)) { \ -gen_set_cr6_from_fpscr(ctx); \ +gen_set_cr1_from_fpscr(ctx); \ } \ tcg_temp_free_ptr(rt);\ tcg_temp_free_ptr(rb);\ @@ -8289,7 +8289,7 @@ static void gen_##name(DisasContext *ctx)\ i32 = tcg_const_i32(i32fld(ctx-opcode));\ gen_helper_##name(cpu_env, rt, ra, rb, i32); \ if (unlikely(Rc(ctx-opcode) != 0)) {\ -gen_set_cr6_from_fpscr(ctx); \ +gen_set_cr1_from_fpscr(ctx); \ }\ tcg_temp_free_ptr(rt); \ tcg_temp_free_ptr(rb); \ @@ -8310,7 +8310,7 @@ static void gen_##name(DisasContext *ctx)\ rb = gen_fprp_ptr(rB(ctx-opcode)); \ gen_helper_##name(cpu_env, rt, rb); \ if (unlikely(Rc(ctx-opcode) != 0)) {\ -gen_set_cr6_from_fpscr(ctx); \ +gen_set_cr1_from_fpscr(ctx); \ }\ tcg_temp_free_ptr(rt); \ tcg_temp_free_ptr(rb); \ @@ -8331,7 +8331,7 @@ static void gen_##name(DisasContext *ctx) \ i32 = tcg_const_i32(i32fld(ctx-opcode)); \ gen_helper_##name(cpu_env, rt, rs, i32); \ if (unlikely(Rc(ctx-opcode) != 0)) { \ -gen_set_cr6_from_fpscr(ctx); \ +gen_set_cr1_from_fpscr(ctx); \ } \ tcg_temp_free_ptr(rt); \ tcg_temp_free_ptr(rs); \ Reviewed-by: Tom Musta tommu...@gmail.com Tested-by: Tom Musta tommu...@gmail.com Thanks, applied to ppc-next. Alex
Re: [Qemu-devel] [PATCH 1/5] s390x/gdb: don't touch the cc if tcg is not enabled
On 03/09/14 11:27, Alexander Graf wrote: On 02.09.14 09:07, Christian Borntraeger wrote: On 02/09/14 00:39, Alexander Graf wrote: On 29.08.14 15:52, Jens Freimann wrote: From: David Hildenbrand d...@linux.vnet.ibm.com When reading/writing the psw mask, the condition code may only be touched if running on tcg. Why? Shouldn't we be able to set CC from gdb as well? You can. What this patch does (and the patch description is a bit vague here) is to not modify the PSW it gets from KVM when passing it to gdb: The qemu core gets the PSW from KVM. Without this patch, we use cc_op to calculate the current CC of the PSW (No idea of TCG, I guess its evaluated lazy - at least this is how valgrind works). This is wrong for the KVM case, as cc_op does not contain any useful data for the KVM case. With this patch we simply pass the psw from KVM to gdb and back. The symptom was that the cc was always shown as zero. Ah, I see. I agree with the patch, but the patch description does not actually describe what the patch does. Please rework it. The patch was already pulled...Well the description is misleaded but with some interpretion still correct ;-). But your point is well taken. We will try to make sure that all patch description are descriptive and not ambiguous. Your question is a good indication that this was not the case in this patch. Christian I also wouldn't mind if instead of hard coding this logic in the gdbstub, we'd extract it as helper functions to read and write the PSW.MASK in cpu.h.
Re: [Qemu-devel] [Qemu-ppc] [PATCH 13/17] ppc: compute mask from BI using right shift
On 03.09.14 22:59, Tom Musta wrote: On 8/28/2014 12:15 PM, Paolo Bonzini wrote: This will match the code we use in fpu_helper.c when we flip CRF_* bit-endianness. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- target-ppc/translate.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/target-ppc/translate.c b/target-ppc/translate.c index 48c7b66..4ce7af4 100644 --- a/target-ppc/translate.c +++ b/target-ppc/translate.c @@ -794,7 +794,7 @@ static void gen_isel(DisasContext *ctx) TCGv_i32 t0; TCGv t1, true_op, zero; -mask = 1 (3 - (bi 0x03)); +mask = 0x08 (bi 0x03); t0 = tcg_temp_new_i32(); tcg_gen_andi_i32(t0, cpu_crf[bi 2], mask); t1 = tcg_temp_new(); @@ -3870,7 +3870,7 @@ static inline void gen_bcond(DisasContext *ctx, int type) if ((bo 0x10) == 0) { /* Test CR */ uint32_t bi = BI(ctx-opcode); -uint32_t mask = 1 (3 - (bi 0x03)); +uint32_t mask = 0x08 (bi 0x03); TCGv_i32 temp = tcg_temp_new_i32(); if (bo 0x8) { @@ -3952,7 +3952,7 @@ static void glue(gen_, name)(DisasContext *ctx) else \ tcg_gen_mov_i32(t1, cpu_crf[crbB(ctx-opcode) 2]); \ tcg_op(t0, t0, t1); \ -bitmask = 1 (3 - (crbD(ctx-opcode) 0x03)); \ +bitmask = 0x08 (crbD(ctx-opcode) 0x03); \ tcg_gen_andi_i32(t0, t0, bitmask); \ tcg_gen_andi_i32(t1, cpu_crf[crbD(ctx-opcode) 2], ~bitmask); \ tcg_gen_or_i32(cpu_crf[crbD(ctx-opcode) 2], t0, t1); \ Reviewed-by: Tom Musta tommu...@gmail.com Tested-by: Tom Musta tommu...@gmail.com Thanks, applied to ppc-next. Alex
[Qemu-devel] [PATCH 3/3] s390x/css: catch ccw sequence errors
From: Cornelia Huck cornelia.h...@de.ibm.com We must not allow chains of more than 255 ccws without data transfer. Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com --- hw/s390x/css.c | 10 ++ hw/s390x/css.h | 1 + 2 files changed, 11 insertions(+) diff --git a/hw/s390x/css.c b/hw/s390x/css.c index 34637cb..b67c039 100644 --- a/hw/s390x/css.c +++ b/hw/s390x/css.c @@ -294,6 +294,13 @@ static int css_interpret_ccw(SubchDev *sch, hwaddr ccw_addr) check_len = !((ccw.flags CCW_FLAG_SLI) !(ccw.flags CCW_FLAG_DC)); +if (!ccw.cda) { +if (sch-ccw_no_data_cnt == 255) { +return -EINVAL; +} +sch-ccw_no_data_cnt++; +} + /* Look at the command. */ switch (ccw.cmd_code) { case CCW_CMD_NOOP: @@ -396,6 +403,7 @@ static void sch_handle_start_func(SubchDev *sch, ORB *orb) return; } sch-ccw_fmt_1 = !!(orb-ctrl0 ORB_CTRL0_MASK_FMT); +sch-ccw_no_data_cnt = 0; } else { s-ctrl = ~(SCSW_ACTL_SUSP | SCSW_ACTL_RESUME_PEND); } @@ -1358,6 +1366,7 @@ void subch_device_save(SubchDev *s, QEMUFile *f) qemu_put_be16(f, s-id.ciw[i].count); } qemu_put_byte(f, s-ccw_fmt_1); +qemu_put_byte(f, s-ccw_no_data_cnt); return; } @@ -1414,6 +1423,7 @@ int subch_device_load(SubchDev *s, QEMUFile *f) s-id.ciw[i].count = qemu_get_be16(f); } s-ccw_fmt_1 = qemu_get_byte(f); +s-ccw_no_data_cnt = qemu_get_byte(f); return 0; } diff --git a/hw/s390x/css.h b/hw/s390x/css.h index 384a455..33104ac 100644 --- a/hw/s390x/css.h +++ b/hw/s390x/css.h @@ -78,6 +78,7 @@ struct SubchDev { bool last_cmd_valid; bool ccw_fmt_1; bool thinint_active; +uint8_t ccw_no_data_cnt; /* transport-provided data: */ int (*ccw_cb) (SubchDev *, CCW1); SenseId id; -- 1.8.5.5
[Qemu-devel] [PATCH 0/3 RESEND] s390x: css patches and small sclp cleanup
Cornelia, Christian, Alex, here are two css patches and a small sclp cleanup. Patch 1 remove duplicate defines in SCLP code Patch 2 adds support for format-0 ccws Patch 3 a css bugfix adding a limit of 255 to ccws chains without data transfer regards Jens Cornelia Huck (2): s390x/css: support format-0 ccws s390x/css: catch ccw sequence errors Jens Freimann (1): s390x: remove duplicate defines in SCLP code hw/s390x/css.c | 40 +++- hw/s390x/css.h | 2 ++ include/hw/s390x/sclp.h | 2 -- target-s390x/ioinst.h | 10 ++ 4 files changed, 43 insertions(+), 11 deletions(-) -- 1.8.5.5
[Qemu-devel] [PATCH 2/3] s390x/css: support format-0 ccws
From: Cornelia Huck cornelia.h...@de.ibm.com Add support for format-0 ccws in channel programs. As a format-1 ccw contains the same information as format-0 ccws, only supporting larger addresses, simply convert every ccw to format-1 as we walk the chain. Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com --- hw/s390x/css.c| 30 +- hw/s390x/css.h| 1 + target-s390x/ioinst.h | 10 ++ 3 files changed, 32 insertions(+), 9 deletions(-) diff --git a/hw/s390x/css.c b/hw/s390x/css.c index 49c2aaf..34637cb 100644 --- a/hw/s390x/css.c +++ b/hw/s390x/css.c @@ -243,17 +243,25 @@ static void copy_sense_id_to_guest(SenseId *dest, SenseId *src) } } -static CCW1 copy_ccw_from_guest(hwaddr addr) +static CCW1 copy_ccw_from_guest(hwaddr addr, bool fmt1) { -CCW1 tmp; +CCW0 tmp0; +CCW1 tmp1; CCW1 ret; -cpu_physical_memory_read(addr, tmp, sizeof(tmp)); -ret.cmd_code = tmp.cmd_code; -ret.flags = tmp.flags; -ret.count = be16_to_cpu(tmp.count); -ret.cda = be32_to_cpu(tmp.cda); - +if (fmt1) { +cpu_physical_memory_read(addr, tmp1, sizeof(tmp1)); +ret.cmd_code = tmp1.cmd_code; +ret.flags = tmp1.flags; +ret.count = be16_to_cpu(tmp1.count); +ret.cda = be32_to_cpu(tmp1.cda); +} else { +cpu_physical_memory_read(addr, tmp0, sizeof(tmp0)); +ret.cmd_code = tmp0.cmd_code; +ret.flags = tmp0.flags; +ret.count = be16_to_cpu(tmp0.count); +ret.cda = be16_to_cpu(tmp0.cda1) | (tmp0.cda0 16); +} return ret; } @@ -268,7 +276,8 @@ static int css_interpret_ccw(SubchDev *sch, hwaddr ccw_addr) return -EIO; } -ccw = copy_ccw_from_guest(ccw_addr); +/* Translate everything to format-1 ccws - the information is the same. */ +ccw = copy_ccw_from_guest(ccw_addr, sch-ccw_fmt_1); /* Check for invalid command codes. */ if ((ccw.cmd_code 0x0f) == 0) { @@ -386,6 +395,7 @@ static void sch_handle_start_func(SubchDev *sch, ORB *orb) s-ctrl |= (SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND); return; } +sch-ccw_fmt_1 = !!(orb-ctrl0 ORB_CTRL0_MASK_FMT); } else { s-ctrl = ~(SCSW_ACTL_SUSP | SCSW_ACTL_RESUME_PEND); } @@ -1347,6 +1357,7 @@ void subch_device_save(SubchDev *s, QEMUFile *f) qemu_put_byte(f, s-id.ciw[i].command); qemu_put_be16(f, s-id.ciw[i].count); } +qemu_put_byte(f, s-ccw_fmt_1); return; } @@ -1402,6 +1413,7 @@ int subch_device_load(SubchDev *s, QEMUFile *f) s-id.ciw[i].command = qemu_get_byte(f); s-id.ciw[i].count = qemu_get_be16(f); } +s-ccw_fmt_1 = qemu_get_byte(f); return 0; } diff --git a/hw/s390x/css.h b/hw/s390x/css.h index c864ea7..384a455 100644 --- a/hw/s390x/css.h +++ b/hw/s390x/css.h @@ -76,6 +76,7 @@ struct SubchDev { hwaddr channel_prog; CCW1 last_cmd; bool last_cmd_valid; +bool ccw_fmt_1; bool thinint_active; /* transport-provided data: */ int (*ccw_cb) (SubchDev *, CCW1); diff --git a/target-s390x/ioinst.h b/target-s390x/ioinst.h index 5bbc67d..29f6423 100644 --- a/target-s390x/ioinst.h +++ b/target-s390x/ioinst.h @@ -156,6 +156,16 @@ typedef struct ORB { #define ORB_CTRL1_MASK_ORBX 0x01 #define ORB_CTRL1_MASK_INVALID 0x3e +/* channel command word (type 0) */ +typedef struct CCW0 { +uint8_t cmd_code; +uint8_t cda0; +uint16_t cda1; +uint8_t flags; +uint8_t reserved; +uint16_t count; +} QEMU_PACKED CCW0; + /* channel command word (type 1) */ typedef struct CCW1 { uint8_t cmd_code; -- 1.8.5.5
[Qemu-devel] [PATCH 1/3] s390x: remove duplicate defines in SCLP code
Let's get rid of these duplicate defines. Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com --- include/hw/s390x/sclp.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/include/hw/s390x/sclp.h b/include/hw/s390x/sclp.h index 5c43574..ec07a11 100644 --- a/include/hw/s390x/sclp.h +++ b/include/hw/s390x/sclp.h @@ -28,8 +28,6 @@ #define SCLP_UNASSIGN_STORAGE 0x000C0001 #define SCLP_CMD_READ_EVENT_DATA0x00770005 #define SCLP_CMD_WRITE_EVENT_DATA 0x00760005 -#define SCLP_CMD_READ_EVENT_DATA0x00770005 -#define SCLP_CMD_WRITE_EVENT_DATA 0x00760005 #define SCLP_CMD_WRITE_EVENT_MASK 0x00780005 /* SCLP Memory hotplug codes */ -- 1.8.5.5
Re: [Qemu-devel] [PATCH] target-ppc: Fix kvmppc_set_compat to use negotiated cpu-version
On 05.09.14 09:04, Alexey Kardashevskiy wrote: By mistake, QEMU uses the maximum compatibility level from the command line instead of the value negotiated in client-architecture-support call. This replaces @max_compat with @cpu_version. This only affects guests which do not support the host CPU. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru Thanks, applied to ppc-next. Alex
Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 1/4] target-ppc: Extend rtas-blob
On 04.09.14 13:13, Aravinda Prasad wrote: Extend rtas-blob to accommodate error log. Error log structure is saved in rtas space upon a machine check exception. Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com --- hw/ppc/spapr.c |4 1 file changed, 4 insertions(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 4b20e36..4a7c0ae 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -1409,6 +1409,10 @@ static void ppc_spapr_init(MachineState *machine) filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, spapr-rtas.bin); spapr-rtas_size = get_image_size(filename); + +/* Resize blob to accommodate error log at a page aligned address */ +spapr-rtas_size = TARGET_PAGE_ALIGN(spapr-rtas_size) + TARGET_PAGE_SIZE; I think it's clearer if you do #define RTAS_ERROR_LOG_SIZE 4096 and add this to the aligned size instead. Alex
Re: [Qemu-devel] [RFC][patch 0/6] pci pass-through support for qemu/KVM on s390
On Thu, Sep 04, 2014 at 07:16:24AM -0600, Alex Williamson wrote: On Thu, 2014-09-04 at 12:52 +0200, frank.blasc...@de.ibm.com wrote: This set of patches implements pci pass-through support for qemu/KVM on s390. PCI support on s390 is very different from other platforms. Major differences are: 1) all PCI operations are driven by special s390 instructions Generating config cycles is always arch specific. 2) all s390 PCI instructions are privileged While the operations to generate config cycles on x86 are not privileged, they must be arbitrated between accesses, so in a sense they're privileged. 3) PCI config and memory spaces can not be mmap'ed VFIO has mapping flags that allow any region to specify mmap support. Hi Alex, thx for your reply. Let me elaborate a little bit ore on 1 - 3. Config and memory space can not be accessed via memory operations. You have to use special s390 instructions. This instructions can not be executed in user space. So there is no other way than executing this instructions in kernel. Yes vfio does support a slow path via ioctrl we could use, but this seems suboptimal from performance point of view. 4) no classic interrupts (INTX, MSI). The pci hw understands the concept of requesting MSIX irqs but irqs are delivered as s390 adapter irqs. VFIO delivers interrupts as eventfds regardless of the underlying platform mechanism. yes that's right, but then we have to do platform specific stuff to present the irq to the guest. I do not say this is impossible but we have add s390 specific code to vfio. 5) For DMA access there is always an IOMMU required. x86 requires the same. s390 pci implementation does not support a complete memory to iommu mapping, dma mappings are created on request. Sounds like POWER. Don't know the details from power, maybe it is similar but not the same. We might be able to extend vfio to have a new interface allowing us to do DMA mappings on request. 6) The OS does not get any informations about the physical layout of the PCI bus. If that means that every device is isolated (seems unlikely for multifunction devices) then that makes IOMMU group support really easy. OK 7) To take advantage of system z specific virtualization features we need to access the SIE control block residing in the kernel KVM The KVM-VFIO device allows interaction between VFIO devices and KVM. 8) To enable system z specific virtualization features we have to manipulate the zpci device in kernel. VFIO supports different device backends, currently pci_dev and working towards platform devices. zpci might just be an extension to standard pci. 7 - 8 At least this is not as straightforward as the pure kernel approach, but I have to dig into that in more detail if we could only agree on a vfio solution. For this reasons I decided to implement a kernel based approach similar to x86 device assignment. There is a new qemu device (s390-pci) representing a pass through device on the host. Here is a sample qemu device configuration: -device s390-pci,host=:00:00.0 The device executes the KVM_ASSIGN_PCI_DEVICE ioctl to create a proxy instance in the kernel KVM and connect this instance to the host pci device. kernel patches apply to linux-kvm s390: cio: chsc function to register GIB s390: pci: export pci functions for pass-through usage KVM: s390: Add GISA support KVM: s390: Add PCI pass-through support qemu patches apply to qemu-master s390: Add PCI bus support s390: Add PCI pass-through device support Feedback and discussion is highly welcome ... KVM-based device assignment needs to go away. It's a horrible model for devices, it offers very little protection to the kernel, assumes every device is fully isolated and visible to the IOMMU, relies on smattering of sysfs files to operate, etc. x86, POWER, and ARM are all moving to VFIO-based device assignment. Why is s390 special enough to repeat all the mistakes that x86 did? Thanks, Is this your personal opinion or was this a strategic decision of the QEMU/KVM community? Can anybody give us direction about this? Actually I can understand your point. In the last weeks I did some development and testing regarding the use of vfio too. But the in kernel solutions seems to offer the best performance and most straighforward implementation for our platform. Greetings, Frank Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [CVE-2014-3615 PATCH v2 3/3] spice: make sure we don't overflow ssd-buf
On 09/04/14 09:04, Gerd Hoffmann wrote: Related spice-only bug. We have a fixed 16 MB buffer here, being presented to the spice-server as qxl video memory in case spice is used with a non-qxl card. It's also used with qxl in vga mode. When using display resolutions requiring more than 16 MB of memory we are going to overflow that buffer. In theory the guest can write, indirectly via spice-server. The spice-server clears the memory after setting a new video mode though, triggering a segfault in the overflow case, so qemu crashes before the guest has a chance to do something evil. Fix that by switching to dynamic allocation for the buffer. CVE-2014-3615 Cc: qemu-sta...@nongnu.org Cc: secal...@redhat.com Cc: Laszlo Ersek ler...@redhat.com Signed-off-by: Gerd Hoffmann kra...@redhat.com --- ui/spice-display.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/ui/spice-display.c b/ui/spice-display.c index 66e2578..57a8fd0 100644 --- a/ui/spice-display.c +++ b/ui/spice-display.c @@ -334,6 +334,7 @@ void qemu_spice_create_host_memslot(SimpleSpiceDisplay *ssd) void qemu_spice_create_host_primary(SimpleSpiceDisplay *ssd) { QXLDevSurfaceCreate surface; +uint64_t surface_size; memset(surface, 0, sizeof(surface)); @@ -347,9 +348,18 @@ void qemu_spice_create_host_primary(SimpleSpiceDisplay *ssd) surface.mouse_mode = true; surface.flags = 0; surface.type = 0; -surface.mem= (uintptr_t)ssd-buf; surface.group_id = MEMSLOT_GROUP_HOST; +surface_size = surface.width * surface.height * 4; (1) surface.width and surface.height are uint32_t fields [spice-server/server/spice.h]: struct QXLDevSurfaceCreate { uint32_t width; uint32_t height; uint32_t equals unsigned int in our case, hence the multiplication will be carried out in unsigned int -- 32-bits. Given that the dimensions here are under guest control, I suggest to write it as surface_size = (uint64_t)surface.width * surface.height * 4; (2) However, I have a concern even that way. The above multiplication (due to the *4) can overflow in uint64_t as well. I can see that surface.width and surface.height come from pixman_image_get_width() and pixman_image_get_height(), which seem to return int values. Hence, (uint64_t)0x7FFF_ * 0x7FFF_ * 4 == 0x_FFFC__0004 which is OK. But, what if pixman returns a negative value? Can we create an image that big? From qemu_spice_create_one_update(): int bw, bh; bw = rect-right - rect-left; bh = rect-bottom - rect-top; dest = pixman_image_create_bits(PIXMAN_x8r8g8b8, bw, bh, (void *)update-bitmap, bw * 4); where typedef struct SPICE_ATTR_PACKED QXLRect { int32_t top; int32_t left; int32_t bottom; int32_t right; } QXLRect; I can't track this back far enough. I'd feel safer if you checked that the multiplication can't overflow even in uint64_t. (3) In addition: +if (ssd-bufsize surface_size) { +ssd-bufsize = surface_size; struct SimpleSpiceDisplay { [...] int bufsize; Meaning, even if the multiplication fits okay in an uint64_t, the above assignment can overflow ssd-bufsize, because that's just an int. +fprintf(stderr, %s: % PRId64 (%dx%d)\n, __func__, +surface_size, surface.width, surface.height); (4) surface_size is uint64_t, it should be printed with PRIu64, not PRId64. Similarly, surface.width and surface.height are uint32_t, they should be printed with either %u or PRIu32, not %d. +g_free(ssd-buf); +ssd-buf = g_malloc(ssd-bufsize); Then, g_malloc() takes a gsize, which means unsigned long: https://developer.gnome.org/glib/2.28/glib-Memory-Allocation.html#g-malloc https://developer.gnome.org/glib/2.28/glib-Basic-Types.html#gsize which has the range of uint32_t in an ILP32 (i686) build. Therefore even changing the type of ssd-bufsize to uint64_t wouldn't help. (5) Instead, you really need to make sure that the very first multiplication fits in a signed int: int width, height; width = surface_width(ssd-ds); height = surface_height(ssd-ds); if (width = 0 || height = 0 || width INT_MAX / 4 || height INT_MAX / (width * 4)) { /* won't ever fit */ abort(); } if (ssd-bufsize width * height * 4) { ssd-bufsize = width * height * 4; /* do the rest of the realloc */ } and do everything else after, even the population of surface's fields. +} +surface.mem = (uintptr_t)ssd-buf; + qemu_spice_create_primary_surface(ssd, 0, surface, QXL_SYNC); } @@ -369,8 +379,6 @@ void qemu_spice_display_init_common(SimpleSpiceDisplay *ssd) if (ssd-num_surfaces == 0) { ssd-num_surfaces = 1024; } -ssd-bufsize = (16 * 1024 * 1024); -ssd-buf = g_malloc(ssd-bufsize); } I'm
Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 1/4] target-ppc: Extend rtas-blob
On Friday 05 September 2014 01:12 PM, Alexander Graf wrote: On 04.09.14 13:13, Aravinda Prasad wrote: Extend rtas-blob to accommodate error log. Error log structure is saved in rtas space upon a machine check exception. Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com --- hw/ppc/spapr.c |4 1 file changed, 4 insertions(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 4b20e36..4a7c0ae 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -1409,6 +1409,10 @@ static void ppc_spapr_init(MachineState *machine) filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, spapr-rtas.bin); spapr-rtas_size = get_image_size(filename); + +/* Resize blob to accommodate error log at a page aligned address */ +spapr-rtas_size = TARGET_PAGE_ALIGN(spapr-rtas_size) + TARGET_PAGE_SIZE; I think it's clearer if you do #define RTAS_ERROR_LOG_SIZE 4096 and add this to the aligned size instead. sure Alex -- Regards, Aravinda
Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 3/4] target-ppc: Build error log
On 04.09.14 13:13, Aravinda Prasad wrote: Whenever there is a physical memory error due to bit flips, which cannot be corrected by hardware, the error is passed on to the kernel. If the memory address in error belongs to guest address space then guest kernel is responsible to take action. Hence the error is passed on to guest via KVM by invoking 0x200 NMI vector. However, guest OS, as per PAPR, expects an error log upon such error. This patch registers a new hcall which is issued from 0x200 interrupt vector and builds the error log, copies the error log to rtas space and passes the address of the error log to guest Enhancement to KVM to perform above functionality is already in upstream kernel. Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com --- hw/ppc/spapr_hcall.c | 154 include/hw/ppc/spapr.h |4 + 2 files changed, 157 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c index 01650ba..c3aa448 100644 --- a/hw/ppc/spapr_hcall.c +++ b/hw/ppc/spapr_hcall.c @@ -14,6 +14,88 @@ struct SPRSyncState { target_ulong mask; }; +/* Offset from rtas-base where error log is placed */ +#define RTAS_ERROR_OFFSET (TARGET_PAGE_SIZE) You can't assume this. Please compute the value at the same place you compute the rtas-size. + +#define RTAS_ELOG_SEVERITY_SHIFT 0x5 +#define RTAS_ELOG_DISPOSITION_SHIFT 0x3 +#define RTAS_ELOG_INITIATOR_SHIFT0x4 + +/* + * Only required RTAS event severity, disposition, initiator + * target and type are copied from arch/powerpc/include/asm/rtas.h + */ + +/* RTAS event severity */ +#define RTAS_SEVERITY_ERROR_SYNC0x3 + +/* RTAS event disposition */ +#define RTAS_DISP_NOT_RECOVERED 0x2 + +/* RTAS event initiator */ +#define RTAS_INITIATOR_MEMORY 0x4 + +/* RTAS event target */ +#define RTAS_TARGET_MEMORY 0x4 + +/* RTAS event type */ +#define RTAS_TYPE_ECC_UNCORR0x09 + +/* + * Currently KVM only passes on the uncorrected machine + * check memory error to guest. Other machine check errors + * such as SLB multi-hit and TLB multi-hit are recovered + * in KVM and are not passed on to guest. + * + * DSISR Bit for uncorrected machine check error. Based + * on arch/powerpc/include/asm/mce.h Please don't include Linux code. This file is GPLv2+ licensed and I don't want to taint it as GPLv2 only just for the sake of mce. + */ +#define PPC_BIT(bit)(0x8000ULL bit) +#define P7_DSISR_MC_UE (PPC_BIT(48)) /* P8 too */ + +/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */ +struct rtas_error_log { +/* Byte 0 */ +uint8_t byte0; /* Architectural version */ + +/* Byte 1 */ +uint8_t byte1; +/* + * XXX 3: Severity level of error + *XX2: Degree of recovery + * X 1: Extended log present? + * XX 2: Reserved + */ + +/* Byte 2 */ +uint8_t byte2; +/* + * 4: Initiator of event + * 4: Target of failed operation + */ +uint8_t byte3; /* General event or error*/ +}; + +/* + * Data format in RTAS-Blob + * + * This structure contains error information related to Machine + * Check exception. This is filled up and copied to rtas-blob + * upon machine check exception. + */ +struct rtas_mc_log { +target_ulong srr0; +target_ulong srr1; +/* + * Beginning of error log address. This is properly + * populated and passed on to OS registered machine + * check notification routine upon machine check + * exception + */ +target_ulong r3; +struct rtas_error_log err_log; +}; + static void do_spr_sync(void *arg) { struct SPRSyncState *s = arg; @@ -586,6 +668,77 @@ static target_ulong h_rtas_update(PowerPCCPU *cpu, sPAPREnvironment *spapr, return 0; } +static target_ulong h_report_mc_err(PowerPCCPU *cpu, sPAPREnvironment *spapr, + target_ulong opcode, target_ulong *args) +{ +struct rtas_mc_log mc_log; +CPUPPCState *env = cpu-env; +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu); + +/* + * We save the original r3 register in SPRG2 in 0x200 vector, + * which is patched during call to ibm.nmi-register. Original + * r3 is required to be included in error log + */ +mc_log.r3 = env-spr[SPR_SPRG2]; Don't you have to call cpu_synchronize_registers() before you access SPRG2? Otherwise the value may be stale. + +/* + * SRR0 and SRR1, containing nip and msr at the time of exception, + * are clobbered when we return from this hcall. Hence they + * need to be properly saved and restored. We save srr0 + * and srr1 in rtas blob and restore it in 0x200 vector + * before branching to OS
Re: [Qemu-devel] [PATCH V3] vhost_net: start/stop guest notifiers properly
On 2014/9/1 16:18, Michael S. Tsirkin wrote: On Fri, Aug 29, 2014 at 06:40:24PM +0800, Zhangjie (HZ) wrote: On 2014/8/27 20:59, Michael S. Tsirkin wrote: On Thu, Aug 21, 2014 at 03:42:53PM +0800, Zhangjie (HZ) wrote: On 2014/8/21 14:53, Jason Wang wrote: On 08/21/2014 02:28 PM, Zhangjie (HZ) wrote: After migration, vhost is not disabled, virtual nic became unreachable because vhost is not awakened. By the logical of EVENT_IDX, virtio-net will not kick vhost again if the used idx is not updated. So, if one interrupts is lost during migration, virtio_net will not kick vhost again. Then, no skb from virtio-net can be sent to tap. Yes and I mean to test vhost=off to see if it was the issue of vhost. That sounds reasonable, but the test case is to test vhost. Jason's patch reduced the probability of occurrence, from about 1/20 to 1/80. It is really effective. I think the patch should be acked. May be we can try to solve the problem from another perspective. Do you have some methods to sense the migration? We can make up a signal from virtio-net after the migration. You can make a patch like this to debug. If problem disappears, it means interrupt was really lost anyway. Anyway, I will try to reproduce it by myself. The test environment is really terrible, I build a environment myself, but it problem did not occur. The environment I use now is from a colleague Responsible for test work. Two hosts, every host has about 20 vms, they send packages(ipv4 and ipv6) between each other. The VM to be migrated also sens packages itself, and there is a ping(-i 0.001) from another host to it. The physical nic is 1GE, connected through a internal nework. Just want to confirm. For the problem did not occur, you mean with my patch on top? . I mean, with your patch, I have to test 80 times before it occurs, the probability is reduced. Could you please try to apply the patch [PATCH V4] net: Forbid dealing with packets when VM is not running on top and see if this helps? Thanks! -- Best Wishes! Zhang Jie . Thanks! I will have a test. Great, once you have the result of the two patches applied together, please let us know on the list. -- Best Wishes! Zhang Jie . I'm sorry for not giving test results in time, test resource is busy these days. Perhaps I can give the result next week. -- Best Wishes! Zhang Jie
Re: [Qemu-devel] [PATCH v2 3/4] target-ppc: Build error log
On Friday 05 September 2014 07:44 AM, Alexey Kardashevskiy wrote: On 09/04/2014 09:13 PM, Aravinda Prasad wrote: Whenever there is a physical memory error due to bit flips, which cannot be corrected by hardware, the error is passed on to the kernel. If the memory address in error belongs to guest address space then guest kernel is responsible to take action. Hence the error is passed on to guest via KVM by invoking 0x200 NMI vector. However, guest OS, as per PAPR, expects an error log upon such error. This patch registers a new hcall which is issued from 0x200 interrupt vector and builds the error log, copies the error log to rtas space and passes the address of the error log to guest Enhancement to KVM to perform above functionality is already in upstream kernel. Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com --- hw/ppc/spapr_hcall.c | 154 include/hw/ppc/spapr.h |4 + 2 files changed, 157 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c index 01650ba..c3aa448 100644 --- a/hw/ppc/spapr_hcall.c +++ b/hw/ppc/spapr_hcall.c @@ -14,6 +14,88 @@ struct SPRSyncState { target_ulong mask; }; +/* Offset from rtas-base where error log is placed */ +#define RTAS_ERROR_OFFSET (TARGET_PAGE_SIZE) + +#define RTAS_ELOG_SEVERITY_SHIFT 0x5 +#define RTAS_ELOG_DISPOSITION_SHIFT 0x3 +#define RTAS_ELOG_INITIATOR_SHIFT0x4 + +/* + * Only required RTAS event severity, disposition, initiator + * target and type are copied from arch/powerpc/include/asm/rtas.h + */ + +/* RTAS event severity */ +#define RTAS_SEVERITY_ERROR_SYNC0x3 + +/* RTAS event disposition */ +#define RTAS_DISP_NOT_RECOVERED 0x2 + +/* RTAS event initiator */ +#define RTAS_INITIATOR_MEMORY 0x4 + +/* RTAS event target */ +#define RTAS_TARGET_MEMORY 0x4 + +/* RTAS event type */ +#define RTAS_TYPE_ECC_UNCORR0x09 + +/* + * Currently KVM only passes on the uncorrected machine + * check memory error to guest. Other machine check errors + * such as SLB multi-hit and TLB multi-hit are recovered + * in KVM and are not passed on to guest. + * + * DSISR Bit for uncorrected machine check error. Based + * on arch/powerpc/include/asm/mce.h + */ +#define PPC_BIT(bit)(0x8000ULL bit) +#define P7_DSISR_MC_UE (PPC_BIT(48)) /* P8 too */ + +/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */ +struct rtas_error_log { +/* Byte 0 */ +uint8_t byte0; /* Architectural version */ + +/* Byte 1 */ +uint8_t byte1; +/* + * XXX 3: Severity level of error + *XX2: Degree of recovery + * X 1: Extended log present? + * XX 2: Reserved + */ + +/* Byte 2 */ +uint8_t byte2; +/* + * 4: Initiator of event + * 4: Target of failed operation + */ +uint8_t byte3; /* General event or error*/ +}; Any particular reason not to copy rtas_error_log as is? No specific reason. I overlooked it when I moved my code form early prototype version. Will include rest all members. Regards, Aravinda -- Regards, Aravinda
Re: [Qemu-devel] [libvirt] NBD TLS support in QEMU
On Fri, Sep 05, 2014 at 08:23:17AM +0200, Michal Privoznik wrote: On 03.09.2014 18:44, Stefan Hajnoczi wrote: Hi, QEMU offers both NBD client and server functionality. The NBD protocol runs unencrypted, which is a problem when the client and server communicate over an untrusted network. This is not problem for NBD only, but for the rest of data that qemu sends over network, i.e. migration stream, VNC/SPICE, ... We already have TLS support for VNC and SPICE. We do need it for NBD, migration and the chardev TCP backend. I'd suggest that it is likely possible to add support for NBD, migration and chardev all at the same time by doing a general purpose TLS socket wrapper in QEMU that all those areas can use. The particular use case that prompted this mail is storage migration in OpenStack. The goal is to encrypt the NBD connection between source and destination hosts during storage migration. I think we can integrate TLS into the NBD protocol as an optional flag. A quick web search does not reveal existing open source SSL/TLS NBD implementations. I do see a VMware NBDSSL protocol but there is no specification so I guess it is proprietary. In case of libvirt, we have so called tunnelled migration (both spelled misspelled :P) in which libvirt opens a local ports on both src dst side and then sets up secured forwarding pipe to the other side. Or a insecured one if user wishes so. The only problem is that when I adapted libvirt for NBD, I intentionally forbade NBD in tunnelled migration as it requires one more data stream for which libvirt migration protocol is not ready yet. Having saidy that, I feel that libvirt is the show stopper here, not QEMU. While tunnelled migration via libvirt is doable, it is very much sub-optimal as it involves a great many data copies which is bad for performance of migration. This is the main reason that having direct TLS support in the QEMU migration and NBD data channels is desired. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] NBD TLS support in QEMU
On Fri, Sep 05, 2014 at 12:02:18AM +0200, Wouter Verhelst wrote: [Cc: to nbd-general list added] On Wed, Sep 03, 2014 at 05:44:17PM +0100, Stefan Hajnoczi wrote: Hi, QEMU offers both NBD client and server functionality. The NBD protocol runs unencrypted, which is a problem when the client and server communicate over an untrusted network. The particular use case that prompted this mail is storage migration in OpenStack. The goal is to encrypt the NBD connection between source and destination hosts during storage migration. I've never given encrypted NBD high priority, since I don't think encryption without authentication serves much purpose -- and I haven't gotten around to adding authentication yet (for which I have plans; but other things have priority). While have an authentication layer like SASL wired into the NBD protocol would be nice, it shouldn't be considered a blocker / pre-requisite. It is pretty straightforward for the server to require x509 certificates from the client and validate those as a means of authentication. We've used that as an authentication mechanism in libvirt and VNC with success, though we did later add SASL integration as an option too. I think we can integrate TLS into the NBD protocol as an optional flag. A quick web search does not reveal existing open source SSL/TLS NBD implementations. I do see a VMware NBDSSL protocol but there is no specification so I guess it is proprietary. The NBD protocol starts with a negotiation phase. This would be the appropriate place to indicate that TLS will be used. After client and server complete TLS setup the connection can continue as normal. Besides QEMU, the userspace NBD tools (http://nbd.sf.net/) can also be extended to support TLS. In this case the kernel needs a localhost socket and userspace handles TLS. That introduces a possibility for a deadlock, since now your network socket isn't on the PF_MEMALLOC-protected socket anymore, which will cause the kernel to throw away packets which are needed for your nbd connection, in hopes of clearing some memory. I suppose you could theoretically do the encryption in kernel space. Not convinced that trying TLS in kernel space is a good idea, though. I have heard of people using stunnel or the likes to pipe the NBD protocol over a secure channel, with various levels of success. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [RFC][patch 0/6] pci pass-through support for qemu/KVM on s390
On 04.09.14 12:52, frank.blasc...@de.ibm.com wrote: This set of patches implements pci pass-through support for qemu/KVM on s390. PCI support on s390 is very different from other platforms. Major differences are: 1) all PCI operations are driven by special s390 instructions 2) all s390 PCI instructions are privileged 3) PCI config and memory spaces can not be mmap'ed That's ok, vfio abstracts config space anyway. 4) no classic interrupts (INTX, MSI). The pci hw understands the concept of requesting MSIX irqs but irqs are delivered as s390 adapter irqs. This is in line with other implementations. Interrupts go from device - PHB - PIC - CPU (some times you can have another converter device in between) In your case, the PHB converts INTX and MSI interrupts to Adapter interrupts to go to the floating interrupt controller. Same thing as everyone else really. 5) For DMA access there is always an IOMMU required. s390 pci implementation does not support a complete memory to iommu mapping, dma mappings are created on request. Sounds great :). So I suppose we should implement a guest facing IOMMU? 6) The OS does not get any informations about the physical layout of the PCI bus. So how does it know whether different devices are behind the same IOMMU context? Or can we assume that every device has its own context? 7) To take advantage of system z specific virtualization features we need to access the SIE control block residing in the kernel KVM Pleas elaborate. 8) To enable system z specific virtualization features we have to manipulate the zpci device in kernel. Why? For this reasons I decided to implement a kernel based approach similar to x86 device assignment. There is a new qemu device (s390-pci) representing a I fail to see the rationale and I definitely don't want to see anything even remotely similar to the legacy x86 device assignment on s390 ;). Can't we just enhance VFIO? Also, I think we'll get the cleanest model if we start off with an implementation that allows us to add emulated PCI devices to an s390x machine and only then follow on with physical ones. Alex
Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 3/4] target-ppc: Build error log
On Friday 05 September 2014 01:34 PM, Alexander Graf wrote: On 04.09.14 13:13, Aravinda Prasad wrote: Whenever there is a physical memory error due to bit flips, which cannot be corrected by hardware, the error is passed on to the kernel. If the memory address in error belongs to guest address space then guest kernel is responsible to take action. Hence the error is passed on to guest via KVM by invoking 0x200 NMI vector. However, guest OS, as per PAPR, expects an error log upon such error. This patch registers a new hcall which is issued from 0x200 interrupt vector and builds the error log, copies the error log to rtas space and passes the address of the error log to guest Enhancement to KVM to perform above functionality is already in upstream kernel. Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com --- hw/ppc/spapr_hcall.c | 154 include/hw/ppc/spapr.h |4 + 2 files changed, 157 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c index 01650ba..c3aa448 100644 --- a/hw/ppc/spapr_hcall.c +++ b/hw/ppc/spapr_hcall.c @@ -14,6 +14,88 @@ struct SPRSyncState { target_ulong mask; }; +/* Offset from rtas-base where error log is placed */ +#define RTAS_ERROR_OFFSET (TARGET_PAGE_SIZE) You can't assume this. Please compute the value at the same place you compute the rtas-size. sure + +#define RTAS_ELOG_SEVERITY_SHIFT 0x5 +#define RTAS_ELOG_DISPOSITION_SHIFT 0x3 +#define RTAS_ELOG_INITIATOR_SHIFT0x4 + +/* + * Only required RTAS event severity, disposition, initiator + * target and type are copied from arch/powerpc/include/asm/rtas.h + */ + +/* RTAS event severity */ +#define RTAS_SEVERITY_ERROR_SYNC0x3 + +/* RTAS event disposition */ +#define RTAS_DISP_NOT_RECOVERED 0x2 + +/* RTAS event initiator */ +#define RTAS_INITIATOR_MEMORY 0x4 + +/* RTAS event target */ +#define RTAS_TARGET_MEMORY 0x4 + +/* RTAS event type */ +#define RTAS_TYPE_ECC_UNCORR0x09 + +/* + * Currently KVM only passes on the uncorrected machine + * check memory error to guest. Other machine check errors + * such as SLB multi-hit and TLB multi-hit are recovered + * in KVM and are not passed on to guest. + * + * DSISR Bit for uncorrected machine check error. Based + * on arch/powerpc/include/asm/mce.h Please don't include Linux code. This file is GPLv2+ licensed and I don't want to taint it as GPLv2 only just for the sake of mce. Hmm.. ok. In that case I should not copy rtas_error_log structure below from kernel source as well. + */ +#define PPC_BIT(bit)(0x8000ULL bit) +#define P7_DSISR_MC_UE (PPC_BIT(48)) /* P8 too */ + +/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */ +struct rtas_error_log { +/* Byte 0 */ +uint8_t byte0; /* Architectural version */ + +/* Byte 1 */ +uint8_t byte1; +/* + * XXX 3: Severity level of error + *XX2: Degree of recovery + * X 1: Extended log present? + * XX 2: Reserved + */ + +/* Byte 2 */ +uint8_t byte2; +/* + * 4: Initiator of event + * 4: Target of failed operation + */ +uint8_t byte3; /* General event or error*/ +}; + +/* + * Data format in RTAS-Blob + * + * This structure contains error information related to Machine + * Check exception. This is filled up and copied to rtas-blob + * upon machine check exception. + */ +struct rtas_mc_log { +target_ulong srr0; +target_ulong srr1; +/* + * Beginning of error log address. This is properly + * populated and passed on to OS registered machine + * check notification routine upon machine check + * exception + */ +target_ulong r3; +struct rtas_error_log err_log; +}; + static void do_spr_sync(void *arg) { struct SPRSyncState *s = arg; @@ -586,6 +668,77 @@ static target_ulong h_rtas_update(PowerPCCPU *cpu, sPAPREnvironment *spapr, return 0; } +static target_ulong h_report_mc_err(PowerPCCPU *cpu, sPAPREnvironment *spapr, + target_ulong opcode, target_ulong *args) +{ +struct rtas_mc_log mc_log; +CPUPPCState *env = cpu-env; +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu); + +/* + * We save the original r3 register in SPRG2 in 0x200 vector, + * which is patched during call to ibm.nmi-register. Original + * r3 is required to be included in error log + */ +mc_log.r3 = env-spr[SPR_SPRG2]; Don't you have to call cpu_synchronize_registers() before you access SPRG2? Otherwise the value may be stale. Yes true. Will include. + +/* + * SRR0 and SRR1, containing nip and msr at the time of exception, + *
Re: [Qemu-devel] [RFC][patch 3/6] KVM: s390: Add GISA support
On 04.09.14 12:52, frank.blasc...@de.ibm.com wrote: From: Frank Blaschka frank.blasc...@de.ibm.com This patch adds GISA (Guest Interrupt State Area) support to s390 kvm. GISA can be used for exitless interrupts. The patch provides a set of functions for GISA related operations like accessing GISA fields or registering ISCs for alert. Exploiters of GISA will follow with additional patches. Signed-off-by: Frank Blaschka frank.blasc...@de.ibm.com That's a nice feature. However, please make sure that you maintain the abstraction levels. What should happen is that you request an irqfd from FLIC. Then you associate that irqfd with the PCI device. Thanks to that association, both parties can now talk to each other and negotiate their GISA number space and make sure things are connected. However, it should always be possible to do things without this direct IRQ injection. So you should be able to receive an irqfd event when an IRQ happened, so that VFIO user space applications can also handle interrupts for example. And the same applies for interrupt injection. We also need to be able to inject an adapter interrupt from QEMU for emulated devices ;). Alex
Re: [Qemu-devel] [RFC][patch 0/6] pci pass-through support for qemu/KVM on s390
On 05.09.14 09:46, Frank Blaschka wrote: On Thu, Sep 04, 2014 at 07:16:24AM -0600, Alex Williamson wrote: On Thu, 2014-09-04 at 12:52 +0200, frank.blasc...@de.ibm.com wrote: This set of patches implements pci pass-through support for qemu/KVM on s390. PCI support on s390 is very different from other platforms. Major differences are: 1) all PCI operations are driven by special s390 instructions Generating config cycles is always arch specific. 2) all s390 PCI instructions are privileged While the operations to generate config cycles on x86 are not privileged, they must be arbitrated between accesses, so in a sense they're privileged. 3) PCI config and memory spaces can not be mmap'ed VFIO has mapping flags that allow any region to specify mmap support. Hi Alex, thx for your reply. Let me elaborate a little bit ore on 1 - 3. Config and memory space can not be accessed via memory operations. You have to use special s390 instructions. This instructions can not be executed in user space. So there is no other way than executing this instructions in kernel. Yes vfio does support a slow path via ioctrl we could use, but this seems suboptimal from performance point of view. Ah, I missed the memory spaces part ;). I agree that it's suboptimal to call into the kernel for every PCI access, but I still think that VFIO provides the correct abstraction layer for us to use. If nothing else, it would at least give us identical configuration to x86 and nice debugability en par with the other platforms. 4) no classic interrupts (INTX, MSI). The pci hw understands the concept of requesting MSIX irqs but irqs are delivered as s390 adapter irqs. VFIO delivers interrupts as eventfds regardless of the underlying platform mechanism. yes that's right, but then we have to do platform specific stuff to present the irq to the guest. I do not say this is impossible but we have add s390 specific code to vfio. Not at all - interrupt delivery is completely transparent to VFIO. 5) For DMA access there is always an IOMMU required. x86 requires the same. s390 pci implementation does not support a complete memory to iommu mapping, dma mappings are created on request. Sounds like POWER. Don't know the details from power, maybe it is similar but not the same. We might be able to extend vfio to have a new interface allowing us to do DMA mappings on request. We already have that. 6) The OS does not get any informations about the physical layout of the PCI bus. If that means that every device is isolated (seems unlikely for multifunction devices) then that makes IOMMU group support really easy. OK 7) To take advantage of system z specific virtualization features we need to access the SIE control block residing in the kernel KVM The KVM-VFIO device allows interaction between VFIO devices and KVM. 8) To enable system z specific virtualization features we have to manipulate the zpci device in kernel. VFIO supports different device backends, currently pci_dev and working towards platform devices. zpci might just be an extension to standard pci. 7 - 8 At least this is not as straightforward as the pure kernel approach, but I have to dig into that in more detail if we could only agree on a vfio solution. Please do so, yes :). For this reasons I decided to implement a kernel based approach similar to x86 device assignment. There is a new qemu device (s390-pci) representing a pass through device on the host. Here is a sample qemu device configuration: -device s390-pci,host=:00:00.0 The device executes the KVM_ASSIGN_PCI_DEVICE ioctl to create a proxy instance in the kernel KVM and connect this instance to the host pci device. kernel patches apply to linux-kvm s390: cio: chsc function to register GIB s390: pci: export pci functions for pass-through usage KVM: s390: Add GISA support KVM: s390: Add PCI pass-through support qemu patches apply to qemu-master s390: Add PCI bus support s390: Add PCI pass-through device support Feedback and discussion is highly welcome ... KVM-based device assignment needs to go away. It's a horrible model for devices, it offers very little protection to the kernel, assumes every device is fully isolated and visible to the IOMMU, relies on smattering of sysfs files to operate, etc. x86, POWER, and ARM are all moving to VFIO-based device assignment. Why is s390 special enough to repeat all the mistakes that x86 did? Thanks, Is this your personal opinion or was this a strategic decision of the QEMU/KVM community? Can anybody give us direction about this? Actually I can understand your point. In the last weeks I did some development and testing regarding the use of vfio too. But the in kernel solutions seems to offer the best performance and most straighforward implementation for our platform. I don't see why there should be any difference in
Re: [Qemu-devel] [RFC][patch 4/6] KVM: s390: Add PCI pass-through support
On 04.09.14 12:52, frank.blasc...@de.ibm.com wrote: From: Frank Blaschka frank.blasc...@de.ibm.com This patch implemets PCI pass-through kernel support for s390. Design approach is very similar to the x86 device assignment. User space executes the KVM_ASSIGN_PCI_DEVICE ioctl to create a proxy instance in the kernel KVM and connect this instance to the host pci device. s390 pci instructions are intercepted in kernel and operations are passed directly to the assigned pci device. To take advantage of all system z specific virtualization features we need to access the SIE control block residing in KVM. Also we have to enable z pci devices with special configuration information coming form the SIE block as well. Signed-off-by: Frank Blaschka frank.blasc...@de.ibm.com --- arch/s390/include/asm/kvm_host.h |1 arch/s390/kvm/Makefile |2 arch/s390/kvm/intercept.c|1 arch/s390/kvm/kvm-s390.c | 33 arch/s390/kvm/kvm-s390.h | 17 arch/s390/kvm/pci.c | 2130 +++ arch/s390/kvm/priv.c | 21 7 files changed, 2202 insertions(+), 3 deletions(-) I would love to review this patch, but in its current form it's impossible to do. I can't possibly keep 2000 lines of code in my head. Alex
[Qemu-devel] [PATCH v7 00/28] modify boot order of guest, and take effect after rebooting
From: Gonglei arei.gong...@huawei.com Sometimes, we want to modify boot order of a guest, but no need to shutdown it. We can call dynamic changing bootindex of a guest, which can be assured taking effect just after the guest rebooting. For example, in P2V scene, we boot a guest and then attach a new system disk, for copying some thing. We want to assign the new disk as the booting disk, which means its bootindex=1. Different nics can be assigen different bootindex dynamically also make sense. This patch series do belows works: 1. add an fw_cfg_machine_reset() assure re-read global fw_boot_order list during vm rebooting. 2. switch the property from qdev to qom, then use the set callback to also update the fw_cfg file. Note: - Do not support change pci option rom's bootindex. - Do not handle those devices which don't have use the bootindex property. changes since v6: - move all bootindex/boot-device code to a new file, named bootdevice.c. - introduce a getter/setter wrapper for all device. - call add_boot_device_path in setter bootindx callback function. - call del_boot_device_path in finalize bootindex qom callback function. - other bugfixes. Thanks for Eduardo's good suggestion! And other guys, thanks too! changes since v5: rework by Gerd and Markus's suggestion(Thanks a lot): - Set/update bootindex on reset instead of realize/init. - Switch the property from qdev to qom, then use the set callback to also update the fw_cfg file. - using qom-set instead of 'set-bootindex' qmp interface, remove it. This is a huge change relative to the previous version. Changes since v4: - using error_setg() instead of qerror_report() in patch 1/8. - call del_boot_device_path() from device_finalize() instead of placing it into each individual device in patch 4/8. Changes since v3: - rework del_* and modify_* function, because of virtio devices' specialation. For example, virtio-net's id is NULL, and its parent virtio-net-pci's id was assigned. Though the global fw_boot_order stored the virtio-net device. - call dell_boot_device_path in each individual device avoiding waste resouce. - introduce qmp query-bootindex command - introcude hmp info bootindex command - Fixes by Eric's reviewing comments, thanks. Changes since v2: *address Gerd's reviewing suggestion: - use the old entry's suffix, if the caller do not pass it in. - call del_boot_device_path() from device_finalize() instead of placing it into each individual device. Changes since v1: *rework by Gerd's suggestion: - split modify and del fw_boot_order for single function. - change modify bootindex's realization which simply lookup the device and modify the bootindex. if the new bootindex has already used by another device just throw an error. - change to del_boot_device_path(DeviceState *dev) and simply delete all entries belonging to the device. For Convenience of testing, my test case based on Andreas's patch series: [PATCH qom-next 0/4] qom: HMP commands to replace info qtree http://thread.gmane.org/gmane.comp.emulators.qemu/271513 However, there is no direct relation with this bootindex patch series. This series based on my patch posted yestoday: [PATCH] virtio-pci: fix virtio-net child refcount in transports http://lists.gnu.org/archive/html/qemu-devel/2014-09/msg00805.html ./qemu-system-x86_64 -enable-kvm -m 4096 -smp 4 -name redhat6.2 -drive \ file=/home/win7_32_2U,if=none,id=drive-ide0-0-0 -device ide-hd,bus=ide.0,\ unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive \ file=/home/rhel-server-7.0-x86_64-dvd.iso,if=none,id=drive-ide0-0-1 \ -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=4 \ -vnc 0.0.0.0:10 -netdev type=user,id=net0 -device virtio-net-pci,netdev=net0,bootindex=3,id=nic1 \ -drive file=/mnt/sdb/gonglei/image/virtio-win-1.5.3.vfd,if=none,id=drive-fdc0-0-0,format=raw \ -device isa-fdc,driveA=drive-fdc0-0-0,bootindexA=5,id=floppy1 -qmp unix:/tmp/qmp,server,nowait \ -monitor stdio -netdev type=user,id=net1 -device e1000,netdev=net1,bootindex=2,id=nic \ -boot menu=on -device virtio-scsi-pci,id=scsi0 -drive file=/home/suse11_sp3_32,if=none,\ id=drive-scsi0-0-0-0,format=raw,cache=none,aio=native \ -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=8 QEMU 2.1.50 monitor - type 'help' for more information (qemu) qom-get nic1 bootindex 3 (0x3) (qemu) qom-set nic1 bootindex 3 The bootindex 3 has already been used (qemu) qom-set nic1 bootindex 0 (qemu) qom-set floppy1 bootindexA 3 (qemu) system_reset (qemu) qom-get nic1 bootindex 0 (0x0) (qemu) qom-get scsi0-0-0-0 bootindex 8 (0x8) (qemu) qom-set scsi0-0-0-0 bootindex 0 The bootindex 0 has already been used (qemu) qom-set nic1 bootindex -1 (qemu) qom-set scsi0-0-0-0 bootindex 0 (qemu) qom-get scsi0-0-0-0 bootindex 0 (0x0) (qemu) Gonglei (28): bootdevice: move bootdevice related code to new file bootdevice.c bootindex: add check
[Qemu-devel] [PATCH v7 14/28] spapr_lian: add bootindex to qom property
From: Gonglei arei.gong...@huawei.com Add a qom property with the same name 'bootindex', when we remove it form qdev property, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/net/spapr_llan.c | 24 1 file changed, 24 insertions(+) diff --git a/hw/net/spapr_llan.c b/hw/net/spapr_llan.c index 2d47df6..38bbb01 100644 --- a/hw/net/spapr_llan.c +++ b/hw/net/spapr_llan.c @@ -219,6 +219,29 @@ static int spapr_vlan_init(VIOsPAPRDevice *sdev) return 0; } +static void spapr_vlan_get_bootindex(Object *obj, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +VIOsPAPRVLANDevice *dev = VIO_SPAPR_VLAN_DEVICE(obj); + +get_bootindex(dev-nicconf.bootindex, v, name, errp); +} + +static void spapr_vlan_set_bootindex(Object *obj, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +VIOsPAPRVLANDevice *dev = VIO_SPAPR_VLAN_DEVICE(obj); + +set_bootindex(dev-nicconf.bootindex, v, name, errp); +} + +static void spapr_vlan_instance_init(Object *obj) +{ +object_property_add(obj, bootindex, int, +spapr_vlan_get_bootindex, +spapr_vlan_set_bootindex, NULL, NULL, NULL); +} + void spapr_vlan_create(VIOsPAPRBus *bus, NICInfo *nd) { DeviceState *dev; @@ -546,6 +569,7 @@ static const TypeInfo spapr_vlan_info = { .parent= TYPE_VIO_SPAPR_DEVICE, .instance_size = sizeof(VIOsPAPRVLANDevice), .class_init= spapr_vlan_class_init, +.instance_init = spapr_vlan_instance_init, }; static void spapr_vlan_register_types(void) -- 1.7.12.4
[Qemu-devel] [PATCH v7 11/28] ne2000: add bootindex to qom property
From: Gonglei arei.gong...@huawei.com Add a qom property with the same name 'bootindex', when we remove it form qdev property, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/net/ne2000.c | 12 1 file changed, 12 insertions(+) diff --git a/hw/net/ne2000.c b/hw/net/ne2000.c index a62d12d..62b86af 100644 --- a/hw/net/ne2000.c +++ b/hw/net/ne2000.c @@ -752,6 +752,17 @@ static void pci_ne2000_exit(PCIDevice *pci_dev) qemu_free_irq(s-irq); } +static void ne2000_instance_init(Object *obj) +{ +PCIDevice *pci_dev = PCI_DEVICE(obj); +PCINE2000State *d = DO_UPCAST(PCINE2000State, dev, pci_dev); +NE2000State *s = d-ne2000; + +device_add_bootindex_property(obj, s-c.bootindex, + bootindex, /ethernet-phy@0, + pci_dev-qdev, NULL); +} + static Property ne2000_properties[] = { DEFINE_NIC_PROPERTIES(PCINE2000State, ne2000.c), DEFINE_PROP_END_OF_LIST(), @@ -778,6 +789,7 @@ static const TypeInfo ne2000_info = { .parent= TYPE_PCI_DEVICE, .instance_size = sizeof(PCINE2000State), .class_init= ne2000_class_init, +.instance_init = ne2000_instance_init, }; static void ne2000_register_types(void) -- 1.7.12.4
[Qemu-devel] [PATCH v7 16/28] usb-net: add bootindex to qom property
From: Gonglei arei.gong...@huawei.com Add a qom property with the same name 'bootindex', when we remove it form qdev property, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/usb/dev-network.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/hw/usb/dev-network.c b/hw/usb/dev-network.c index 518d536..0dd4df7 100644 --- a/hw/usb/dev-network.c +++ b/hw/usb/dev-network.c @@ -1376,6 +1376,16 @@ static int usb_net_initfn(USBDevice *dev) return 0; } +static void usb_net_instance_init(Object *obj) +{ +USBDevice *dev = USB_DEVICE(obj); +USBNetState *s = DO_UPCAST(USBNetState, dev, dev); + +device_add_bootindex_property(obj, s-conf.bootindex, + bootindex, /ethernet-phy@0, + dev-qdev, NULL); +} + static USBDevice *usb_net_init(USBBus *bus, const char *cmdline) { Error *local_err = NULL; @@ -1439,6 +1449,7 @@ static const TypeInfo net_info = { .parent= TYPE_USB_DEVICE, .instance_size = sizeof(USBNetState), .class_init= usb_net_class_initfn, +.instance_init = usb_net_instance_init, }; static void usb_net_register_types(void) -- 1.7.12.4
[Qemu-devel] [PATCH v7 19/28] pci-assign: remove bootindex property from qdev to qom
From: Gonglei arei.gong...@huawei.com Remove bootindex form qdev property to qom, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/i386/kvm/pci-assign.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c index 17c7d6dc..56f387b 100644 --- a/hw/i386/kvm/pci-assign.c +++ b/hw/i386/kvm/pci-assign.c @@ -1850,13 +1850,23 @@ static void assigned_exitfn(struct PCIDevice *pci_dev) free_assigned_device(dev); } +static void assigned_dev_instance_init(Object *obj) +{ +PCIDevice *pci_dev = PCI_DEVICE(obj); +AssignedDevice *d = DO_UPCAST(AssignedDevice, dev, PCI_DEVICE(obj)); + +device_add_bootindex_property(obj, d-bootindex, + bootindex, NULL, + pci_dev-qdev, NULL); +object_property_set_int(obj, -1, bootindex, NULL); +} + static Property assigned_dev_properties[] = { DEFINE_PROP_PCI_HOST_DEVADDR(host, AssignedDevice, host), DEFINE_PROP_BIT(prefer_msi, AssignedDevice, features, ASSIGNED_DEVICE_PREFER_MSI_BIT, false), DEFINE_PROP_BIT(share_intx, AssignedDevice, features, ASSIGNED_DEVICE_SHARE_INTX_BIT, true), -DEFINE_PROP_INT32(bootindex, AssignedDevice, bootindex, -1), DEFINE_PROP_STRING(configfd, AssignedDevice, configfd_name), DEFINE_PROP_END_OF_LIST(), }; @@ -1882,6 +1892,7 @@ static const TypeInfo assign_info = { .parent = TYPE_PCI_DEVICE, .instance_size = sizeof(AssignedDevice), .class_init = assign_class_init, +.instance_init = assigned_dev_instance_init, }; static void assign_register_types(void) -- 1.7.12.4
[Qemu-devel] [PATCH v7 13/28] rtl8139: add bootindex to qom property
From: Gonglei arei.gong...@huawei.com Add a qom property with the same name 'bootindex', when we remove it form qdev property, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/net/rtl8139.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c index 6e59f38..138a03a 100644 --- a/hw/net/rtl8139.c +++ b/hw/net/rtl8139.c @@ -3543,6 +3543,15 @@ static int pci_rtl8139_init(PCIDevice *dev) return 0; } +static void rtl8139_instance_init(Object *obj) +{ +RTL8139State *s = RTL8139(obj); + +device_add_bootindex_property(obj, s-conf.bootindex, + bootindex, /ethernet-phy@0, + DEVICE(obj), NULL); +} + static Property rtl8139_properties[] = { DEFINE_NIC_PROPERTIES(RTL8139State, conf), DEFINE_PROP_END_OF_LIST(), @@ -3571,6 +3580,7 @@ static const TypeInfo rtl8139_info = { .parent= TYPE_PCI_DEVICE, .instance_size = sizeof(RTL8139State), .class_init= rtl8139_class_init, +.instance_init = rtl8139_instance_init, }; static void rtl8139_register_types(void) -- 1.7.12.4
[Qemu-devel] [PATCH v7 07/28] bootindex: add a setter/getter functions wrapper for bootindex property
From: Gonglei arei.gong...@huawei.com when we remove bootindex form qdev.property to qom.property, we can use those functions set/get bootindex property for all correlative devices. Signed-off-by: Gonglei arei.gong...@huawei.com --- bootdevice.c| 70 + include/sysemu/sysemu.h | 3 +++ 2 files changed, 73 insertions(+) diff --git a/bootdevice.c b/bootdevice.c index 484d0c9..c669293 100644 --- a/bootdevice.c +++ b/bootdevice.c @@ -23,6 +23,7 @@ */ #include sysemu/sysemu.h +#include qapi/visitor.h typedef struct FWBootEntry FWBootEntry; @@ -207,3 +208,72 @@ char *get_boot_devices_list(size_t *size, bool ignore_suffixes) } return list; } + +typedef struct { +int32_t *bootindex; +const char *suffix; +DeviceState *dev; +} BootIndexProperty; + +static void device_get_bootindex(Object *obj, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +BootIndexProperty *prop = opaque; +visit_type_int32(v, prop-bootindex, name, errp); +} + +static void device_set_bootindex(Object *obj, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +BootIndexProperty *prop = opaque; +int32_t boot_index; +Error *local_err = NULL; + +visit_type_int32(v, boot_index, name, local_err); +if (local_err) { +goto out; +} +/* check whether bootindex is present in fw_boot_order list */ +check_boot_index(boot_index, local_err); +if (local_err) { +goto out; +} +/* change bootindex to a new one */ +*prop-bootindex = boot_index; + +out: +if (local_err) { +error_propagate(errp, local_err); +} +} + +static void property_release_bootindex(Object *obj, const char *name, + void *opaque) + +{ +BootIndexProperty *prop = opaque; +g_free(prop); +} + +void device_add_bootindex_property(Object *obj, int32_t *bootindex, + const char *name, const char *suffix, + DeviceState *dev, Error **errp) +{ +Error *local_err = NULL; +BootIndexProperty *prop = g_malloc0(sizeof(*prop)); + +prop-bootindex = bootindex; +prop-suffix = suffix; +prop-dev = dev; + +object_property_add(obj, name, int, +device_get_bootindex, +device_set_bootindex, +property_release_bootindex, +prop, local_err); + +if (local_err) { +error_propagate(errp, local_err); +g_free(prop); +} +} diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index 10cd573..a636d2e 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -215,6 +215,9 @@ char *get_boot_devices_list(size_t *size, bool ignore_suffixes); DeviceState *get_boot_device(uint32_t position); void check_boot_index(int32_t bootindex, Error **errp); void del_boot_device_path(DeviceState *dev); +void device_add_bootindex_property(Object *obj, int32_t *bootindex, + const char *name, const char *suffix, + DeviceState *dev, Error **errp); QemuOpts *qemu_get_machine_opts(void); -- 1.7.12.4
[Qemu-devel] [PATCH v7 03/28] bootindex: add del_boot_device_path function
From: Gonglei arei.gong...@huawei.com Introduce del_boot_device_path() to clean up fw_cfg content when hot-unplugging a device that refers to a bootindex. Signed-off-by: Gonglei arei.gong...@huawei.com Signed-off-by: Chenliang chenlian...@huawei.com --- bootdevice.c| 21 + include/sysemu/sysemu.h | 1 + 2 files changed, 22 insertions(+) diff --git a/bootdevice.c b/bootdevice.c index f5399df..89aca7f 100644 --- a/bootdevice.c +++ b/bootdevice.c @@ -51,6 +51,27 @@ void check_boot_index(int32_t bootindex, Error **errp) } } +void del_boot_device_path(DeviceState *dev) +{ +FWBootEntry *i; + +assert(dev != NULL); + +/* remove all entries of the assigned device */ +QTAILQ_FOREACH(i, fw_boot_order, link) { +if (i-dev == NULL) { +continue; +} +if (i-dev == dev) { +QTAILQ_REMOVE(fw_boot_order, i, link); +g_free(i-suffix); +g_free(i); + +break; +} +} +} + void add_boot_device_path(int32_t bootindex, DeviceState *dev, const char *suffix) { diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index 72463de..10cd573 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -214,6 +214,7 @@ char *get_boot_devices_list(size_t *size, bool ignore_suffixes); DeviceState *get_boot_device(uint32_t position); void check_boot_index(int32_t bootindex, Error **errp); +void del_boot_device_path(DeviceState *dev); QemuOpts *qemu_get_machine_opts(void); -- 1.7.12.4
[Qemu-devel] [PATCH v7 21/28] redirect: remove bootindex property from qdev to qom
From: Gonglei arei.gong...@huawei.com Remove bootindex form qdev property to qom, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/usb/redirect.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/hw/usb/redirect.c b/hw/usb/redirect.c index 44522d9..352ba80 100644 --- a/hw/usb/redirect.c +++ b/hw/usb/redirect.c @@ -2468,7 +2468,6 @@ static Property usbredir_properties[] = { DEFINE_PROP_CHR(chardev, USBRedirDevice, cs), DEFINE_PROP_UINT8(debug, USBRedirDevice, debug, usbredirparser_warning), DEFINE_PROP_STRING(filter, USBRedirDevice, filter_str), -DEFINE_PROP_INT32(bootindex, USBRedirDevice, bootindex, -1), DEFINE_PROP_END_OF_LIST(), }; @@ -2493,11 +2492,23 @@ static void usbredir_class_initfn(ObjectClass *klass, void *data) set_bit(DEVICE_CATEGORY_MISC, dc-categories); } +static void usbredir_instance_init(Object *obj) +{ +USBDevice *udev = USB_DEVICE(obj); +USBRedirDevice *dev = DO_UPCAST(USBRedirDevice, dev, udev); + +device_add_bootindex_property(obj, dev-bootindex, + bootindex, NULL, + udev-qdev, NULL); +object_property_set_int(obj, -1, bootindex, NULL); +} + static const TypeInfo usbredir_dev_info = { .name = usb-redir, .parent= TYPE_USB_DEVICE, .instance_size = sizeof(USBRedirDevice), .class_init= usbredir_class_initfn, +.instance_init = usbredir_instance_init, }; static void usbredir_register_types(void) -- 1.7.12.4
[Qemu-devel] [PATCH v7 23/28] scsi: add bootindex to qom property
From: Gonglei arei.gong...@huawei.com Add a qom property with the same name 'bootindex', when we remove it form qdev property, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/scsi/scsi-disk.c| 14 ++ hw/scsi/scsi-generic.c | 1 + include/hw/scsi/scsi.h | 1 + 3 files changed, 16 insertions(+) diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index e34a544..205cdb0 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -2331,6 +2331,16 @@ static void scsi_disk_realize(SCSIDevice *dev, Error **errp) } } +void scsi_dev_instance_init(Object *obj) +{ +DeviceState *dev = DEVICE(obj); +SCSIDevice *s = DO_UPCAST(SCSIDevice, qdev, dev); + +device_add_bootindex_property(obj, s-conf.bootindex, + bootindex, NULL, + s-qdev, NULL); +} + static const SCSIReqOps scsi_disk_emulate_reqops = { .size = sizeof(SCSIDiskReq), .free_req = scsi_free_request, @@ -2644,6 +2654,7 @@ static const TypeInfo scsi_hd_info = { .parent= TYPE_SCSI_DEVICE, .instance_size = sizeof(SCSIDiskState), .class_init= scsi_hd_class_initfn, +.instance_init = scsi_dev_instance_init, }; static Property scsi_cd_properties[] = { @@ -2675,6 +2686,7 @@ static const TypeInfo scsi_cd_info = { .parent= TYPE_SCSI_DEVICE, .instance_size = sizeof(SCSIDiskState), .class_init= scsi_cd_class_initfn, +.instance_init = scsi_dev_instance_init, }; #ifdef __linux__ @@ -2705,6 +2717,7 @@ static const TypeInfo scsi_block_info = { .parent= TYPE_SCSI_DEVICE, .instance_size = sizeof(SCSIDiskState), .class_init= scsi_block_class_initfn, +.instance_init = scsi_dev_instance_init, }; #endif @@ -2743,6 +2756,7 @@ static const TypeInfo scsi_disk_info = { .parent= TYPE_SCSI_DEVICE, .instance_size = sizeof(SCSIDiskState), .class_init= scsi_disk_class_initfn, +.instance_init = scsi_dev_instance_init, }; static void scsi_disk_register_types(void) diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c index 20587b4..f6100a9 100644 --- a/hw/scsi/scsi-generic.c +++ b/hw/scsi/scsi-generic.c @@ -513,6 +513,7 @@ static const TypeInfo scsi_generic_info = { .parent= TYPE_SCSI_DEVICE, .instance_size = sizeof(SCSIDevice), .class_init= scsi_generic_class_initfn, +.instance_init = scsi_dev_instance_init, }; static void scsi_generic_register_types(void) diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h index 2e3a8f9..99c1c57 100644 --- a/include/hw/scsi/scsi.h +++ b/include/hw/scsi/scsi.h @@ -270,4 +270,5 @@ SCSIDevice *scsi_device_find(SCSIBus *bus, int channel, int target, int lun); /* scsi-generic.c. */ extern const SCSIReqOps scsi_generic_req_ops; +void scsi_dev_instance_init(Object *obj); #endif -- 1.7.12.4
[Qemu-devel] [PATCH v7 17/28] net: remove bootindex property from qdev to qom
From: Gonglei arei.gong...@huawei.com Remove bootindex form qdev property to qom, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Meanwhile set the initial value of bootindex to -1. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/net/e1000.c | 1 + hw/net/eepro100.c | 1 + hw/net/lance.c | 1 + hw/net/ne2000.c| 1 + hw/net/pcnet-pci.c | 1 + hw/net/rtl8139.c | 1 + hw/net/spapr_llan.c| 1 + hw/net/virtio-net.c| 1 + hw/net/vmxnet3.c | 1 + hw/usb/dev-network.c | 1 + hw/virtio/virtio-pci.c | 1 + include/net/net.h | 4 +--- 12 files changed, 12 insertions(+), 3 deletions(-) diff --git a/hw/net/e1000.c b/hw/net/e1000.c index 0edbfa6..2e5dc41 100644 --- a/hw/net/e1000.c +++ b/hw/net/e1000.c @@ -1627,6 +1627,7 @@ static void e1000_instance_init(Object *obj) device_add_bootindex_property(obj, n-conf.bootindex, bootindex, /ethernet-phy@0, DEVICE(n), NULL); +object_property_set_int(obj, -1, bootindex, NULL); } static const TypeInfo e1000_base_info = { diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c index fb9c944..cc79f09 100644 --- a/hw/net/eepro100.c +++ b/hw/net/eepro100.c @@ -1912,6 +1912,7 @@ static void eepro100_instance_init(Object *obj) device_add_bootindex_property(obj, s-conf.bootindex, bootindex, /ethernet-phy@0, DEVICE(s), NULL); +object_property_set_int(obj, -1, bootindex, NULL); } static E100PCIDeviceInfo e100_devices[] = { diff --git a/hw/net/lance.c b/hw/net/lance.c index a1c49f1..4972b01 100644 --- a/hw/net/lance.c +++ b/hw/net/lance.c @@ -152,6 +152,7 @@ static void lance_instance_init(Object *obj) device_add_bootindex_property(obj, s-conf.bootindex, bootindex, /ethernet-phy@0, DEVICE(obj), NULL); +object_property_set_int(obj, -1, bootindex, NULL); } static Property lance_properties[] = { diff --git a/hw/net/ne2000.c b/hw/net/ne2000.c index 62b86af..6d31555 100644 --- a/hw/net/ne2000.c +++ b/hw/net/ne2000.c @@ -761,6 +761,7 @@ static void ne2000_instance_init(Object *obj) device_add_bootindex_property(obj, s-c.bootindex, bootindex, /ethernet-phy@0, pci_dev-qdev, NULL); +object_property_set_int(obj, -1, bootindex, NULL); } static Property ne2000_properties[] = { diff --git a/hw/net/pcnet-pci.c b/hw/net/pcnet-pci.c index fb5f5d6..55f906d 100644 --- a/hw/net/pcnet-pci.c +++ b/hw/net/pcnet-pci.c @@ -353,6 +353,7 @@ static void pcnet_instance_init(Object *obj) device_add_bootindex_property(obj, s-conf.bootindex, bootindex, /ethernet-phy@0, DEVICE(obj), NULL); +object_property_set_int(obj, -1, bootindex, NULL); } static Property pcnet_properties[] = { diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c index 138a03a..d921ebd 100644 --- a/hw/net/rtl8139.c +++ b/hw/net/rtl8139.c @@ -3550,6 +3550,7 @@ static void rtl8139_instance_init(Object *obj) device_add_bootindex_property(obj, s-conf.bootindex, bootindex, /ethernet-phy@0, DEVICE(obj), NULL); +object_property_set_int(obj, -1, bootindex, NULL); } static Property rtl8139_properties[] = { diff --git a/hw/net/spapr_llan.c b/hw/net/spapr_llan.c index c3eb99a..c8b0978 100644 --- a/hw/net/spapr_llan.c +++ b/hw/net/spapr_llan.c @@ -226,6 +226,7 @@ static void spapr_vlan_instance_init(Object *obj) device_add_bootindex_property(obj, dev-nicconf.bootindex, bootindex, , DEVICE(dev), NULL); +object_property_set_int(obj, -1, bootindex, NULL); } void spapr_vlan_create(VIOsPAPRBus *bus, NICInfo *nd) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index f5f6bf7..3e6d3fa 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -1719,6 +1719,7 @@ static void virtio_net_instance_init(Object *obj) device_add_bootindex_property(obj, n-nic_conf.bootindex, bootindex, /ethernet-phy@0, DEVICE(n), NULL); +object_property_set_int(obj, -1, bootindex, NULL); } static Property virtio_net_properties[] = { diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c index 88e9d9c..56b22f3 100644 --- a/hw/net/vmxnet3.c +++ b/hw/net/vmxnet3.c @@ -2183,6 +2183,7 @@ static void vmxnet3_instance_init(Object *obj) device_add_bootindex_property(obj, s-conf.bootindex, bootindex, /ethernet-phy@0, DEVICE(obj), NULL); +object_property_set_int(obj, -1, bootindex, NULL); } static void vmxnet3_pci_uninit(PCIDevice *pci_dev) diff
[Qemu-devel] [PATCH v7 15/28] vmxnet3: add bootindex to qom property
From: Gonglei arei.gong...@huawei.com Add a qom property with the same name 'bootindex', when we remove it form qdev property, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/net/spapr_llan.c | 22 -- hw/net/vmxnet3.c| 8 2 files changed, 12 insertions(+), 18 deletions(-) diff --git a/hw/net/spapr_llan.c b/hw/net/spapr_llan.c index 38bbb01..c3eb99a 100644 --- a/hw/net/spapr_llan.c +++ b/hw/net/spapr_llan.c @@ -219,27 +219,13 @@ static int spapr_vlan_init(VIOsPAPRDevice *sdev) return 0; } -static void spapr_vlan_get_bootindex(Object *obj, Visitor *v, void *opaque, - const char *name, Error **errp) -{ -VIOsPAPRVLANDevice *dev = VIO_SPAPR_VLAN_DEVICE(obj); - -get_bootindex(dev-nicconf.bootindex, v, name, errp); -} - -static void spapr_vlan_set_bootindex(Object *obj, Visitor *v, void *opaque, - const char *name, Error **errp) +static void spapr_vlan_instance_init(Object *obj) { VIOsPAPRVLANDevice *dev = VIO_SPAPR_VLAN_DEVICE(obj); -set_bootindex(dev-nicconf.bootindex, v, name, errp); -} - -static void spapr_vlan_instance_init(Object *obj) -{ -object_property_add(obj, bootindex, int, -spapr_vlan_get_bootindex, -spapr_vlan_set_bootindex, NULL, NULL, NULL); +device_add_bootindex_property(obj, dev-nicconf.bootindex, + bootindex, , + DEVICE(dev), NULL); } void spapr_vlan_create(VIOsPAPRBus *bus, NICInfo *nd) diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c index f246fa1..88e9d9c 100644 --- a/hw/net/vmxnet3.c +++ b/hw/net/vmxnet3.c @@ -2177,6 +2177,13 @@ static int vmxnet3_pci_init(PCIDevice *pci_dev) return 0; } +static void vmxnet3_instance_init(Object *obj) +{ +VMXNET3State *s = VMXNET3(obj); +device_add_bootindex_property(obj, s-conf.bootindex, + bootindex, /ethernet-phy@0, + DEVICE(obj), NULL); +} static void vmxnet3_pci_uninit(PCIDevice *pci_dev) { @@ -2524,6 +2531,7 @@ static const TypeInfo vmxnet3_info = { .parent= TYPE_PCI_DEVICE, .instance_size = sizeof(VMXNET3State), .class_init= vmxnet3_class_init, +.instance_init = vmxnet3_instance_init, }; static void vmxnet3_register_types(void) -- 1.7.12.4
[Qemu-devel] [PATCH v7 08/28] virtio-net: add bootindex to qom property
From: Gonglei arei.gong...@huawei.com Add a qom property with the same name 'bootindex', when we remove it form qdev property, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/net/virtio-net.c| 3 +++ hw/virtio/virtio-pci.c | 11 --- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index 826a2a5..f5f6bf7 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -1716,6 +1716,9 @@ static void virtio_net_instance_init(Object *obj) * Can be overriden with virtio_net_set_config_size. */ n-config_size = sizeof(struct virtio_net_config); +device_add_bootindex_property(obj, n-nic_conf.bootindex, + bootindex, /ethernet-phy@0, + DEVICE(n), NULL); } static Property virtio_net_properties[] = { diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index 78dcd68..0779d28 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -1454,9 +1454,14 @@ static void virtio_net_pci_class_init(ObjectClass *klass, void *data) static void virtio_net_pci_instance_init(Object *obj) { VirtIONetPCI *dev = VIRTIO_NET_PCI(obj); -object_initialize(dev-vdev, sizeof(dev-vdev), TYPE_VIRTIO_NET); -object_property_add_child(obj, virtio-backend, OBJECT(dev-vdev), NULL); -object_unref(OBJECT(dev-vdev)); +VirtIONet *n = dev-vdev; + +object_initialize(n, sizeof(dev-vdev), TYPE_VIRTIO_NET); +object_property_add_child(obj, virtio-backend, OBJECT(n), NULL); +object_unref(OBJECT(n)); +device_add_bootindex_property(obj, n-nic_conf.bootindex, + bootindex, /ethernet-phy@0, + DEVICE(n), NULL); } static const TypeInfo virtio_net_pci_info = { -- 1.7.12.4
[Qemu-devel] [PATCH v7 26/28] block: remove bootindex property from qdev to qom
From: Gonglei arei.gong...@huawei.com Remove bootindex form qdev property to qom, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Meanwhile set the initial value of bootindex to -1. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/block/virtio-blk.c| 1 + hw/ide/qdev.c| 1 + hw/scsi/scsi-disk.c | 2 +- hw/scsi/scsi-generic.c | 1 - include/hw/block/block.h | 1 - 5 files changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 9d04590..4d05114 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -808,6 +808,7 @@ static void virtio_blk_instance_init(Object *obj) device_add_bootindex_property(obj, s-conf-bootindex, bootindex, /disk@0,0, DEVICE(obj), NULL); +object_property_set_int(obj, -1, bootindex, NULL); } static Property virtio_blk_properties[] = { diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c index 9e2ed40..8382f24 100644 --- a/hw/ide/qdev.c +++ b/hw/ide/qdev.c @@ -200,6 +200,7 @@ static void ide_dev_instance_init(Object *obj) bootindex, d-unit ? /disk@1 : /disk@0, d-qdev, NULL); +object_property_set_int(obj, -1, bootindex, NULL); } static int ide_hd_initfn(IDEDevice *dev) diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index 205cdb0..c37a8be 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -2339,6 +2339,7 @@ void scsi_dev_instance_init(Object *obj) device_add_bootindex_property(obj, s-conf.bootindex, bootindex, NULL, s-qdev, NULL); +object_property_set_int(obj, -1, bootindex, NULL); } static const SCSIReqOps scsi_disk_emulate_reqops = { @@ -2692,7 +2693,6 @@ static const TypeInfo scsi_cd_info = { #ifdef __linux__ static Property scsi_block_properties[] = { DEFINE_PROP_DRIVE(drive, SCSIDiskState, qdev.conf.bs), -DEFINE_PROP_INT32(bootindex, SCSIDiskState, qdev.conf.bootindex, -1), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c index f6100a9..a542d1a 100644 --- a/hw/scsi/scsi-generic.c +++ b/hw/scsi/scsi-generic.c @@ -482,7 +482,6 @@ static SCSIRequest *scsi_new_request(SCSIDevice *d, uint32_t tag, uint32_t lun, static Property scsi_generic_properties[] = { DEFINE_PROP_DRIVE(drive, SCSIDevice, conf.bs), -DEFINE_PROP_INT32(bootindex, SCSIDevice, conf.bootindex, -1), DEFINE_PROP_END_OF_LIST(), }; diff --git a/include/hw/block/block.h b/include/hw/block/block.h index 3a01488..867a226 100644 --- a/include/hw/block/block.h +++ b/include/hw/block/block.h @@ -49,7 +49,6 @@ static inline unsigned int get_physical_block_exp(BlockConf *conf) _conf.physical_block_size, 512), \ DEFINE_PROP_UINT16(min_io_size, _state, _conf.min_io_size, 0), \ DEFINE_PROP_UINT32(opt_io_size, _state, _conf.opt_io_size, 0),\ -DEFINE_PROP_INT32(bootindex, _state, _conf.bootindex, -1),\ DEFINE_PROP_UINT32(discard_granularity, _state, \ _conf.discard_granularity, -1) -- 1.7.12.4
[Qemu-devel] [PATCH v7 18/28] host-libusb: remove bootindex property from qdev to qom
From: Gonglei arei.gong...@huawei.com Remove bootindex form qdev property to qom, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/usb/host-libusb.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c index c189147..a2fa70e 100644 --- a/hw/usb/host-libusb.c +++ b/hw/usb/host-libusb.c @@ -980,6 +980,17 @@ static int usb_host_initfn(USBDevice *udev) return 0; } +static void usb_host_instance_init(Object *obj) +{ +USBDevice *udev = USB_DEVICE(obj); +USBHostDevice *s = USB_HOST_DEVICE(udev); + +device_add_bootindex_property(obj, s-bootindex, + bootindex, NULL, + udev-qdev, NULL); +object_property_set_int(obj, -1, bootindex, NULL); +} + static void usb_host_handle_destroy(USBDevice *udev) { USBHostDevice *s = USB_HOST_DEVICE(udev); @@ -1464,7 +1475,6 @@ static Property usb_host_dev_properties[] = { DEFINE_PROP_UINT32(productid, USBHostDevice, match.product_id, 0), DEFINE_PROP_UINT32(isobufs, USBHostDevice, iso_urb_count,4), DEFINE_PROP_UINT32(isobsize, USBHostDevice, iso_urb_frames, 32), -DEFINE_PROP_INT32(bootindex, USBHostDevice, bootindex,-1), DEFINE_PROP_UINT32(loglevel, USBHostDevice, loglevel, LIBUSB_LOG_LEVEL_WARNING), DEFINE_PROP_BIT(pipeline,USBHostDevice, options, @@ -1497,6 +1507,7 @@ static TypeInfo usb_host_dev_info = { .parent= TYPE_USB_DEVICE, .instance_size = sizeof(USBHostDevice), .class_init= usb_host_class_initfn, +.instance_init = usb_host_instance_init, }; static void usb_host_register_types(void) -- 1.7.12.4
[Qemu-devel] [PATCH v7 20/28] vfio: remove bootindex property from qdev to qom
From: Gonglei arei.gong...@huawei.com Remove bootindex form qdev property to qom, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/misc/vfio.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c index 40dcaa6..cdbaedf 100644 --- a/hw/misc/vfio.c +++ b/hw/misc/vfio.c @@ -4365,13 +4365,23 @@ post_reset: vfio_pci_post_reset(vdev); } +static void vfio_instance_init(Object *obj) +{ +PCIDevice *pci_dev = PCI_DEVICE(obj); +VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, PCI_DEVICE(obj)); + +device_add_bootindex_property(obj, vdev-bootindex, + bootindex, NULL, + pci_dev-qdev, NULL); +object_property_set_int(obj, -1, bootindex, NULL); +} + static Property vfio_pci_dev_properties[] = { DEFINE_PROP_PCI_HOST_DEVADDR(host, VFIODevice, host), DEFINE_PROP_UINT32(x-intx-mmap-timeout-ms, VFIODevice, intx.mmap_timeout, 1100), DEFINE_PROP_BIT(x-vga, VFIODevice, features, VFIO_FEATURE_ENABLE_VGA_BIT, false), -DEFINE_PROP_INT32(bootindex, VFIODevice, bootindex, -1), /* * TODO - support passed fds... is this necessary? * DEFINE_PROP_STRING(vfiofd, VFIODevice, vfiofd_name), @@ -4407,6 +4417,7 @@ static const TypeInfo vfio_pci_dev_info = { .parent = TYPE_PCI_DEVICE, .instance_size = sizeof(VFIODevice), .class_init = vfio_pci_dev_class_init, +.instance_init = vfio_instance_init, }; static void register_vfio_pci_dev_type(void) -- 1.7.12.4
[Qemu-devel] [PATCH v7 24/28] ide: add bootindex to qom property
From: Gonglei arei.gong...@huawei.com Add a qom property with the same name 'bootindex', when we remove it form qdev property, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/ide/qdev.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c index efab95b..9e2ed40 100644 --- a/hw/ide/qdev.c +++ b/hw/ide/qdev.c @@ -191,6 +191,17 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind) return 0; } +static void ide_dev_instance_init(Object *obj) +{ +DeviceState *dev = DEVICE(obj); +IDEDevice *d = DO_UPCAST(IDEDevice, qdev, dev); + +device_add_bootindex_property(obj, d-conf.bootindex, + bootindex, + d-unit ? /disk@1 : /disk@0, + d-qdev, NULL); +} + static int ide_hd_initfn(IDEDevice *dev) { return ide_dev_initfn(dev, IDE_HD); @@ -238,6 +249,7 @@ static const TypeInfo ide_hd_info = { .parent= TYPE_IDE_DEVICE, .instance_size = sizeof(IDEDrive), .class_init= ide_hd_class_init, +.instance_init = ide_dev_instance_init, }; static Property ide_cd_properties[] = { @@ -260,6 +272,7 @@ static const TypeInfo ide_cd_info = { .parent= TYPE_IDE_DEVICE, .instance_size = sizeof(IDEDrive), .class_init= ide_cd_class_init, +.instance_init = ide_dev_instance_init, }; static Property ide_drive_properties[] = { @@ -282,6 +295,7 @@ static const TypeInfo ide_drive_info = { .parent= TYPE_IDE_DEVICE, .instance_size = sizeof(IDEDrive), .class_init= ide_drive_class_init, +.instance_init = ide_dev_instance_init, }; static void ide_device_class_init(ObjectClass *klass, void *data) -- 1.7.12.4
[Qemu-devel] [PATCH v7 22/28] isa-fdc: remove bootindexA/B property from qdev to qom
From: Gonglei arei.gong...@huawei.com Remove bootindexA/B form qdev property to qom, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/block/fdc.c | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/hw/block/fdc.c b/hw/block/fdc.c index 490d127..51b1f08 100644 --- a/hw/block/fdc.c +++ b/hw/block/fdc.c @@ -2217,8 +2217,6 @@ static Property isa_fdc_properties[] = { DEFINE_PROP_UINT32(dma, FDCtrlISABus, dma, 2), DEFINE_PROP_DRIVE(driveA, FDCtrlISABus, state.drives[0].bs), DEFINE_PROP_DRIVE(driveB, FDCtrlISABus, state.drives[1].bs), -DEFINE_PROP_INT32(bootindexA, FDCtrlISABus, bootindexA, -1), -DEFINE_PROP_INT32(bootindexB, FDCtrlISABus, bootindexB, -1), DEFINE_PROP_BIT(check_media_rate, FDCtrlISABus, state.check_media_rate, 0, true), DEFINE_PROP_END_OF_LIST(), @@ -2236,11 +2234,26 @@ static void isabus_fdc_class_init(ObjectClass *klass, void *data) set_bit(DEVICE_CATEGORY_STORAGE, dc-categories); } +static void isabus_fdc_instance_init(Object *obj) +{ +FDCtrlISABus *isa = ISA_FDC(obj); + +device_add_bootindex_property(obj, isa-bootindexA, + bootindexA, /floppy@0, + DEVICE(obj), NULL); +device_add_bootindex_property(obj, isa-bootindexB, + bootindexB, /floppy@1, + DEVICE(obj), NULL); +object_property_set_int(obj, -1, bootindexA, NULL); +object_property_set_int(obj, -1, bootindexB, NULL); +} + static const TypeInfo isa_fdc_info = { .name = TYPE_ISA_FDC, .parent= TYPE_ISA_DEVICE, .instance_size = sizeof(FDCtrlISABus), .class_init= isabus_fdc_class_init, +.instance_init = isabus_fdc_instance_init, }; static const VMStateDescription vmstate_sysbus_fdc ={ -- 1.7.12.4
[Qemu-devel] [PATCH v7 12/28] pcnet: add bootindex to qom property
From: Gonglei arei.gong...@huawei.com Add a qom property with the same name 'bootindex', when we remove it form qdev property, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/net/lance.c | 12 hw/net/pcnet-pci.c | 12 hw/net/pcnet.h | 1 - 3 files changed, 24 insertions(+), 1 deletion(-) diff --git a/hw/net/lance.c b/hw/net/lance.c index 7811a9e..a1c49f1 100644 --- a/hw/net/lance.c +++ b/hw/net/lance.c @@ -42,6 +42,7 @@ #include hw/sparc/sun4m.h #include pcnet.h #include trace.h +#include sysemu/sysemu.h #define TYPE_LANCE lance #define SYSBUS_PCNET(obj) \ @@ -143,6 +144,16 @@ static void lance_reset(DeviceState *dev) pcnet_h_reset(d-state); } +static void lance_instance_init(Object *obj) +{ +SysBusPCNetState *d = SYSBUS_PCNET(obj); +PCNetState *s = d-state; + +device_add_bootindex_property(obj, s-conf.bootindex, + bootindex, /ethernet-phy@0, + DEVICE(obj), NULL); +} + static Property lance_properties[] = { DEFINE_PROP_PTR(dma, SysBusPCNetState, state.dma_opaque), DEFINE_NIC_PROPERTIES(SysBusPCNetState, state.conf), @@ -169,6 +180,7 @@ static const TypeInfo lance_info = { .parent= TYPE_SYS_BUS_DEVICE, .instance_size = sizeof(SysBusPCNetState), .class_init= lance_class_init, +.instance_init = lance_instance_init, }; static void lance_register_types(void) diff --git a/hw/net/pcnet-pci.c b/hw/net/pcnet-pci.c index 50ffe91..fb5f5d6 100644 --- a/hw/net/pcnet-pci.c +++ b/hw/net/pcnet-pci.c @@ -32,6 +32,7 @@ #include hw/loader.h #include qemu/timer.h #include sysemu/dma.h +#include sysemu/sysemu.h #include pcnet.h @@ -344,6 +345,16 @@ static void pci_reset(DeviceState *dev) pcnet_h_reset(d-state); } +static void pcnet_instance_init(Object *obj) +{ +PCIPCNetState *d = PCI_PCNET(obj); +PCNetState *s = d-state; + +device_add_bootindex_property(obj, s-conf.bootindex, + bootindex, /ethernet-phy@0, + DEVICE(obj), NULL); +} + static Property pcnet_properties[] = { DEFINE_NIC_PROPERTIES(PCIPCNetState, state.conf), DEFINE_PROP_END_OF_LIST(), @@ -372,6 +383,7 @@ static const TypeInfo pcnet_info = { .parent= TYPE_PCI_DEVICE, .instance_size = sizeof(PCIPCNetState), .class_init= pcnet_class_init, +.instance_init = pcnet_instance_init, }; static void pci_pcnet_register_types(void) diff --git a/hw/net/pcnet.h b/hw/net/pcnet.h index 9dee6f3..f8e8a6f 100644 --- a/hw/net/pcnet.h +++ b/hw/net/pcnet.h @@ -66,5 +66,4 @@ void pcnet_set_link_status(NetClientState *nc); void pcnet_common_cleanup(PCNetState *d); int pcnet_common_init(DeviceState *dev, PCNetState *s, NetClientInfo *info); extern const VMStateDescription vmstate_pcnet; - #endif -- 1.7.12.4
Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 3/4] target-ppc: Build error log
On 05.09.14 10:28, Aravinda Prasad wrote: On Friday 05 September 2014 01:34 PM, Alexander Graf wrote: On 04.09.14 13:13, Aravinda Prasad wrote: Whenever there is a physical memory error due to bit flips, which cannot be corrected by hardware, the error is passed on to the kernel. If the memory address in error belongs to guest address space then guest kernel is responsible to take action. Hence the error is passed on to guest via KVM by invoking 0x200 NMI vector. However, guest OS, as per PAPR, expects an error log upon such error. This patch registers a new hcall which is issued from 0x200 interrupt vector and builds the error log, copies the error log to rtas space and passes the address of the error log to guest Enhancement to KVM to perform above functionality is already in upstream kernel. Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com --- hw/ppc/spapr_hcall.c | 154 include/hw/ppc/spapr.h |4 + 2 files changed, 157 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c index 01650ba..c3aa448 100644 --- a/hw/ppc/spapr_hcall.c +++ b/hw/ppc/spapr_hcall.c @@ -14,6 +14,88 @@ struct SPRSyncState { target_ulong mask; }; +/* Offset from rtas-base where error log is placed */ +#define RTAS_ERROR_OFFSET (TARGET_PAGE_SIZE) You can't assume this. Please compute the value at the same place you compute the rtas-size. sure + +#define RTAS_ELOG_SEVERITY_SHIFT 0x5 +#define RTAS_ELOG_DISPOSITION_SHIFT 0x3 +#define RTAS_ELOG_INITIATOR_SHIFT0x4 + +/* + * Only required RTAS event severity, disposition, initiator + * target and type are copied from arch/powerpc/include/asm/rtas.h + */ + +/* RTAS event severity */ +#define RTAS_SEVERITY_ERROR_SYNC0x3 + +/* RTAS event disposition */ +#define RTAS_DISP_NOT_RECOVERED 0x2 + +/* RTAS event initiator */ +#define RTAS_INITIATOR_MEMORY 0x4 + +/* RTAS event target */ +#define RTAS_TARGET_MEMORY 0x4 + +/* RTAS event type */ +#define RTAS_TYPE_ECC_UNCORR0x09 + +/* + * Currently KVM only passes on the uncorrected machine + * check memory error to guest. Other machine check errors + * such as SLB multi-hit and TLB multi-hit are recovered + * in KVM and are not passed on to guest. + * + * DSISR Bit for uncorrected machine check error. Based + * on arch/powerpc/include/asm/mce.h Please don't include Linux code. This file is GPLv2+ licensed and I don't want to taint it as GPLv2 only just for the sake of mce. Hmm.. ok. In that case I should not copy rtas_error_log structure below from kernel source as well. + */ +#define PPC_BIT(bit)(0x8000ULL bit) +#define P7_DSISR_MC_UE (PPC_BIT(48)) /* P8 too */ + +/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */ +struct rtas_error_log { +/* Byte 0 */ +uint8_t byte0; /* Architectural version */ + +/* Byte 1 */ +uint8_t byte1; +/* + * XXX 3: Severity level of error + *XX2: Degree of recovery + * X 1: Extended log present? + * XX 2: Reserved + */ + +/* Byte 2 */ +uint8_t byte2; +/* + * 4: Initiator of event + * 4: Target of failed operation + */ +uint8_t byte3; /* General event or error*/ +}; + +/* + * Data format in RTAS-Blob + * + * This structure contains error information related to Machine + * Check exception. This is filled up and copied to rtas-blob + * upon machine check exception. + */ +struct rtas_mc_log { +target_ulong srr0; +target_ulong srr1; +/* + * Beginning of error log address. This is properly + * populated and passed on to OS registered machine + * check notification routine upon machine check + * exception + */ +target_ulong r3; +struct rtas_error_log err_log; +}; + static void do_spr_sync(void *arg) { struct SPRSyncState *s = arg; @@ -586,6 +668,77 @@ static target_ulong h_rtas_update(PowerPCCPU *cpu, sPAPREnvironment *spapr, return 0; } +static target_ulong h_report_mc_err(PowerPCCPU *cpu, sPAPREnvironment *spapr, + target_ulong opcode, target_ulong *args) +{ +struct rtas_mc_log mc_log; +CPUPPCState *env = cpu-env; +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu); + +/* + * We save the original r3 register in SPRG2 in 0x200 vector, + * which is patched during call to ibm.nmi-register. Original + * r3 is required to be included in error log + */ +mc_log.r3 = env-spr[SPR_SPRG2]; Don't you have to call cpu_synchronize_registers() before you access SPRG2? Otherwise the value may be stale. Yes true. Will include. + +/* + * SRR0 and SRR1,
[Qemu-devel] [PATCH v7 09/28] e1000: add bootindex to qom property
From: Gonglei arei.gong...@huawei.com Add a qom property with the same name 'bootindex', when we remove it form qdev property, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/net/e1000.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/hw/net/e1000.c b/hw/net/e1000.c index 272df00..0edbfa6 100644 --- a/hw/net/e1000.c +++ b/hw/net/e1000.c @@ -1621,10 +1621,19 @@ static void e1000_class_init(ObjectClass *klass, void *data) dc-props = e1000_properties; } +static void e1000_instance_init(Object *obj) +{ +E1000State *n = E1000(obj); +device_add_bootindex_property(obj, n-conf.bootindex, + bootindex, /ethernet-phy@0, + DEVICE(n), NULL); +} + static const TypeInfo e1000_base_info = { .name = TYPE_E1000_BASE, .parent= TYPE_PCI_DEVICE, .instance_size = sizeof(E1000State), +.instance_init = e1000_instance_init, .class_size= sizeof(E1000BaseClass), .abstract = true, }; @@ -1668,6 +1677,7 @@ static void e1000_register_types(void) type_info.parent = TYPE_E1000_BASE; type_info.class_data = (void *)info; type_info.class_init = e1000_class_init; +type_info.instance_init = e1000_instance_init; type_register(type_info); } -- 1.7.12.4
[Qemu-devel] [PATCH v7 05/28] bootindex: rework add_boot_device_path function
From: Gonglei arei.gong...@huawei.com Add the function of updating bootindex about fw_boot_order list in add_boot_device_path(). We should delete the old one if a device has existed in global fw_boot_order list. Signed-off-by: Gonglei arei.gong...@huawei.com --- bootdevice.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/bootdevice.c b/bootdevice.c index 89aca7f..6f430ec 100644 --- a/bootdevice.c +++ b/bootdevice.c @@ -72,6 +72,34 @@ void del_boot_device_path(DeviceState *dev) } } +static void del_original_boot_device(DeviceState *dev, const char *suffix) +{ +FWBootEntry *i; + +if (dev == NULL) { +return; +} + +QTAILQ_FOREACH(i, fw_boot_order, link) { +if (suffix) { +if (i-dev == dev !strcmp(i-suffix, suffix)) { +QTAILQ_REMOVE(fw_boot_order, i, link); +g_free(i-suffix); +g_free(i); + +break; +} +} else { /* host-usb and scsi devices do not have a suffix */ +if (i-dev == dev) { +QTAILQ_REMOVE(fw_boot_order, i, link); +g_free(i); + +break; +} +} +} +} + void add_boot_device_path(int32_t bootindex, DeviceState *dev, const char *suffix) { @@ -83,6 +111,8 @@ void add_boot_device_path(int32_t bootindex, DeviceState *dev, assert(dev != NULL || suffix != NULL); +del_original_boot_device(dev, suffix); + node = g_malloc0(sizeof(FWBootEntry)); node-bootindex = bootindex; node-suffix = g_strdup(suffix); -- 1.7.12.4
Re: [Qemu-devel] [Qemu-ppc] [PATCH 4/5] target-ppc: Handle ibm, nmi-register RTAS call
On 04.09.14 15:49, Aravinda Prasad wrote: On Thursday 04 September 2014 06:39 PM, Alexander Graf wrote: Am 04.09.2014 um 10:25 schrieb Aravinda Prasad aravi...@linux.vnet.ibm.com: On Friday 29 August 2014 03:46 AM, Alexander Graf wrote: On 28.08.14 19:42, Aravinda Prasad wrote: On Thursday 28 August 2014 02:07 PM, Alexander Graf wrote: On 28.08.14 08:38, Aravinda Prasad wrote: On Wednesday 27 August 2014 04:07 PM, Alexander Graf wrote: On 25.08.14 15:45, Aravinda Prasad wrote: This patch adds FWNMI support in qemu for powerKVM guests by handling the ibm,nmi-register rtas call. Whenever OS issues ibm,nmi-register RTAS call, the machine check notification address is saved and the machine check interrupt vector 0x200 is patched to issue a private hcall. Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com --- hw/ppc/spapr_rtas.c| 91 include/hw/ppc/spapr.h |8 2 files changed, 98 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c index 02ddbf9..1135d2b 100644 --- a/hw/ppc/spapr_rtas.c +++ b/hw/ppc/spapr_rtas.c @@ -277,6 +277,91 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU *cpu, rtas_st(rets, 0, ret); } +static void rtas_ibm_nmi_register(PowerPCCPU *cpu, + sPAPREnvironment *spapr, + uint32_t token, uint32_t nargs, + target_ulong args, + uint32_t nret, target_ulong rets) +{ +int i; +uint32_t branch_inst = 0x4802; +target_ulong guest_machine_check_addr; +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu); +/* + * Trampoline saves r3 in sprg2 and issues private hcall + * to request qemu to build error log. QEMU builds the + * error log, copies to rtas-blob and returns the address. + * The initial 16 bytes in rtas-blob consists of saved srr0 + * and srr1 which we restore and pass on the actual error + * log address to OS handled mcachine check notification + * routine + */ +uint32_t trampoline[] = { +0x7c7243a6,/* mtspr SPRN_SPRG2,r3 */ +0x3860,/* li r3,0 */ +/* 0xf004 is the KVMPPC_H_REPORT_ERR private HCALL */ +0x6063f004,/* ori r3,r3,f004 */ +/* Issue H_CALL */ +0x4422,/* sc 1 */ So up to here we're saving r3 in SPRG2 (how do we know that we can clobber it?) and call our special hypercall. But what does all the cruft below here do? The saved r3 in SPRG2 is consumed in KVMPPC_H_REPORT_ERR hcall, hence we can clobber SPRG2 after hcall returns. I have included a comment in patch 3/5 while building error log. I think better I add one here as well. +0x7c9243a6,/* mtspr r4 sprg2 */ Apart from th fact that your order is wrong, this destroys the value of r3 that we saved above again. SPRG2 is saved inside hcall and hence we don't need SPRG2 further after KVMPPC_H_REPORT_ERR hcall returns. +0xe883,/* ld r4, 0(r3) */ +0x7c9a03a6,/* mtspr r4, srr0 */ +0xe8830008,/* ld r4, 8(r3) */ +0x7c9b03a6,/* mtspr r4, srr1 */ Can't we just set srr0 and srr1 directly? I checked for instructions in ISA which set srr0/1 directly given an address, but could not find any such instructions. I mean from QEMU :). srr0 and srr1, which are properly set when 0x200 is invoked, are clobbered when we return from KVMPPC_H_REPORT_ERR hcall. I think they are modified before issuing rfid (I can see them getting clobbered from QEMU monitor). However when we jump to OS registered machine check routine srr0 and srr1 should reflect the value they had when 0x200 was invoked. Hence srr0 and srr1 are saved in hcall and restored when we return from hcall. Also we don't have enough scratch registers available to save these before invoking hcall from 0x200. Or am I missing other ways to do this from QEMU? If you just do cpu_synchronize_state() and then change env-spr[SPRN_SRR0/1] inside your hypercall handler that should also change the value when you return from the hcall. I tried cpu_synchronize_state(), however, srr0 and srr1 are still clobbered. Just before I issue hcall from 0x200 I see the following values from QEMU monitor: SRR0 d0f40264 SRR1 80209033 Inside hcall, I call cpu_synchronize_state(). As soon as I return from hcall I see: SRR0 0214 SRR1 80001001 I see SRR0 is now set to nip in 0x200 and SRR1 to msr value. I think it is reset during returning from hcall before executing rfid. Ah, because the hypercall is an interrupt itself. Heh ;). True. Alex
[Qemu-devel] [PATCH v7 28/28] bootindex: delete bootindex when device is removed
From: Gonglei arei.gong...@huawei.com Device should be removed from global boot list when it is hot-unplugged. Signed-off-by: Gonglei arei.gong...@huawei.com --- bootdevice.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/bootdevice.c b/bootdevice.c index e95d085..1255c06 100644 --- a/bootdevice.c +++ b/bootdevice.c @@ -254,6 +254,8 @@ static void property_release_bootindex(Object *obj, const char *name, { BootIndexProperty *prop = opaque; + +del_boot_device_path(prop-dev); g_free(prop); } -- 1.7.12.4
[Qemu-devel] [PATCH v7 25/28] virtio-blk: add bootindex to qom property
From: Gonglei arei.gong...@huawei.com Add a qom property with the same name 'bootindex', when we remove it form qdev property, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/block/virtio-blk.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index a7f2827..9d04590 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -805,6 +805,9 @@ static void virtio_blk_instance_init(Object *obj) (Object **)s-blk.iothread, qdev_prop_allow_set_link_before_realize, OBJ_PROP_LINK_UNREF_ON_RELEASE, NULL); +device_add_bootindex_property(obj, s-conf-bootindex, + bootindex, /disk@0,0, + DEVICE(obj), NULL); } static Property virtio_blk_properties[] = { -- 1.7.12.4
Re: [Qemu-devel] NBD TLS support in QEMU
On Wed, Sep 03, 2014 at 05:44:17PM +0100, Stefan Hajnoczi wrote: Hi, QEMU offers both NBD client and server functionality. The NBD protocol runs unencrypted, which is a problem when the client and server communicate over an untrusted network. The particular use case that prompted this mail is storage migration in OpenStack. The goal is to encrypt the NBD connection between source and destination hosts during storage migration. I think we can integrate TLS into the NBD protocol as an optional flag. A quick web search does not reveal existing open source SSL/TLS NBD implementations. I do see a VMware NBDSSL protocol but there is no specification so I guess it is proprietary. Not sold on this because: - It requires (unnecessary) changes to the NBD specification. - It would still be vulnerable to a protocol downgrade attack: ie. An attacker could strip the TLS flag from the server's response resulting in either: * Connection fallback to cleartext, if both ends don't force TLS. * Loss of backward compatibility, if one of the ends forces TLS, making the reason for which such a flag is added moot IIUC. IMO, it is more fool proof add some --use-tls flag to the client and server implementations to wrap the data in TLS, just like HTTPS does for instance. Also, so mean of verification is required (otherwise, back to point 0 being vulnerable to sslstrip style attacks) either that the server's cert is signed with a certain (self-generated) CA certificate or that it matches a certain fingerprint. Doing it similarly on the server-side would allow hitting a 2nd bird (authentication.) The NBD protocol starts with a negotiation phase. This would be the appropriate place to indicate that TLS will be used. After client and server complete TLS setup the connection can continue as normal. Besides QEMU, the userspace NBD tools (http://nbd.sf.net/) can also be extended to support TLS. In this case the kernel needs a localhost socket and userspace handles TLS. Thoughts? Stefan
[Qemu-devel] [PATCH v7 02/28] bootindex: add check bootindex function
From: Gonglei arei.gong...@huawei.com Determine whether a given bootindex exists or not. If exists, we report an error. Signed-off-by: Gonglei arei.gong...@huawei.com --- bootdevice.c| 15 +++ include/sysemu/sysemu.h | 1 + 2 files changed, 16 insertions(+) diff --git a/bootdevice.c b/bootdevice.c index d5b8789..f5399df 100644 --- a/bootdevice.c +++ b/bootdevice.c @@ -36,6 +36,21 @@ struct FWBootEntry { static QTAILQ_HEAD(, FWBootEntry) fw_boot_order = QTAILQ_HEAD_INITIALIZER(fw_boot_order); +void check_boot_index(int32_t bootindex, Error **errp) +{ +FWBootEntry *i; + +if (bootindex = 0) { +QTAILQ_FOREACH(i, fw_boot_order, link) { +if (i-bootindex == bootindex) { +error_setg(errp, The bootindex %d has already been used, + bootindex); +return; +} +} +} +} + void add_boot_device_path(int32_t bootindex, DeviceState *dev, const char *suffix) { diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index 8de5100..72463de 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -213,6 +213,7 @@ void add_boot_device_path(int32_t bootindex, DeviceState *dev, char *get_boot_devices_list(size_t *size, bool ignore_suffixes); DeviceState *get_boot_device(uint32_t position); +void check_boot_index(int32_t bootindex, Error **errp); QemuOpts *qemu_get_machine_opts(void); -- 1.7.12.4
[Qemu-devel] [PATCH v7 04/28] fw_cfg: add fw_cfg_machine_reset function
From: Gonglei arei.gong...@huawei.com We must assure that the changed bootindex can take effect when guest is rebooted. So we introduce fw_cfg_machine_reset(), which change the fw_cfg file's bootindex data using the new global fw_boot_order list. Signed-off-by: Chenliang chenlian...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/nvram/fw_cfg.c | 55 --- include/hw/nvram/fw_cfg.h | 2 ++ 2 files changed, 54 insertions(+), 3 deletions(-) diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index b71d251..e7ed27e 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -402,6 +402,26 @@ static void fw_cfg_add_bytes_read_callback(FWCfgState *s, uint16_t key, s-entries[arch][key].callback_opaque = callback_opaque; } +static void *fw_cfg_modify_bytes_read(FWCfgState *s, uint16_t key, + void *data, size_t len) +{ +void *ptr; +int arch = !!(key FW_CFG_ARCH_LOCAL); + +key = FW_CFG_ENTRY_MASK; + +assert(key FW_CFG_MAX_ENTRY len UINT32_MAX); + +/* return the old data to the function caller, avoid memory leak */ +ptr = s-entries[arch][key].data; +s-entries[arch][key].data = data; +s-entries[arch][key].len = len; +s-entries[arch][key].callback_opaque = NULL; +s-entries[arch][key].callback = NULL; + +return ptr; +} + void fw_cfg_add_bytes(FWCfgState *s, uint16_t key, void *data, size_t len) { fw_cfg_add_bytes_read_callback(s, key, NULL, NULL, data, len); @@ -499,13 +519,42 @@ void fw_cfg_add_file(FWCfgState *s, const char *filename, fw_cfg_add_file_callback(s, filename, NULL, NULL, data, len); } -static void fw_cfg_machine_ready(struct Notifier *n, void *data) +void *fw_cfg_modify_file(FWCfgState *s, const char *filename, +void *data, size_t len) +{ +int i, index; + +assert(s-files); + +index = be32_to_cpu(s-files-count); +assert(index FW_CFG_FILE_SLOTS); + +for (i = 0; i index; i++) { +if (strcmp(filename, s-files-f[i].name) == 0) { +return fw_cfg_modify_bytes_read(s, FW_CFG_FILE_FIRST + i, + data, len); +} +} +/* add new one */ +fw_cfg_add_file_callback(s, filename, NULL, NULL, data, len); +return NULL; +} + +static void fw_cfg_machine_reset(void *opaque) { +void *ptr; size_t len; -FWCfgState *s = container_of(n, FWCfgState, machine_ready); +FWCfgState *s = opaque; char *bootindex = get_boot_devices_list(len, false); -fw_cfg_add_file(s, bootorder, (uint8_t*)bootindex, len); +ptr = fw_cfg_modify_file(s, bootorder, (uint8_t *)bootindex, len); +g_free(ptr); +} + +static void fw_cfg_machine_ready(struct Notifier *n, void *data) +{ +FWCfgState *s = container_of(n, FWCfgState, machine_ready); +qemu_register_reset(fw_cfg_machine_reset, s); } FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port, diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h index 72b1549..56e1ed7 100644 --- a/include/hw/nvram/fw_cfg.h +++ b/include/hw/nvram/fw_cfg.h @@ -76,6 +76,8 @@ void fw_cfg_add_file(FWCfgState *s, const char *filename, void *data, void fw_cfg_add_file_callback(FWCfgState *s, const char *filename, FWCfgReadCallback callback, void *callback_opaque, void *data, size_t len); +void *fw_cfg_modify_file(FWCfgState *s, const char *filename, void *data, + size_t len); FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port, hwaddr crl_addr, hwaddr data_addr); -- 1.7.12.4
[Qemu-devel] [PATCH v7 06/28] bootindex: support to set a existent device's bootindex to -1
From: Gonglei arei.gong...@huawei.com When set a device's bootindex to -1, we remove it from global fw_boot_order list. Signed-off-by: Gonglei arei.gong...@huawei.com --- bootdevice.c | 1 + 1 file changed, 1 insertion(+) diff --git a/bootdevice.c b/bootdevice.c index 6f430ec..484d0c9 100644 --- a/bootdevice.c +++ b/bootdevice.c @@ -106,6 +106,7 @@ void add_boot_device_path(int32_t bootindex, DeviceState *dev, FWBootEntry *node, *i; if (bootindex 0) { +del_original_boot_device(dev, suffix); return; } -- 1.7.12.4
[Qemu-devel] [PATCH v7 10/28] eepro100: add bootindex to qom property
From: Gonglei arei.gong...@huawei.com Add a qom property with the same name 'bootindex', when we remove it form qdev property, things will continue to work just fine, and we can use qom features which are not supported by qdev property. Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/net/eepro100.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c index 3cd826a..fb9c944 100644 --- a/hw/net/eepro100.c +++ b/hw/net/eepro100.c @@ -1906,6 +1906,14 @@ static int e100_nic_init(PCIDevice *pci_dev) return 0; } +static void eepro100_instance_init(Object *obj) +{ +EEPRO100State *s = DO_UPCAST(EEPRO100State, dev, PCI_DEVICE(obj)); +device_add_bootindex_property(obj, s-conf.bootindex, + bootindex, /ethernet-phy@0, + DEVICE(s), NULL); +} + static E100PCIDeviceInfo e100_devices[] = { { .name = i82550, @@ -2104,7 +2112,8 @@ static void eepro100_register_types(void) type_info.parent = TYPE_PCI_DEVICE; type_info.class_init = eepro100_class_init; type_info.instance_size = sizeof(EEPRO100State); - +type_info.instance_init = eepro100_instance_init; + type_register(type_info); } } -- 1.7.12.4
[Qemu-devel] [PATCH v7 01/28] bootdevice: move bootdevice related code to new file bootdevice.c
From: Gonglei arei.gong...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- Makefile.target | 2 +- bootdevice.c| 142 include/sysemu/sysemu.h | 1 + vl.c| 118 +--- 4 files changed, 145 insertions(+), 118 deletions(-) create mode 100644 bootdevice.c diff --git a/Makefile.target b/Makefile.target index 1e8d7ab..e9ff1ee 100644 --- a/Makefile.target +++ b/Makefile.target @@ -127,7 +127,7 @@ endif #CONFIG_BSD_USER # System emulator target ifdef CONFIG_SOFTMMU obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o -obj-y += qtest.o +obj-y += qtest.o bootdevice.o obj-y += hw/ obj-$(CONFIG_FDT) += device_tree.o obj-$(CONFIG_KVM) += kvm-all.o diff --git a/bootdevice.c b/bootdevice.c new file mode 100644 index 000..d5b8789 --- /dev/null +++ b/bootdevice.c @@ -0,0 +1,142 @@ +/* + * QEMU Boot Device Implement + * + * Copyright (c) 2014 HUAWEI TECHNOLOGIES CO.,LTD. + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the Software), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include sysemu/sysemu.h + +typedef struct FWBootEntry FWBootEntry; + +struct FWBootEntry { +QTAILQ_ENTRY(FWBootEntry) link; +int32_t bootindex; +DeviceState *dev; +char *suffix; +}; + +static QTAILQ_HEAD(, FWBootEntry) fw_boot_order = +QTAILQ_HEAD_INITIALIZER(fw_boot_order); + +void add_boot_device_path(int32_t bootindex, DeviceState *dev, + const char *suffix) +{ +FWBootEntry *node, *i; + +if (bootindex 0) { +return; +} + +assert(dev != NULL || suffix != NULL); + +node = g_malloc0(sizeof(FWBootEntry)); +node-bootindex = bootindex; +node-suffix = g_strdup(suffix); +node-dev = dev; + +QTAILQ_FOREACH(i, fw_boot_order, link) { +if (i-bootindex == bootindex) { +fprintf(stderr, Two devices with same boot index %d\n, bootindex); +exit(1); +} else if (i-bootindex bootindex) { +continue; +} +QTAILQ_INSERT_BEFORE(i, node, link); +return; +} +QTAILQ_INSERT_TAIL(fw_boot_order, node, link); +} + +DeviceState *get_boot_device(uint32_t position) +{ +uint32_t counter = 0; +FWBootEntry *i = NULL; +DeviceState *res = NULL; + +if (!QTAILQ_EMPTY(fw_boot_order)) { +QTAILQ_FOREACH(i, fw_boot_order, link) { +if (counter == position) { +res = i-dev; +break; +} +counter++; +} +} +return res; +} + +/* + * This function returns null terminated string that consist of new line + * separated device paths. + * + * memory pointed by size is assigned total length of the array in bytes + * + */ +char *get_boot_devices_list(size_t *size, bool ignore_suffixes) +{ +FWBootEntry *i; +size_t total = 0; +char *list = NULL; + +QTAILQ_FOREACH(i, fw_boot_order, link) { +char *devpath = NULL, *bootpath; +size_t len; + +if (i-dev) { +devpath = qdev_get_fw_dev_path(i-dev); +assert(devpath); +} + +if (i-suffix !ignore_suffixes devpath) { +size_t bootpathlen = strlen(devpath) + strlen(i-suffix) + 1; + +bootpath = g_malloc(bootpathlen); +snprintf(bootpath, bootpathlen, %s%s, devpath, i-suffix); +g_free(devpath); +} else if (devpath) { +bootpath = devpath; +} else if (!ignore_suffixes) { +assert(i-suffix); +bootpath = g_strdup(i-suffix); +} else { +bootpath = g_strdup(); +} + +if (total) { +list[total-1] = '\n'; +} +len = strlen(bootpath) + 1; +list = g_realloc(list, total + len); +memcpy(list[total], bootpath, len); +total += len; +g_free(bootpath); +} + +*size = total; + +if (boot_strict
Re: [Qemu-devel] [CVE-2014-3615 PATCH v2 3/3] spice: make sure we don't overflow ssd-buf
Hi, I can't track this back far enough. I'd feel safer if you checked that the multiplication can't overflow even in uint64_t. Effectively it comes from the emulated graphics hardware (anything in hw/display/*). The gfx emulation must make sure that the framebuffer fits into the video memory, which in turn pretty much implies that we can't overflow uint64_t. I think even uint32_t can't overflow with the gfx hardware we are emulating today. (5) Instead, you really need to make sure that the very first multiplication fits in a signed int: Makes sense. I think it is easier to just multiply in 64bit, then check the result is small enougth (new patch attached). /* display listener callbacks */ @@ -495,7 +503,7 @@ static void interface_get_init_info(QXLInstance *sin, QXLDevInitInfo *info) info-num_memslots = NUM_MEMSLOTS; info-num_memslots_groups = NUM_MEMSLOTS_GROUPS; info-internal_groupslot_id = 0; -info-qxl_ram_size = ssd-bufsize; +info-qxl_ram_size = 16 * 1024 * 1024; info-n_surfaces = ssd-num_surfaces; } spice-server doesn't do anything with it, other than passing to spice-client. Not fully sure what spice-client uses this for, maybe as some kind of hint for resource management. Maybe not at all. It surely doesn't matter at all for ssd-buf size. cheers, Gerd
[Qemu-devel] [PATCH v7 27/28] bootindex: move calling add_boot_device_patch to bootindex setter function
From: Gonglei arei.gong...@huawei.com On this way, we can assure the new bootindex take effect during vm rebooting. Signed-off-by: Gonglei arei.gong...@huawei.com --- bootdevice.c | 2 ++ hw/block/fdc.c | 3 --- hw/block/virtio-blk.c| 2 -- hw/i386/kvm/pci-assign.c | 2 -- hw/ide/qdev.c| 3 --- hw/misc/vfio.c | 1 - hw/net/e1000.c | 2 -- hw/net/eepro100.c| 2 -- hw/net/ne2000.c | 2 -- hw/net/pcnet.c | 2 -- hw/net/rtl8139.c | 2 -- hw/net/spapr_llan.c | 2 -- hw/net/virtio-net.c | 2 -- hw/net/vmxnet3.c | 2 -- hw/scsi/scsi-disk.c | 1 - hw/scsi/scsi-generic.c | 3 --- hw/usb/dev-network.c | 1 - hw/usb/host-libusb.c | 1 - hw/usb/redirect.c| 1 - 19 files changed, 2 insertions(+), 34 deletions(-) diff --git a/bootdevice.c b/bootdevice.c index c669293..e95d085 100644 --- a/bootdevice.c +++ b/bootdevice.c @@ -241,6 +241,8 @@ static void device_set_bootindex(Object *obj, Visitor *v, void *opaque, /* change bootindex to a new one */ *prop-bootindex = boot_index; +add_boot_device_path(*prop-bootindex, prop-dev, prop-suffix); + out: if (local_err) { error_propagate(errp, local_err); diff --git a/hw/block/fdc.c b/hw/block/fdc.c index 51b1f08..970b7c3 100644 --- a/hw/block/fdc.c +++ b/hw/block/fdc.c @@ -2142,9 +2142,6 @@ static void isabus_fdc_realize(DeviceState *dev, Error **errp) error_propagate(errp, err); return; } - -add_boot_device_path(isa-bootindexA, dev, /floppy@0); -add_boot_device_path(isa-bootindexB, dev, /floppy@1); } static void sysbus_fdc_initfn(Object *obj) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 4d05114..787955b 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -777,8 +777,6 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp) bdrv_set_guest_block_size(s-bs, s-conf-logical_block_size); bdrv_iostatus_enable(s-bs); - -add_boot_device_path(s-conf-bootindex, dev, /disk@0,0); } static void virtio_blk_device_unrealize(DeviceState *dev, Error **errp) diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c index 56f387b..4159978 100644 --- a/hw/i386/kvm/pci-assign.c +++ b/hw/i386/kvm/pci-assign.c @@ -1825,8 +1825,6 @@ static int assigned_initfn(struct PCIDevice *pci_dev) assigned_dev_load_option_rom(dev); -add_boot_device_path(dev-bootindex, pci_dev-qdev, NULL); - return 0; assigned_out: diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c index 8382f24..daf6992 100644 --- a/hw/ide/qdev.c +++ b/hw/ide/qdev.c @@ -185,9 +185,6 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind) dev-serial = g_strdup(s-drive_serial_str); } -add_boot_device_path(dev-conf.bootindex, dev-qdev, - dev-unit ? /disk@1 : /disk@0); - return 0; } diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c index cdbaedf..d073304 100644 --- a/hw/misc/vfio.c +++ b/hw/misc/vfio.c @@ -4296,7 +4296,6 @@ static int vfio_initfn(PCIDevice *pdev) } } -add_boot_device_path(vdev-bootindex, pdev-qdev, NULL); vfio_register_err_notifier(vdev); return 0; diff --git a/hw/net/e1000.c b/hw/net/e1000.c index 2e5dc41..34dde84 100644 --- a/hw/net/e1000.c +++ b/hw/net/e1000.c @@ -1569,8 +1569,6 @@ static int pci_e1000_init(PCIDevice *pci_dev) qemu_format_nic_info_str(qemu_get_queue(d-nic), macaddr); -add_boot_device_path(d-conf.bootindex, dev, /ethernet-phy@0); - d-autoneg_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, e1000_autoneg_timer, d); d-mit_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, e1000_mit_timer, d); diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c index cc79f09..9acc8a3 100644 --- a/hw/net/eepro100.c +++ b/hw/net/eepro100.c @@ -1901,8 +1901,6 @@ static int e100_nic_init(PCIDevice *pci_dev) s-vmstate-name = qemu_get_queue(s-nic)-model; vmstate_register(pci_dev-qdev, -1, s-vmstate, s); -add_boot_device_path(s-conf.bootindex, pci_dev-qdev, /ethernet-phy@0); - return 0; } diff --git a/hw/net/ne2000.c b/hw/net/ne2000.c index 6d31555..b6d8cb1 100644 --- a/hw/net/ne2000.c +++ b/hw/net/ne2000.c @@ -738,8 +738,6 @@ static int pci_ne2000_init(PCIDevice *pci_dev) object_get_typename(OBJECT(pci_dev)), pci_dev-qdev.id, s); qemu_format_nic_info_str(qemu_get_queue(s-nic), s-c.macaddr.a); -add_boot_device_path(s-c.bootindex, pci_dev-qdev, /ethernet-phy@0); - return 0; } diff --git a/hw/net/pcnet.c b/hw/net/pcnet.c index 5299d52..d344c15 100644 --- a/hw/net/pcnet.c +++ b/hw/net/pcnet.c @@ -1735,8 +1735,6 @@ int pcnet_common_init(DeviceState *dev, PCNetState *s, NetClientInfo *info) s-nic = qemu_new_nic(info, s-conf, object_get_typename(OBJECT(dev)), dev-id, s); qemu_format_nic_info_str(qemu_get_queue(s-nic), s-conf.macaddr.a); -add_boot_device_path(s-conf.bootindex,
Re: [Qemu-devel] [PATCH] translate-all.c: fix debug memory maps printing
I've also found that this issue in walker API leads to creating a misformed core file. elf_core_dump() uses walk_memory_regions() to build memory mapping for a core file. As a result the core file has a very small size and doesn't contain page snapshots of mapped libraries. I've compiled a simple helloworld prog (which dereference a NULL pointer at the end to get a core file) and run the prog twice without a workaround patch and with it. $ gcc -m32 -g -Wall -o /tmp/prog /tmp/prog.c $ ulimic -c unlimited I use i386 but the same works for ARM and believe for other 32bits arches. $ qemu-i386 /tmp/prog Hello world! qemu: uncaught target signal 11 (Segmentation fault) - core dumped Segmentation fault (core dumped) Here is core files sizes without a workaround patch and with it. $ ls -l 892 qemu_prog_20140905-121147_21636.core 8454144 qemu_prog_20140905-120737_21355.core Also I've investigate them with gdb. No workaround: $ gdb /tmp/prog qemu_prog_20140905-121147_21636.core Cannot access memory at address 0x407fffe4 #-1 0x08048406 in main () at /tmp/prog.c:8 Here is the fail because there is no mapping that contains address 0x407fffe4. (gdb) info files 0x00048000 - 0x00048000 is load1 0x00049000 - 0x00049000 is load2 0x0004 - 0x0004 is load3 0x00041000 - 0x00041000 is load4 0x00041800 - 0x00041800 is load5 0x00061800 - 0x00061800 is load6 0x00062800 - 0x00062800 is load7 0x00045800 - 0x00045800 is load8 0x001ee800 - 0x001ee800 is load9 0x001ef800 - 0x001ef800 is load10 0x001f1800 - 0x001f1800 is load11 With workaround: 0x08048000 - 0x08048000 is load1 0x08049000 - 0x0804a000 is load2 0x4000 - 0x4000 is load3 0x40001000 - 0x40801000 is load4 0x40801000 - 0x40801000 is load5 0x40821000 - 0x40822000 is load6 0x40822000 - 0x40827000 is load7 0x40845000 - 0x40845000 is load8 0x409ee000 - 0x409ee000 is load9 0x409ef000 - 0x409f1000 is load10 0x409f1000 - 0x409f7000 is load11 address 0x407fffe4 belongs to load4. Also this core includes other library mappings that are not presented in the core file without a workaround: 0x40845174 - 0x40845198 is /lib/i386-linux-gnu/libc.so.6 0x40845198 - 0x408451b8 is /lib/i386-linux-gnu/libc.so.6 0x408451b8 - 0x40848ec8 is /lib/i386-linux-gnu/libc.so.6 0x40848ec8 - 0x40852438 is /lib/i386-linux-gnu/libc.so.6 0x40852438 - 0x4085815e is /lib/i386-linux-gnu/libc.so.6 0x4085815e - 0x4085940c is /lib/i386-linux-gnu/libc.so.6 0x4085940c - 0x40859898 is /lib/i386-linux-gnu/libc.so.6 0x40859898 - 0x408598d8 is /lib/i386-linux-gnu/libc.so.6 0x408598d8 - 0x4085c2e8 is /lib/i386-linux-gnu/libc.so.6 0x4085c2e8 - 0x4085c348 is /lib/i386-linux-gnu/libc.so.6 0x4085c350 - 0x4085c420 is /lib/i386-linux-gnu/libc.so.6 0x4085c420 - 0x4098e44e is /lib/i386-linux-gnu/libc.so.6 0x4098e450 - 0x4098f3db is /lib/i386-linux-gnu/libc.so.6 0x4098f3e0 - 0x4098f5de is /lib/i386-linux-gnu/libc.so.6 This patch fails to compile for MIPS N32 (a 32-bit ABI with a 64-bit virtual address space). I also wonder we have separate linux-user emulators for i386 (32 bit ABI + 32 bit address space) and amd64 binaries (64 bit ABI + 64 bit address space). And we can not run 32 bits apps under qemu-x86_64 but MIPS N32 looks in some other way and it supports 32 bit ABI apps with 64 bit address space. I really not sure but is it a right design or not? Riku, do you have any ideas (or free cycles to do the change)? Please, give some clues. Thanks. On 25.08.2014 16:05, Paolo Bonzini wrote: Il 25/08/2014 13:45, Paolo Bonzini ha scritto: Il 11/08/2014 12:28, Mikhail Ilyin ha scritto: Fix memory maps textualizing function. The output was not correct because of wrong base address calculation. The initial address has to be shifted also for TARGET_PAGE_BITS. Signed-off-by: Mikhail Ilyin m.i...@samsung.com --- translate-all.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/translate-all.c b/translate-all.c index 8f7e11b..cb7a33d 100644 --- a/translate-all.c +++ b/translate-all.c @@ -1728,9 +1728,8 @@ int walk_memory_regions(void *priv, walk_memory_regions_fn fn) data.prot = 0; for (i = 0; i V_L1_SIZE; i++) { -int rc = walk_memory_regions_1(data, (abi_ulong)i V_L1_SHIFT, +int rc = walk_memory_regions_1(data, (abi_ulong)i (V_L1_SHIFT + TARGET_PAGE_BITS), V_L1_SHIFT / V_L2_BITS - 1, l1_map + i); - if (rc != 0) { return rc; } Thanks, this is simple enough that I've queued it. Ouch, I spoke too soon. This patch fails to compile for MIPS N32 (a 32-bit ABI with a 64-bit virtual address space). I'm not sure if there is a simple fix. walk_memory_regions and
Re: [Qemu-devel] [Qemu-ppc] [PATCH 4/5] target-ppc: Handle ibm, nmi-register RTAS call
On Friday 05 September 2014 02:16 PM, Alexander Graf wrote: On 04.09.14 15:49, Aravinda Prasad wrote: On Thursday 04 September 2014 06:39 PM, Alexander Graf wrote: Am 04.09.2014 um 10:25 schrieb Aravinda Prasad aravi...@linux.vnet.ibm.com: On Friday 29 August 2014 03:46 AM, Alexander Graf wrote: On 28.08.14 19:42, Aravinda Prasad wrote: On Thursday 28 August 2014 02:07 PM, Alexander Graf wrote: On 28.08.14 08:38, Aravinda Prasad wrote: On Wednesday 27 August 2014 04:07 PM, Alexander Graf wrote: On 25.08.14 15:45, Aravinda Prasad wrote: This patch adds FWNMI support in qemu for powerKVM guests by handling the ibm,nmi-register rtas call. Whenever OS issues ibm,nmi-register RTAS call, the machine check notification address is saved and the machine check interrupt vector 0x200 is patched to issue a private hcall. Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com --- hw/ppc/spapr_rtas.c| 91 include/hw/ppc/spapr.h |8 2 files changed, 98 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c index 02ddbf9..1135d2b 100644 --- a/hw/ppc/spapr_rtas.c +++ b/hw/ppc/spapr_rtas.c @@ -277,6 +277,91 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU *cpu, rtas_st(rets, 0, ret); } +static void rtas_ibm_nmi_register(PowerPCCPU *cpu, + sPAPREnvironment *spapr, + uint32_t token, uint32_t nargs, + target_ulong args, + uint32_t nret, target_ulong rets) +{ +int i; +uint32_t branch_inst = 0x4802; +target_ulong guest_machine_check_addr; +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu); +/* + * Trampoline saves r3 in sprg2 and issues private hcall + * to request qemu to build error log. QEMU builds the + * error log, copies to rtas-blob and returns the address. + * The initial 16 bytes in rtas-blob consists of saved srr0 + * and srr1 which we restore and pass on the actual error + * log address to OS handled mcachine check notification + * routine + */ +uint32_t trampoline[] = { +0x7c7243a6,/* mtspr SPRN_SPRG2,r3 */ +0x3860,/* li r3,0 */ +/* 0xf004 is the KVMPPC_H_REPORT_ERR private HCALL */ +0x6063f004,/* ori r3,r3,f004 */ +/* Issue H_CALL */ +0x4422,/* sc 1 */ So up to here we're saving r3 in SPRG2 (how do we know that we can clobber it?) and call our special hypercall. But what does all the cruft below here do? The saved r3 in SPRG2 is consumed in KVMPPC_H_REPORT_ERR hcall, hence we can clobber SPRG2 after hcall returns. I have included a comment in patch 3/5 while building error log. I think better I add one here as well. +0x7c9243a6,/* mtspr r4 sprg2 */ Apart from th fact that your order is wrong, this destroys the value of r3 that we saved above again. SPRG2 is saved inside hcall and hence we don't need SPRG2 further after KVMPPC_H_REPORT_ERR hcall returns. +0xe883,/* ld r4, 0(r3) */ +0x7c9a03a6,/* mtspr r4, srr0 */ +0xe8830008,/* ld r4, 8(r3) */ +0x7c9b03a6,/* mtspr r4, srr1 */ Can't we just set srr0 and srr1 directly? I checked for instructions in ISA which set srr0/1 directly given an address, but could not find any such instructions. I mean from QEMU :). srr0 and srr1, which are properly set when 0x200 is invoked, are clobbered when we return from KVMPPC_H_REPORT_ERR hcall. I think they are modified before issuing rfid (I can see them getting clobbered from QEMU monitor). However when we jump to OS registered machine check routine srr0 and srr1 should reflect the value they had when 0x200 was invoked. Hence srr0 and srr1 are saved in hcall and restored when we return from hcall. Also we don't have enough scratch registers available to save these before invoking hcall from 0x200. Or am I missing other ways to do this from QEMU? If you just do cpu_synchronize_state() and then change env-spr[SPRN_SRR0/1] inside your hypercall handler that should also change the value when you return from the hcall. I tried cpu_synchronize_state(), however, srr0 and srr1 are still clobbered. Just before I issue hcall from 0x200 I see the following values from QEMU monitor: SRR0 d0f40264 SRR1 80209033 Inside hcall, I call cpu_synchronize_state(). As soon as I return from hcall I see: SRR0 0214 SRR1 80001001 I see SRR0 is now set to nip in 0x200 and SRR1 to msr value. I think it is reset during returning from hcall before executing rfid. Ah, because the hypercall is an interrupt itself. Heh ;). True. Yes. However, as you mentioned, we still need cpu_synchronize_state() to avoid stale value
Re: [Qemu-devel] [PATCH v5 1/3] contrib: add ivshmem client and server
Just to point out that for the client there is also a DEBUG_LOG to uppercase, just like already pointed out for the server. diff --git a/contrib/ivshmem-client/ivshmem-client.c b/contrib/ivshmem-client/ivshmem-client.c new file mode 100644 index 000..ad210c8 --- /dev/null +++ b/contrib/ivshmem-client/ivshmem-client.c @@ -0,0 +1,405 @@ +/* + * Copyright 6WIND S.A., 2014 + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * (at your option) any later version. See the COPYING file in the + * top-level directory. + */ + +#include sys/types.h +#include sys/socket.h +#include sys/un.h + +#include qemu-common.h +#include qemu/queue.h + +#include ivshmem-client.h + +/* log a message on stdout if verbose=1 */ +#define debug_log(client, fmt, ...) do { \ +if ((client)-verbose) { \ +printf(fmt, ## __VA_ARGS__); \ +}\ +} while (0) + ..here (DEBUG_LOG?) Thanks to all who are working on this. Claudio
Re: [Qemu-devel] [PATCH v3] ide: Add resize callback to ide/core
John Snow js...@redhat.com writes: On 09/04/2014 12:13 PM, Stefan Hajnoczi wrote: This patch seems to break tests/bios-tables-test.c: ERROR:tests/bios-tables-test.c:744:test_acpi_one: assertion failed (signature == SIGNATURE): (0x == 0xdead) GTester: last random seed: R02S3d881198f35228a485b4c3d116dff3b1 I have run it many times to make sure the failure is deterministic and I used git-bisect(1) to find this commit as the cause. Not sure why but it seems to break the test. Please take a look. Dropped from the block branch. Stefan I've fixed it in a v4, but before I submit and while I'm at it, you didn't like the comments I left in the identify functions because they were prone to bitrot. would you prefer I excised them entirely? I found them helpful because it makes sense to read these functions alongside the identify data specs, and excising any references to those word indices in their natural order obfuscates the code needlessly. My inclination is to leave them in. I'd replace them by a pointer to the factored-out code. Meanwhile, the bios-tables-test problem is such: We serve the identify request out of the io_buffer, not the identify_data cache, thus for 2nd and subsequent requests for identify_data, we get correct size information, but for the 1st request to ide_identify, we get zeroes. I found that part confusing and stared at it until I believed it works. Looks like I believed too quickly. I corrected this by making ide_identify and ide_atapi_identify mimic the flow of ide_cfata_identify, which is more clear about the nature of the two buffers. Yup, that one's much better. Why this causes a failure here and for only the Q35 machine type I am not certain, but this is the causative bug. My best guess is different firmware behavior.
Re: [Qemu-devel] [PATCH 1/3] trace: Only link generated-tracers.o with simple backend
On Wed, Sep 03, 2014 at 11:44:54AM +0800, Fam Zheng wrote: In any other cases the object file is effectively empty, which is disliked by ranlib and nm on Mac OS X. Reported-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Fam Zheng f...@redhat.com --- trace/Makefile.objs | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) If another trace backend uses generate_c() there will be a link error so they'll figure out they need to add it in the makefile. Reviewed-by: Stefan Hajnoczi stefa...@redhat.com pgplLAiuKpydd.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH] qemu-char: fix terminal crash when using -monitor stdio -nographic
Li Liu john.li...@huawei.com writes: Ping, any more comments? Thanks. I'd like to hear Gerd's opinion (cc'ed). On 2014/8/27 15:40, Li Liu wrote: On 2014/8/27 14:44, Markus Armbruster wrote: john.liuli john.li...@huawei.com writes: From: Li Liu john.li...@huawei.com Eeay to reproduce, just try qemu -monitor stdio -nographic and type quit, then the terminal will be crashed. There are two pathes try to call tcgetattr of stdio in vl.c: 1) Monitor_parse(optarg, readline); . qemu_opts_foreach(qemu_find_opts(chardev), chardev_init_func, NULL, 1) != 0) 2) if (default_serial) add_device_config(DEV_SERIAL, stdio); if (foreach_device_config(DEV_SERIAL, serial_parse) 0) Both of them will trigger qemu_chr_open_stdio which will disable ECHO attributes. First one has updated the attributes of stdio by calling qemu_chr_fe_set_echo(chr, false). And the tty attributes has been saved in oldtty. Then the second path will redo such actions, and the oldtty is overlapped. So till quit, term_exit can't recove the correct attributes. Signed-off-by: Li Liu john.li...@huawei.com Yes, failure to restore tty settings is a bug. But is having multiple character devices use the same terminal valid? I'm not sure. But I have found such comments in vl.c According to documentation and historically, -nographic redirects serial port, parallel port and monitor to stdio Best regards Li. If no, can we catch and reject the attempt? [...]
Re: [Qemu-devel] [PATCH v7 18/28] host-libusb: remove bootindex property from qdev to qom
+static void usb_host_instance_init(Object *obj) +{ +USBDevice *udev = USB_DEVICE(obj); +USBHostDevice *s = USB_HOST_DEVICE(udev); + +device_add_bootindex_property(obj, s-bootindex, + bootindex, NULL, + udev-qdev, NULL); +object_property_set_int(obj, -1, bootindex, NULL); I think the object_property_set_int() call can be moved to device_add_bootindex_property, so there is no need to duplicate the call to init bootindex with -1 in all devices. cheers, Gerd
Re: [Qemu-devel] [CVE-2014-3615 PATCH v2 3/3] spice: make sure we don't overflow ssd-buf
On 09/05/14 10:58, Gerd Hoffmann wrote: Hi, I can't track this back far enough. I'd feel safer if you checked that the multiplication can't overflow even in uint64_t. Effectively it comes from the emulated graphics hardware (anything in hw/display/*). The gfx emulation must make sure that the framebuffer fits into the video memory, which in turn pretty much implies that we can't overflow uint64_t. I think even uint32_t can't overflow with the gfx hardware we are emulating today. (5) Instead, you really need to make sure that the very first multiplication fits in a signed int: Makes sense. I think it is easier to just multiply in 64bit, then check the result is small enougth (new patch attached). Okay, if you can guarantee that the product fits in uint64_t, then such a check would suffice. New patch has not been attached though :) /* display listener callbacks */ @@ -495,7 +503,7 @@ static void interface_get_init_info(QXLInstance *sin, QXLDevInitInfo *info) info-num_memslots = NUM_MEMSLOTS; info-num_memslots_groups = NUM_MEMSLOTS_GROUPS; info-internal_groupslot_id = 0; -info-qxl_ram_size = ssd-bufsize; +info-qxl_ram_size = 16 * 1024 * 1024; info-n_surfaces = ssd-num_surfaces; } spice-server doesn't do anything with it, other than passing to spice-client. Not fully sure what spice-client uses this for, maybe as some kind of hint for resource management. Maybe not at all. It surely doesn't matter at all for ssd-buf size. Okay, I'll trust you on this. Thanks Laszlo
Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/2] PPC: kvm: Fix incorrect remapping of in-kernel MPIC
On 03.09.14 20:36, Bogdan Purcareata wrote: On target-ppc, the kvm-openpic memory region is part of the E500-CCSR memory region. On the kernel side, the MPIC is mapped at the same offset as the kvm-openpic within the address space. When adding the PCI BAR0 memory region, an alias is created to point to the E500-CCSR memory region. This results in firing the kvm_openpic_region_add once more, since kvm-openpic is part of the latter. Only this time, the offset is wrong - it's part of the PCI memory region. This leads to the in-kernel MPIC to be remapped at a wrong address, and thus all traps to the kvm-openpic address to be emulated in userspace. The fix consists in an additional filter in kvm_openpic_region_{add,del} to consider only addresses matching the start of the kvm-openpic memory region. If this is true, wouldn't vhost and vfio be broken too? Alex
Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/2] PPC: kvm: Fix incorrect remapping of in-kernel MPIC
On 03.09.14 20:36, Bogdan Purcareata wrote: On target-ppc, the kvm-openpic memory region is part of the E500-CCSR memory region. On the kernel side, the MPIC is mapped at the same offset as the kvm-openpic within the address space. When adding the PCI BAR0 memory region, an alias is created to point to the E500-CCSR memory region. This results in firing the kvm_openpic_region_add once more, since kvm-openpic is part of the latter. Only this time, the offset is wrong - it's part of the PCI memory region. This leads to the in-kernel MPIC to be remapped at a wrong address, and thus all traps to the kvm-openpic address to be emulated in userspace. The fix consists in an additional filter in kvm_openpic_region_{add,del} to consider only addresses matching the start of the kvm-openpic memory region. If this is true, wouldn't vfio and host be broken too? Alex
Re: [Qemu-devel] [PATCH] virtio-pci: fix virtio-net child refcount in transports
Hi, CC'ing Stefan and qemu-stable@ for more attention. :) Best regards, -Gonglei -Original Message- From: Gonglei (Arei) Sent: Thursday, September 04, 2014 7:42 PM To: qemu-devel@nongnu.org Cc: m...@redhat.com; Huangweidong (C); Gonglei (Arei) Subject: [PATCH] virtio-pci: fix virtio-net child refcount in transports From: Gonglei arei.gong...@huawei.com object_initialize() leaves the object with a refcount of 1. object_property_add_child() adds its own reference which is dropped again when the property is deleted. The upshot of this is that we always have a refcount = 1. Upon hot unplug the virtio-net child is not finalized! Drop our reference after the child property has been added to the parent. Signed-off-by: Gonglei arei.gong...@huawei.com --- Stefan had post virtio-blk in commit c5d49db4, but virtio-net has the same problem. Maybe the other virtio devices have too. --- hw/virtio/virtio-pci.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index ddb5da1..78dcd68 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -1456,6 +1456,7 @@ static void virtio_net_pci_instance_init(Object *obj) VirtIONetPCI *dev = VIRTIO_NET_PCI(obj); object_initialize(dev-vdev, sizeof(dev-vdev), TYPE_VIRTIO_NET); object_property_add_child(obj, virtio-backend, OBJECT(dev-vdev), NULL); +object_unref(OBJECT(dev-vdev)); } static const TypeInfo virtio_net_pci_info = { -- 1.7.12.4
Re: [Qemu-devel] [PATCH] translate-all.c: fix debug memory maps printing
On 5 September 2014 09:59, Mikhail Ilin m.i...@samsung.com wrote: I also wonder we have separate linux-user emulators for i386 (32 bit ABI + 32 bit address space) and amd64 binaries (64 bit ABI + 64 bit address space). And we can not run 32 bits apps under qemu-x86_64 but MIPS N32 looks in some other way and it supports 32 bit ABI apps with 64 bit address space. I really not sure but is it a right design or not? The design here is that we have one QEMU executable for each Linux ABI we support. On MIPS there are a number of different ABIs which may be in use even with the same CPU type, which is why there are separate emulator binaries. On x86, if we ever supported the x86_64 x32 ABI, that would be an extra emulator binary in addition to the current x86_64 and i386 ones. This should all not matter much for core code, which simply has to use the correct types for things. Quoting from HACKING: vaddr is the best type to use to hold a CPU virtual address in target-independent code. It is guaranteed to be large enough to hold a virtual address for any target, and it does not change size from target to target. It is always unsigned. target_ulong is a type the size of a virtual address on the CPU; this means it may be 32 or 64 bits depending on which target is being built. It should therefore be used only in target-specific code, and in some performance-critical built-per-target core code such as the TLB code. There is also a signed version, target_long. abi_ulong is for the *-user targets, and represents a type the size of 'void *' in that target's ABI. (This may not be the same as the size of a full CPU virtual address in the case of target ABIs which use 32 bit pointers on 64 bit CPUs, like sparc32plus.) Definitions of structures that must match the target's ABI must use this type for anything that on the target is defined to be an 'unsigned long' or a pointer type. There is also a signed version, abi_long. The problem you're running into here is with the ABIs where abi_ulong and target_ulong are different sizes. thanks -- PMM
Re: [Qemu-devel] [PATCH v13 2/6] block: don't convert file size to sector size
On Thu, Sep 04, 2014 at 11:57:58AM +0200, Kevin Wolf wrote: Am 29.08.2014 um 10:33 hat Hu Tao geschrieben: and avoid converting it back later. Signed-off-by: Hu Tao hu...@cn.fujitsu.com diff --git a/block/raw-posix.c b/block/raw-posix.c index 9c22e3f..abe0759 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -1369,8 +1369,8 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp) strstart(filename, file:, filename); /* Read out options */ -total_size = DIV_ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0), - BDRV_SECTOR_SIZE); +total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0), + BDRV_SECTOR_SIZE); nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false); fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, @@ -1394,7 +1394,7 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp) #endif } -if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) { +if (ftruncate(fd, total_size) != 0) { result = -errno; error_setg_errno(errp, -result, Could not resize file); } You forgot changing hdev_create() in raw-posix. Doesn't make the patch less correct, but you may want to add it for v14. Thanks, changed it too. Regards, Hu Kevin
Re: [Qemu-devel] [PATCH v7 18/28] host-libusb: remove bootindex property from qdev to qom
Hi, From: Gerd Hoffmann [mailto:kra...@redhat.com] Sent: Friday, September 05, 2014 5:06 PM Subject: Re: [PATCH v7 18/28] host-libusb: remove bootindex property from qdev to qom +static void usb_host_instance_init(Object *obj) +{ +USBDevice *udev = USB_DEVICE(obj); +USBHostDevice *s = USB_HOST_DEVICE(udev); + +device_add_bootindex_property(obj, s-bootindex, + bootindex, NULL, + udev-qdev, NULL); +object_property_set_int(obj, -1, bootindex, NULL); I think the object_property_set_int() call can be moved to device_add_bootindex_property, so there is no need to duplicate the call to init bootindex with -1 in all devices. Good idea. Thanks! Will fix this. Best regards, -Gonglei
Re: [Qemu-devel] [PATCH] cow: make padding in the header explicit
Am 04.09.2014 um 17:43 hat Stefan Hajnoczi geschrieben: On Thu, Sep 04, 2014 at 04:10:14PM +0200, Kevin Wolf wrote: Am 04.09.2014 um 15:51 hat Stefan Hajnoczi geschrieben: On Thu, Sep 04, 2014 at 06:07:32AM -0600, Eric Blake wrote: On 09/04/2014 02:58 AM, Stefan Hajnoczi wrote: On-disk structures should be marked packed so the compiler does not insert padding for field alignment. Padding should be explicit so on-disk layout is obvious and we don't rely on the architecture-specific ABI for alignment rules. The pahole(1) diff shows that the padding is now explicit and offsets are unchanged: char backing_file[1024]; /* 8 1024 */ /* --- cacheline 16 boundary (1024 bytes) was 8 bytes ago --- */ int32_tmtime;/* 1032 4 */ - - /* XXX 4 bytes hole, try to pack */ - + uint32_t padding; /* 1036 4 */ uint64_t size; /* 1040 8 */ Was a 32-bit build also inserting this padding, or do we have historical differences where 32-bit and 64-bit cow files are actually different, and we may need to be prepared to parse files from both sources? Good point. Let's not merge this patch since it breaks 32-bit hosts. The fact that no one hit problems when exchanging files between 32-bit and 64-bit machines shows that the cow format is rarely used. At this point we have 2 different formats: one without padding (i386-style) and one with padding (x86_64-style). The chance of more variants is small but who knows, maybe some other host architecture ABI has yet another alignment rule for uint64_t. I'd like to git rm block/cow.c but I suppose the backwards-compatible thing to do is to introduce subformats to support both variants. Opinions? Can we safely detect which of the subformats we have? But I'm not sure if it's even worth fixing. I think it would default to the subformat depending on the host architecture but allow overriding with -o subformat=i386|x86_64. Hm, okay. If we can't do it automatically, that's an option, too. I'm also not sure if it's worth fixing. The cow file format is so rarely used I wonder if we'd be better off without it. It has never been used as a native qemu image format. As I understand it, its use case is compatibility with existing images, and one of the great things about qemu-img is that it can open more or less any random image that you get from somewhere. The exotic formats are part of this. If we wanted to simplify our code, we could probably make it read-only (at the cost of losing qemu-iotests support), but then it's so simple that leaving r/w support around won't hurt us. Hm, actually, looking at the kernel (arch/um/drivers/cow_user.c), it doesn't look as if we were compatible at all for v2, at least in the current kernel version, we use a different backing file name length... And they have the struct packed today, so we should probably follow their example. On the other hand, out of three versions, we only support v2, and we don't use the same format as the kernel. Yeah, I guess we might just drop the support then, it's not helpful for compatibility. The only other way would be to update it to handle all three versions the same as the kernel code. Kevin pgp57GQ6wLdw8.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH] qemu-char: fix terminal crash when using -monitor stdio -nographic
On Fr, 2014-09-05 at 11:04 +0200, Markus Armbruster wrote: Li Liu john.li...@huawei.com writes: Ping, any more comments? Thanks. I'd like to hear Gerd's opinion (cc'ed). But is having multiple character devices use the same terminal valid? No (guess we should catch that case in stdio init). Beside the tty initialization and cleanup you also have the problem that both users are racing for input. Well, maybe not in the qemu case as it is the same process and it very well might be that it polls the two chardevs in a well defined order, so one of them gets all input and the other gets nothing. With two processes reading from the terminal (try 'cat | less') it is actually random though. I'm not sure. But I have found such comments in vl.c According to documentation and historically, -nographic redirects serial port, parallel port and monitor to stdio In that case mux chardev is used (that is the piece which handles the input switching between serial and monitor via 'Ctrl-A c'). There is one stdio instance, and one mux instance, the mux is chained to stdio, and mux allows multiple backends to connect. You can construct it on the command line this way: qemu -nographic -nodefaults \ -chardev stdio,mux=on,id=terminal \ -serial chardev:terminal \ -monitor chardev:terminal [ serial is default, so no output here, unless you boot a guest with serial console configured ] [ Hit 'Ctrl-A h' now ] C-a hprint this help C-a xexit emulator C-a ssave disk data back to file (if -snapshot) C-a ttoggle console timestamps C-a bsend break (magic sysrq) C-a cswitch between console and monitor C-a C-a sends C-a [ Hit 'Ctrl-A c' now ] QEMU 2.1.50 monitor - type 'help' for more information (qemu) info chardev terminal: filename=mux terminal-base: filename=stdio (qemu) HTH, Gerd
Re: [Qemu-devel] [CVE-2014-3615 PATCH v2 3/3] spice: make sure we don't overflow ssd-buf
On Fr, 2014-09-05 at 11:06 +0200, Laszlo Ersek wrote: Makes sense. I think it is easier to just multiply in 64bit, then check the result is small enougth (new patch attached). Okay, if you can guarantee that the product fits in uint64_t, then such a check would suffice. New patch has not been attached though :) Oops. Here we go. cheers, Gerd From 33c5c3d1736fd577fc1279a1f3c50d2414e98fe3 Mon Sep 17 00:00:00 2001 From: Gerd Hoffmann kra...@redhat.com Date: Wed, 3 Sep 2014 15:50:08 +0200 Subject: [PATCH] spice: make sure we don't overflow ssd-buf Related spice-only bug. We have a fixed 16 MB buffer here, being presented to the spice-server as qxl video memory in case spice is used with a non-qxl card. It's also used with qxl in vga mode. When using display resolutions requiring more than 16 MB of memory we are going to overflow that buffer. In theory the guest can write, indirectly via spice-server. The spice-server clears the memory after setting a new video mode though, triggering a segfault in the overflow case, so qemu crashes before the guest has a chance to do something evil. Fix that by switching to dynamic allocation for the buffer. CVE-2014-3615 Cc: qemu-sta...@nongnu.org Cc: secal...@redhat.com Cc: Laszlo Ersek ler...@redhat.com Signed-off-by: Gerd Hoffmann kra...@redhat.com --- ui/spice-display.c | 20 +++- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/ui/spice-display.c b/ui/spice-display.c index 66e2578..def7b52 100644 --- a/ui/spice-display.c +++ b/ui/spice-display.c @@ -334,11 +334,23 @@ void qemu_spice_create_host_memslot(SimpleSpiceDisplay *ssd) void qemu_spice_create_host_primary(SimpleSpiceDisplay *ssd) { QXLDevSurfaceCreate surface; +uint64_t surface_size; memset(surface, 0, sizeof(surface)); -dprint(1, %s/%d: %dx%d\n, __func__, ssd-qxl.id, - surface_width(ssd-ds), surface_height(ssd-ds)); +surface_size = (uint64_t) surface_width(ssd-ds) * +surface_height(ssd-ds) * 4; +assert(surface_size 0); +assert(surface_size INT_MAX); +if (ssd-bufsize surface_size) { +ssd-bufsize = surface_size; +g_free(ssd-buf); +ssd-buf = g_malloc(ssd-bufsize); +} + +dprint(1, %s/%d: %ux%u (size % PRIu64 /%d)\n, __func__, ssd-qxl.id, + surface_width(ssd-ds), surface_height(ssd-ds), + surface_size, ssd-bufsize); surface.format = SPICE_SURFACE_FMT_32_xRGB; surface.width = surface_width(ssd-ds); @@ -369,8 +381,6 @@ void qemu_spice_display_init_common(SimpleSpiceDisplay *ssd) if (ssd-num_surfaces == 0) { ssd-num_surfaces = 1024; } -ssd-bufsize = (16 * 1024 * 1024); -ssd-buf = g_malloc(ssd-bufsize); } /* display listener callbacks */ @@ -495,7 +505,7 @@ static void interface_get_init_info(QXLInstance *sin, QXLDevInitInfo *info) info-num_memslots = NUM_MEMSLOTS; info-num_memslots_groups = NUM_MEMSLOTS_GROUPS; info-internal_groupslot_id = 0; -info-qxl_ram_size = ssd-bufsize; +info-qxl_ram_size = 16 * 1024 * 1024; info-n_surfaces = ssd-num_surfaces; } -- 1.8.3.1
Re: [Qemu-devel] [PATCH v4 0/2] add resize callback to ide/core
On Thu, Sep 04, 2014 at 11:42:15PM -0400, John Snow wrote: This patch series fixes incorrect IDENTIFY data returned for an IDE drive after a block_resize event by adding a resize callback for IDE devices. Inconsistencies between identify routines are also removed so that they read easier. V4: - Added patch that makes the buffer and cache fill order for identify routines more consistent. - Fixed a bug where the very first call to IDENTIFY does not return any size information. V3: - Factored out the size update into new functions. - Fixed the size update for CFATA. - Added assertion to clarify that ide_resize_cb is non-atapi. V2: - Do not attempt to update geometry values, to avoid clobbering user-specified values, if they exist. - Do not regenerate the entire IDENTIFY buffer to avoid losing any settings that occurred during normal operation. John Snow (2): IDE: Fill the IDENTIFY request consistently ide: Add resize callback to ide/core hw/ide/core.c | 97 +-- 1 file changed, 74 insertions(+), 23 deletions(-) -- 1.9.3 Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan pgpoJQFXydmIB.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH 0/2] vmdk: fix leaks in vmdk_parse_extents()
On Thu, Sep 04, 2014 at 09:04:41PM +0100, Stefan Hajnoczi wrote: See patches for the specific leaks. Stefan Hajnoczi (2): vmdk: fix vmdk_parse_extents() extent_file leaks vmdk: fix buf leak in vmdk_parse_extents() block/vmdk.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) -- 1.9.3 Applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan pgpXUoZi35KrL.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH 1/4] block: Correct bs-growable
Am 04.09.2014 um 22:01 hat Max Reitz geschrieben: On 20.08.2014 13:40, Kevin Wolf wrote: Am 12.07.2014 um 00:23 hat Max Reitz geschrieben: Currently, the field growable in a BDS is set iff the BDS is opened in protocol mode (with O_BDRV_PROTOCOL). However, not every protocol block driver allows growing: NBD, for instance, does not. On the other hand, a non-protocol block driver may allow growing: The raw driver does. Fix this by correcting the growable field in the driver-specific open function for the BDS, if necessary. Signed-off-by: Max Reitz mre...@redhat.com I'm not sure I agree with bs-growable = true for raw. It's certainly true that the backend can technically provide the functionality that writes beyond EOF grow the file. That's not the point of bs-growable, though. The point of it was to _forbid_ it to grow even when it's technically possible (non-file protocols weren't really a thing back then, apart from vvfat, so the assumption was that it's always technically possible). growable was introduced with bdrv_check_request(), which is supposed to reject guest requests after the end of the virtual disk (and this fixed a CVE, see commit 71d0770c). You're now disabling this check for raw. I think we need to make sure that bs-growable is only set if it is opened for an image that has drv-requires_growing_file set and therefore not directly used by a guest. Well, except that with node-name a guest will be able to use any image in the chain... Might this mean that it's really a BlockBackend property? Okay, the more I sit at this problem, the harder it seems to get. The only thing I currently know for sure is that I disagree with Anthony's opinion in 71d0770c (this patch makes the BlockDriver API guarantee that all requests are within 0..bdrv_getlength() which to me seems like a Good Thing). The initial point was to range check guest requests. In 2009, it may have been enough to statically check the BDS type (protocol or format) to know whether the guest directly accesses it (format) or not (protocol). However, this is no longer sufficient. Now, as far as I know, guests can access basically any BDS (as you yourself say). Therefore, it seems to me like it's impossible to determine whether the device should be marked growable or not when opening it. Instead, I think we have to check for each single request whether it comes from the guest or from within the block layer and do range checking only for the former case; but this should not be the task of the block layer core, but of the block devices instead. Theoretically, guests may write beyond the image end and grow it that way all they want, but in practice this should be limited by the block devices which have a fixed length. What about block jobs, built-in NBD server, etc.? Also, we have many device models that I don't trust a bit and having one central check is much easier to get right than n duplicates in each device emulation. Under this impression, I wanted to simply return to growable = false for raw. However, this breaks test 071 which attaches blkdebug to a raw BDS after qemu has been started. blkdebug detects raw is not growable, therefore reports not to be growable as well, and because qcow2 is on top of all that, the warning introduced by this series is emitted. Okay, so we will need growable = true for raw in some cases. I don't want to make the shortcut more tempting, but if we come to the conclusion that a fixed growable=false for raw is the right thing to do, then we simply need to fix the test case. It's not trivial to determine whether the superordinate BDS of a certain BDS has BlockDriver.requires_growing_file set or not. We could introduce a new flag for bdrv_open(), but I'd rather avoid it. In fact, I tried something like this, but I quickly got into problems because e.g. blkdebug does not have requires_growing_file_set, but decides whether its own BDS are growable based on whether the underlying file is growable or not. So if a blkdebug BDS is growable, the underlying file actually needs to be growable as well. Therefore, we'd need a more sophisticated requires_growing_file_set and maybe even propagation of growable requirement through the BDS layers which quickly turns ugly when one has to consider that a BDS may be used by multiple users. Ugh. You're right, it's not a static per-driver property, but really a per-BDS one. That's somewhat unfortunate. Also, it's actually irrelevant whether requires_growing_file is set or not. growable's current sole purpose apparently is enabling range checks for guest-accessible images. If the BDS is only a subordinate to another BDS, it doesn't matter whether that other BDS needs growable set or not. Hm... So in blockdev talk, what you're saying is that range checks are a BlockBackend thing, and if we're on BDS level, we don't care? Perhaps that's the right approach and gets us rid of bs-growable immediately.
Re: [Qemu-devel] I/O parallelism on QCOW2
On Thu, Sep 04, 2014 at 12:32:12PM -0400, Xingbo Wu wrote: After running a 16-thread sync-random-write test against qcow2, It is observed that QCOW2 seems to be serializing all its metadata-related writes. If qcow2 is designed to do this,* then what is the concern?* What would go wrong if this ordering is relaxed? How do you know that serializing part of the write request is a significant bottleneck? Please post your benchmark results with raw, qed, and qcow2 handling 1-, 8-, and 16-threads of I/O (or whatever similar benchmarks you have run). The bottleneck may actually be something else, so please share your benchmark configuration and results. By providing less features, raw-file and QED scales well on parallel I/O workload. I believe qcow2 does this with clear reasons. Thanks! QED serializes allocating writes, see qed_aio_write_alloc(). In qcow2 the BdrvQcowState-lock is held across metadata updates. The important pieces here are: * qcow2_co_writev() only releases the lock around data writes (including COW). * qcow2_co_flush_to_os() holds the lock around metadata updates Stefan pgpAi29D4DiLt.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH] Fix improper usage of cpu_to_be32 in vpc
On Thu, Sep 04, 2014 at 10:43:58PM +0800, Gordon Gong wrote: From fd3f0fd9c53d7782d4d835597c8a07b897bec3d0 Mon Sep 17 00:00:00 2001 From: Xiaodong Gong gordongong0...@gmail.com Date: Sat, 30 Aug 2014 03:17:03 +0800 Subject: Fix improper usage of cpu_to_be32 in vpc cpu_to_be32() is wrong since vhd_type is an enum constant (just a regular CPU-endian integer). Signed-off-by: Xiaodong Gong gordongong0...@gmail.com --- block/vpc.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) This patch is malformed and does not apply with git-am(1). Please use git-send-email(1) to send properly formatted patches. Copy-pasting patches into GMail's web interface does not work. http://qemu-project.org/Contribute/SubmitAPatch Stefan pgpLKrQQsS2QW.pgp Description: PGP signature
Re: [Qemu-devel] [RFC PATCH v2] Support vhd type VHD_DIFFERENCING
On Thu, Sep 04, 2014 at 10:49:43PM +0800, Gordon Gong wrote: [Qemu-devel][RFC PATCH v2] Support vhd type VHD_DIFFERENCING From 5387a2a7b6ad052659a08a1fc7e89595708396d1 Mon Sep 17 00:00:00 2001 From: Xiaodong Gong gordongong0...@gmail.com Date: Thu, 4 Sep 2014 01:14:59 +0800 Subject: [PATCH 2/2] Support vhd type VHD_DIFFERENCING Now qemu only supports vhd type VHD_FIXED and VHD_DYNAMIC, so qemu can't read snapshot volume of vhd, and can't support other storage features of vhd file. This patch add read parent information in function vpc_open, read bitmap in vpc_read, and change bitmap in vpc_write. Signed-off-by: Xiaodong Gong gordongong0...@gmail.com --- block/vpc.c | 329 +++- 1 file changed, 261 insertions(+), 68 deletions(-) This patch is malformed. Please use git-send-email(1): http://qemu-project.org/Contribute/SubmitAPatch Stefan pgpiiF3wOEU6z.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH] gtk.c: Fix memory leak in gd_set_keycode_type()
On Di, 2014-09-02 at 14:33 +0800, Chen Fan wrote: this memory leak is introduced by the original commit 3158a3482b0093e41f2b2596fba50774ea31ae08 added to gtk queue. thanks, Gerd
Re: [Qemu-devel] [PULL for-2.1 0/7] QOM devices patch queue 2014-09-04
On 4 September 2014 18:21, Andreas Färber afaer...@suse.de wrote: Hello Peter, This is my QOM (devices) patch queue. Please pull. Regards, Andreas Cc: Peter Maydell peter.mayd...@linaro.org Cc: Michael S. Tsirkin m...@redhat.com The following changes since commit 01eb313907dda97313b8fea62e5632fca64f069c: Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-09-03' into staging (2014-09-04 13:33:53 +0100) are available in the git repository at: git://github.com/afaerber/qemu-cpu.git tags/qom-devices-for-peter for you to fetch changes up to 1d45a705fc007a13f20d18473290082eae6d1725: qdev: Add cleanup logic in device_set_realized() to avoid resource leak (2014-09-04 19:15:54 +0200) QOM infrastructure fixes and device conversions * Cleanups for recursive device unrealization Alexey Kardashevskiy (1): qom: Make object_child_foreach() safe for objects removal Andreas Färber (1): machine: Clean up -machine handling Gonglei (3): qdev: Use error_abort instead of using local_err qdev: Use NULL instead of local_err for qbus_child unrealize qdev: Add cleanup logic in device_set_realized() to avoid resource leak Peter Crosthwaite (2): qom: Add automatic arrayification to object_property_add() memory: Remove object_property_add_child_array() Applied, thanks. -- PMM
Re: [Qemu-devel] [CVE-2014-3615 PATCH v2 3/3] spice: make sure we don't overflow ssd-buf
On 09/05/14 11:33, Gerd Hoffmann wrote: On Fr, 2014-09-05 at 11:06 +0200, Laszlo Ersek wrote: Makes sense. I think it is easier to just multiply in 64bit, then check the result is small enougth (new patch attached). Okay, if you can guarantee that the product fits in uint64_t, then such a check would suffice. New patch has not been attached though :) Oops. Here we go. cheers, Gerd 0001-spice-make-sure-we-don-t-overflow-ssd-buf.patch From 33c5c3d1736fd577fc1279a1f3c50d2414e98fe3 Mon Sep 17 00:00:00 2001 From: Gerd Hoffmann kra...@redhat.com Date: Wed, 3 Sep 2014 15:50:08 +0200 Subject: [PATCH] spice: make sure we don't overflow ssd-buf Related spice-only bug. We have a fixed 16 MB buffer here, being presented to the spice-server as qxl video memory in case spice is used with a non-qxl card. It's also used with qxl in vga mode. When using display resolutions requiring more than 16 MB of memory we are going to overflow that buffer. In theory the guest can write, indirectly via spice-server. The spice-server clears the memory after setting a new video mode though, triggering a segfault in the overflow case, so qemu crashes before the guest has a chance to do something evil. Fix that by switching to dynamic allocation for the buffer. CVE-2014-3615 Cc: qemu-sta...@nongnu.org Cc: secal...@redhat.com Cc: Laszlo Ersek ler...@redhat.com Signed-off-by: Gerd Hoffmann kra...@redhat.com --- ui/spice-display.c | 20 +++- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/ui/spice-display.c b/ui/spice-display.c index 66e2578..def7b52 100644 --- a/ui/spice-display.c +++ b/ui/spice-display.c @@ -334,11 +334,23 @@ void qemu_spice_create_host_memslot(SimpleSpiceDisplay *ssd) void qemu_spice_create_host_primary(SimpleSpiceDisplay *ssd) { QXLDevSurfaceCreate surface; +uint64_t surface_size; memset(surface, 0, sizeof(surface)); -dprint(1, %s/%d: %dx%d\n, __func__, ssd-qxl.id, - surface_width(ssd-ds), surface_height(ssd-ds)); +surface_size = (uint64_t) surface_width(ssd-ds) * +surface_height(ssd-ds) * 4; +assert(surface_size 0); +assert(surface_size INT_MAX); +if (ssd-bufsize surface_size) { +ssd-bufsize = surface_size; +g_free(ssd-buf); +ssd-buf = g_malloc(ssd-bufsize); +} + +dprint(1, %s/%d: %ux%u (size % PRIu64 /%d)\n, __func__, ssd-qxl.id, + surface_width(ssd-ds), surface_height(ssd-ds), + surface_size, ssd-bufsize); surface.format = SPICE_SURFACE_FMT_32_xRGB; surface.width = surface_width(ssd-ds); @@ -369,8 +381,6 @@ void qemu_spice_display_init_common(SimpleSpiceDisplay *ssd) if (ssd-num_surfaces == 0) { ssd-num_surfaces = 1024; } -ssd-bufsize = (16 * 1024 * 1024); -ssd-buf = g_malloc(ssd-bufsize); } /* display listener callbacks */ @@ -495,7 +505,7 @@ static void interface_get_init_info(QXLInstance *sin, QXLDevInitInfo *info) info-num_memslots = NUM_MEMSLOTS; info-num_memslots_groups = NUM_MEMSLOTS_GROUPS; info-internal_groupslot_id = 0; -info-qxl_ram_size = ssd-bufsize; +info-qxl_ram_size = 16 * 1024 * 1024; info-n_surfaces = ssd-num_surfaces; } -- 1.8.3.1 Reviewed-by: Laszlo Ersek ler...@redhat.com
Re: [Qemu-devel] [PATCH v2] qcow2: add update refcount table realization for update_refcount
Am 01.09.2014 um 12:52 hat Jun Li geschrieben: When every item of refcount block is NULL, free refcount block and reset the corresponding item of refcount table with NULL. Signed-off-by: Jun Li address@hidden The commit message should also describe why this is a relevant improvement for some use case. My gut feeling is that it complicates the code for a very minimal gain. Kevin
Re: [Qemu-devel] [PATCH v5 3/3] ivshmem: add check on protocol version in QEMU
On Thu, Sep 04, 2014 at 02:51:01PM +0200, David Marchand wrote: diff --git a/contrib/ivshmem-client/ivshmem-client.c b/contrib/ivshmem-client/ivshmem-client.c index ad210c8..0c4e016 100644 --- a/contrib/ivshmem-client/ivshmem-client.c +++ b/contrib/ivshmem-client/ivshmem-client.c @@ -184,10 +184,18 @@ ivshmem_client_connect(IvshmemClient *client) goto err_close; } -/* first, we expect our index + a fd == -1 */ +/* first, we expect a protocol version */ +if (read_one_msg(client, tmp, fd) 0 || +(tmp != IVSHMEM_PROTOCOL_VERSION) || fd != -1) { +debug_log(client, cannot read from server\n); +goto err_close; +} +debug_log(client, our_id=%ld\n, client-local.id); This debug_log() is probably not intentional. local.id will always be -1 here so the output is not useful. +static void ivshmem_check_version(void *opaque, const uint8_t * buf, int flags) +{ +IVShmemState *s = opaque; +PCIDevice *dev = PCI_DEVICE(s); +int tmp; +long version; + +memcpy(version, buf, sizeof(long)); +tmp = qemu_chr_fe_get_msgfd(s-server_chr); +if (tmp != -1 || version != IVSHMEM_PROTOCOL_VERSION) { +fprintf(stderr, incompatible version, you are connecting to a ivhsmem- +server using a different protocol please check your setup\n); +qemu_chr_delete(s-server_chr); +s-server_chr = NULL; +return; +} + +IVSHMEM_DPRINTF(version check ok, finish init and switch to real chardev +handler\n); + +pci_register_bar(dev, 2, s-ivshmem_attr, s-bar); Not sure if it is okay to delay PCI initialization to a fd hander callback. If the version message is too slow the guest could see the PCI adapter without the BAR! Did you move this code in order to prevent the guest from accessing the device before it has connected to the server? Perhaps the device needs a state field that tracks whether or not it is ready for operation. Any access before RUNNING state is reached will be ignored (?). pgpFAOfA0J1aK.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH] virtio-pci: fix virtio-net child refcount in transports
On Thu, Sep 04, 2014 at 07:41:32PM +0800, arei.gong...@huawei.com wrote: From: Gonglei arei.gong...@huawei.com object_initialize() leaves the object with a refcount of 1. object_property_add_child() adds its own reference which is dropped again when the property is deleted. The upshot of this is that we always have a refcount = 1. Upon hot unplug the virtio-net child is not finalized! Drop our reference after the child property has been added to the parent. Signed-off-by: Gonglei arei.gong...@huawei.com Aren't other virtio devices affected? what about virtio-scsi? --- Stefan had post virtio-blk in commit c5d49db4, but virtio-net has the same problem. Maybe the other virtio devices have too. --- hw/virtio/virtio-pci.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index ddb5da1..78dcd68 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -1456,6 +1456,7 @@ static void virtio_net_pci_instance_init(Object *obj) VirtIONetPCI *dev = VIRTIO_NET_PCI(obj); object_initialize(dev-vdev, sizeof(dev-vdev), TYPE_VIRTIO_NET); object_property_add_child(obj, virtio-backend, OBJECT(dev-vdev), NULL); +object_unref(OBJECT(dev-vdev)); } static const TypeInfo virtio_net_pci_info = { -- 1.7.12.4
Re: [Qemu-devel] [PATCH v5 2/3] docs: update ivshmem device spec
On Thu, Sep 04, 2014 at 02:51:00PM +0200, David Marchand wrote: Add some notes on the parts needed to use ivshmem devices: more specifically, explain the purpose of an ivshmem server and the basic concept to use the ivshmem devices in guests. Move some parts of the documentation and re-organise it. Signed-off-by: David Marchand david.march...@6wind.com Reviewed-by: Claudio Fontana claudio.font...@huawei.com --- docs/specs/ivshmem_device_spec.txt | 124 +++- 1 file changed, 93 insertions(+), 31 deletions(-) Reviewed-by: Stefan Hajnoczi stefa...@redhat.com pgpRRPmN581Xw.pgp Description: PGP signature
Re: [Qemu-devel] [Qemu-ppc] [PULL 00/52] ppc patch queue 2014-09-04
On 4 September 2014 23:17, Alexander Graf ag...@suse.de wrote: Peter, please pull the same tag name again - I updated it with the now working state. Doesn't build on Windows: hw/ppc/spapr.o: In function `spapr_populate_memory': /home/petmay01/linaro/qemu-for-merges/hw/ppc/spapr.c:708: undefined reference to `_ffsl' /home/petmay01/linaro/qemu-for-merges/hw/ppc/spapr.c:708: undefined reference to `_ffsl' /home/petmay01/linaro/qemu-for-merges/hw/ppc/spapr.c:709: undefined reference to `_ffsl' Don't try to use ffs() or ffsl() -- use the ctz32(), ctz64() or ctzl() functions in host-utils.h (whichever is appropriate for the size of the type; in this case ctz64 I think). Watch out that in the common case ctzl(x) == ffsl(x) - 1, and check the handling of the edge case of zero input is what you want. thanks -- PMM
Re: [Qemu-devel] [PATCH v5 1/3] contrib: add ivshmem client and server
On Thu, Sep 04, 2014 at 02:50:59PM +0200, David Marchand wrote: When using ivshmem devices, notifications between guests can be sent as interrupts using a ivshmem-server (typical use described in documentation). The client is provided as a debug tool. Signed-off-by: Olivier Matz olivier.m...@6wind.com Signed-off-by: David Marchand david.march...@6wind.com --- Makefile|8 + configure |3 + contrib/ivshmem-client/ivshmem-client.c | 405 +++ contrib/ivshmem-client/ivshmem-client.h | 239 ++ contrib/ivshmem-client/main.c | 237 ++ contrib/ivshmem-server/ivshmem-server.c | 395 ++ contrib/ivshmem-server/ivshmem-server.h | 186 ++ contrib/ivshmem-server/main.c | 244 +++ qemu-doc.texi | 10 +- 9 files changed, 1724 insertions(+), 3 deletions(-) create mode 100644 contrib/ivshmem-client/ivshmem-client.c create mode 100644 contrib/ivshmem-client/ivshmem-client.h create mode 100644 contrib/ivshmem-client/main.c create mode 100644 contrib/ivshmem-server/ivshmem-server.c create mode 100644 contrib/ivshmem-server/ivshmem-server.h create mode 100644 contrib/ivshmem-server/main.c Modulo Michael's comments: Reviewed-by: Stefan Hajnoczi stefa...@redhat.com pgpLEOnkg7Lw3.pgp Description: PGP signature
Re: [Qemu-devel] [RFC][patch 3/6] KVM: s390: Add GISA support
On Fri, Sep 05, 2014 at 10:29:26AM +0200, Alexander Graf wrote: On 04.09.14 12:52, frank.blasc...@de.ibm.com wrote: From: Frank Blaschka frank.blasc...@de.ibm.com This patch adds GISA (Guest Interrupt State Area) support to s390 kvm. GISA can be used for exitless interrupts. The patch provides a set of functions for GISA related operations like accessing GISA fields or registering ISCs for alert. Exploiters of GISA will follow with additional patches. Signed-off-by: Frank Blaschka frank.blasc...@de.ibm.com That's a nice feature. However, please make sure that you maintain the abstraction levels. What should happen is that you request an irqfd from FLIC. Then you associate that irqfd with the PCI device. Thanks to that association, both parties can now talk to each other and negotiate their GISA number space and make sure things are connected. However, it should always be possible to do things without this direct IRQ injection. So you should be able to receive an irqfd event when an IRQ happened, so that VFIO user space applications can also handle interrupts for example. And the same applies for interrupt injection. We also need to be able to inject an adapter interrupt from QEMU for emulated devices ;). OK, assuming we are doing the vfio solution expoiting GISA would be a second step. Will take your feedback into account. THX! Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH v4 04/20] block: Convert bdrv_em_aiocb_info.cancel to .cancel_async
On Thu, 09/04 17:21, Benoît Canet wrote: The Wednesday 03 Sep 2014 à 19:23:39 (+0800), Fam Zheng wrote : All the difference is that the old .cancel doesn't call cb, but .cancel_async does. Signed-off-by: Fam Zheng f...@redhat.com --- block.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/block.c b/block.c index 4aa1bd7..b7253af 100644 --- a/block.c +++ b/block.c @@ -4679,6 +4679,9 @@ static void bdrv_aio_cancel_em(BlockDriverAIOCB *blockacb) { BlockDriverAIOCBSync *acb = container_of(blockacb, BlockDriverAIOCBSync, common); + +acb-ret = -ECANCELED; +acb-common.cb(acb-common.opaque, acb-ret); qemu_bh_delete(acb-bh); acb-bh = NULL; qemu_aio_release(acb); @@ -4686,7 +4689,7 @@ static void bdrv_aio_cancel_em(BlockDriverAIOCB *blockacb) static const AIOCBInfo bdrv_em_aiocb_info = { .aiocb_size = sizeof(BlockDriverAIOCBSync), -.cancel = bdrv_aio_cancel_em, +.cancel_async = bdrv_aio_cancel_em, }; Note from an AIO noob. Could we explain somewhere in the block layer what the _em suffix means ? I didn't write these functions but I think it means emulation. Fam
Re: [Qemu-devel] [PATCH] cow: make padding in the header explicit
Stefan Hajnoczi stefa...@gmail.com writes: On Thu, Sep 04, 2014 at 04:10:14PM +0200, Kevin Wolf wrote: Am 04.09.2014 um 15:51 hat Stefan Hajnoczi geschrieben: On Thu, Sep 04, 2014 at 06:07:32AM -0600, Eric Blake wrote: On 09/04/2014 02:58 AM, Stefan Hajnoczi wrote: On-disk structures should be marked packed so the compiler does not insert padding for field alignment. Padding should be explicit so on-disk layout is obvious and we don't rely on the architecture-specific ABI for alignment rules. The pahole(1) diff shows that the padding is now explicit and offsets are unchanged: char backing_file[1024]; /* 8 1024 */ /* --- cacheline 16 boundary (1024 bytes) was 8 bytes ago --- */ int32_t mtime; /* 1032 4 */ - - /* XXX 4 bytes hole, try to pack */ - + uint32_t padding; /* 1036 4 */ uint64_t size; /* 1040 8 */ Was a 32-bit build also inserting this padding, or do we have historical differences where 32-bit and 64-bit cow files are actually different, and we may need to be prepared to parse files from both sources? Good point. Let's not merge this patch since it breaks 32-bit hosts. The fact that no one hit problems when exchanging files between 32-bit and 64-bit machines shows that the cow format is rarely used. At this point we have 2 different formats: one without padding (i386-style) and one with padding (x86_64-style). The chance of more variants is small but who knows, maybe some other host architecture ABI has yet another alignment rule for uint64_t. I'd like to git rm block/cow.c but I suppose the backwards-compatible thing to do is to introduce subformats to support both variants. Opinions? Can we safely detect which of the subformats we have? But I'm not sure if it's even worth fixing. I think it would default to the subformat depending on the host architecture but allow overriding with -o subformat=i386|x86_64. Admirable dedication to bug-compatibility, but... I'm also not sure if it's worth fixing. The cow file format is so rarely used I wonder if we'd be better off without it. ... good grief, nuke it already :)
Re: [Qemu-devel] [PATCH v4 04/20] block: Convert bdrv_em_aiocb_info.cancel to .cancel_async
The Friday 05 Sep 2014 à 18:55:51 (+0800), Fam Zheng wrote : On Thu, 09/04 17:21, Benoît Canet wrote: The Wednesday 03 Sep 2014 à 19:23:39 (+0800), Fam Zheng wrote : All the difference is that the old .cancel doesn't call cb, but .cancel_async does. Signed-off-by: Fam Zheng f...@redhat.com --- block.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/block.c b/block.c index 4aa1bd7..b7253af 100644 --- a/block.c +++ b/block.c @@ -4679,6 +4679,9 @@ static void bdrv_aio_cancel_em(BlockDriverAIOCB *blockacb) { BlockDriverAIOCBSync *acb = container_of(blockacb, BlockDriverAIOCBSync, common); + +acb-ret = -ECANCELED; +acb-common.cb(acb-common.opaque, acb-ret); qemu_bh_delete(acb-bh); acb-bh = NULL; qemu_aio_release(acb); @@ -4686,7 +4689,7 @@ static void bdrv_aio_cancel_em(BlockDriverAIOCB *blockacb) static const AIOCBInfo bdrv_em_aiocb_info = { .aiocb_size = sizeof(BlockDriverAIOCBSync), -.cancel = bdrv_aio_cancel_em, +.cancel_async = bdrv_aio_cancel_em, }; Note from an AIO noob. Could we explain somewhere in the block layer what the _em suffix means ? I didn't write these functions but I think it means emulation. Thanks Best regards Benoît Fam