Re: [Qemu-devel] [RFC] ATAPI-SCSI bridge GSoC project
On 18/07/15 20:49, Alexander Bezzubikov wrote: atapi: ATAPI-SCSI bridge device created private SCSI bus added to bridge ATAPI inquiry command can use a bridge Hi! Not everybody is familiar with your GSoC project, so it would be great if you could be a little bit more verbose in your patch description. Unfortunately, your patch also has some style issues, please make sure to read http://qemu-project.org/Contribute/SubmitAPatch first. Especially: - Your patch lacks a Signed-off-by line - scripts/checkpatch.pl reports quite a bunch of errors - Make sure to CC the right maintainers Hope that helps, Thomas
[Qemu-devel] [RFC PATCH qemu 0/4] vfio: SPAPR IOMMU v2 (memory preregistration support)
Yet another try, reworked the whole patchset. Here are few patches to prepare an existing listener for handling memory preregistration for SPAPR guests running on POWER8. This used to be a part of DDW patchset but now is separated as requested. Please comment. Thanks! Changes: v4: * have 2 listeners now - iommu and prereg * removed iommu_data * many smaller changes v3: * removed incorrect vfio: Skip PCI BARs in memory listener * removed page size changes from quirks as they did not completely fix the crashes happening on POWER8 (only total removal helps there) * added memory: Add reporting of supported page sizes Alexey Kardashevskiy (4): memory: Add reporting of supported page sizes vfio: Generalize IOMMU memory listener vfio: Use different page size for different IOMMU types vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering) hw/ppc/spapr_iommu.c | 8 ++ hw/vfio/common.c | 217 +- include/exec/memory.h | 11 +++ include/hw/vfio/vfio-common.h | 26 ++--- memory.c | 9 ++ trace-events | 2 + 6 files changed, 214 insertions(+), 59 deletions(-) -- 2.4.0.rc3.8.gfb3e7d5
[Qemu-devel] [RFC PATCH qemu 3/4] vfio: Use different page size for different IOMMU types
The existing memory listener is called on RAM or PCI address space which implies potentially different page size. This uses new memory_region_iommu_get_page_sizes() for IOMMU regions or falls back to qemu_real_host_page_size if RAM. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: * uses the smallest page size for mask as IOMMU MR can support multple page sizes --- hw/vfio/common.c | 28 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 6eb85c7..171c6ad 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -312,6 +312,16 @@ out: rcu_read_unlock(); } +static hwaddr vfio_iommu_page_mask(MemoryRegion *mr) +{ +if (memory_region_is_iommu(mr)) { +int smallest = ffs(memory_region_iommu_get_page_sizes(mr)) - 1; + +return ~((1ULL smallest) - 1); +} +return qemu_real_host_page_mask; +} + static void vfio_listener_region_add(VFIOMemoryListener *vlistener, MemoryRegionSection *section) { @@ -320,6 +330,7 @@ static void vfio_listener_region_add(VFIOMemoryListener *vlistener, Int128 llend; void *vaddr; int ret; +hwaddr page_mask = vfio_iommu_page_mask(section-mr); if (vfio_listener_skipped_section(section)) { trace_vfio_listener_region_add_skip( @@ -329,16 +340,16 @@ static void vfio_listener_region_add(VFIOMemoryListener *vlistener, return; } -if (unlikely((section-offset_within_address_space ~TARGET_PAGE_MASK) != - (section-offset_within_region ~TARGET_PAGE_MASK))) { +if (unlikely((section-offset_within_address_space ~page_mask) != + (section-offset_within_region ~page_mask))) { error_report(%s received unaligned region, __func__); return; } -iova = TARGET_PAGE_ALIGN(section-offset_within_address_space); +iova = ROUND_UP(section-offset_within_address_space, ~page_mask + 1); llend = int128_make64(section-offset_within_address_space); llend = int128_add(llend, section-size); -llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK)); +llend = int128_and(llend, int128_exts64(page_mask)); if (int128_ge(int128_make64(iova), llend)) { return; @@ -421,6 +432,7 @@ static void vfio_listener_region_del(VFIOMemoryListener *vlistener, VFIOContainer *container = vlistener-container; hwaddr iova, end; int ret; +hwaddr page_mask = vfio_iommu_page_mask(section-mr); if (vfio_listener_skipped_section(section)) { trace_vfio_listener_region_del_skip( @@ -430,8 +442,8 @@ static void vfio_listener_region_del(VFIOMemoryListener *vlistener, return; } -if (unlikely((section-offset_within_address_space ~TARGET_PAGE_MASK) != - (section-offset_within_region ~TARGET_PAGE_MASK))) { +if (unlikely((section-offset_within_address_space ~page_mask) != + (section-offset_within_region ~page_mask))) { error_report(%s received unaligned region, __func__); return; } @@ -457,9 +469,9 @@ static void vfio_listener_region_del(VFIOMemoryListener *vlistener, */ } -iova = TARGET_PAGE_ALIGN(section-offset_within_address_space); +iova = ROUND_UP(section-offset_within_address_space, ~page_mask + 1); end = (section-offset_within_address_space + int128_get64(section-size)) - TARGET_PAGE_MASK; + page_mask; if (iova = end) { return; -- 2.4.0.rc3.8.gfb3e7d5
[Qemu-devel] [RFC PATCH qemu 1/4] memory: Add reporting of supported page sizes
Every IOMMU has some granularity which MemoryRegionIOMMUOps::translate uses when translating, however this information is not available outside the translate context for various checks. This adds a get_page_sizes callback to MemoryRegionIOMMUOps and a wrapper for it so IOMMU users (such as VFIO) can know the actual page size(s) used by an IOMMU. The qemu_real_host_page_mask is used as fallback. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v4: * s/1TARGET_PAGE_BITS/qemu_real_host_page_size/ --- hw/ppc/spapr_iommu.c | 8 include/exec/memory.h | 11 +++ memory.c | 9 + 3 files changed, 28 insertions(+) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index f61504e..a2572c4 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -104,6 +104,13 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr, return ret; } +static uint64_t spapr_tce_get_page_sizes(MemoryRegion *iommu) +{ +sPAPRTCETable *tcet = container_of(iommu, sPAPRTCETable, iommu); + +return 1ULL tcet-page_shift; +} + static int spapr_tce_table_post_load(void *opaque, int version_id) { sPAPRTCETable *tcet = SPAPR_TCE_TABLE(opaque); @@ -135,6 +142,7 @@ static const VMStateDescription vmstate_spapr_tce_table = { static MemoryRegionIOMMUOps spapr_iommu_ops = { .translate = spapr_tce_translate_iommu, +.get_page_sizes = spapr_tce_get_page_sizes, }; static int spapr_tce_table_realize(DeviceState *dev) diff --git a/include/exec/memory.h b/include/exec/memory.h index 1394715..dc90403 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -150,6 +150,8 @@ struct MemoryRegionOps { typedef struct MemoryRegionIOMMUOps MemoryRegionIOMMUOps; struct MemoryRegionIOMMUOps { +/* Returns supported page sizes */ +uint64_t (*get_page_sizes)(MemoryRegion *iommu); /* Return a TLB entry that contains a given address. */ IOMMUTLBEntry (*translate)(MemoryRegion *iommu, hwaddr addr, bool is_write); }; @@ -552,6 +554,15 @@ static inline bool memory_region_is_romd(MemoryRegion *mr) bool memory_region_is_iommu(MemoryRegion *mr); /** + * memory_region_iommu_get_page_sizes: get supported page sizes in an iommu + * + * Returns %bitmap of supported page sizes for an iommu. + * + * @mr: the memory region being queried + */ +uint64_t memory_region_iommu_get_page_sizes(MemoryRegion *mr); + +/** * memory_region_notify_iommu: notify a change in an IOMMU translation entry. * * @mr: the memory region that was changed diff --git a/memory.c b/memory.c index 5a0cc66..eec3746 100644 --- a/memory.c +++ b/memory.c @@ -1413,6 +1413,15 @@ bool memory_region_is_iommu(MemoryRegion *mr) return mr-iommu_ops; } +uint64_t memory_region_iommu_get_page_sizes(MemoryRegion *mr) +{ +assert(memory_region_is_iommu(mr)); +if (mr-iommu_ops mr-iommu_ops-get_page_sizes) { +return mr-iommu_ops-get_page_sizes(mr); +} +return qemu_real_host_page_size; +} + void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n) { notifier_list_add(mr-iommu_notify, n); -- 2.4.0.rc3.8.gfb3e7d5
[Qemu-devel] [RFC PATCH qemu 4/4] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering)
This makes use of the new memory registering feature. The idea is to provide the userspace ability to notify the host kernel about pages which are going to be used for DMA. Having this information, the host kernel can pin them all once per user process, do locked pages accounting (once) and not spent time on doing that in real time with possible failures which cannot be handled nicely in some cases. This adds a prereg memory listener which listens on address_space_memory and notifies a VFIO container about memory which needs to be pinned/unpinned. VFIO MMIO regions (i.e. skip dump regions) are skipped. The feature is only enabled for SPAPR IOMMU v2. The host kernel changes are required. Since v2 does not need/support VFIO_IOMMU_ENABLE, this does not call it when v2 is detected and enabled. This does not change the guest visible interface. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v4: * s/ram_listener/prereg_listener/ - listener names suggest what they do, not what they listen on * put prereg_listener registeration first v3: * new RAM listener skips BARs (i.e. skip dump regions) v2: * added another listener for RAM --- hw/vfio/common.c | 117 +- include/hw/vfio/vfio-common.h | 1 + trace-events | 2 + 3 files changed, 108 insertions(+), 12 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 171c6ad..6d2ee2d 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -326,13 +326,15 @@ static void vfio_listener_region_add(VFIOMemoryListener *vlistener, MemoryRegionSection *section) { VFIOContainer *container = vlistener-container; +bool is_prereg = (vlistener == container-prereg_listener); hwaddr iova, end; Int128 llend; void *vaddr; int ret; hwaddr page_mask = vfio_iommu_page_mask(section-mr); -if (vfio_listener_skipped_section(section)) { +if (vfio_listener_skipped_section(section) || +(is_prereg memory_region_is_skip_dump(section-mr))) { trace_vfio_listener_region_add_skip( section-offset_within_address_space, section-offset_within_address_space + @@ -357,7 +359,7 @@ static void vfio_listener_region_add(VFIOMemoryListener *vlistener, memory_region_ref(section-mr); -if (memory_region_is_iommu(section-mr)) { +if (!is_prereg memory_region_is_iommu(section-mr)) { VFIOGuestIOMMU *giommu; trace_vfio_listener_region_add_iommu(iova, @@ -405,6 +407,33 @@ static void vfio_listener_region_add(VFIOMemoryListener *vlistener, trace_vfio_listener_region_add_ram(iova, end - 1, vaddr); +if (is_prereg) { +struct vfio_iommu_spapr_register_memory reg = { +.argsz = sizeof(reg), +.flags = 0, +.vaddr = (uint64_t) vaddr, +.size = end - iova +}; + +ret = ioctl(container-fd, VFIO_IOMMU_SPAPR_REGISTER_MEMORY, reg); +trace_vfio_ram_register(reg.vaddr, reg.size, ret ? -errno : 0); +if (ret) { +/* + * On the initfn path, store the first error in the container so we + * can gracefully fail. Runtime, there's not much we can do other + * than throw a hardware error. + */ +if (!container-initialized) { +if (!container-error) { +container-error = ret; +} +} else { +hw_error(vfio: DMA mapping failed, unable to continue); +} +} +return; +} + ret = vfio_dma_map(container, iova, end - iova, vaddr, section-readonly); if (ret) { error_report(vfio_dma_map(%p, 0x%HWADDR_PRIx, @@ -430,11 +459,13 @@ static void vfio_listener_region_del(VFIOMemoryListener *vlistener, MemoryRegionSection *section) { VFIOContainer *container = vlistener-container; +bool is_prereg = (vlistener == container-prereg_listener); hwaddr iova, end; int ret; hwaddr page_mask = vfio_iommu_page_mask(section-mr); -if (vfio_listener_skipped_section(section)) { +if (vfio_listener_skipped_section(section) || +(is_prereg memory_region_is_skip_dump(section-mr))) { trace_vfio_listener_region_del_skip( section-offset_within_address_space, section-offset_within_address_space + @@ -448,7 +479,7 @@ static void vfio_listener_region_del(VFIOMemoryListener *vlistener, return; } -if (memory_region_is_iommu(section-mr)) { +if (!is_prereg memory_region_is_iommu(section-mr)) { VFIOGuestIOMMU *giommu; QLIST_FOREACH(giommu, container-giommu_list, giommu_next) { @@ -477,8 +508,24 @@ static void vfio_listener_region_del(VFIOMemoryListener *vlistener, return; } +if (is_prereg) { +void *vaddr =
[Qemu-devel] [RFC PATCH qemu 2/4] vfio: Generalize IOMMU memory listener
At the moment VFIOContainer has an union for per IOMMU type data which is now an IOMMU memory listener and setup flags. The listener listens on PCI address space for both Type1 and sPAPR IOMMUs. The setup flags (@initialized and @error) are only used by Type1 now but the next patch will use it on sPAPR too. This introduces VFIOMemoryListener which is wrapper for MemoryListener and stores a pointer to the container. This allows having multiple memory listeners for the same container. This replaces Type1 listener with @iommu_listener. This moves @initialized and @error out of @iommu_data as these will be used soon for memory pre-registration. As there is only release() left in @iommu_data, this moves it to VFIOContainer and removes @iommu_data and VFIOType1. This stores @iommu_type in VFIOContainer. The prereg patch will use it to know whether or not to do proper cleanup. This should cause no change in behavior. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v4: * used to be vfio: Store IOMMU type in container * moved VFIOType1 content to container as it is not IOMMU type specific --- hw/vfio/common.c | 74 +++ include/hw/vfio/vfio-common.h | 25 +++ 2 files changed, 59 insertions(+), 40 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 85ee9b0..6eb85c7 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -312,11 +312,10 @@ out: rcu_read_unlock(); } -static void vfio_listener_region_add(MemoryListener *listener, +static void vfio_listener_region_add(VFIOMemoryListener *vlistener, MemoryRegionSection *section) { -VFIOContainer *container = container_of(listener, VFIOContainer, -iommu_data.type1.listener); +VFIOContainer *container = vlistener-container; hwaddr iova, end; Int128 llend; void *vaddr; @@ -406,9 +405,9 @@ static void vfio_listener_region_add(MemoryListener *listener, * can gracefully fail. Runtime, there's not much we can do other * than throw a hardware error. */ -if (!container-iommu_data.type1.initialized) { -if (!container-iommu_data.type1.error) { -container-iommu_data.type1.error = ret; +if (!container-initialized) { +if (!container-error) { +container-error = ret; } } else { hw_error(vfio: DMA mapping failed, unable to continue); @@ -416,11 +415,10 @@ static void vfio_listener_region_add(MemoryListener *listener, } } -static void vfio_listener_region_del(MemoryListener *listener, +static void vfio_listener_region_del(VFIOMemoryListener *vlistener, MemoryRegionSection *section) { -VFIOContainer *container = container_of(listener, VFIOContainer, -iommu_data.type1.listener); +VFIOContainer *container = vlistener-container; hwaddr iova, end; int ret; @@ -478,14 +476,33 @@ static void vfio_listener_region_del(MemoryListener *listener, } } -static const MemoryListener vfio_memory_listener = { -.region_add = vfio_listener_region_add, -.region_del = vfio_listener_region_del, +static void vfio_iommu_listener_region_add(MemoryListener *listener, + MemoryRegionSection *section) +{ +VFIOMemoryListener *vlistener = container_of(listener, VFIOMemoryListener, + listener); + +vfio_listener_region_add(vlistener, section); +} + + +static void vfio_iommu_listener_region_del(MemoryListener *listener, + MemoryRegionSection *section) +{ +VFIOMemoryListener *vlistener = container_of(listener, VFIOMemoryListener, + listener); + +vfio_listener_region_del(vlistener, section); +} + +static const MemoryListener vfio_iommu_listener = { +.region_add = vfio_iommu_listener_region_add, +.region_del = vfio_iommu_listener_region_del, }; static void vfio_listener_release(VFIOContainer *container) { -memory_listener_unregister(container-iommu_data.type1.listener); +memory_listener_unregister(container-iommu_listener.listener); } int vfio_mmap_region(Object *obj, VFIORegion *region, @@ -676,27 +693,28 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as) goto free_container_exit; } -ret = ioctl(fd, VFIO_SET_IOMMU, -v2 ? VFIO_TYPE1v2_IOMMU : VFIO_TYPE1_IOMMU); +container-iommu_type = v2 ? VFIO_TYPE1v2_IOMMU : VFIO_TYPE1_IOMMU; +ret = ioctl(fd, VFIO_SET_IOMMU, container-iommu_type); if (ret) { error_report(vfio: failed to set iommu for container: %m); ret = -errno; goto free_container_exit;
Re: [Qemu-devel] [PATCH v2] AioContext: fix broken placement of event_notifier_test_and_clear
On Mon, 07/20 07:27, Paolo Bonzini wrote: diff --git a/aio-win32.c b/aio-win32.c index ea655b0..7afc999 100644 --- a/aio-win32.c +++ b/aio-win32.c @@ -337,10 +337,11 @@ bool aio_poll(AioContext *ctx, bool blocking) aio_context_acquire(ctx); } -if (first aio_bh_poll(ctx)) { -progress = true; +if (first) { +event_notifier_test_and_clear(ctx-notifier); I'm looking at optimizing it but I don't fully understand the relationship between aio_prepare and WaitForMultipleObjects. Do they get the same set of events? What if a new event comes in between, for example, thread worker calls aio_notify()? Fam +progress |= aio_bh_poll(ctx); +first = false; } -first = false; /* if we have any signaled events, dispatch event */ event = NULL;
Re: [Qemu-devel] [RFC PATCH 2/2] spapr: -kernel: allow linking with specified addr
On 20/07/15 07:01, David Gibson wrote: On Fri, Jul 17, 2015 at 01:56:40PM +0200, Andrew Jones wrote: I've started playing with adding ppc support to kvm-unit-tests, using spapr for the machine model. I wanted to link the unit test at 0x40 to match qemu's load address, making the unit test startup code simpler, but ended up with 0x80 instead, due to how translate_kernel_address works. The translation makes sense for how Linux kernels are linked (always at 0xc000 or 0xc000), but for the unit test case we need to avoid adding the offset. Signed-off-by: Andrew Jones drjo...@redhat.com --- Big RFC because I don't know if the always at 0xc... statement is 100% true for Linux, nor if this patch would break other stuff... Yeah, I'm pretty dubious about this too, especially since I don't entirely grasp what the load_elf() translation function is all about anyway. Well, AFAIK it's used to modify the addresses before the ELF loader uses the address for loading. For example if your ELF binary is linked at address 0x1000, the translate function would move your binary to 0x401000 instead so that it does not interfere with the SLOF firmware (which is loaded to address 0 IIRC). So I also think your fix here is wrong, Andrew. E.g. when you have a binary that is linked to address 0x1000, you don't want to bypass the translation step here since it then would clash with the firmware. That said, I suspect making your unit test assume a fixed load address may not be the best idea - qemu or SLOF could change in future to move things about, so it might be more robust to have your test copy itself to address it wants to be at before executing. +1 ... or you could try to get the elf_reloc code working for POWER, too (see include/hw/elf_ops.h). That way QEMU would take care of relocating your program. (you can peek at elf_apply_rela64() in https://github.com/aik/SLOF/blob/master/lib/libelf/elf64.c if you want to know what basically has to be done for POWER relocations). Thomas signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v9 0/4] remove icc bus/bridge
On Mon, 20 Jul 2015 09:00:06 +0800 Zhu Guihua zhugh.f...@cn.fujitsu.com wrote: On 07/16/2015 05:52 PM, Igor Mammedov wrote: On Thu, 16 Jul 2015 10:45:41 +0800 Zhu Guihua zhugh.f...@cn.fujitsu.com wrote: ping... I'll look at it once 2.4 is released. Got it, thanks. By the way, do you know what state of qemu socket topology ? There were no changes/progress as far as I remember. Regards, Zhu On 07/03/2015 05:38 PM, Zhu Guihua wrote: ICC Bus was used for providing a hotpluggable bus for APIC and CPU, but now we use HotplugHandler to make hotplug. So ICC Bus is unnecessary. This code has passed the new pc-cpu-test. And I have tested with kvm along with kernel_irqchip=on/off, it works fine. This patch series is based on the latest master. v9: -use a callback to correct reset sequence for x86 -update apic mmio mapping v8: -add a wrapper to specify reset order v7: -update to register reset handler for main_system_bus when created -register reset handler for apic after all devices are initialized Chen Fan (2): apic: map APIC's MMIO region at each CPU's address space cpu/apic: drop icc bus/bridge Zhu Guihua (2): x86: use new method to correct reset sequence icc_bus: drop the unused files default-configs/i386-softmmu.mak | 1 - default-configs/x86_64-softmmu.mak | 1 - hw/cpu/Makefile.objs | 1 - hw/cpu/icc_bus.c | 118 - hw/i386/pc.c | 43 +++--- hw/i386/pc_piix.c | 9 +-- hw/i386/pc_q35.c | 9 +-- hw/intc/apic_common.c | 11 +--- include/hw/cpu/icc_bus.h | 82 -- include/hw/i386/apic_internal.h| 7 ++- include/hw/i386/pc.h | 2 +- target-i386/cpu.c | 30 +++--- 12 files changed, 52 insertions(+), 262 deletions(-) delete mode 100644 hw/cpu/icc_bus.c delete mode 100644 include/hw/cpu/icc_bus.h .
[Qemu-devel] [POC] colo-proxy in qemu
Hi, all We are planning to implement colo-proxy in qemu to cache and compare packets. This module is one of the important component of COLO project and now it is still in early stage, so any comments and feedback are warmly welcomed, thanks in advance. ## Background COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop Service) project is a high availability solution. Both Primary VM (PVM) and Secondary VM (SVM) run in parallel. They receive the same request from client, and generate responses in parallel too. If the response packets from PVM and SVM are identical, they are released immediately. Otherwise, a VM checkpoint (on demand) is conducted. Paper: http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0 COLO on Xen: http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping COLO on Qemu/KVM: http://wiki.qemu.org/Features/COLO By the needs of capturing response packets from PVM and SVM and finding out whether they are identical, we introduce a new module to qemu networking called colo-proxy. This document describes the design of the colo-proxy module ## Glossary PVM - Primary VM, which provides services to clients. SVM - Secondary VM, a hot standby and replication of PVM. PN - Primary Node, the host which PVM runs on SN - Secondary Node, the host which SVM runs on ## Workflow ## The following image show the qemu networking packet datapath between guest's NIC and qemu's backend in colo-proxy. +---++---+ |PN ||SN | +---+--+ +--+ | +---+ | | +---+ | ++ |chkpoint[socket]---chkpoint ++ |PVM | +---^---+ | | +---+---+ |SVM | || +proxy--v+ | | | || || || | | | || | +---+ | | +TCP/IP stack+ | | | +-v---proxy | +---+ | +-|NIC|--+ | || | | | | | +-|NIC|--+ | +^-++ | | ++ | | | | | +TCP/IP stack-+ | +^--+ | | | +-- | | compare| | -[socket]-forward- | ++ | || | | || | +---++ | | | | | | |seqack | | + | | || +-|--+ | | | | | |adjust | | | | | || || | | | | ++ | | | | +---+-copyforward-[socket]--- +-+ | | | +---|---|+ | | +^+ | | | | | | | | | | | | | x | |+--+---v+ | |+-v-+ | | QEMU | backend | | | QEMU | backend | | ++ (tap)+-+ ++ (tap)+-+ +---++---+ ## Our Idea ## ### Net filter In current QEMU, a packet is transported between networking backend(tap) and qemu network adapter(NIC) directly. Backend and adapter is linked by NetClientState-peer in qemu as following ++ v| +NetClientState+ +---+NetClientState+| |info-type=TAP| ||info-type=NIC|| +--+ |+--+| | *peer +---+| *peer ++ +--++--+ |name=tap0 ||name=e1000 | +--++--+ | ... || ... | +--++--+ In COLO QEMU, we insert a net filter named colo-proxy between backend and adapter like below: typedef struct COLOState { NetClientState nc; NetClientState *peer; } COLOState; +---+NetClientState++NetClientState++ ||info-type=TAP||info-type=NIC| | |+--++--+ | +---+ *peer || *peer ++ | |+--++--+ | | | ||name=tap0 ||name=e1000 | | | | |+--++--+ | | | || ... || ... | | | | |+--++--+ | | | | | | | | +-COLOState+ +-COLOState+ | | +-+NetClientState+- - + ++NetClientState+-+ | | |info-type=COLO | | | | |info-type=COLO | | | |
Re: [Qemu-devel] [FIX PATCH] pc-dimm: Fail pc-dimm realization for invalid nodes in non-NUMA configuration
On Fri, 17 Jul 2015 18:19:40 +0530 Bharata B Rao bhar...@linux.vnet.ibm.com wrote: pc_dimm_realize() validates the NUMA node to which memory hotplug is being performed only in case of NUMA configuration. Include a check to fail invalid nodes in case of non-NUMA configuration too. Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com Reviewed-by: Igor Mammedov imamm...@redhat.com --- hw/mem/pc-dimm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c index bb04862..099025e 100644 --- a/hw/mem/pc-dimm.c +++ b/hw/mem/pc-dimm.c @@ -414,10 +414,11 @@ static void pc_dimm_realize(DeviceState *dev, Error **errp) error_setg(errp, ' PC_DIMM_MEMDEV_PROP ' property is not set); return; } -if ((nb_numa_nodes 0) (dimm-node = nb_numa_nodes)) { +if (((nb_numa_nodes 0) (dimm-node = nb_numa_nodes)) || +(!nb_numa_nodes dimm-node)) { error_setg(errp, 'DIMM property PC_DIMM_NODE_PROP has value % PRIu32 ' which exceeds the number of numa nodes: %d, - dimm-node, nb_numa_nodes); + dimm-node, nb_numa_nodes ? nb_numa_nodes : 1); return; } }
Re: [Qemu-devel] [RFC PATCH 3/4] ppc: Use split I/D mmu modes to avoid flushes on interrupts
On 2015-07-20 09:33, Benjamin Herrenschmidt wrote: On Mon, 2015-07-20 at 01:01 +0200, Aurelien Jarno wrote: One way to improve this would be to reduce the size of a TLB entry. Currently we store the page address separately for read, write and code. The information is therefore quite redundant. We might want to have only one page address entry and encode if it is allowed for read, write or code in the low bits just like we do for invalid, mmio or dirty. This means the TLB entry can be checked with env-tlb_table[mmu_idx][page_index].ADDR == (addr (TARGET_PAGE_MASK | (DATA_SIZE - 1))) | READ/WRITE/CODE) with READ/WRITE/CODE each being a different bit (they can probably even replace invalid). In practice it means one more instruction in the fast path (one or with a 8-bit immediate), but it allows to divide the size of a TLB entry by two on a 64-bit machine. It might be worth a try. It might but that means fixing all tcg backends which I'm not necessarily looking forward to :-) The cost of that one or might be minimum on some processor but I wouldn't bet on it as we have basically all dependent instructions. Understood. I did some tests showing that the number of instructions in the fast path doesn't not have a big performance impact. In that case, there is dependency between instructions, but anyway the CPU is likely to be stalled by the TLB entry to the memory access, so we can add one instruction before with very little impact. I'll keep this idea in my todo list for another day. Aurelien -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
[Qemu-devel] [Bug 1469946] Re: guest can't get IP when create guest with bridge.
Does the bug's patch has merged in qemu.git? I test the latest qemu.git(commit:5b5e8cdd7da7a2214dd062afff5b866234aab228), the bug still can reproduce. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1469946 Title: guest can't get IP when create guest with bridge. Status in QEMU: New Bug description: Environment: Host OS (ia32/ia32e/IA64):ia32e Guest OS (ia32/ia32e/IA64):ia32e Guest OS Type (Linux/Windows):linux kvm.git Commit:aefbef10e3ae6e2c6e3c54f906f10b34c73a2c66 qemu.git Commit:dc1e1350f8061021df765b396295329797d66933 Host Kernel Version:4.1.0 Hardware:Ivytown_EP, Haswell_EP Bug detailed description: -- when create guest with bridge, the guest can not get ip. note: 1. fail rate: 3/5 2. this is a qemu bug: kvm + qemu = result aefbef10 + dc1e1350= bad aefbef10 + a4ef02fd = good Reproduce steps: 1. create guest: qemu-system-x86_64 -enable-kvm -m 2G -smp 4 -device virtio-net-pci,netdev=net0,mac=$random_mac -netdev tap,id=net0,script=/etc/kvm/qemu-ifup rhel6u5.qcow Current result: guest can't get IP Expected result: guest can get ip Basic root-causing log: -- To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1469946/+subscriptions
Re: [Qemu-devel] [PATCH] hostmem: Fix qemu_opt_get_bool() crash in host_memory_backend_init()
On Fri, 17 Jul 2015 17:33:55 -0300 Eduardo Habkost ehabk...@redhat.com wrote: On Thu, Jul 16, 2015 at 11:02:14PM +0200, Igor Mammedov wrote: On Thu, 16 Jul 2015 17:39:17 -0300 Eduardo Habkost ehabk...@redhat.com wrote: This fixes the following crash, introduced by commit 49d2e648e8087d154d8bf8b91f27c8e05e79d5a6: $ gdb --args qemu-system-x86_64 -machine pc,mem-merge=off -object memory-backend-ram,id=ram-node0,size=1024 [...] Program received signal SIGABRT, Aborted. (gdb) bt #0 0x7253b8c7 in raise () at /lib64/libc.so.6 #1 0x7253d52a in abort () at /lib64/libc.so.6 #2 0x7253446d in __assert_fail_base () at /lib64/libc.so.6 #3 0x72534522 in () at /lib64/libc.so.6 #4 0x558bb80a in qemu_opt_get_bool_helper (opts=0x5621b650, name=name@entry=0x558ec922 mem-merge, defval=defval@entry=true, del=del@entry=false) at qemu/util/qemu-option.c:388 #5 0x558bbb5a in qemu_opt_get_bool (opts=optimized out, name=name@entry=0x558ec922 mem-merge, defval=defval@entry=true) at qemu/util/qemu-option.c:398 #6 0x55720a24 in host_memory_backend_init (obj=0x562ac970) at qemu/backends/hostmem.c:226 Instead of using qemu_opt_get_bool(), that didn't work with qemu_machine_opts for a long time, we can use the machine QOM properties directly. Signed-off-by: Eduardo Habkost ehabk...@redhat.com --- backends/hostmem.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/backends/hostmem.c b/backends/hostmem.c index 61c1ac0..38a32ed 100644 --- a/backends/hostmem.c +++ b/backends/hostmem.c @@ -10,6 +10,7 @@ * See the COPYING file in the top-level directory. */ #include sysemu/hostmem.h +#include hw/boards.h #include qapi/visitor.h #include qapi-types.h #include qapi-visit.h @@ -223,10 +224,10 @@ static void host_memory_backend_init(Object *obj) { HostMemoryBackend *backend = MEMORY_BACKEND(obj); -backend-merge = qemu_opt_get_bool(qemu_get_machine_opts(), - mem-merge, true); -backend-dump = qemu_opt_get_bool(qemu_get_machine_opts(), - dump-guest-core, true); +backend-merge = object_property_get_bool(OBJECT(current_machine), maybe use qdev_get_machine() instead of OBJECT(current_machine) What are the advantages you see in the extra layers of indirection of qdev_get_machine()? (I am not against your proposal, but I would like to understand the point of qdev_get_machine() yet.) current_machine might be NULL where as qdev_get_machine() always returns /machine object. I'd prefer to use something that is guaranteed to be MachineState*, qdev_get_machine() returns Object*. I am even considering using current_machine-mem_merge and current_machine-dump_guest_core directly instead of object_property_get_bool(). That would mean extra compile-time checks, instead of runtime ones. Check difference 'git grep qdev_get_machine' vs 'git grep current_machine'. I was under impression that policy was trying no to use globals unless one has to, and not introduce new usage in presence of other means to get object.
Re: [Qemu-devel] [RFC PATCH 3/4] ppc: Use split I/D mmu modes to avoid flushes on interrupts
On Mon, 2015-07-20 at 09:11 +0200, Aurelien Jarno wrote: Understood. I did some tests showing that the number of instructions in the fast path doesn't not have a big performance impact. In that case, there is dependency between instructions, but anyway the CPU is likely to be stalled by the TLB entry to the memory access, so we can add one instruction before with very little impact. Possibly, though I would expect the TLB to be pretty hot in the cache. The likelihood of successive accesses being close to each other (or even in the same page) is also quite high. It might be something worth instrumenting some day. The one thing that might prove a gain for some workloads would be to add TCG primitives for simple vector load/stores... with gcc being more and more aggressive these days at using them for moving things around (and glibc memcpy), it might be a measurable gain. I'll keep this idea in my todo list for another day. My todo list tend to be a O_WRONLY file sadly :-) Cheers, Ben.
Re: [Qemu-devel] [PULL for-2.4 0/1] virtio-rng: reduce wakeups
On 17 July 2015 at 14:49, Amit Shah amit.s...@redhat.com wrote: The following changes since commit 5b5e8cdd7da7a2214dd062afff5b866234aab228: Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20150717-1' into staging (2015-07-17 12:39:12 +0100) are available in the git repository at: https://git.kernel.org/pub/scm/virt/qemu/amit/virtio-rng.git tags/vrng-2.4 for you to fetch changes up to 621a20e08155179b1902c428361e80f41429f50d: virtio-rng: trigger timer only when guest requests for entropy (2015-07-17 19:05:16 +0530) Fire timer only when required. Brings down wakeups by a big number. Applied, thanks. -- PMM
Re: [Qemu-devel] [PATCH v7 18/42] Add wrappers and handlers for sending/receiving the postcopy-ram migration messages.
On (Mon) 13 Jul 2015 [13:02:09], Juan Quintela wrote: +/* We're expecting a + *Version (0) + *a RAM ID string (length byte, name, 0 term) + *then at least 1 16 byte chunk +*/ +if (len 20) { 1 + 1+1+1+1+2*8 Humm, thinking about it, why are we not needing a length field of number of entries? hm, yea. +error_report(CMD_POSTCOPY_RAM_DISCARD invalid length (%d), len); +return -1; +} + +tmp = qemu_get_byte(mis-file); +if (tmp != 0) { I think that a constant telling POSTCOPY_VERSION0 or whatever? agreed. Amit
Re: [Qemu-devel] [POC] colo-proxy in qemu
On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote: We are planning to implement colo-proxy in qemu to cache and compare packets. I thought there is a kernel module to do that? Why does the proxy need to be part of the QEMU process? -netdev socket or host network stack features allow you to process packets in a separate process. Without details on what the proxy does it's hard to discuss this. What happens in the non-TCP case? What happens in the TCP case? Does the proxy need to perform privileged operations, create sockets, open files, etc? The slirp code is not actively developed or used much in production. It might be a good idea to audit the code for bugs if you want to use it. Stefan pgp9feTj3BlPV.pgp Description: PGP signature
Re: [Qemu-devel] [RFC] Virt machine memory map
On 20 July 2015 at 09:55, Pavel Fedin p.fe...@samsung.com wrote: Hello! In our project we work on a very fast paravirtualized network I/O drivers, based on ivshmem. We successfully got ivshmem working on ARM, however with one hack. Currently we have: --- cut --- [VIRT_PCIE_MMIO] = { 0x1000, 0x2eff }, [VIRT_PCIE_PIO] = { 0x3eff, 0x0001 }, [VIRT_PCIE_ECAM] = { 0x3f00, 0x0100 }, [VIRT_MEM] ={ 0x4000, 30ULL * 1024 * 1024 * 1024 }, --- cut --- And MMIO region is not enough for us because we want to have 1GB mapping for PCI device. In order to make it working, we modify the map as follows: --- cut --- [VIRT_PCIE_MMIO] ={ 0x1000, 0x7eff }, [VIRT_PCIE_PIO] = { 0x8eff, 0x0001 }, [VIRT_PCIE_ECAM] = { 0x8f00, 0x0100 }, [VIRT_MEM] = { 0x9000, 30ULL * 1024 * 1024 * 1024 }, --- cut --- The question is - how could we upstream this? I believe modifying 32-bit virt memory map this way is not good. Will it be OK to have different memory map for 64-bit virt ? I think the theory we discussed at the time of putting in the PCIe device was that if we wanted this we'd add support for the other PCIe memory window (which would then live at somewhere above 4GB). Alex, can you remember what the idea was? But to be honest I think we weren't expecting anybody to need 1GB of PCI MMIO space unless it was a video card... thanks -- PMM
Re: [Qemu-devel] [libvirt] [PATCH] qxl: Fix new function name for spice-server library
On Mon, Jul 20, 2015 at 09:43:23AM +0100, Frediano Ziglio wrote: The new spice-server function to limit the number of monitors (0.12.6) changed while development from spice_qxl_set_monitors_config_limit to spice_qxl_max_monitors (accepted upstream). By mistake I post patch with former name. This patch fix the function name. Signed-off-by: Frediano Ziglio fzig...@redhat.com --- hw/display/qxl.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) I tested again doing a clean build, unfortunately I did some mistake and my tests worked. diff --git a/hw/display/qxl.c b/hw/display/qxl.c index 4e5ff69..2288238 100644 --- a/hw/display/qxl.c +++ b/hw/display/qxl.c @@ -273,8 +273,7 @@ static void qxl_spice_monitors_config_async(PCIQXLDevice *qxl, int replay) } else { #if SPICE_SERVER_VERSION = 0x000c06 /* release 0.12.6 */ if (qxl-max_outputs) { -spice_qxl_set_monitors_config_limit(qxl-ssd.qxl, -qxl-max_outputs); +spice_qxl_set_max_monitors(qxl-ssd.qxl, qxl-max_outputs); } #endif qxl-guest_monitors_config = qxl-ram-monitors_config; -- 2.1.0 Same as the fix I did in order for this to work with upstream spice. ACK. Weak, though, as I'm not a privileged one. Martin signature.asc Description: PGP signature
[Qemu-devel] [Bug 1476183] [NEW] can not create 4 serial port on window (guest os)
Public bug reported: qemu ver: 2.1.2-Latest guest os: window 7 64bit with 2 cpu problem: when qemu start with 4 serial port, on linux(rhel 7) guest os, /dev/ttyS0-4 is work fine. but on window 7 guest os, only show com1,com2 in device manager, how to get com3 com4 ? qemu cmd: -chardev spiceport,id=charserial0,name=org.qemu.console.serial.0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spiceport,id=charserial1,name=org.qemu.console.serial.1 -device isa-serial,chardev=charserial1,id=serial1 -chardev spiceport,id=charserial2,name=org.qemu.console.serial.2 -device isa-serial,chardev=charserial2,id=serial2 -chardev spiceport,id=charserial3,name=org.qemu.console.serial.3 -device isa-serial,chardev=charserial3,id=serial3 ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1476183 Title: can not create 4 serial port on window (guest os) Status in QEMU: New Bug description: qemu ver: 2.1.2-Latest guest os: window 7 64bit with 2 cpu problem: when qemu start with 4 serial port, on linux(rhel 7) guest os, /dev/ttyS0-4 is work fine. but on window 7 guest os, only show com1,com2 in device manager, how to get com3 com4 ? qemu cmd: -chardev spiceport,id=charserial0,name=org.qemu.console.serial.0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spiceport,id=charserial1,name=org.qemu.console.serial.1 -device isa-serial,chardev=charserial1,id=serial1 -chardev spiceport,id=charserial2,name=org.qemu.console.serial.2 -device isa-serial,chardev=charserial2,id=serial2 -chardev spiceport,id=charserial3,name=org.qemu.console.serial.3 -device isa-serial,chardev=charserial3,id=serial3 To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1476183/+subscriptions
Re: [Qemu-devel] [PATCH COLO-Frame v7 04/34] colo-comm/migration: skip colo info section for special cases
On 2015/7/18 1:07, Dr. David Alan Gilbert wrote: * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: For older machine types, we skip the colo info section when do migration, in this way, we can migrate successfully between older mainchine and the new one. We also skip this section if colo is not enabled (i.e. migrate_set_capability colo on), so that, It not break compatibility with migration however the --enable-colo/disable-colo on the source/destination; Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com --- hw/i386/pc_piix.c | 1 + hw/i386/pc_q35.c | 1 + hw/ppc/spapr.c| 1 + include/migration/migration.h | 1 + migration/colo-comm.c | 13 + 5 files changed, 17 insertions(+) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 8167b12..926b0d8 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -313,6 +313,7 @@ static void pc_compat_2_3(MachineState *machine) } global_state_set_optional(); savevm_skip_configuration(); +savevm_skip_colo_state(); } static void pc_compat_2_2(MachineState *machine) diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 974aead..b5c6c85 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -296,6 +296,7 @@ static void pc_compat_2_3(MachineState *machine) } global_state_set_optional(); savevm_skip_configuration(); +savevm_skip_colo_state(); } static void pc_compat_2_2(MachineState *machine) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index a6f1947..568de93 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -1880,6 +1880,7 @@ static void spapr_compat_2_3(Object *obj) { savevm_skip_section_footers(); global_state_set_optional(); +savevm_skip_colo_state(); } static void spapr_compat_2_2(Object *obj) diff --git a/include/migration/migration.h b/include/migration/migration.h index 5c797d4..1b23517 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -203,4 +203,5 @@ void savevm_skip_section_footers(void); void register_global_state(void); void global_state_set_optional(void); void savevm_skip_configuration(void); +void savevm_skip_colo_state(void); #endif diff --git a/migration/colo-comm.c b/migration/colo-comm.c index 0a93672..3c8e361 100644 --- a/migration/colo-comm.c +++ b/migration/colo-comm.c @@ -21,6 +21,11 @@ typedef struct { static COLOInfo colo_info; +void savevm_skip_colo_state(void) +{ +colo_info.skip = true; +} + static void colo_info_pre_save(void *opaque) { COLOInfo *s = opaque; @@ -32,12 +37,20 @@ static void colo_info_pre_save(void *opaque) } } +static bool colo_info_need(void *opaque) +{ +if (migrate_enable_colo() !colo_info.skip) { +return true; + } +return false; +} That will work, but I think (untested) this can just be: +static bool colo_info_need(void *opaque) +{ +return migrate_enable_colo(); +} and then you can get rid of the skip stuff (and merge it back to the previous patch). Old qemu's will never sent the section so we're safe. New qemu's with that flag unset won't send the section, so they're still migration compatible on the machine type. New qemu's with the flag set will use it. Yes, you are right, the 'skip stuff' is redundant, will fix it in next version, thanks. static const VMStateDescription colo_state = { .name = COLOState, .version_id = 1, .minimum_version_id = 1, .pre_save = colo_info_pre_save, + .needed = colo_info_need, .fields = (VMStateField[]) { VMSTATE_UINT32(colo_requested, COLOInfo), VMSTATE_END_OF_LIST() -- 1.7.12.4 -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK .
[Qemu-devel] [PATCH] qxl: Fix new function name for spice-server library
The new spice-server function to limit the number of monitors (0.12.6) changed while development from spice_qxl_set_monitors_config_limit to spice_qxl_max_monitors (accepted upstream). By mistake I post patch with former name. This patch fix the function name. Signed-off-by: Frediano Ziglio fzig...@redhat.com --- hw/display/qxl.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) I tested again doing a clean build, unfortunately I did some mistake and my tests worked. diff --git a/hw/display/qxl.c b/hw/display/qxl.c index 4e5ff69..2288238 100644 --- a/hw/display/qxl.c +++ b/hw/display/qxl.c @@ -273,8 +273,7 @@ static void qxl_spice_monitors_config_async(PCIQXLDevice *qxl, int replay) } else { #if SPICE_SERVER_VERSION = 0x000c06 /* release 0.12.6 */ if (qxl-max_outputs) { -spice_qxl_set_monitors_config_limit(qxl-ssd.qxl, -qxl-max_outputs); +spice_qxl_set_max_monitors(qxl-ssd.qxl, qxl-max_outputs); } #endif qxl-guest_monitors_config = qxl-ram-monitors_config; -- 2.1.0
Re: [Qemu-devel] [PATCH] qxl: Fix new function name for spice-server library
On Mon, Jul 20, 2015 at 09:43:23AM +0100, Frediano Ziglio wrote: The new spice-server function to limit the number of monitors (0.12.6) changed while development from spice_qxl_set_monitors_config_limit to spice_qxl_max_monitors (accepted upstream). By mistake I post patch with former name. This patch fix the function name. Signed-off-by: Frediano Ziglio fzig...@redhat.com --- hw/display/qxl.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) I tested again doing a clean build, unfortunately I did some mistake and my tests worked. diff --git a/hw/display/qxl.c b/hw/display/qxl.c index 4e5ff69..2288238 100644 --- a/hw/display/qxl.c +++ b/hw/display/qxl.c @@ -273,8 +273,7 @@ static void qxl_spice_monitors_config_async(PCIQXLDevice *qxl, int replay) } else { #if SPICE_SERVER_VERSION = 0x000c06 /* release 0.12.6 */ if (qxl-max_outputs) { -spice_qxl_set_monitors_config_limit(qxl-ssd.qxl, -qxl-max_outputs); +spice_qxl_set_max_monitors(qxl-ssd.qxl, qxl-max_outputs); } #endif qxl-guest_monitors_config = qxl-ram-monitors_config; ACK from me, Christophe pgpmeQQ6uSTL7.pgp Description: PGP signature
[Qemu-devel] [RFC] Virt machine memory map
Hello! In our project we work on a very fast paravirtualized network I/O drivers, based on ivshmem. We successfully got ivshmem working on ARM, however with one hack. Currently we have: --- cut --- [VIRT_PCIE_MMIO] = { 0x1000, 0x2eff }, [VIRT_PCIE_PIO] = { 0x3eff, 0x0001 }, [VIRT_PCIE_ECAM] = { 0x3f00, 0x0100 }, [VIRT_MEM] ={ 0x4000, 30ULL * 1024 * 1024 * 1024 }, --- cut --- And MMIO region is not enough for us because we want to have 1GB mapping for PCI device. In order to make it working, we modify the map as follows: --- cut --- [VIRT_PCIE_MMIO] ={ 0x1000, 0x7eff }, [VIRT_PCIE_PIO] = { 0x8eff, 0x0001 }, [VIRT_PCIE_ECAM] = { 0x8f00, 0x0100 }, [VIRT_MEM] = { 0x9000, 30ULL * 1024 * 1024 * 1024 }, --- cut --- The question is - how could we upstream this? I believe modifying 32-bit virt memory map this way is not good. Will it be OK to have different memory map for 64-bit virt ? Another possible approach is not to use PCI, but MMIO instead, and just specify our region in the device tree. This way we work around the limitation of having only a single PCI MMIO region, and we could happily place our 1GB device after system RAM. Any opinions ? Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia
Re: [Qemu-devel] [PATCH v7 18/42] Add wrappers and handlers for sending/receiving the postcopy-ram migration messages.
On (Tue) 16 Jun 2015 [11:26:31], Dr. David Alan Gilbert (git) wrote: From: Dr. David Alan Gilbert dgilb...@redhat.com The state of the postcopy process is managed via a series of messages; * Add wrappers and handlers for sending/receiving these messages * Add state variable that track the current state of postcopy Signed-off-by: Dr. David Alan Gilbert dgilb...@redhat.com Reviewed-by: Amit Shah amit.s...@redhat.com But: +void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name, + uint16_t len, + uint64_t *start_list, + uint64_t *end_list) +{ +uint8_t *buf; +uint16_t tmplen; +uint16_t t; +size_t name_len = strlen(name); + +trace_qemu_savevm_send_postcopy_ram_discard(name, len); +buf = g_malloc0(len*16 + name_len + 3); +buf[0] = 0; /* Version */ +assert(name_len 256); +buf[1] = name_len; +memcpy(buf+2, name, name_len); +tmplen = 2+name_len; +buf[tmplen++] = '\0'; whitespace around operators missing +static int loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis, + uint16_t len) +len -= 3+strlen(ramid); ditto Amit
Re: [Qemu-devel] [PATCH v2] AioContext: fix broken placement of event_notifier_test_and_clear
On Mon, 07/20 15:46, Fam Zheng wrote: On Mon, 07/20 07:27, Paolo Bonzini wrote: diff --git a/aio-win32.c b/aio-win32.c index ea655b0..7afc999 100644 --- a/aio-win32.c +++ b/aio-win32.c @@ -337,10 +337,11 @@ bool aio_poll(AioContext *ctx, bool blocking) aio_context_acquire(ctx); } -if (first aio_bh_poll(ctx)) { -progress = true; +if (first) { +event_notifier_test_and_clear(ctx-notifier); I'm looking at optimizing it but I don't fully understand the relationship between aio_prepare and WaitForMultipleObjects. Do they get the same set of events? What if a new event comes in between, for example, thread worker calls aio_notify()? After some reading I think WaitForMultipleObjects is for event notifiers and aio_prepare is for select() on fd events. It's a bit trickier than aio-posix, in the first iteration there could be another event masking ctx-notifier so we don't know if we need to clear it. But since MSDN says: ... the return value minus WAIT_OBJECT_0 indicates the lpHandles array index of the object that satisfied the wait. If more than one object became signaled during the call, this is the array index of the signaled object with the smallest index value of all the signaled objects. Maybe we can reverse events[] so that ctx-notifier will be the 0th one. And I think we can always remove it after first iteration, am I right? Fam
Re: [Qemu-devel] [PULL for-2.4 0/7] update ipxe roms, fix efi support
On 17 July 2015 at 15:37, Gerd Hoffmann kra...@redhat.com wrote: Hi, This pull finally fixes the efi boot support. ipxe is updated to the latest master, two non-upstream commits needed to make efi work are added on top, and the build process is tweaked a bit. The ipxe changes are pushed to git://git.kraxel.org/ipxe (branch qemu, tag qemu-2.4). They should be mirrored to git://git.qemu.org/ipxe.git before merging this pull request. Is this supposed to happen automatically, or does somebody need to manually do something for that to happen? thanks -- PMM
Re: [Qemu-devel] Accessing guest kernel thread_info struct
On 20 July 2015 at 11:43, Igor R boost.li...@gmail.com wrote: I need to access thread_info (linux kernel struct) of the guest from within qemu, when the guest is in kernel mode. To do this, I read the stack pointer and mask it with ~(stack_size - 1). This works with x86 and ARM, but doesn't seem to work with MIPS - the pointer points to something that doesn't look like thread_info. I get sp as follows: env-active_tc.gpr[29] MIPS keeps the thread info pointer in a dedicated register. To get this right for each architecture you need to look at how the kernel implements current_thread_info(). For instance on ARM: http://lxr.free-electrons.com/source/arch/arm/include/asm/thread_info.h#L95 return (struct thread_info *) (current_stack_pointer ~(THREAD_SIZE - 1)); but on MIPS: http://lxr.free-electrons.com/source/arch/mips/include/asm/thread_info.h#L55 return __current_thread_info; where register struct thread_info *__current_thread_info __asm__($28); x86 doesn't use 'mask the stack pointer' either: static inline struct thread_info *current_thread_info(void) { return (struct thread_info *)(current_top_of_stack() - THREAD_SIZE); } where current_top_of_stack() is different for x86_64 and i386 but in both cases is reading a value from a per-CPU kernel variable. If you're trying to do something the kernel does, it's usually the case that the kernel has some kind of cross-platform abstraction, and you can just search the kernel sources to find out what the actual implementations for each architecture are. thanks -- PMM
Re: [Qemu-devel] [RFC] ATAPI-SCSI bridge GSoC project
On Sat, Jul 18, 2015 at 09:49:26PM +0300, Alexander Bezzubikov wrote: atapi: ATAPI-SCSI bridge device created private SCSI bus added to bridge ATAPI inquiry command can use a bridge Multiple items is a clue that this patch should be split up into patches with smaller logical changes. That will also allow you to write commit descriptions that give the rationale for the changes. pgp38YGWfXOyy.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH 1/1] virtio-mmio: return the max queue num of virtio-mmio with initial value
On 16 July 2015 at 19:38, Wei Huang w...@redhat.com wrote: Recently we found that virtio-console devices consumes lots AArch64 guest memory, roughly 1GB with 8 devices. After debugging, it turns out that lots of factors contribute to this problem: i) guest PAGE_SIZE=64KB, ii) virtio-mmio based devices, and iii) virtio-console device. Here is the detailed analysis: 1. First, during initialization, virtio-mmio driver in guest pokes vq size by reading VIRTIO_MMIO_QUEUE_NUM_MAX (see virtio_mmio.c file). 2. QEMU returns VIRTQUEUE_MAX_SIZE (1024) to guest VM; And virtio-mmio uses it as the default vq size. 3. virtio-console driver allocates vring buffers based on this value (see add_inbuf() function of virtio_console.c file). Because PAGE_SIZE=64KB, ~64MB is allocated for each virtio-console vq. This patch addresses the problem by returning the iniatlized vring size when VM queries QEMU about VIRTIO_MMIO_QUEUE_NUM_MAX. This is similar to virtio-pci's approach. By doing this, the vq memory consumption is reduced substantially. I don't know if this patch is sensible to apply anyway, but from this description this really sounds like a guest kernel bug. QEMU tells the kernel the maximum queue size it can cope with, and if the guest kernel cares about not using insane amounts of RAM on queues then it should not blindly use the maximum size but restrict it itself... thanks -- PMM
Re: [Qemu-devel] [RFC] Virt machine memory map
On 07/20/15 11:41, Peter Maydell wrote: On 20 July 2015 at 09:55, Pavel Fedin p.fe...@samsung.com wrote: Hello! In our project we work on a very fast paravirtualized network I/O drivers, based on ivshmem. We successfully got ivshmem working on ARM, however with one hack. Currently we have: --- cut --- [VIRT_PCIE_MMIO] = { 0x1000, 0x2eff }, [VIRT_PCIE_PIO] = { 0x3eff, 0x0001 }, [VIRT_PCIE_ECAM] = { 0x3f00, 0x0100 }, [VIRT_MEM] ={ 0x4000, 30ULL * 1024 * 1024 * 1024 }, --- cut --- And MMIO region is not enough for us because we want to have 1GB mapping for PCI device. In order to make it working, we modify the map as follows: --- cut --- [VIRT_PCIE_MMIO] ={ 0x1000, 0x7eff }, [VIRT_PCIE_PIO] = { 0x8eff, 0x0001 }, [VIRT_PCIE_ECAM] = { 0x8f00, 0x0100 }, [VIRT_MEM] = { 0x9000, 30ULL * 1024 * 1024 * 1024 }, --- cut --- The question is - how could we upstream this? I believe modifying 32-bit virt memory map this way is not good. Will it be OK to have different memory map for 64-bit virt ? I think the theory we discussed at the time of putting in the PCIe device was that if we wanted this we'd add support for the other PCIe memory window (which would then live at somewhere above 4GB). Alex, can you remember what the idea was? Yes, pretty much. It would give us an upper bound to the amount of RAM that we're able to support, but at least we would be able to support big MMIO regions like for ivshmem. I'm not really sure where to put it though. Depending on your kernel config Linux supports somewhere between 39 and 48 or so bits of phys address space. And I'd rather not crawl into the PCI hole rat hole that we have on x86 ;). We could of course also put it just above RAM - but then our device tree becomes really dynamic and heavily dependent on -m. But to be honest I think we weren't expecting anybody to need 1GB of PCI MMIO space unless it was a video card... Ivshmem was actually the most likely target that I could've thought of to require big MMIO regions ;). Alex
Re: [Qemu-devel] Accessing guest kernel thread_info struct
Thanks for the useful info! (Actually, my approach works as well - it was just endianness issue...)
Re: [Qemu-devel] [POC] colo-proxy in qemu
On 2015/7/20 18:32, Stefan Hajnoczi wrote: On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote: We are planning to implement colo-proxy in qemu to cache and compare packets. I thought there is a kernel module to do that? Yes, but we decided to re-implement it in userspace (Here is in qemu), there are mainly two reasons that we made this change. One is the colo-proxy in kernel is narrowly used, which can only be used for COLO FT, besides, we have to modify iptables and nftables to support this capability. IMHO, it is hardly been accepted by the kernel community. The other reason is that the kernel proxy scenario can't been used in all situations, for example, evs + vhost-user + dpdk, it can't work if VM's network packets don't go through host's network stack. (For the new userspace colo proxy scheme, we also can't use it with vhost-net, we have to use virtio-net instead). Why does the proxy need to be part of the QEMU process? -netdev socket or host network stack features allow you to process packets in a separate process. Without details on what the proxy does it's hard to discuss this. What happens in the non-TCP case? What happens in the TCP case? Does the proxy need to perform privileged operations, create sockets, open files, etc? The slirp code is not actively developed or used much in production. It might be a good idea to audit the code for bugs if you want to use it. Agreed, besides, it is seemed that slirp is not supporting ipv6, we also have to supplement it. Thanks, zhanghailiang
[Qemu-devel] [PULL 2/6] Revert vhost-user: add multi queue support
This reverts commit 830d70db692e374b5f4407f96a1ceefdcc97. The interface isn't fully backwards-compatible, which is bad. Let's redo this properly after 2.4. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- qapi-schema.json | 6 +- hw/net/vhost_net.c| 3 +-- hw/virtio/vhost-user.c| 11 +-- net/vhost-user.c | 37 + docs/specs/vhost-user.txt | 5 - qemu-options.hx | 5 ++--- 6 files changed, 18 insertions(+), 49 deletions(-) diff --git a/qapi-schema.json b/qapi-schema.json index 1285b8c..a0a45f7 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -2466,16 +2466,12 @@ # # @vhostforce: #optional vhost on for non-MSIX virtio guests (default: false). # -# @queues: #optional number of queues to be created for multiqueue vhost-user -# (default: 1) (Since 2.4) -# # Since 2.1 ## { 'struct': 'NetdevVhostUserOptions', 'data': { 'chardev':'str', -'*vhostforce':'bool', -'*queues':'uint32' } } +'*vhostforce':'bool' } } ## # @NetClientOptions diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index 9bd360b..5c1d11f 100644 --- a/hw/net/vhost_net.c +++ b/hw/net/vhost_net.c @@ -160,7 +160,6 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options) net-dev.nvqs = 2; net-dev.vqs = net-vqs; -net-dev.vq_index = net-nc-queue_index; r = vhost_dev_init(net-dev, options-opaque, options-backend_type); @@ -287,7 +286,7 @@ static void vhost_net_stop_one(struct vhost_net *net, for (file.index = 0; file.index net-dev.nvqs; ++file.index) { const VhostOps *vhost_ops = net-dev.vhost_ops; int r = vhost_ops-vhost_call(net-dev, VHOST_RESET_OWNER, - file); + NULL); assert(r = 0); } } diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index d6f2163..e7ab829 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -210,12 +210,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request, break; case VHOST_SET_OWNER: -break; - case VHOST_RESET_OWNER: -memcpy(msg.state, arg, sizeof(struct vhost_vring_state)); -msg.state.index += dev-vq_index; -msg.size = sizeof(m.state); break; case VHOST_SET_MEM_TABLE: @@ -258,20 +253,17 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request, case VHOST_SET_VRING_NUM: case VHOST_SET_VRING_BASE: memcpy(msg.state, arg, sizeof(struct vhost_vring_state)); -msg.state.index += dev-vq_index; msg.size = sizeof(m.state); break; case VHOST_GET_VRING_BASE: memcpy(msg.state, arg, sizeof(struct vhost_vring_state)); -msg.state.index += dev-vq_index; msg.size = sizeof(m.state); need_reply = 1; break; case VHOST_SET_VRING_ADDR: memcpy(msg.addr, arg, sizeof(struct vhost_vring_addr)); -msg.addr.index += dev-vq_index; msg.size = sizeof(m.addr); break; @@ -279,7 +271,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request, case VHOST_SET_VRING_CALL: case VHOST_SET_VRING_ERR: file = arg; -msg.u64 = (file-index + dev-vq_index) VHOST_USER_VRING_IDX_MASK; +msg.u64 = file-index VHOST_USER_VRING_IDX_MASK; msg.size = sizeof(m.u64); if (ioeventfd_enabled() file-fd 0) { fds[fd_num++] = file-fd; @@ -321,7 +313,6 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request, error_report(Received bad msg size.); return -1; } -msg.state.index -= dev-vq_index; memcpy(arg, msg.state, sizeof(struct vhost_vring_state)); break; default: diff --git a/net/vhost-user.c b/net/vhost-user.c index b51bc04..93dcecd 100644 --- a/net/vhost-user.c +++ b/net/vhost-user.c @@ -120,39 +120,35 @@ static void net_vhost_user_event(void *opaque, int event) case CHR_EVENT_OPENED: vhost_user_start(s); net_vhost_link_down(s, false); -error_report(chardev \%s\ went up, s-nc.info_str); +error_report(chardev \%s\ went up, s-chr-label); break; case CHR_EVENT_CLOSED: net_vhost_link_down(s, true); vhost_user_stop(s); -error_report(chardev \%s\ went down, s-nc.info_str); +error_report(chardev \%s\ went down, s-chr-label); break; } } static int net_vhost_user_init(NetClientState *peer, const char *device, - const char *name, CharDriverState *chr, - uint32_t queues) + const char *name, CharDriverState *chr) { NetClientState *nc;
[Qemu-devel] [PULL 3/6] virtio-net: unbreak any layout
From: Jason Wang jasow...@redhat.com Commit 032a74a1c0fcdd5fd1c69e56126b4c857ee36611 (virtio-net: byteswap virtio-net header) breaks any layout by requiring out_sg[0].iov_len = n-guest_hdr_len. Fixing this by copying header to temporary buffer if swap is needed, and then use this buffer as part of out_sg. Fixes 032a74a1c0fcdd5fd1c69e56126b4c857ee36611 (virtio-net: byteswap virtio-net header) Cc: qemu-sta...@nongnu.org Cc: c...@fr.ibm.com Signed-off-by: Jason Wang jasow...@redhat.com Reviewed-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Michael S. Tsirkin m...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- include/hw/virtio/virtio-access.h | 9 + hw/net/virtio-net.c | 23 ++- 2 files changed, 27 insertions(+), 5 deletions(-) diff --git a/include/hw/virtio/virtio-access.h b/include/hw/virtio/virtio-access.h index cee5dd7..1ec1dfd 100644 --- a/include/hw/virtio/virtio-access.h +++ b/include/hw/virtio/virtio-access.h @@ -143,6 +143,15 @@ static inline uint64_t virtio_ldq_p(VirtIODevice *vdev, const void *ptr) } } +static inline bool virtio_needs_swap(VirtIODevice *vdev) +{ +#ifdef HOST_WORDS_BIGENDIAN +return virtio_access_is_big_endian(vdev) ? false : true; +#else +return virtio_access_is_big_endian(vdev) ? true : false; +#endif +} + static inline uint16_t virtio_tswap16(VirtIODevice *vdev, uint16_t s) { #ifdef HOST_WORDS_BIGENDIAN diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index e3c2db3..9f7e91d 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -1142,7 +1142,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) ssize_t ret, len; unsigned int out_num = elem.out_num; struct iovec *out_sg = elem.out_sg[0]; -struct iovec sg[VIRTQUEUE_MAX_SIZE]; +struct iovec sg[VIRTQUEUE_MAX_SIZE], sg2[VIRTQUEUE_MAX_SIZE + 1]; +struct virtio_net_hdr_mrg_rxbuf mhdr; if (out_num 1) { error_report(virtio-net header not in first element); @@ -1150,13 +1151,25 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) } if (n-has_vnet_hdr) { -if (out_sg[0].iov_len n-guest_hdr_len) { +if (iov_to_buf(out_sg, out_num, 0, mhdr, n-guest_hdr_len) +n-guest_hdr_len) { error_report(virtio-net header incorrect); exit(1); } -virtio_net_hdr_swap(vdev, (void *) out_sg[0].iov_base); +if (virtio_needs_swap(vdev)) { +virtio_net_hdr_swap(vdev, (void *) mhdr); +sg2[0].iov_base = mhdr; +sg2[0].iov_len = n-guest_hdr_len; +out_num = iov_copy(sg2[1], ARRAY_SIZE(sg2) - 1, + out_sg, out_num, + n-guest_hdr_len, -1); +if (out_num == VIRTQUEUE_MAX_SIZE) { +goto drop; + } +out_num += 1; +out_sg = sg2; + } } - /* * If host wants to see the guest header as is, we can * pass it on unchanged. Otherwise, copy just the parts @@ -1186,7 +1199,7 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) } len += ret; - +drop: virtqueue_push(q-tx_vq, elem, 0); virtio_notify(vdev, q-tx_vq); -- MST
Re: [Qemu-devel] [PATCH for-2.4] disas/arm-a64: Add missing compiler attribute GCC_FMT_ATTR
On 18 July 2015 at 10:13, Peter Maydell peter.mayd...@linaro.org wrote: On 18 July 2015 at 09:27, Stefan Weil s...@weilnetz.de wrote: Type fprintf_function which fits here was defined with this attribute. Signed-off-by: Stefan Weil s...@weilnetz.de --- This is an optional trivial patch for 2.4 which fixes compiler warnings in my build environment (with -Wextra). Reviewed-by: Peter Maydell peter.mayd...@linaro.org No objection if you want to put it in 2.4, though I'm a bit surprised that missing GCC_FMT_ATTR provokes warnings. The cleanup is nice anyway. I've applied this to target-arm.next. Thanks. -- PMM
Re: [Qemu-devel] [PATCH v2] raw-posix.c: Make physical devices usable in QEMU under Mac OS X host
On Fri, Jul 17, 2015 at 03:24:34PM -0400, Programmingkid wrote: On Jul 17, 2015, at 9:41 AM, Stefan Hajnoczi wrote: On Thu, Jul 16, 2015 at 04:46:07PM -0400, Programmingkid wrote: @@ -2014,7 +2015,9 @@ kern_return_t GetBSDPath( io_iterator_t mediaIterator, char *bsdPath, CFIndex ma if ( bsdPathAsCFString ) { size_t devPathLength; strcpy( bsdPath, _PATH_DEV ); -strcat( bsdPath, r ); +if (flags BDRV_O_NOCACHE) { +strcat(bsdPath, r); +} devPathLength = strlen( bsdPath ); if ( CFStringGetCString( bsdPathAsCFString, bsdPath + devPathLength, maxPathSize - devPathLength, kCFStringEncodingASCII ) ) { kernResult = KERN_SUCCESS; Is this the fix that makes CD-ROM passthrough work for you? Does the guest boot successfully when you do: -drive if=ide,media=cdrom,cache=none,file=/dev/cdrom The guest fails during the boot process with the above command line. That means the issue you originally hit hasn't been solved yet. Take a look at s-needs_alignment and raw_probe_alignment(). In the -drive cache=none case raw-posix needs to detect the correct alignment (probably 2 KB for CD-ROMs). Stefan pgpbn8sQuAqGZ.pgp Description: PGP signature
[Qemu-devel] Accessing guest kernel thread_info struct
Hello, I need to access thread_info (linux kernel struct) of the guest from within qemu, when the guest is in kernel mode. To do this, I read the stack pointer and mask it with ~(stack_size - 1). This works with x86 and ARM, but doesn't seem to work with MIPS - the pointer points to something that doesn't look like thread_info. I get sp as follows: env-active_tc.gpr[29] Is it correct? What could be the reason of failure? Thanks.
Re: [Qemu-devel] [PATCH v2] pci_add_capability: remove duplicate comments
-Original Message- From: qemu-devel-bounces+chenhanxiao=cn.fujitsu@nongnu.org [mailto:qemu-devel-bounces+chenhanxiao=cn.fujitsu@nongnu.org] On Behalf Of Chen Hanxiao Sent: Tuesday, July 14, 2015 4:16 PM To: Michael S. Tsirkin; qemu-devel@nongnu.org Subject: [Qemu-devel] [PATCH v2] pci_add_capability: remove duplicate comments Signed-off-by: Chen Hanxiao chenhanx...@cn.fujitsu.com --- hw/pci/pci.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 442f822..a017614 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -2101,12 +2101,10 @@ static void pci_del_option_rom(PCIDevice *pdev) } /* - * if !offset - * Reserve space and add capability to the linked list in pci config space - * * if offset = 0, * Find and reserve space and add capability to the linked list - * in pci config space */ + * in pci config space + */ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id, uint8_t offset, uint8_t size) { -- 2.1.0 ping Regards, - Chen
[Qemu-devel] [PATCH] exec.c: Use atomic_rcu_read() to access dispatch in memory_region_section_get_iotlb()
When accessing the dispatch pointer in an AddressSpace within an RCU critical section we should always use atomic_rcu_read(). Fix an access within memory_region_section_get_iotlb() which was incorrectly doing a direct pointer access. Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- I discussed this on IRC with Paolo a while back, and IIRC he said that although this is a bug it's not one that can currently have any ill effects, though I forget why (probably because this code path is TCG only and TCG is single-threaded right now). exec.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/exec.c b/exec.c index 7d60e15..0a4a0c5 100644 --- a/exec.c +++ b/exec.c @@ -954,7 +954,10 @@ hwaddr memory_region_section_get_iotlb(CPUState *cpu, iotlb |= PHYS_SECTION_ROM; } } else { -iotlb = section - section-address_space-dispatch-map.sections; +AddressSpaceDispatch *d; + +d = atomic_rcu_read(section-address_space-dispatch); +iotlb = section - d-map.sections; iotlb += xlat; } -- 1.9.1
[Qemu-devel] [PULL 1/6] ich9: fix skipped vmstate_memhp_state subsection
From: Paulo Alcantara pca...@gmail.com By declaring another .subsections array for vmstate_tco_io_state made vmstate_memhp_state not registered anymore. There must be only one .subsections array for all subsections. Cc: Michael S. Tsirkin m...@redhat.com Cc: Amit Shah amit.s...@redhat.com Reported-by: Amit Shah amit.s...@redhat.com Signed-off-by: Paulo Alcantara pca...@zytor.com Reviewed-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Michael S. Tsirkin m...@redhat.com Reviewed-by: Amit Shah amit.s...@redhat.com --- hw/acpi/ich9.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c index 5fb7a87..f04f6dc 100644 --- a/hw/acpi/ich9.c +++ b/hw/acpi/ich9.c @@ -206,9 +206,6 @@ const VMStateDescription vmstate_ich9_pm = { }, .subsections = (const VMStateDescription*[]) { vmstate_memhp_state, -NULL -}, -.subsections = (const VMStateDescription*[]) { vmstate_tco_io_state, NULL } -- MST
[Qemu-devel] [PULL 0/6] virtio, vhost, pc fixes for 2.4
The following changes since commit b4329bf41c86bac8b56cadb097081960cc4839a0: Update version for v2.4.0-rc1 release (2015-07-16 20:32:20 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream for you to fetch changes up to f9d6dbf0bf6e91b8ed896369ab1b7e91e5a1a4df: virtio-net: remove virtio queues if the guest doesn't support multiqueue (2015-07-20 14:19:42 +0300) virtio, vhost, pc fixes for 2.4 The only notable thing here is vhost-user multiqueue revert. We'll work on making it stable in 2.5, reverting now means we won't have to maintain bug for bug compability forever. Signed-off-by: Michael S. Tsirkin m...@redhat.com Chen Hanxiao (1): pci_add_capability: remove duplicate comments Fam Zheng (1): virtio-net: Flush incoming queues when DRIVER_OK is being set Jason Wang (1): virtio-net: unbreak any layout Michael S. Tsirkin (1): Revert vhost-user: add multi queue support Paulo Alcantara (1): ich9: fix skipped vmstate_memhp_state subsection Wen Congyang (1): virtio-net: remove virtio queues if the guest doesn't support multiqueue qapi-schema.json | 6 +- include/hw/virtio/virtio-access.h | 9 +++ hw/acpi/ich9.c| 3 - hw/net/vhost_net.c| 3 +- hw/net/virtio-net.c | 143 +- hw/pci/pci.c | 6 +- hw/virtio/vhost-user.c| 11 +-- net/vhost-user.c | 37 -- docs/specs/vhost-user.txt | 5 -- qemu-options.hx | 5 +- 10 files changed, 138 insertions(+), 90 deletions(-)
[Qemu-devel] [PATCH v2] PAM: make PAM emulation closer to documentation
This patch improves PAM emulation. PAM defines 4 memory access redirection modes. In mode 1 reads are directed to RAM and writes are directed to PCI. In mode 2 it is contrary. In mode 0 all access is directed to PCI. In mode 3 it is directed to RAM. Currently all modes are emulated using aliases. It is good for modes 0 and 3 but modes 1 and 2 require more complicated logic. Present API has not needed region type. The patch uses ROM-like regions for modes 1 and 2. Each region has I/O callbacks to redirect access to destination defined by current mode. Write access is always redirected by callback. If actual read source is RAM or ROM (it is common case) then ram_addr of PAM region is set to ram_addr of source region with offset. Otherwise, when source region is an I/O region, reading is redirected to source region read callback by PAM region one. The reasons of ram_addr modification for read redirection are: - QEMU cannot execute code outside RAM or ROM (while BIOS tries exactly that); - it is faster because of TLB is used. Redirection is based on address spaces: for PCI and for RAM. QEMU has no ones so PAM creates private address spaces with root regions that alias to actual PCI and RAM regions. The memory commit callbacks are used to keep read source and write destination address spaces and ram_addr up to date. Signed-off-by: Efimov Vasily r...@ispras.ru --- v2 change: - use address spaces for access redirection Qemu distribution includes SeaBIOS which has hacks to work around incorrect modes 1 and 2 emulation. This patch series is tested using modified SeaBIOS. It is forced to use mode 2 for copying its data. BIOS reads a value from memory and immediately writes it to same address. According to PAM definition, reads are directed to PCI (i.e. to BIOS ROM) and writes are directed to RAM. The patch for SeaBIOS is listed below. Both SeaBIOS versions works with new PAM but the modified one does not work with old PAM. == Whitespaces are added to prevent attempt to apply SeaBIOS patch to QEMU. diff --git a/src/fw/shadow.c b/src/fw/shadow.c index 4f6..7249aa2 100644 --- a/src/fw/shadow.c +++ b/src/fw/shadow.c @@ -26,32 +26,62 @@ static void __make_bios_writable_intel(u16 bdf, u32 pam0) { // Make ram from 0xc-0xf writable -int clear = 0; +dprintf(1, PAM mode 1 test begin\n); +unsigned *m = (unsigned *) BUILD_ROM_START; + +pci_config_writeb(bdf, pam0 + 1, 0x33); +*m = 0xdeafbeef; + +pci_config_writeb(bdf, pam0 + 1, 0x11); +volatile unsigned t = *m; +*m = t; + +pci_config_writeb(bdf, pam0 + 1, 0x33); +t = *m; + +pci_config_writeb(bdf, pam0 + 1, 0x00); +unsigned t2 = *m; + +dprintf(1, t = 0x%x, t2 = 0x%x\n, t, t2); + +dprintf(1, PAM mode 1 test end\n); int i; +unsigned *mem, *mem_limit; for (i=0; i6; i++) { u32 pam = pam0 + 1 + i; int reg = pci_config_readb(bdf, pam); -if (CONFIG_OPTIONROMS_DEPLOYED (reg 0x11) != 0x11) { -// Need to copy optionroms to work around qemu implementation -void *mem = (void*)(BUILD_ROM_START + i * 32*1024); -memcpy((void*)BUILD_BIOS_TMP_ADDR, mem, 32*1024); +if ((reg 0x11) != 0x11) { +mem = (unsigned *)(BUILD_ROM_START + i * 32 * 1024); +pci_config_writeb(bdf, pam, 0x22); +mem_limit = mem + 32 * 1024 / sizeof(unsigned); + +while (mem mem_limit) { +volatile unsigned tmp = *mem; +*mem = tmp; +mem++; +} pci_config_writeb(bdf, pam, 0x33); -memcpy(mem, (void*)BUILD_BIOS_TMP_ADDR, 32*1024); -clear = 1; } else { pci_config_writeb(bdf, pam, 0x33); } } -if (clear) -memset((void*)BUILD_BIOS_TMP_ADDR, 0, 32*1024); // Make ram from 0xf-0x10 writable int reg = pci_config_readb(bdf, pam0); -pci_config_writeb(bdf, pam0, 0x30); if (reg 0x10) // Ram already present. return; +pci_config_writeb(bdf, pam0, 0x22); +mem = (unsigned *)BUILD_BIOS_ADDR; +mem_limit = mem + 32 * 1024 / sizeof(unsigned); +while (mem mem_limit) { +volatile unsigned tmp = *mem; +*mem = tmp; +mem++; +} +pci_config_writeb(bdf, pam0, 0x33); + // Copy bios. extern u8 code32flat_start[], code32flat_end[]; memcpy(code32flat_start, code32flat_start + BIOS_SRC_OFFSET @@ -61,17 +91,6 @@ __make_bios_writable_intel(u16 bdf, u32 pam0) static void make_bios_writable_intel(u16 bdf, u32 pam0) { -int reg = pci_config_readb(bdf, pam0); -if (!(reg 0x10)) { -// QEMU doesn't fully implement the piix shadow capabilities - -// if ram isn't backing the bios segment when shadowing is -// disabled, the code itself wont be in memory. So, run the -// code from the high-memory flash location. -u32 pos =
[Qemu-devel] [PATCH] qcow2: Remove forward declaration of QCowAIOCB
This struct doesn't exist any more since commit 3fc48d09 in August 2011, it's about time to remove its forward declaration. Signed-off-by: Kevin Wolf kw...@redhat.com --- block/qcow2.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/block/qcow2.h b/block/qcow2.h index 72e1328..46e1e80 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -292,8 +292,6 @@ typedef struct BDRVQcowState { char *image_backing_format; } BDRVQcowState; -struct QCowAIOCB; - typedef struct Qcow2COWRegion { /** * Offset of the COW region in bytes from the start of the first cluster -- 1.8.3.1
Re: [Qemu-devel] [POC] colo-proxy in qemu
CC Wen Congyang On 07/20/2015 06:32 PM, Stefan Hajnoczi wrote: On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote: We are planning to implement colo-proxy in qemu to cache and compare packets. I thought there is a kernel module to do that? Why does the proxy need to be part of the QEMU process? -netdev socket or host network stack features allow you to process packets in a separate process. yes, it used to be a kernel module. we plan to re-implement a QEMU space colo-proxy by the following reasons: 1. colo-proxy in kernel was based on netfilter, it was impletmented by add a new nf_ct_ext_id, but this will touch the existed kernel code and we must re-build the kernel before we install the colo-proxy modules. For this reason, less people is like to test colo-proxy and it become harder to post to kenel 2. COLO is the only scene of colo-proxy in kernel 3. colo-proxy in kernel only works on the case where packet will deliver to kernel tcp/ip stack. COLO project is mainly including 3 components, COLO-Frame COLO-Block and COLO-Proxy. The first tow components is being post to QEMU, if we integrate proxy into QEMU, it will become convenienter to manage the whole COLO project. further more, COLO will become easier to configure without depending on kernel Without details on what the proxy does it's hard to discuss this. What happens in the non-TCP case? What happens in the TCP case? more details will be post soon Does the proxy need to perform privileged operations, create sockets, open files, etc? IMO, we just need to create a new socket like the migration socket to forward packet between PVM and SVM. Best regards. Li Zhijian The slirp code is not actively developed or used much in production. It might be a good idea to audit the code for bugs if you want to use it. Stefan
[Qemu-devel] [PULL 4/6] pci_add_capability: remove duplicate comments
From: Chen Hanxiao chenhanx...@cn.fujitsu.com Signed-off-by: Chen Hanxiao chenhanx...@cn.fujitsu.com Reviewed-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Michael S. Tsirkin m...@redhat.com --- hw/pci/pci.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 442f822..a017614 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -2101,12 +2101,10 @@ static void pci_del_option_rom(PCIDevice *pdev) } /* - * if !offset - * Reserve space and add capability to the linked list in pci config space - * * if offset = 0, * Find and reserve space and add capability to the linked list - * in pci config space */ + * in pci config space + */ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id, uint8_t offset, uint8_t size) { -- MST
Re: [Qemu-devel] [RFC PATCH 1/2] spapr: add dumpdtb support
On Mon, Jul 20, 2015 at 02:02:33PM +1000, David Gibson wrote: On Fri, Jul 17, 2015 at 01:56:39PM +0200, Andrew Jones wrote: Signed-off-by: Andrew Jones drjo...@redhat.com Looks good to me, but I'd like an actual commit message: what's dumpdtb, how and why would you use it. Ok, just sent it separately with a commit message and the RFC dropped. Thanks, drew --- hw/ppc/spapr.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index a6f19473cf278..c1cbf3387ae0c 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -30,6 +30,7 @@ #include hw/fw-path-provider.h #include elf.h #include net/net.h +#include sysemu/device_tree.h #include sysemu/block-backend.h #include sysemu/cpus.h #include sysemu/kvm.h @@ -822,6 +823,7 @@ static void spapr_finalize_fdt(sPAPRMachineState *spapr, exit(1); } +qemu_fdt_dumpdtb(fdt, fdt_totalsize(fdt)); cpu_physical_memory_write(fdt_addr, fdt, fdt_totalsize(fdt)); g_free(bootlist); -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au| minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Re: [Qemu-devel] Creating a VM from an E01 file
On 20 July 2015 at 14:03, Cervellone, Adam acervell...@ncdoj.gov wrote: My name is Adam Cervellone. I am a digital evidence intern at the North Carolina State Crime Laboratory. As part of my time here, I am conducting a research project using the SIFT workstation to make a virtual machine of an E01 file. I’ve previously used this series of commands to attempt to create a VM 1.Sudo su o SIFT password entered 2.Mkdir /mnt/ewf1 3.Mount_ewf.py E01 image file path /mnt/ewf1 4.qemu-img convert /mnt/ewf1/E01 image file name -O vmdk give_a_name.vmdk I found these steps on Forensics wiki, however they have not worked. I may have misinterpreted a section of the command and entered it incorrectly. Do you know of the correct way to do this or of someone who may be able to help me? You've said what you were trying to do, but not what actually happened. It didn't work provides no information at all. Did it give an error message (if so, what?), did it do nothing, did it just hang? Giving an exact transcript of the commands you typed and the output might be useful. Identifying problems with computing has some similarities with detecting crime -- we need evidence to be able to draw good conclusions :-) thanks -- PMM
Re: [Qemu-devel] [RFC] Virt machine memory map
On Mon, 20 Jul 2015 13:23:45 +0200 Alexander Graf ag...@suse.de wrote: On 07/20/15 11:41, Peter Maydell wrote: On 20 July 2015 at 09:55, Pavel Fedin p.fe...@samsung.com wrote: Hello! In our project we work on a very fast paravirtualized network I/O drivers, based on ivshmem. We successfully got ivshmem working on ARM, however with one hack. Currently we have: --- cut --- [VIRT_PCIE_MMIO] = { 0x1000, 0x2eff }, [VIRT_PCIE_PIO] = { 0x3eff, 0x0001 }, [VIRT_PCIE_ECAM] = { 0x3f00, 0x0100 }, [VIRT_MEM] ={ 0x4000, 30ULL * 1024 * 1024 * 1024 }, --- cut --- And MMIO region is not enough for us because we want to have 1GB mapping for PCI device. In order to make it working, we modify the map as follows: --- cut --- [VIRT_PCIE_MMIO] ={ 0x1000, 0x7eff }, [VIRT_PCIE_PIO] = { 0x8eff, 0x0001 }, [VIRT_PCIE_ECAM] = { 0x8f00, 0x0100 }, [VIRT_MEM] = { 0x9000, 30ULL * 1024 * 1024 * 1024 }, --- cut --- The question is - how could we upstream this? I believe modifying 32-bit virt memory map this way is not good. Will it be OK to have different memory map for 64-bit virt ? I think the theory we discussed at the time of putting in the PCIe device was that if we wanted this we'd add support for the other PCIe memory window (which would then live at somewhere above 4GB). Alex, can you remember what the idea was? Yes, pretty much. It would give us an upper bound to the amount of RAM that we're able to support, but at least we would be able to support big MMIO regions like for ivshmem. I'm not really sure where to put it though. Depending on your kernel config Linux supports somewhere between 39 and 48 or so bits of phys address space. And I'd rather not crawl into the PCI hole rat hole that we have on x86 ;). We could of course also put it just above RAM - but then our device tree becomes really dynamic and heavily dependent on -m. on x86 we've made everything that is not mapped to ram/mmio fall down to PCI address space, see pc_pci_as_mapping_init(). So we don't have explicitly mapped PCI regions anymore there, but we still thinking in terms of PCI hole/PCI ranges when it comes to ACPI PCI bus description where one need to specify ranges available for bus in its _CRS. But to be honest I think we weren't expecting anybody to need 1GB of PCI MMIO space unless it was a video card... Ivshmem was actually the most likely target that I could've thought of to require big MMIO regions ;). Alex
[Qemu-devel] [PULL 5/6] virtio-net: Flush incoming queues when DRIVER_OK is being set
From: Fam Zheng f...@redhat.com This patch fixes network hang after stop then cont, while network packets keep arriving. Tested both manually (tap, host pinging guest) and with Jason's qtest series (plus his [PATCH 2.4] socket: pass correct size in net_socket_send() fix). As virtio_net_set_status is called when guest driver is setting status byte and when vm state is changing, it is a good opportunity to flush queued packets. This is necessary because during vm stop the backend (e.g. tap) would stop rx processing after .can_receive returns false, until the queue is explicitly flushed or purged. The other interesting condition in .can_receive, virtio_queue_ready(), is handled by virtio_net_handle_rx() when guest kicks; the 3rd condition is invalid queue index which doesn't need flushing. Signed-off-by: Fam Zheng f...@redhat.com Reviewed-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Michael S. Tsirkin m...@redhat.com --- hw/net/virtio-net.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index 9f7e91d..e1d9cbf 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -162,6 +162,8 @@ static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status) virtio_net_vhost_status(n, status); for (i = 0; i n-max_queues; i++) { +NetClientState *ncs = qemu_get_subqueue(n-nic, i); +bool queue_started; q = n-vqs[i]; if ((!n-multiqueue i != 0) || i = n-curr_queues) { @@ -169,12 +171,18 @@ static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status) } else { queue_status = status; } +queue_started = +virtio_net_started(n, queue_status) !n-vhost_started; + +if (queue_started) { +qemu_flush_queued_packets(ncs); +} if (!q-tx_waiting) { continue; } -if (virtio_net_started(n, queue_status) !n-vhost_started) { +if (queue_started) { if (q-tx_timer) { timer_mod(q-tx_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + n-tx_timeout); -- MST
[Qemu-devel] [PULL 6/6] virtio-net: remove virtio queues if the guest doesn't support multiqueue
From: Wen Congyang we...@cn.fujitsu.com commit da51a335 adds all queues in .realize(). But if the guest doesn't support multiqueue, we forget to remove them. And we cannot handle the ctrl vq corretly. The guest will hang. Signed-off-by: Wen Congyang we...@cn.fujitsu.com Reviewed-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Michael S. Tsirkin m...@redhat.com Acked-by: Jason Wang jasow...@redhat.com --- hw/net/virtio-net.c | 110 +++- 1 file changed, 82 insertions(+), 28 deletions(-) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index e1d9cbf..304d3dd 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -1327,9 +1327,86 @@ static void virtio_net_tx_bh(void *opaque) } } +static void virtio_net_add_queue(VirtIONet *n, int index) +{ +VirtIODevice *vdev = VIRTIO_DEVICE(n); + +n-vqs[index].rx_vq = virtio_add_queue(vdev, 256, virtio_net_handle_rx); +if (n-net_conf.tx !strcmp(n-net_conf.tx, timer)) { +n-vqs[index].tx_vq = +virtio_add_queue(vdev, 256, virtio_net_handle_tx_timer); +n-vqs[index].tx_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, + virtio_net_tx_timer, + n-vqs[index]); +} else { +n-vqs[index].tx_vq = +virtio_add_queue(vdev, 256, virtio_net_handle_tx_bh); +n-vqs[index].tx_bh = qemu_bh_new(virtio_net_tx_bh, n-vqs[index]); +} + +n-vqs[index].tx_waiting = 0; +n-vqs[index].n = n; +} + +static void virtio_net_del_queue(VirtIONet *n, int index) +{ +VirtIODevice *vdev = VIRTIO_DEVICE(n); +VirtIONetQueue *q = n-vqs[index]; +NetClientState *nc = qemu_get_subqueue(n-nic, index); + +qemu_purge_queued_packets(nc); + +virtio_del_queue(vdev, index * 2); +if (q-tx_timer) { +timer_del(q-tx_timer); +timer_free(q-tx_timer); +} else { +qemu_bh_delete(q-tx_bh); +} +virtio_del_queue(vdev, index * 2 + 1); +} + +static void virtio_net_change_num_queues(VirtIONet *n, int new_max_queues) +{ +VirtIODevice *vdev = VIRTIO_DEVICE(n); +int old_num_queues = virtio_get_num_queues(vdev); +int new_num_queues = new_max_queues * 2 + 1; +int i; + +assert(old_num_queues = 3); +assert(old_num_queues % 2 == 1); + +if (old_num_queues == new_num_queues) { +return; +} + +/* + * We always need to remove and add ctrl vq if + * old_num_queues != new_num_queues. Remove ctrl_vq first, + * and then we only enter one of the following too loops. + */ +virtio_del_queue(vdev, old_num_queues - 1); + +for (i = new_num_queues - 1; i old_num_queues - 1; i += 2) { +/* new_num_queues old_num_queues */ +virtio_net_del_queue(n, i / 2); +} + +for (i = old_num_queues - 1; i new_num_queues - 1; i += 2) { +/* new_num_queues old_num_queues */ +virtio_net_add_queue(n, i / 2); +} + +/* add ctrl_vq last */ +n-ctrl_vq = virtio_add_queue(vdev, 64, virtio_net_handle_ctrl); +} + static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue) { +int max = multiqueue ? n-max_queues : 1; + n-multiqueue = multiqueue; +virtio_net_change_num_queues(n, max); virtio_net_set_queues(n); } @@ -1604,21 +1681,7 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp) } for (i = 0; i n-max_queues; i++) { -n-vqs[i].rx_vq = virtio_add_queue(vdev, 256, virtio_net_handle_rx); -if (n-net_conf.tx !strcmp(n-net_conf.tx, timer)) { -n-vqs[i].tx_vq = -virtio_add_queue(vdev, 256, virtio_net_handle_tx_timer); -n-vqs[i].tx_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, - virtio_net_tx_timer, - n-vqs[i]); -} else { -n-vqs[i].tx_vq = -virtio_add_queue(vdev, 256, virtio_net_handle_tx_bh); -n-vqs[i].tx_bh = qemu_bh_new(virtio_net_tx_bh, n-vqs[i]); -} - -n-vqs[i].tx_waiting = 0; -n-vqs[i].n = n; +virtio_net_add_queue(n, i); } n-ctrl_vq = virtio_add_queue(vdev, 64, virtio_net_handle_ctrl); @@ -1672,7 +1735,7 @@ static void virtio_net_device_unrealize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIONet *n = VIRTIO_NET(dev); -int i; +int i, max_queues; /* This will stop vhost backend if appropriate. */ virtio_net_set_status(vdev, 0); @@ -1687,18 +1750,9 @@ static void virtio_net_device_unrealize(DeviceState *dev, Error **errp) g_free(n-mac_table.macs); g_free(n-vlans); -for (i = 0; i n-max_queues; i++) { -VirtIONetQueue *q = n-vqs[i]; -NetClientState *nc = qemu_get_subqueue(n-nic, i); - -qemu_purge_queued_packets(nc); - -if (q-tx_timer) { -timer_del(q-tx_timer); -
[Qemu-devel] R: Re: [PATCH v2] AioContext: fix broken placement of event_notifier_test_and_clear
I'm looking at optimizing it but I don't fully understand the relationship between aio_prepare and WaitForMultipleObjects. Do they get the same set of events? After some reading I think WaitForMultipleObjects is for event notifiers and aio_prepare is for select() on fd events. It's a bit trickier than aio-posix, in the first iteration there could be another event masking ctx-notifier so we don't know if we need to clear it. Maybe we can reverse events[] so that ctx-notifier will be the 0th one. And I think we can always remove it after first iteration, am I right? Yes, that would work. I am not sure how complex it would be. You would also need a solution for the GSource and one (probably similar to aio-posix) for your epoll implementation. With ctx-notified at least you can encapsulate it in aio_notify_accept... Stefan, any preferences? Paolo
Re: [Qemu-devel] [PATCH v2] raw-posix.c: Make physical devices usable in QEMU under Mac OS X host
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 20/07/2015 12:48, Stefan Hajnoczi wrote: On Fri, Jul 17, 2015 at 03:24:34PM -0400, Programmingkid wrote: On Jul 17, 2015, at 9:41 AM, Stefan Hajnoczi wrote: On Thu, Jul 16, 2015 at 04:46:07PM -0400, Programmingkid wrote: @@ -2014,7 +2015,9 @@ kern_return_t GetBSDPath( io_iterator_t mediaIterator, char *bsdPath, CFIndex ma if ( bsdPathAsCFString ) { size_t devPathLength; strcpy( bsdPath, _PATH_DEV ); -strcat( bsdPath, r ); + if (flags BDRV_O_NOCACHE) { + strcat(bsdPath, r); +} devPathLength = strlen( bsdPath ); if ( CFStringGetCString( bsdPathAsCFString, bsdPath + devPathLength, maxPathSize - devPathLength, kCFStringEncodingASCII ) ) { kernResult = KERN_SUCCESS; Is this the fix that makes CD-ROM passthrough work for you? Does the guest boot successfully when you do: -drive if=ide,media=cdrom,cache=none,file=/dev/cdrom The guest fails during the boot process with the above command line. That means the issue you originally hit hasn't been solved yet. Take a look at s-needs_alignment and raw_probe_alignment(). In the -drive cache=none case raw-posix needs to detect the correct alignment (probably 2 KB for CD-ROMs). As raw_open_common() sets needs_alignment to true on BDRV_O_NOCACHE (cache=none) and raw_probe_alignment() detects alignment if needs_alignment is true, I don't understand why it doesn't work. Could you explain ? Laurent -BEGIN PGP SIGNATURE- Version: GnuPG v2 iEYEARECAAYFAlWs7bcACgkQNKT2yavzbFMxIwCcCPYXvcSZTnjp7UVQBUVLAj6K iY0An2l1ttpVEb9bZP+VEakuU75X/Zd7 =S83F -END PGP SIGNATURE-
Re: [Qemu-devel] [PATCH for-2.4] timer: rename NSEC_PER_SEC due to Mac OS X header clash
On 8 July 2015 at 15:10, Stefan Hajnoczi stefa...@redhat.com wrote: Commit e0cf11f31c24cfb17f44ed46c254d84c78e7f6e9 (timer: Use a single definition of NSEC_PER_SEC for the whole codebase) renamed NANOSECONDS_PER_SECOND to NSEC_PER_SEC. On Mac OS X there is a dispatch/time.h system header which also defines NSEC_PER_SEC. This causes compiler warnings. Let's use the old name instead. It's longer but it doesn't clash. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com Do you have a plan for putting this into 2.4 or should I just apply it to master directly? thanks -- PMM
Re: [Qemu-devel] [RFC PATCH 2/2] spapr: -kernel: allow linking with specified addr
On Mon, Jul 20, 2015 at 08:47:53AM +0200, Thomas Huth wrote: On 20/07/15 07:01, David Gibson wrote: On Fri, Jul 17, 2015 at 01:56:40PM +0200, Andrew Jones wrote: I've started playing with adding ppc support to kvm-unit-tests, using spapr for the machine model. I wanted to link the unit test at 0x40 to match qemu's load address, making the unit test startup code simpler, but ended up with 0x80 instead, due to how translate_kernel_address works. The translation makes sense for how Linux kernels are linked (always at 0xc000 or 0xc000), but for the unit test case we need to avoid adding the offset. Signed-off-by: Andrew Jones drjo...@redhat.com --- Big RFC because I don't know if the always at 0xc... statement is 100% true for Linux, nor if this patch would break other stuff... Yeah, I'm pretty dubious about this too, especially since I don't entirely grasp what the load_elf() translation function is all about anyway. Well, AFAIK it's used to modify the addresses before the ELF loader uses the address for loading. For example if your ELF binary is linked at address 0x1000, the translate function would move your binary to 0x401000 instead so that it does not interfere with the SLOF firmware (which is loaded to address 0 IIRC). This is correct, but the move isn't just to make sure we don't interfere with SLOF, it's also to make sure we can load the kernel into main memory. When the link address is 0xc..., then we can't use vaddr == paddr. The Linux ppc64 kernel, for example, is linked at 0xc000. So it gets pulled down with the mask, and then the 0x40 offset is added to get it above SLOF. So I also think your fix here is wrong, Andrew. E.g. when you have a binary that is linked to address 0x1000, you don't want to bypass the translation step here since it then would clash with the firmware. I set the unit test's text segment start at 0x40, so QEMU still loads it there with this patch, and thus it wouldn't clash with SLOF. But, anyway, there's no need for SLOF in kvm-unit-tests, and I've replaced it with a four byte boot loader, b 0x40. That said, I suspect making your unit test assume a fixed load address may not be the best idea - qemu or SLOF could change in future to move things about, so it might be more robust to have your test copy itself to address it wants to be at before executing. Well, the reason I want vaddr == paddr is to be able to run C code without the MMU enabled, not to mention to avoid needing to do any sort of reloc dance in what's supposed to be super simple code. I don't need to worry about SLOF changing things, since I don't use it. If QEMU changes the load address, then things will indeed break, but it would be a one line Makefile fix in kvm-unit-tests to point the text segment to the new offset. +1 ... or you could try to get the elf_reloc code working for POWER, too (see include/hw/elf_ops.h). That way QEMU would take care of relocating your program. (you can peek at elf_apply_rela64() in https://github.com/aik/SLOF/blob/master/lib/libelf/elf64.c if you want to know what basically has to be done for POWER relocations). Thomas kvm-unit-tests doesn't load the unit test elf itself. It relies on QEMU's -kernel parameter to get the kernel (the unit test) into memory. Thanks, drew
Re: [Qemu-devel] [POC] colo-proxy in qemu
2015-07-20 14:55 GMT+03:00 zhanghailiang zhang.zhanghaili...@huawei.com: Agreed, besides, it is seemed that slirp is not supporting ipv6, we also have to supplement it. patch for ipv6 slirp support some times ago sended to qemu list, but i don't know why in not accepted. -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru
[Qemu-devel] [PATCH] spapr: add dumpdtb support
dumpdtb (-machine dumpdtb=file) allows one to inspect the generated device tree of machine types that generate device trees. This is useful for a) seeing what's there b) debugging/testing device tree generator patches. It can be used as follows $QEMU_CMDLINE -machine dumpdtb=dtb dtc -I dtb -O dts dtb Signed-off-by: Andrew Jones drjo...@redhat.com --- hw/ppc/spapr.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index a6f19473cf278..c1cbf3387ae0c 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -30,6 +30,7 @@ #include hw/fw-path-provider.h #include elf.h #include net/net.h +#include sysemu/device_tree.h #include sysemu/block-backend.h #include sysemu/cpus.h #include sysemu/kvm.h @@ -822,6 +823,7 @@ static void spapr_finalize_fdt(sPAPRMachineState *spapr, exit(1); } +qemu_fdt_dumpdtb(fdt, fdt_totalsize(fdt)); cpu_physical_memory_write(fdt_addr, fdt, fdt_totalsize(fdt)); g_free(bootlist); -- 2.4.3
Re: [Qemu-devel] [PATCH] qcow2: Remove forward declaration of QCowAIOCB
On 07/20/2015 05:55 AM, Kevin Wolf wrote: This struct doesn't exist any more since commit 3fc48d09 in August 2011, it's about time to remove its forward declaration. Signed-off-by: Kevin Wolf kw...@redhat.com --- block/qcow2.h | 2 -- 1 file changed, 2 deletions(-) Reviewed-by: Eric Blake ebl...@redhat.com -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PULL 3/6] virtio-net: unbreak any layout
On 07/20/2015 06:12 AM, Michael S. Tsirkin wrote: From: Jason Wang jasow...@redhat.com Commit 032a74a1c0fcdd5fd1c69e56126b4c857ee36611 (virtio-net: byteswap virtio-net header) breaks any layout by requiring out_sg[0].iov_len = n-guest_hdr_len. Fixing this by copying header to temporary buffer if swap is needed, and then use this buffer as part of out_sg. Fixes 032a74a1c0fcdd5fd1c69e56126b4c857ee36611 (virtio-net: byteswap virtio-net header) Cc: qemu-sta...@nongnu.org Cc: c...@fr.ibm.com Signed-off-by: Jason Wang jasow...@redhat.com Reviewed-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Michael S. Tsirkin m...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com I think my R-b was intended for 2/6, not this one. But if this has already been pulled, it's not a show-stopper. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH for-2.5 4/8] s390x: Dump storage keys qmp command
On 07/20/2015 07:49 AM, Cornelia Huck wrote: From: Jason J. Herne jjhe...@linux.vnet.ibm.com Provide a dump-skeys qmp command to allow the end user to dump storage keys. This is useful for debugging problems with guest storage key support within Qemu and for guest operating system developers. Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Jason J. Herne jjhe...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- +void qmp_dump_skeys(const char *filename, Error **errp) +{ + +f = fopen(filename, wb); If you'll use qemu_fopen() here... +++ b/qapi-schema.json @@ -2058,6 +2058,19 @@ 'returns': 'DumpGuestMemoryCapability' } ## +# @dump-skeys +# +# Dump guest's storage keys. @filename: the path to the file to dump to. +# This command is only supported on s390 architecture. +# +# Returns: nothing on success +# +# Since: 2.5 +## +{ 'command': 'dump-skeys', + 'data': { 'filename': 'str' } } then this command will automatically accept /dev/fdset/NNN notation for allowing the user to pass in a file descriptor with add-fd then tying that fd to this command (useful for when qemu is restricted from directly calling open() for security reasons). -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] spapr: add dumpdtb support
On 07/20/15 15:19, Andrew Jones wrote: dumpdtb (-machine dumpdtb=file) allows one to inspect the generated device tree of machine types that generate device trees. This is useful for a) seeing what's there b) debugging/testing device tree generator patches. It can be used as follows $QEMU_CMDLINE -machine dumpdtb=dtb dtc -I dtb -O dts dtb Signed-off-by: Andrew Jones drjo...@redhat.com Glad to see that it's useful for others too :). Reviewed-by: Alexander Graf ag...@suse.de Alex
[Qemu-devel] [PATCH 1/2] vhost: add vhost_has_free_slot() interface
it will allow for other parts of QEMU check if it's safe to map memory region during hotplug/runtime. That way hotplug path will have a chance to cancel hotplug operation instead of crashing in vhost_commit(). Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/virtio/vhost-backend.c | 23 ++- hw/virtio/vhost-user.c| 8 +++- hw/virtio/vhost.c | 21 + include/hw/virtio/vhost-backend.h | 2 ++ include/hw/virtio/vhost.h | 1 + stubs/Makefile.objs | 1 + stubs/vhost.c | 6 ++ 7 files changed, 60 insertions(+), 2 deletions(-) create mode 100644 stubs/vhost.c diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c index 4d68a27..46fa707 100644 --- a/hw/virtio/vhost-backend.c +++ b/hw/virtio/vhost-backend.c @@ -11,6 +11,7 @@ #include hw/virtio/vhost.h #include hw/virtio/vhost-backend.h #include qemu/error-report.h +#include linux/vhost.h #include sys/ioctl.h @@ -42,11 +43,31 @@ static int vhost_kernel_cleanup(struct vhost_dev *dev) return close(fd); } +static int vhost_kernel_memslots_limit(struct vhost_dev *dev) +{ +int limit; +int s = offsetof(struct vhost_memory, regions); +struct vhost_memory *mem = g_malloc0(s); + +assert(dev-mem-nregions); +do { +s += sizeof mem-regions[0]; +mem = g_realloc(mem, s); +mem-regions[mem-nregions] = dev-mem-regions[0]; +mem-nregions++; +} while (vhost_kernel_call(dev, VHOST_SET_MEM_TABLE, mem) != -1); +limit = mem-nregions - 1 0 ? mem-nregions - 1 : 0; +g_free(mem); + +return limit; +} + static const VhostOps kernel_ops = { .backend_type = VHOST_BACKEND_TYPE_KERNEL, .vhost_call = vhost_kernel_call, .vhost_backend_init = vhost_kernel_init, -.vhost_backend_cleanup = vhost_kernel_cleanup +.vhost_backend_cleanup = vhost_kernel_cleanup, +.vhost_backend_memslots_limit = vhost_kernel_memslots_limit }; int vhost_set_backend_type(struct vhost_dev *dev, VhostBackendType backend_type) diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index d6f2163..0487809 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -352,9 +352,15 @@ static int vhost_user_cleanup(struct vhost_dev *dev) return 0; } +static int vhost_user_memslots_limit(struct vhost_dev *dev) +{ +return VHOST_MEMORY_MAX_NREGIONS; +} + const VhostOps user_ops = { .backend_type = VHOST_BACKEND_TYPE_USER, .vhost_call = vhost_user_call, .vhost_backend_init = vhost_user_init, -.vhost_backend_cleanup = vhost_user_cleanup +.vhost_backend_cleanup = vhost_user_cleanup, +.vhost_backend_memslots_limit = vhost_user_memslots_limit }; diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 2712c6f..e964004 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -26,6 +26,18 @@ static struct vhost_log *vhost_log; +static int used_memslots; +static int memslots_limit = -1; + +bool vhost_has_free_slot(void) +{ +if (memslots_limit = 0) { +return memslots_limit used_memslots; +} + +return true; +} + static void vhost_dev_sync_region(struct vhost_dev *dev, MemoryRegionSection *section, uint64_t mfirst, uint64_t mlast, @@ -457,6 +469,7 @@ static void vhost_set_memory(MemoryListener *listener, dev-mem_changed_start_addr = MIN(dev-mem_changed_start_addr, start_addr); dev-mem_changed_end_addr = MAX(dev-mem_changed_end_addr, start_addr + size - 1); dev-memory_changed = true; +used_memslots = dev-mem-nregions; } static bool vhost_section(MemoryRegionSection *section) @@ -1119,6 +1132,14 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev) if (r 0) { goto fail_features; } + +r = hdev-vhost_ops-vhost_backend_memslots_limit(hdev); +if (memslots_limit 0) { +memslots_limit = MIN(memslots_limit, r); +} else { +memslots_limit = r; +} + r = hdev-vhost_ops-vhost_call(hdev, VHOST_SET_MEM_TABLE, hdev-mem); if (r 0) { r = -errno; diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h index e472f29..28b6714 100644 --- a/include/hw/virtio/vhost-backend.h +++ b/include/hw/virtio/vhost-backend.h @@ -24,12 +24,14 @@ typedef int (*vhost_call)(struct vhost_dev *dev, unsigned long int request, void *arg); typedef int (*vhost_backend_init)(struct vhost_dev *dev, void *opaque); typedef int (*vhost_backend_cleanup)(struct vhost_dev *dev); +typedef int (*vhost_backend_memslots_limit)(struct vhost_dev *dev); typedef struct VhostOps { VhostBackendType backend_type; vhost_call vhost_call; vhost_backend_init vhost_backend_init; vhost_backend_cleanup vhost_backend_cleanup; +vhost_backend_memslots_limit
[Qemu-devel] [PATCH 2/2] pc-dimm: add vhost slots limit check before commiting to hotplug
it allows safely cancel memory hotplug if vhost backend doesn't support necessary amount of memory slots and prevents QEMU crashing in vhost due to hitting vhost limit on amount of supported memory ranges. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/mem/pc-dimm.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c index bb04862..901bdbf 100644 --- a/hw/mem/pc-dimm.c +++ b/hw/mem/pc-dimm.c @@ -25,6 +25,7 @@ #include sysemu/numa.h #include sysemu/kvm.h #include trace.h +#include hw/virtio/vhost.h typedef struct pc_dimms_capacity { uint64_t size; @@ -95,6 +96,12 @@ void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms, goto out; } +if (!vhost_has_free_slot()) { +error_setg(local_err, a used vhost backend has no free +memory slots left); +goto out; +} + memory_region_add_subregion(hpms-mr, addr - hpms-base, mr); vmstate_register_ram(mr, dev); numa_set_mem_node_id(addr, memory_region_size(mr), dimm-node); -- 1.8.3.1
[Qemu-devel] [PATCH 0/2] vhost: check if vhost has capacity for hotplugged memory
it's defensive patchset which helps to avoid QEMU crashing at memory hotplug time by checking that vhost has free capacity for an additional memory slot. Igor Mammedov (2): vhost: add vhost_has_free_slot() interface pc-dimm: add vhost slots limit check before commiting to hotplug hw/mem/pc-dimm.c | 7 +++ hw/virtio/vhost-backend.c | 23 ++- hw/virtio/vhost-user.c| 8 +++- hw/virtio/vhost.c | 21 + include/hw/virtio/vhost-backend.h | 2 ++ include/hw/virtio/vhost.h | 1 + stubs/Makefile.objs | 1 + stubs/vhost.c | 6 ++ 8 files changed, 67 insertions(+), 2 deletions(-) create mode 100644 stubs/vhost.c -- 1.8.3.1
[Qemu-devel] [PATCH for-2.5 2/8] s390x: Create QOM device for s390 storage keys
From: Jason J. Herne jjhe...@linux.vnet.ibm.com A new QOM style device is provided to back guest storage keys. A special version for KVM is created, which handles the storage key access via KVM_S390_GET_SKEYS and KVM_S390_SET_SKEYS ioctl. Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Jason J. Herne jjhe...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- MAINTAINERS | 1 + hw/s390x/Makefile.objs | 2 + hw/s390x/s390-skeys-kvm.c | 75 + hw/s390x/s390-skeys.c | 141 include/hw/s390x/storage-keys.h | 55 5 files changed, 274 insertions(+) create mode 100644 hw/s390x/s390-skeys-kvm.c create mode 100644 hw/s390x/s390-skeys.c create mode 100644 include/hw/s390x/storage-keys.h diff --git a/MAINTAINERS b/MAINTAINERS index 978b717..1387537 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -560,6 +560,7 @@ F: hw/s390x/css.[hc] F: hw/s390x/sclp*.[hc] F: hw/s390x/ipl*.[hc] F: hw/s390x/*pci*.[hc] +F: hw/s390x/s390-skeys*.c F: include/hw/s390x/ F: pc-bios/s390-ccw/ T: git git://github.com/cohuck/qemu virtio-ccw-upstr diff --git a/hw/s390x/Makefile.objs b/hw/s390x/Makefile.objs index 27cd75a..527d754 100644 --- a/hw/s390x/Makefile.objs +++ b/hw/s390x/Makefile.objs @@ -9,3 +9,5 @@ obj-y += css.o obj-y += s390-virtio-ccw.o obj-y += virtio-ccw.o obj-y += s390-pci-bus.o s390-pci-inst.o +obj-y += s390-skeys.o +obj-$(CONFIG_KVM) += s390-skeys-kvm.o diff --git a/hw/s390x/s390-skeys-kvm.c b/hw/s390x/s390-skeys-kvm.c new file mode 100644 index 000..682949a --- /dev/null +++ b/hw/s390x/s390-skeys-kvm.c @@ -0,0 +1,75 @@ +/* + * s390 storage key device + * + * Copyright 2015 IBM Corp. + * Author(s): Jason J. Herne jjhe...@linux.vnet.ibm.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or (at + * your option) any later version. See the COPYING file in the top-level + * directory. + */ + +#include hw/s390x/storage-keys.h +#include sysemu/kvm.h +#include qemu/error-report.h + +static int kvm_s390_skeys_enabled(S390SKeysState *ss) +{ +S390SKeysClass *skeyclass = S390_SKEYS_GET_CLASS(ss); +uint8_t single_key; +int r; + +r = skeyclass-get_skeys(ss, 0, 1, single_key); +if (r != 0 r != KVM_S390_GET_SKEYS_NONE) { +error_report(S390_GET_KEYS error %d\n, r); +} +return (r == 0); +} + +static int kvm_s390_skeys_get(S390SKeysState *ss, uint64_t start_gfn, + uint64_t count, uint8_t *keys) +{ +struct kvm_s390_skeys args = { +.start_gfn = start_gfn, +.count = count, +.skeydata_addr = (__u64)keys +}; + +return kvm_vm_ioctl(kvm_state, KVM_S390_GET_SKEYS, args); +} + +static int kvm_s390_skeys_set(S390SKeysState *ss, uint64_t start_gfn, + uint64_t count, uint8_t *keys) +{ +struct kvm_s390_skeys args = { +.start_gfn = start_gfn, +.count = count, +.skeydata_addr = (__u64)keys +}; + +return kvm_vm_ioctl(kvm_state, KVM_S390_SET_SKEYS, args); +} + +static void kvm_s390_skeys_class_init(ObjectClass *oc, void *data) +{ +S390SKeysClass *skeyclass = S390_SKEYS_CLASS(oc); + +skeyclass-skeys_enabled = kvm_s390_skeys_enabled; +skeyclass-get_skeys = kvm_s390_skeys_get; +skeyclass-set_skeys = kvm_s390_skeys_set; +} + +static const TypeInfo kvm_s390_skeys_info = { +.name = TYPE_KVM_S390_SKEYS, +.parent= TYPE_S390_SKEYS, +.instance_size = sizeof(S390SKeysState), +.class_init= kvm_s390_skeys_class_init, +.class_size= sizeof(S390SKeysClass), +}; + +static void kvm_s390_skeys_register_types(void) +{ +type_register_static(kvm_s390_skeys_info); +} + +type_init(kvm_s390_skeys_register_types) diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c new file mode 100644 index 000..77c42ff --- /dev/null +++ b/hw/s390x/s390-skeys.c @@ -0,0 +1,141 @@ +/* + * s390 storage key device + * + * Copyright 2015 IBM Corp. + * Author(s): Jason J. Herne jjhe...@linux.vnet.ibm.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or (at + * your option) any later version. See the COPYING file in the top-level + * directory. + */ + +#include hw/boards.h +#include hw/s390x/storage-keys.h +#include qemu/error-report.h + +S390SKeysState *s390_get_skeys_device(void) +{ +S390SKeysState *ss; + +ss = S390_SKEYS(object_resolve_path_type(, TYPE_S390_SKEYS, NULL)); +assert(ss); +return ss; +} + +void s390_skeys_init(void) +{ +Object *obj; + +if (kvm_enabled()) { +obj = object_new(TYPE_KVM_S390_SKEYS); +} else { +obj = object_new(TYPE_QEMU_S390_SKEYS); +} +object_property_add_child(qdev_get_machine(), TYPE_S390_SKEYS, + obj, NULL); +object_unref(obj); + +qdev_init_nofail(DEVICE(obj)); +} + +static void qemu_s390_skeys_init(Object *obj)
Re: [Qemu-devel] Creating a VM from an E01 file
I apologize for my error. I have now re-run all the commands in the same order and attached a screen shot of the terminal window. I have selected desktop as the location for the VMDK file to be stored and it is included in the command after -O vmdk. Thank you, Adam Cervellone -Original Message- From: Peter Maydell [mailto:peter.mayd...@linaro.org] Sent: Monday, July 20, 2015 9:21 AM To: Cervellone, Adam Cc: qemu-devel@nongnu.org Subject: Re: Creating a VM from an E01 file On 20 July 2015 at 14:03, Cervellone, Adam acervell...@ncdoj.gov wrote: My name is Adam Cervellone. I am a digital evidence intern at the North Carolina State Crime Laboratory. As part of my time here, I am conducting a research project using the SIFT workstation to make a virtual machine of an E01 file. I’ve previously used this series of commands to attempt to create a VM 1.Sudo su o SIFT password entered 2.Mkdir /mnt/ewf1 3.Mount_ewf.py E01 image file path /mnt/ewf1 4.qemu-img convert /mnt/ewf1/E01 image file name -O vmdk give_a_name.vmdk I found these steps on Forensics wiki, however they have not worked. I may have misinterpreted a section of the command and entered it incorrectly. Do you know of the correct way to do this or of someone who may be able to help me? You've said what you were trying to do, but not what actually happened. It didn't work provides no information at all. Did it give an error message (if so, what?), did it do nothing, did it just hang? Giving an exact transcript of the commands you typed and the output might be useful. Identifying problems with computing has some similarities with detecting crime -- we need evidence to be able to draw good conclusions :-) thanks -- PMM
Re: [Qemu-devel] Creating a VM from an E01 file
On 20 July 2015 at 14:57, Cervellone, Adam acervell...@ncdoj.gov wrote: Your instructions say: 4.qemu-img convert /mnt/ewf1/E01 image file name -O vmdk give_a_name.vmdk but in your screenshot the command you run is: qemu-img convert /mnt/ewf1 -O vmdk /home/sansforensics/Desktop/Item1.vmdk and you haven't filled in the 'E01 image file name' part. qemu-img seems to give this slightly unhelpful error message if you pass it a directory name rather than a filename for the input file. thanks -- PMM
Re: [Qemu-devel] Creating a VM from an E01 file
I have now changed the command to qemu-img convert /mnt/ewf1/ewf1 -O vmdk /home/sansforensics/Desktop/Item1.vmdk and two things have happened. 1. An Item1.vmdk file is now on the desktop. Terminal just hangs after running the command. The cursor is blinking and the shell prompt has not returned. The SIFT Workstation is running in VMware Player 7, should I try exporting the newly made vmdk to the windows host and running it in VirtualBox or VMware player? Thank you, Adam Cervellone -Original Message- From: Peter Maydell [mailto:peter.mayd...@linaro.org] Sent: Monday, July 20, 2015 10:06 AM To: Cervellone, Adam Cc: qemu-devel@nongnu.org Subject: Re: Creating a VM from an E01 file On 20 July 2015 at 14:57, Cervellone, Adam acervell...@ncdoj.gov wrote: Your instructions say: 4.qemu-img convert /mnt/ewf1/E01 image file name -O vmdk give_a_name.vmdk but in your screenshot the command you run is: qemu-img convert /mnt/ewf1 -O vmdk /home/sansforensics/Desktop/Item1.vmdk and you haven't filled in the 'E01 image file name' part. qemu-img seems to give this slightly unhelpful error message if you pass it a directory name rather than a filename for the input file. thanks -- PMM
[Qemu-devel] Creating a VM from an E01 file
To whom it may concern, My name is Adam Cervellone. I am a digital evidence intern at the North Carolina State Crime Laboratory. As part of my time here, I am conducting a research project using the SIFT workstation to make a virtual machine of an E01 file. I've previously used this series of commands to attempt to create a VM 1.Sudo su o SIFT password entered 2.Mkdir /mnt/ewf1 3.Mount_ewf.py E01 image file path /mnt/ewf1 4.qemu-img convert /mnt/ewf1/E01 image file name -O vmdk give_a_name.vmdk I found these steps on Forensics wiki, however they have not worked. I may have misinterpreted a section of the command and entered it incorrectly. Do you know of the correct way to do this or of someone who may be able to help me? Thank you, Adam Cervellone
Re: [Qemu-devel] [RFC] Virt machine memory map
On 07/20/15 15:30, Igor Mammedov wrote: On Mon, 20 Jul 2015 13:23:45 +0200 Alexander Graf ag...@suse.de wrote: On 07/20/15 11:41, Peter Maydell wrote: On 20 July 2015 at 09:55, Pavel Fedin p.fe...@samsung.com wrote: Hello! In our project we work on a very fast paravirtualized network I/O drivers, based on ivshmem. We successfully got ivshmem working on ARM, however with one hack. Currently we have: --- cut --- [VIRT_PCIE_MMIO] = { 0x1000, 0x2eff }, [VIRT_PCIE_PIO] = { 0x3eff, 0x0001 }, [VIRT_PCIE_ECAM] = { 0x3f00, 0x0100 }, [VIRT_MEM] ={ 0x4000, 30ULL * 1024 * 1024 * 1024 }, --- cut --- And MMIO region is not enough for us because we want to have 1GB mapping for PCI device. In order to make it working, we modify the map as follows: --- cut --- [VIRT_PCIE_MMIO] ={ 0x1000, 0x7eff }, [VIRT_PCIE_PIO] = { 0x8eff, 0x0001 }, [VIRT_PCIE_ECAM] = { 0x8f00, 0x0100 }, [VIRT_MEM] = { 0x9000, 30ULL * 1024 * 1024 * 1024 }, --- cut --- The question is - how could we upstream this? I believe modifying 32-bit virt memory map this way is not good. Will it be OK to have different memory map for 64-bit virt ? I think the theory we discussed at the time of putting in the PCIe device was that if we wanted this we'd add support for the other PCIe memory window (which would then live at somewhere above 4GB). Alex, can you remember what the idea was? Yes, pretty much. It would give us an upper bound to the amount of RAM that we're able to support, but at least we would be able to support big MMIO regions like for ivshmem. I'm not really sure where to put it though. Depending on your kernel config Linux supports somewhere between 39 and 48 or so bits of phys address space. And I'd rather not crawl into the PCI hole rat hole that we have on x86 ;). We could of course also put it just above RAM - but then our device tree becomes really dynamic and heavily dependent on -m. on x86 we've made everything that is not mapped to ram/mmio fall down to PCI address space, see pc_pci_as_mapping_init(). So we don't have explicitly mapped PCI regions anymore there, but we still thinking in terms of PCI hole/PCI ranges when it comes to ACPI PCI bus description where one need to specify ranges available for bus in its _CRS. Yes, and in the ARM case we pass those in as a region in device tree which gets generated from QEMU :). Alex
[Qemu-devel] [PATCH for-2.5 1/8] s390x: add 2.5 compat s390-ccw-virtio machine
Reviewed-by: Jason J. Herne jjhe...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/s390x/s390-virtio-ccw.c | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c index 4c51d1a..708763e 100644 --- a/hw/s390x/s390-virtio-ccw.c +++ b/hw/s390x/s390-virtio-ccw.c @@ -289,7 +289,6 @@ static void ccw_machine_2_4_class_init(ObjectClass *oc, void *data) mc-name = s390-ccw-virtio-2.4; mc-alias = s390-ccw-virtio; mc-desc = VirtIO-ccw based S390 machine v2.4; -mc-is_default = 1; } static const TypeInfo ccw_machine_2_4_info = { @@ -298,10 +297,27 @@ static const TypeInfo ccw_machine_2_4_info = { .class_init= ccw_machine_2_4_class_init, }; +static void ccw_machine_2_5_class_init(ObjectClass *oc, void *data) +{ +MachineClass *mc = MACHINE_CLASS(oc); + +mc-name = s390-ccw-virtio-2.5; +mc-alias = s390-ccw-virtio; +mc-desc = VirtIO-ccw based S390 machine v2.5; +mc-is_default = 1; +} + +static const TypeInfo ccw_machine_2_5_info = { +.name = TYPE_S390_CCW_MACHINE 2.5, +.parent= TYPE_S390_CCW_MACHINE, +.class_init= ccw_machine_2_5_class_init, +}; + static void ccw_machine_register_types(void) { type_register_static(ccw_machine_info); type_register_static(ccw_machine_2_4_info); +type_register_static(ccw_machine_2_5_info); } type_init(ccw_machine_register_types) -- 2.4.6
[Qemu-devel] [PATCH for-2.5 5/8] s390x: Dump-skeys hmp support
From: Jason J. Herne jjhe...@linux.vnet.ibm.com Add dump-skeys command to the human monitor. Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Jason J. Herne jjhe...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hmp-commands.hx | 16 hw/s390x/s390-skeys.c | 12 include/hw/s390x/storage-keys.h | 2 ++ monitor.c | 4 4 files changed, 34 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index d3b7932..803ff91 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1053,6 +1053,22 @@ gdb. Without -z|-l|-s, the dump format is ELF. together with begin. ETEXI +#if defined(TARGET_S390X) +{ +.name = dump-skeys, +.args_type = filename:F, +.params = , +.help = Save guest storage keys into file 'filename'.\n, +.mhandler.cmd = hmp_dump_skeys, +}, +#endif + +STEXI +@item dump-skeys @var{filename} +@findex dump-skeys +Save guest storage keys to a file. +ETEXI + { .name = snapshot_blkdev, .args_type = reuse:-n,device:B,snapshot-file:s?,format:s?, diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c index a7b7a01..5e2948d 100644 --- a/hw/s390x/s390-skeys.c +++ b/hw/s390x/s390-skeys.c @@ -66,6 +66,18 @@ static void write_keys(FILE *f, uint8_t *keys, uint64_t startgfn, } } +void hmp_dump_skeys(Monitor *mon, const QDict *qdict) +{ +const char *filename = qdict_get_str(qdict, filename); +Error *err = NULL; + +qmp_dump_skeys(filename, err); +if (err) { +monitor_printf(mon, %s\n, error_get_pretty(err)); +error_free(err); +} +} + void qmp_dump_skeys(const char *filename, Error **errp) { S390SKeysState *ss = s390_get_skeys_device(); diff --git a/include/hw/s390x/storage-keys.h b/include/hw/s390x/storage-keys.h index cfd7da7..0d04f19 100644 --- a/include/hw/s390x/storage-keys.h +++ b/include/hw/s390x/storage-keys.h @@ -13,6 +13,7 @@ #define __S390_STORAGE_KEYS_H #include hw/qdev.h +#include monitor/monitor.h #define TYPE_S390_SKEYS s390-skeys #define S390_SKEYS(obj) \ @@ -52,4 +53,5 @@ void s390_skeys_init(void); S390SKeysState *s390_get_skeys_device(void); +void hmp_dump_skeys(Monitor *mon, const QDict *qdict); #endif /* __S390_STORAGE_KEYS_H */ diff --git a/monitor.c b/monitor.c index f1501cd..cfe31a4 100644 --- a/monitor.c +++ b/monitor.c @@ -82,6 +82,10 @@ #endif #include hw/lm32/lm32_pic.h +#if defined(TARGET_S390X) +#include hw/s390x/storage-keys.h +#endif + /* * Supported types: * -- 2.4.6
[Qemu-devel] [PATCH for-2.5 0/8] s390x: storage key migration
Here's the first batch of s390x patches I plan to send for 2.5. This one deals with storage keys, which may be set by guests and lacked a proper resting place so far. Introducing a device (that is backed by the KVM_S390_{SET,GET}_SKEYS ioctls in the kvm case) allows us to migrate them properly. Also available as a branch on git://github.com/cohuck/qemu s390-skey Cornelia Huck (1): s390x: add 2.5 compat s390-ccw-virtio machine Jason J. Herne (7): s390x: Create QOM device for s390 storage keys s390x: Enable new s390-storage-keys device s390x: Dump storage keys qmp command s390x: Dump-skeys hmp support s390x: Info skeys sub-command s390x: Migrate guest storage keys (initial memory only) s390x: Disable storage key migration on old machine type MAINTAINERS | 1 + hmp-commands.hx | 18 ++ hw/s390x/Makefile.objs | 2 + hw/s390x/s390-skeys-kvm.c | 75 hw/s390x/s390-skeys.c | 402 hw/s390x/s390-virtio-ccw.c | 38 +++- hw/s390x/s390-virtio.c | 11 +- hw/s390x/s390-virtio.h | 2 +- include/hw/s390x/storage-keys.h | 60 ++ monitor.c | 20 ++ qapi-schema.json| 13 ++ qmp-commands.hx | 25 +++ target-s390x/cpu.h | 2 - target-s390x/mem_helper.c | 46 - target-s390x/mmu_helper.c | 28 ++- trace-events| 4 + 16 files changed, 722 insertions(+), 25 deletions(-) create mode 100644 hw/s390x/s390-skeys-kvm.c create mode 100644 hw/s390x/s390-skeys.c create mode 100644 include/hw/s390x/storage-keys.h -- 2.4.6
[Qemu-devel] [PATCH for-2.5 3/8] s390x: Enable new s390-storage-keys device
From: Jason J. Herne jjhe...@linux.vnet.ibm.com s390 guest initialization is modified to make use of new s390-storage-keys device. Old code that globally allocated storage key array is removed. The new device enables storage key access for kvm guests. Cache storage key QOM objects in frequently used helper functions to avoid a performance hit every time we use one of these functions. Reviewed-by: Cornelia Huck cornelia.h...@de.ibm.com Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Jason J. Herne jjhe...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/s390x/s390-virtio-ccw.c | 8 hw/s390x/s390-virtio.c | 11 +-- hw/s390x/s390-virtio.h | 2 +- target-s390x/cpu.h | 2 -- target-s390x/mem_helper.c | 46 -- target-s390x/mmu_helper.c | 28 +++- trace-events | 4 7 files changed, 77 insertions(+), 24 deletions(-) diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c index 708763e..8f1b1fc 100644 --- a/hw/s390x/s390-virtio-ccw.c +++ b/hw/s390x/s390-virtio-ccw.c @@ -19,6 +19,7 @@ #include virtio-ccw.h #include qemu/config-file.h #include s390-pci-bus.h +#include hw/s390x/storage-keys.h #define TYPE_S390_CCW_MACHINE s390-ccw-machine @@ -105,7 +106,6 @@ static void ccw_init(MachineState *machine) MemoryRegion *sysmem = get_system_memory(); MemoryRegion *ram = g_new(MemoryRegion, 1); sclpMemoryHotplugDev *mhd = init_sclp_memory_hotplug_dev(); -uint8_t *storage_keys; int ret; VirtualCssBus *css_bus; DeviceState *dev; @@ -179,11 +179,11 @@ static void ccw_init(MachineState *machine) mhd-standby_mem_size = standby_mem_size; } -/* allocate storage keys */ -storage_keys = g_malloc0(my_ram_size / TARGET_PAGE_SIZE); +/* Initialize storage key device */ +s390_skeys_init(); /* init CPUs */ -s390_init_cpus(machine-cpu_model, storage_keys); +s390_init_cpus(machine-cpu_model); if (kvm_enabled()) { kvm_s390_enable_css_support(s390_cpu_addr2state(0)); diff --git a/hw/s390x/s390-virtio.c b/hw/s390x/s390-virtio.c index 1284e77..6cc6b5d 100644 --- a/hw/s390x/s390-virtio.c +++ b/hw/s390x/s390-virtio.c @@ -38,6 +38,7 @@ #include hw/s390x/sclp.h #include hw/s390x/s390_flic.h #include hw/s390x/s390-virtio.h +#include hw/s390x/storage-keys.h #include cpu.h //#define DEBUG_S390 @@ -164,7 +165,7 @@ void s390_init_ipl_dev(const char *kernel_filename, qdev_init_nofail(dev); } -void s390_init_cpus(const char *cpu_model, uint8_t *storage_keys) +void s390_init_cpus(const char *cpu_model) { int i; @@ -184,7 +185,6 @@ void s390_init_cpus(const char *cpu_model, uint8_t *storage_keys) ipi_states[i] = cpu; cs-halted = 1; cs-exception_index = EXCP_HLT; -cpu-env.storage_keys = storage_keys; } } @@ -264,7 +264,6 @@ static void s390_init(MachineState *machine) MemoryRegion *sysmem = get_system_memory(); MemoryRegion *ram = g_new(MemoryRegion, 1); int increment_size = 20; -uint8_t *storage_keys; void *virtio_region; hwaddr virtio_region_len; hwaddr virtio_region_start; @@ -306,11 +305,11 @@ static void s390_init(MachineState *machine) cpu_physical_memory_unmap(virtio_region, virtio_region_len, 1, virtio_region_len); -/* allocate storage keys */ -storage_keys = g_malloc0(my_ram_size / TARGET_PAGE_SIZE); +/* Initialize storage key device */ +s390_skeys_init(); /* init CPUs */ -s390_init_cpus(machine-cpu_model, storage_keys); +s390_init_cpus(machine-cpu_model); /* Create VirtIO network adapters */ s390_create_virtio_net((BusState *)s390_bus, virtio-net-s390); diff --git a/hw/s390x/s390-virtio.h b/hw/s390x/s390-virtio.h index c847853..cf68796 100644 --- a/hw/s390x/s390-virtio.h +++ b/hw/s390x/s390-virtio.h @@ -19,7 +19,7 @@ typedef int (*s390_virtio_fn)(const uint64_t *args); void s390_register_virtio_hypercall(uint64_t code, s390_virtio_fn fn); -void s390_init_cpus(const char *cpu_model, uint8_t *storage_keys); +void s390_init_cpus(const char *cpu_model); void s390_init_ipl_dev(const char *kernel_filename, const char *kernel_cmdline, const char *initrd_filename, diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h index 63aebf4..b650890 100644 --- a/target-s390x/cpu.h +++ b/target-s390x/cpu.h @@ -143,8 +143,6 @@ typedef struct CPUS390XState { uint32_t cpu_num; uint32_t machine_type; -uint8_t *storage_keys; - uint64_t tod_offset; uint64_t tod_basetime; QEMUTimer *tod_timer; diff --git a/target-s390x/mem_helper.c b/target-s390x/mem_helper.c index 6f8bd79..84bf198 100644 --- a/target-s390x/mem_helper.c +++ b/target-s390x/mem_helper.c @@
[Qemu-devel] [PATCH for-2.5 7/8] s390x: Migrate guest storage keys (initial memory only)
From: Jason J. Herne jjhe...@linux.vnet.ibm.com Routines to save/load guest storage keys are provided. register_savevm is called to register them as migration handlers. We prepare the protocol to support more complex parameters. So we will later be able to support standby memory (having empty holes), compression and state live migration like done for ram. Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Jason J. Herne jjhe...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/s390x/s390-skeys.c | 113 ++ 1 file changed, 113 insertions(+) diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c index d355c8f..a927c98 100644 --- a/hw/s390x/s390-skeys.c +++ b/hw/s390x/s390-skeys.c @@ -11,10 +11,13 @@ #include hw/boards.h #include qmp-commands.h +#include migration/qemu-file.h #include hw/s390x/storage-keys.h #include qemu/error-report.h #define S390_SKEYS_BUFFER_SIZE 131072 /* Room for 128k storage keys */ +#define S390_SKEYS_SAVE_FLAG_EOS 0x01 +#define S390_SKEYS_SAVE_FLAG_SKEYS 0x02 S390SKeysState *s390_get_skeys_device(void) { @@ -241,6 +244,115 @@ static const TypeInfo qemu_s390_skeys_info = { .instance_size = sizeof(S390SKeysClass), }; +static void s390_storage_keys_save(QEMUFile *f, void *opaque) +{ +S390SKeysState *ss = S390_SKEYS(opaque); +S390SKeysClass *skeyclass = S390_SKEYS_GET_CLASS(ss); +const uint64_t total_count = ram_size / TARGET_PAGE_SIZE; +uint64_t cur_count, handled_count = 0; +vaddr cur_gfn = 0; +uint8_t *buf; +int ret; + +if (!skeyclass-skeys_enabled(ss)) { +goto end_stream; +} + +buf = g_try_malloc(S390_SKEYS_BUFFER_SIZE); +if (!buf) { +error_report(storage key save could not allocate memory\n); +goto end_stream; +} + +/* We only support initital memory. Standby memory is not handled yet. */ +qemu_put_be64(f, (cur_gfn * TARGET_PAGE_SIZE) | + S390_SKEYS_SAVE_FLAG_SKEYS); +qemu_put_be64(f, total_count); + +while (handled_count total_count) { +cur_count = MIN(total_count - handled_count, S390_SKEYS_BUFFER_SIZE); + +ret = skeyclass-get_skeys(ss, cur_gfn, cur_count, buf); +if (ret 0) { +error_report(S390_GET_KEYS error %d\n, ret); +break; +} + +/* write keys to stream */ +qemu_put_buffer(f, buf, cur_count); + +cur_gfn += cur_count; +handled_count += cur_count; +} + +g_free(buf); +end_stream: +qemu_put_be64(f, S390_SKEYS_SAVE_FLAG_EOS); +} + +static int s390_storage_keys_load(QEMUFile *f, void *opaque, int version_id) +{ +S390SKeysState *ss = S390_SKEYS(opaque); +S390SKeysClass *skeyclass = S390_SKEYS_GET_CLASS(ss); +int ret = 0; + +while (!ret) { +ram_addr_t addr; +int flags; + +addr = qemu_get_be64(f); +flags = addr ~TARGET_PAGE_MASK; +addr = TARGET_PAGE_MASK; + +switch (flags) { +case S390_SKEYS_SAVE_FLAG_SKEYS: { +const uint64_t total_count = qemu_get_be64(f); +uint64_t handled_count = 0, cur_count; +uint64_t cur_gfn = addr / TARGET_PAGE_SIZE; +uint8_t *buf = g_try_malloc(S390_SKEYS_BUFFER_SIZE); + +if (!buf) { +error_report(storage key load could not allocate memory\n); +ret = -ENOMEM; +break; +} + +while (handled_count total_count) { +cur_count = MIN(total_count - handled_count, +S390_SKEYS_BUFFER_SIZE); +qemu_get_buffer(f, buf, cur_count); + +ret = skeyclass-set_skeys(ss, cur_gfn, cur_count, buf); +if (ret 0) { +error_report(S390_SET_KEYS error %d\n, ret); +break; +} +handled_count += cur_count; +cur_gfn += cur_count; +} +g_free(buf); +break; +} +case S390_SKEYS_SAVE_FLAG_EOS: +/* normal exit */ +return 0; +default: +error_report(Unexpected storage key flag data: %#x, flags); +ret = -EINVAL; +} +} + +return ret; +} + +static void s390_skeys_instance_init(Object *obj) +{ +S390SKeysState *ss = S390_SKEYS(obj); + +register_savevm(NULL, TYPE_S390_SKEYS, 0, 1, s390_storage_keys_save, +s390_storage_keys_load, ss); +} + static void s390_skeys_class_init(ObjectClass *oc, void *data) { DeviceClass *dc = DEVICE_CLASS(oc); @@ -252,6 +364,7 @@ static void s390_skeys_class_init(ObjectClass *oc, void *data) static const TypeInfo s390_skeys_info = { .name = TYPE_S390_SKEYS, .parent= TYPE_DEVICE, +.instance_init = s390_skeys_instance_init, .instance_size = sizeof(S390SKeysState),
[Qemu-devel] [PATCH for-2.5 4/8] s390x: Dump storage keys qmp command
From: Jason J. Herne jjhe...@linux.vnet.ibm.com Provide a dump-skeys qmp command to allow the end user to dump storage keys. This is useful for debugging problems with guest storage key support within Qemu and for guest operating system developers. Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Jason J. Herne jjhe...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/s390x/s390-skeys.c | 91 +++ monitor.c | 7 qapi-schema.json | 13 qmp-commands.hx | 25 ++ 4 files changed, 136 insertions(+) diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c index 77c42ff..a7b7a01 100644 --- a/hw/s390x/s390-skeys.c +++ b/hw/s390x/s390-skeys.c @@ -10,9 +10,12 @@ */ #include hw/boards.h +#include qmp-commands.h #include hw/s390x/storage-keys.h #include qemu/error-report.h +#define S390_SKEYS_BUFFER_SIZE 131072 /* Room for 128k storage keys */ + S390SKeysState *s390_get_skeys_device(void) { S390SKeysState *ss; @@ -38,6 +41,94 @@ void s390_skeys_init(void) qdev_init_nofail(DEVICE(obj)); } +static void write_keys(FILE *f, uint8_t *keys, uint64_t startgfn, + uint64_t count, Error **errp) +{ +uint64_t curpage = startgfn; +uint64_t maxpage = curpage + count - 1; +int r; + +for (; curpage = maxpage; curpage++) { +uint8_t acc = (*keys 0xF0) 4; +int fp = (*keys 0x08); +int ref = (*keys 0x04); +int ch = (*keys 0x02); +int reserved = (*keys 0x01); + +r = fprintf(f, page=%03 PRIx64 : key(%d) = ACC=%X, FP=%d, REF=%d, +ch=%d, reserved=%d\n, curpage, *keys, acc, fp, ref, + ch, reserved); +if (r 0) { +error_setg(errp, I/O error); +return; +} +keys++; +} +} + +void qmp_dump_skeys(const char *filename, Error **errp) +{ +S390SKeysState *ss = s390_get_skeys_device(); +S390SKeysClass *skeyclass = S390_SKEYS_GET_CLASS(ss); +const uint64_t total_count = ram_size / TARGET_PAGE_SIZE; +uint64_t handled_count = 0, cur_count; +Error *lerr = NULL; +vaddr cur_gfn = 0; +uint8_t *buf; +int ret; +FILE *f; + +/* Quick check to see if guest is using storage keys*/ +if (!skeyclass-skeys_enabled(ss)) { +error_setg(lerr, This guest is not using storage keys. + Nothing to dump.); +error_propagate(errp, lerr); +return; +} + +f = fopen(filename, wb); +if (!f) { +error_setg(lerr, Could not open file); +error_propagate(errp, lerr); +return; +} + +buf = g_try_malloc(S390_SKEYS_BUFFER_SIZE); +if (!buf) { +error_setg(lerr, Could not allocate memory); +error_propagate(errp, lerr); +goto out; +} + +/* we'll only dump initial memory for now */ +while (handled_count total_count) { +/* Calculate how many keys to ask for handle overflow case */ +cur_count = MIN(total_count - handled_count, S390_SKEYS_BUFFER_SIZE); + +ret = skeyclass-get_skeys(ss, cur_gfn, cur_count, buf); +if (ret 0) { +error_setg(lerr, get_keys error %d, ret); +error_propagate(errp, lerr); +goto out_free; +} + +/* write keys to stream */ +write_keys(f, buf, cur_gfn, cur_count, lerr); +if (lerr) { +error_propagate(errp, lerr); +goto out_free; +} + +cur_gfn += cur_count; +handled_count += cur_count; +} + +out_free: +g_free(buf); +out: +fclose(f); +} + static void qemu_s390_skeys_init(Object *obj) { QEMUS390SKeysState *skeys = QEMU_S390_SKEYS(obj); diff --git a/monitor.c b/monitor.c index aeea2b5..f1501cd 100644 --- a/monitor.c +++ b/monitor.c @@ -5361,3 +5361,10 @@ void qmp_rtc_reset_reinjection(Error **errp) error_setg(errp, QERR_FEATURE_DISABLED, rtc-reset-reinjection); } #endif + +#ifndef TARGET_S390X +void qmp_dump_skeys(const char *filename, Error **errp) +{ +error_setg(errp, QERR_FEATURE_DISABLED, dump-skeys); +} +#endif diff --git a/qapi-schema.json b/qapi-schema.json index 1285b8c..d1c1c25 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -2058,6 +2058,19 @@ 'returns': 'DumpGuestMemoryCapability' } ## +# @dump-skeys +# +# Dump guest's storage keys. @filename: the path to the file to dump to. +# This command is only supported on s390 architecture. +# +# Returns: nothing on success +# +# Since: 2.5 +## +{ 'command': 'dump-skeys', + 'data': { 'filename': 'str' } } + +## # @netdev_add: # # Add a network backend. diff --git a/qmp-commands.hx b/qmp-commands.hx index ba630b1..9848fd8 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -872,6 +872,31 @@ Example: EQMP +#if defined TARGET_S390X +{ +.name
[Qemu-devel] [PATCH for-2.5 6/8] s390x: Info skeys sub-command
From: Jason J. Herne jjhe...@linux.vnet.ibm.com Provide an info skeys hmp sub-command to allow the end user to dump a storage key for a given address. This is useful for guest operating system developers. Reviewed-by: Thomas Huth th...@linux.vnet.ibm.com Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Jason J. Herne jjhe...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hmp-commands.hx | 2 ++ hw/s390x/s390-skeys.c | 23 +++ include/hw/s390x/storage-keys.h | 2 ++ monitor.c | 9 + 4 files changed, 36 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index 803ff91..c61468e 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1806,6 +1806,8 @@ show roms show the TPM device @item info memory-devices show the memory devices +@item info skeys +Display the value of a storage key (s390 only) @end table ETEXI diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c index 5e2948d..d355c8f 100644 --- a/hw/s390x/s390-skeys.c +++ b/hw/s390x/s390-skeys.c @@ -66,6 +66,29 @@ static void write_keys(FILE *f, uint8_t *keys, uint64_t startgfn, } } +void hmp_info_skeys(Monitor *mon, const QDict *qdict) +{ +S390SKeysState *ss = s390_get_skeys_device(); +S390SKeysClass *skeyclass = S390_SKEYS_GET_CLASS(ss); +uint64_t addr = qdict_get_int(qdict, addr); +uint8_t key; +int r; + +/* Quick check to see if guest is using storage keys*/ +if (!skeyclass-skeys_enabled(ss)) { +monitor_printf(mon, Error: This guest is not using storage keys.\n); +return; +} + +r = skeyclass-get_skeys(ss, addr / TARGET_PAGE_SIZE, 1, key); +if (r 0) { +monitor_printf(mon, Error: %s\n, strerror(-r)); +return; +} + +monitor_printf(mon, key: 0x%X\n, key); +} + void hmp_dump_skeys(Monitor *mon, const QDict *qdict) { const char *filename = qdict_get_str(qdict, filename); diff --git a/include/hw/s390x/storage-keys.h b/include/hw/s390x/storage-keys.h index 0d04f19..18e08d2 100644 --- a/include/hw/s390x/storage-keys.h +++ b/include/hw/s390x/storage-keys.h @@ -54,4 +54,6 @@ void s390_skeys_init(void); S390SKeysState *s390_get_skeys_device(void); void hmp_dump_skeys(Monitor *mon, const QDict *qdict); +void hmp_info_skeys(Monitor *mon, const QDict *qdict); + #endif /* __S390_STORAGE_KEYS_H */ diff --git a/monitor.c b/monitor.c index cfe31a4..d2153fa 100644 --- a/monitor.c +++ b/monitor.c @@ -2881,6 +2881,15 @@ static mon_cmd_t info_cmds[] = { .help = Show rocker OF-DPA groups, .mhandler.cmd = hmp_rocker_of_dpa_groups, }, +#if defined(TARGET_S390X) +{ +.name = skeys, +.args_type = addr:l, +.params = address, +.help = Display the value of a storage key, +.mhandler.cmd = hmp_info_skeys, +}, +#endif { .name = NULL, }, -- 2.4.6
[Qemu-devel] [PATCH for-2.5 8/8] s390x: Disable storage key migration on old machine type
From: Jason J. Herne jjhe...@linux.vnet.ibm.com This code disables storage key migration when an older machine type is specified. Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com Signed-off-by: Jason J. Herne jjhe...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/s390x/s390-skeys.c | 28 +--- hw/s390x/s390-virtio-ccw.c | 12 include/hw/s390x/storage-keys.h | 1 + 3 files changed, 38 insertions(+), 3 deletions(-) diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c index a927c98..2f23600 100644 --- a/hw/s390x/s390-skeys.c +++ b/hw/s390x/s390-skeys.c @@ -345,12 +345,34 @@ static int s390_storage_keys_load(QEMUFile *f, void *opaque, int version_id) return ret; } -static void s390_skeys_instance_init(Object *obj) +static inline bool s390_skeys_get_migration_enabled(Object *obj, Error **errp) { S390SKeysState *ss = S390_SKEYS(obj); -register_savevm(NULL, TYPE_S390_SKEYS, 0, 1, s390_storage_keys_save, -s390_storage_keys_load, ss); +return ss-migration_enabled; +} + +static inline void s390_skeys_set_migration_enabled(Object *obj, bool value, +Error **errp) +{ +S390SKeysState *ss = S390_SKEYS(obj); + +ss-migration_enabled = value; + +if (ss-migration_enabled) { +register_savevm(NULL, TYPE_S390_SKEYS, 0, 1, s390_storage_keys_save, +s390_storage_keys_load, ss); +} else { +unregister_savevm(DEVICE(ss), TYPE_S390_SKEYS, ss); +} +} + +static void s390_skeys_instance_init(Object *obj) +{ +object_property_add_bool(obj, migration-enabled, + s390_skeys_get_migration_enabled, + s390_skeys_set_migration_enabled, NULL); +object_property_set_bool(obj, true, migration-enabled, NULL); } static void s390_skeys_class_init(ObjectClass *oc, void *data) diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c index 8f1b1fc..80d4714 100644 --- a/hw/s390x/s390-virtio-ccw.c +++ b/hw/s390x/s390-virtio-ccw.c @@ -282,13 +282,25 @@ static const TypeInfo ccw_machine_info = { }, }; +#define CCW_COMPAT_2_4 \ +{\ +.driver = TYPE_S390_SKEYS,\ +.property = migration-enabled,\ +.value= off,\ +}, + static void ccw_machine_2_4_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); +static GlobalProperty compat_props[] = { +CCW_COMPAT_2_4 +{ /* end of list */ } +}; mc-name = s390-ccw-virtio-2.4; mc-alias = s390-ccw-virtio; mc-desc = VirtIO-ccw based S390 machine v2.4; +mc-compat_props = compat_props; } static const TypeInfo ccw_machine_2_4_info = { diff --git a/include/hw/s390x/storage-keys.h b/include/hw/s390x/storage-keys.h index 18e08d2..72b850c 100644 --- a/include/hw/s390x/storage-keys.h +++ b/include/hw/s390x/storage-keys.h @@ -21,6 +21,7 @@ typedef struct S390SKeysState { DeviceState parent_obj; +bool migration_enabled; } S390SKeysState; -- 2.4.6
Re: [Qemu-devel] [PATCH for-2.5 1/8] s390x: add 2.5 compat s390-ccw-virtio machine
Am 20.07.2015 um 15:49 schrieb Cornelia Huck: Reviewed-by: Jason J. Herne jjhe...@linux.vnet.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com for post 2.4 Acked-by: Christian Borntraeger borntrae...@de.ibm.com --- hw/s390x/s390-virtio-ccw.c | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c index 4c51d1a..708763e 100644 --- a/hw/s390x/s390-virtio-ccw.c +++ b/hw/s390x/s390-virtio-ccw.c @@ -289,7 +289,6 @@ static void ccw_machine_2_4_class_init(ObjectClass *oc, void *data) mc-name = s390-ccw-virtio-2.4; mc-alias = s390-ccw-virtio; mc-desc = VirtIO-ccw based S390 machine v2.4; -mc-is_default = 1; } static const TypeInfo ccw_machine_2_4_info = { @@ -298,10 +297,27 @@ static const TypeInfo ccw_machine_2_4_info = { .class_init= ccw_machine_2_4_class_init, }; +static void ccw_machine_2_5_class_init(ObjectClass *oc, void *data) +{ +MachineClass *mc = MACHINE_CLASS(oc); + +mc-name = s390-ccw-virtio-2.5; +mc-alias = s390-ccw-virtio; +mc-desc = VirtIO-ccw based S390 machine v2.5; +mc-is_default = 1; +} + +static const TypeInfo ccw_machine_2_5_info = { +.name = TYPE_S390_CCW_MACHINE 2.5, +.parent= TYPE_S390_CCW_MACHINE, +.class_init= ccw_machine_2_5_class_init, +}; + static void ccw_machine_register_types(void) { type_register_static(ccw_machine_info); type_register_static(ccw_machine_2_4_info); +type_register_static(ccw_machine_2_5_info); } type_init(ccw_machine_register_types)
[Qemu-devel] [PULL 1/1] crypto: Fix aes_decrypt_wrapper()
Commit d3462e3 broke qcow2's encryption functionality by using encrypt instead of decrypt in the wrapper function it introduces. This was found by qemu-iotests case 134. Signed-off-by: Kevin Wolf kw...@redhat.com Reviewed-by: Daniel P. Berrange berra...@redhat.com --- crypto/cipher-nettle.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/crypto/cipher-nettle.c b/crypto/cipher-nettle.c index a55a8e8..b01cb1c 100644 --- a/crypto/cipher-nettle.c +++ b/crypto/cipher-nettle.c @@ -47,7 +47,7 @@ static void aes_encrypt_wrapper(cipher_ctx_t ctx, cipher_length_t length, static void aes_decrypt_wrapper(cipher_ctx_t ctx, cipher_length_t length, uint8_t *dst, const uint8_t *src) { -aes_encrypt(ctx, length, dst, src); +aes_decrypt(ctx, length, dst, src); } static void des_encrypt_wrapper(cipher_ctx_t ctx, cipher_length_t length, -- 1.8.3.1
[Qemu-devel] [PULL 0/1] Block layer patches for 2.4.0-rc2
The following changes since commit 71358470eec668f5dc53def25e585ce250cea9bf: Merge remote-tracking branch 'remotes/amit-virtio-rng/tags/vrng-2.4' into staging (2015-07-17 15:22:45 +0100) are available in the git repository at: git://repo.or.cz/qemu/kevin.git tags/for-upstream for you to fetch changes up to bd09594603f1498e7623f0030988b62e2052f7da: crypto: Fix aes_decrypt_wrapper() (2015-07-20 13:35:45 +0200) Block layer patches for 2.4.0-rc2 Kevin Wolf (1): crypto: Fix aes_decrypt_wrapper() crypto/cipher-nettle.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Re: [Qemu-devel] [POC] colo-proxy in qemu
On Mon, Jul 20, 2015 at 2:12 PM, Vasiliy Tolstov v.tols...@selfip.ru wrote: 2015-07-20 14:55 GMT+03:00 zhanghailiang zhang.zhanghaili...@huawei.com: Agreed, besides, it is seemed that slirp is not supporting ipv6, we also have to supplement it. patch for ipv6 slirp support some times ago sended to qemu list, but i don't know why in not accepted. I think no one reviewed it but there was no objection against IPv6 support in principle. Jan: Can we merge slirp IPv6 support for QEMU 2.5? Stefan
Re: [Qemu-devel] [PULL 0/6] virtio, vhost, pc fixes for 2.4
On 20 July 2015 at 13:12, Michael S. Tsirkin m...@redhat.com wrote: The following changes since commit b4329bf41c86bac8b56cadb097081960cc4839a0: Update version for v2.4.0-rc1 release (2015-07-16 20:32:20 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream for you to fetch changes up to f9d6dbf0bf6e91b8ed896369ab1b7e91e5a1a4df: virtio-net: remove virtio queues if the guest doesn't support multiqueue (2015-07-20 14:19:42 +0300) virtio, vhost, pc fixes for 2.4 The only notable thing here is vhost-user multiqueue revert. We'll work on making it stable in 2.5, reverting now means we won't have to maintain bug for bug compability forever. Signed-off-by: Michael S. Tsirkin m...@redhat.com Applied, thanks. -- PMM
Re: [Qemu-devel] [Qemu-block] [PATCH 1/2] ignore bdrv_flush operation when no qcow2 cache item is dirty
[patches should always be sent to qemu-devel, even if qemu-block is also in the to/cc list] On 07/08/2015 01:26 AM, Qingshu Chen wrote: qcow2_cache_flush() writes dirty cache to the disk and invokes bdrv_flush() to make the data durable. But even if there is no dirty cache, qcow2_cache_flush() would invoke bdrv_flush(). In fact, bdrv_flush() will invoke fdatasync(), and it is an expensive operation. The patch will not invoke bdrv_flush if there is not dirty cache. The reason that I modify the return value of qcow2_cache_flush() is qcow2_co_flush_to_os needs to know whether flush operation is called. Following is the patch: From 23f9f83da4178e8fbb53d2cffe128f5a2d3a239a Mon Sep 17 00:00:00 2001 From: Qingshu Chen qingshu.chen...@gmail.com Date: Wed, 1 Jul 2015 14:45:23 +0800 Subject: [PATCH 1/2] ignore bdrv_flush operation when no qcow2 cache item is dirty Signed-off-by: Qingshu Chen qingshu.chen...@gmail.com I didn't quickly find an associated 2/2 patch; are you sure you sent the series correctly? --- block/qcow2-cache.c | 9 - block/qcow2.c | 2 ++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c index ed92a09..57c0601 100644 --- a/block/qcow2-cache.c +++ b/block/qcow2-cache.c @@ -174,6 +174,7 @@ int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c) int result = 0; int ret; int i; +int flag = 0; This is used as a bool, so declare it as such (bool flag = false;). -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 09/10] qga: added bus type and disk location path
On 07/06/2015 10:40 PM, Michael Roth wrote: From: Olga Krishtal okrish...@virtuozzo.com According to Microsoft disk location path can be obtained via IOCTL_SCSI_GET_ADDRESS. Unfortunately this ioctl can not be used for all devices. There are certain bus types which could be obtained with this API. Please, refer to the following link for more details https://technet.microsoft.com/en-us/library/ee851589(v=ws.10).aspx Bus type could be obtained using IOCTL_STORAGE_QUERY_PROPERTY. Enum STORAGE_BUS_TYPE describes all buses supported by OS. Windows defines more bus types than Linux. Thus some values have been added to GuestDiskBusType. Signed-off-by: Olga Krishtal okrish...@virtuozzo.com Signed-off-by: Denis V. Lunev d...@openvz.org CC: Eric Blake ebl...@redhat.com CC: Michael Roth mdr...@linux.vnet.ibm.com * fixed warning in CreateFile due to use of NULL instead of 0 Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- +++ b/qga/qapi-schema.json @@ -703,12 +703,24 @@ # @uml: UML disks # @sata: SATA disks # @sd: SD cards +# @unknown: Unknown bus type +# @ieee1394: Win IEEE 1394 bus type +# @ssa: Win SSA bus type +# @fibre: Win fiber channel bus type +# @raid: Win RAID bus type +# @iscsi: Win iScsi bus type +# @sas: Win serial-attaches SCSI bus type +# @mmc: Win multimedia card (MMC) bus type +# @virtual: Win virtual bus type +# @file-backed virtual: Win file-backed bus type # # Since: 2.2 It would be nice to have a followup patch (since it is doc-only, it could still make 2.4) that mentions that all these new enum members were added in 2.4. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH v4 00/38] blockdev: BlockBackend and media
First of all: Thank you, Eric and Berto, for reviewing v3! And thank you, Fam, for at least having a peek at it and being confident enough to base a series of your own on it. :-) This series reworks a lot regarding BlockBackend and media. Basically, it allows empty BlockBackends, that is BBs without a BDS tree. Before this series, empty drives are represented by a BlockBackend with an empty BDS attached to it (a BDS with a NULL driver). However, now we have BlockBackends, thus an empty drive should be represented by a BlockBackend without any BDS tree attached to it. This is what this series does. Quick and early summary for the v4 changes: - Rebase on master (most changes due to the new throttle groups) - Addressed comments for v3 - Fixed a bug: Exchanging media should always be possible for BBs without an attached device model (it wasn't in v3) Justification for each of the patches and their order: -- Preparation before _is_inserted() patches -- 1: Patch 9 will not take care not to break host floppy support, so that support needs to be removed first. 2: Needed for patch 3. Patch 24 is a follow-up after BDS-less BBs are allowed. 3: bdrv_close_all() is broken (block: Rework bdrv_close_all()). Patch 6 will break iotest 071 (actually, just make the problem apparent). So this patch is required to work around the issue. (with the issue being that bdrv_close_all() does not unref() the BDSs it is closing, but just force-closes everything, even if the BDS may still be in use somewhere) -- _is_inserted() patches -- 4: General clean-up work, nice to have before patch 6 (and goes in tune with patch 5). 5: Using the same BB as a guest device means that the data read from there should be exactly the same. Opening the guest tray should therefore result in no data being readable. This is what we then need this function for. 6: General clean-up work (in the _is_inserted() area). 7: General clean-up work (in the _is_inserted() area). 8: General clean-up work (also regarding _is_inserted()). 9: Required so inserting a floppy will not result in the tray being reported as closed (you need to push in the floppy first, using blockdev-close-tray). It's here in the _is_inserted() patches area because I feel like that's a very related topic. -- Support for BDS-less BBs -- 10: Preparation for BDS-less BBs 11: Preparation for BDS-less BBs 12: Preparation for BDS-less BBs (BB properties should be in the BB, and not in the root BDS) 13: Patch 14 removes BlockAcctStats from the BDS, but wr_highest_sector is BDS-dependent, so it needs to stay here 14: Preparation for BDS-less BBs (BB properties should be in the BB, and not in the root BDS) 15: Preparation for BDS-less BBs (BB properties should be in the BB, and not in the root BDS) 16: Preparation for BDS-less BBs (Removing a BDS tree should retain some properties for legacy reasons, which must therefore be stored in the BB(RS)) 17: Preparation for BDS-less BBs 18: Preparation for BDS-less BBs 19: Preparation for BDS-less BBs 20: Ability to add BDS trees to empty BBs (inserting a medium) 21: Preparation for BDS-less BBs (needs patch 20) 22: One goal of this series, and fixes the opening tray event for empty drives when shutting down qemu 23: Needed for patch 24 24: Completion of what patch 2 begun 25: Ability to detach BDS trees from BBs -- Atomic QMP tray operations -- 26: blockdev-open-tray 27: blockdev-close-tray 28: blockdev-remove-medium 29: blockdev-insert-medium -- Reimplementation of change/eject -- 30: eject 31: change 32: Clean-up patch -- New QMP blockdev-change-medium command -- 33: New QMP command 34: Use for HMP change command 35: Add flag to that command for changing the read-only access mode (which was my original intention for this series) 36: Same flag for HMP -- Tests -- 37: Required for patch 38 38: iotests are always nice, so here is one v4: - Rebased on current master: - Patch 1 (all in code being removed): - DEBUG_FLOPPY has been removed - floppy_probe_device() was a bit more specific - Patch 16 (because of throttle groups): Store the throttle group name in the BBRS - Patch 21: hmp_drive_del() no longer drains or flushes explicitly - Patch 22: Retain throttle group for BB-BDS trees, and save throttle group name in the BBRS for empty BBs - Patch 23: - Read the throttle group name in the new function - Use macros for cache.* - Patch 24: Set the throttle group for BB-less BDS trees - Patches 26, 27, 28, 29, 30, 31: Drop QERR_DEVICE_NOT_FOUND - Patch 31: - Use error_setg() for QERR_INVALID_BLOCK_FORMAT - Respect the throttle group name stored in the BBRS - Patch 23: Fixed potentially unused variable [Fam], and removed the empty line at the function's end - Patches 26, 27, 28, 29, 33: s/2\.3/2\.5/ [Eric] - Patches 28, 29: Assume the tray to be open if no device model is attached to the BB; while
Re: [Qemu-devel] [PATCH RFC v2 04/47] qapi-event: Clean up how name of enum QAPIEvent is made
On 07/01/2015 02:21 PM, Markus Armbruster wrote: Use c_name() instead of ad hoc code. Doesn't upcase the -p prefix, which is an improvement in my book. Unbreaks prefix containing '.', but other funny characters remain broken. To be fixed next. Signed-off-by: Markus Armbruster arm...@redhat.com --- scripts/qapi-event.py | 2 +- tests/test-qmp-event.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) No change to the generated qapi-event.[ch], so it looks like only the testsuite is affected. [In fact, as far as I can tell, only docs/qapi-code-gen.txt and tests/Makefile even take advantage of the '-p prefix' argument.] Fine by me. Reviewed-by: Eric Blake ebl...@redhat.com -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH v4 03/38] iotests: Only create BB if necessary
Tests 071 and 081 test giving references in blockdev-add. It is not necessary to create a BlockBackend here, so omit it. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com Reviewed-by: Alberto Garcia be...@igalia.com --- tests/qemu-iotests/071 | 50 ++ tests/qemu-iotests/071.out | 12 +++ tests/qemu-iotests/081 | 14 - tests/qemu-iotests/081.out | 5 +++-- 4 files changed, 70 insertions(+), 11 deletions(-) diff --git a/tests/qemu-iotests/071 b/tests/qemu-iotests/071 index 9eaa49b..68bedd4 100755 --- a/tests/qemu-iotests/071 +++ b/tests/qemu-iotests/071 @@ -104,11 +104,20 @@ echo echo === Testing blkdebug on existing block device === echo -run_qemu -drive file=$TEST_IMG,format=raw,if=none,id=drive0 EOF +run_qemu EOF { execute: qmp_capabilities } { execute: blockdev-add, arguments: { options: { +node-name: drive0, +driver: file, +filename: $TEST_IMG +} +} +} +{ execute: blockdev-add, +arguments: { +options: { driver: $IMGFMT, id: drive0-debug, file: { @@ -133,11 +142,23 @@ echo echo === Testing blkverify on existing block device === echo -run_qemu -drive file=$TEST_IMG,format=$IMGFMT,if=none,id=drive0 EOF +run_qemu EOF { execute: qmp_capabilities } { execute: blockdev-add, arguments: { options: { +node-name: drive0, +driver: $IMGFMT, +file: { +driver: file, +filename: $TEST_IMG +} +} +} +} +{ execute: blockdev-add, +arguments: { +options: { driver: blkverify, id: drive0-verify, test: drive0, @@ -163,11 +184,23 @@ echo echo === Testing blkverify on existing raw block device === echo -run_qemu -drive file=$TEST_IMG.base,format=raw,if=none,id=drive0 EOF +run_qemu EOF { execute: qmp_capabilities } { execute: blockdev-add, arguments: { options: { +node-name: drive0, +driver: raw, +file: { +driver: file, +filename: $TEST_IMG.base +} +} +} +} +{ execute: blockdev-add, +arguments: { +options: { driver: blkverify, id: drive0-verify, test: { @@ -193,11 +226,20 @@ echo echo === Testing blkdebug's set-state through QMP === echo -run_qemu -drive file=$TEST_IMG,format=raw,if=none,id=drive0 EOF +run_qemu EOF { execute: qmp_capabilities } { execute: blockdev-add, arguments: { options: { +node-name: drive0, +driver: file, +filename: $TEST_IMG +} +} +} +{ execute: blockdev-add, +arguments: { +options: { driver: $IMGFMT, id: drive0-debug, file: { diff --git a/tests/qemu-iotests/071.out b/tests/qemu-iotests/071.out index 9205ce2..c8ecfaf 100644 --- a/tests/qemu-iotests/071.out +++ b/tests/qemu-iotests/071.out @@ -42,10 +42,11 @@ read failed: Input/output error === Testing blkdebug on existing block device === -Testing: -drive file=TEST_DIR/t.IMGFMT,format=raw,if=none,id=drive0 +Testing: QMP_VERSION {return: {}} {return: {}} +{return: {}} read failed: Input/output error {return: } {return: {}} @@ -58,28 +59,31 @@ QEMU_PROG: Failed to flush the refcount block cache: Input/output error === Testing blkverify on existing block device === -Testing: -drive file=TEST_DIR/t.IMGFMT,format=IMGFMT,if=none,id=drive0 +Testing: QMP_VERSION {return: {}} {return: {}} +{return: {}} blkverify: read sector_num=0 nb_sectors=1 contents mismatch in sector 0 === Testing blkverify on existing raw block device === -Testing: -drive file=TEST_DIR/t.IMGFMT.base,format=raw,if=none,id=drive0 +Testing: QMP_VERSION {return: {}} {return: {}} +{return: {}} blkverify: read sector_num=0 nb_sectors=1 contents mismatch in sector 0 === Testing blkdebug's set-state through QMP === -Testing: -drive file=TEST_DIR/t.IMGFMT,format=raw,if=none,id=drive0 +Testing: QMP_VERSION {return: {}} {return: {}} +{return: {}} read 512/512 bytes at offset 0 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) {return: } diff --git a/tests/qemu-iotests/081 b/tests/qemu-iotests/081 index d9b042c..5c8a8fa 100755 --- a/tests/qemu-iotests/081 +++ b/tests/qemu-iotests/081 @@ -101,11 +101,23 @@ $QEMU_IO -c open -o $quorum -c read -P 0x32 0 $size | _filter_qemu_io echo echo == checking mixed reference/option specification == -run_qemu -drive file=$TEST_DIR/2.raw,format=$IMGFMT,if=none,id=drive2 EOF +run_qemu EOF { execute: qmp_capabilities } { execute: blockdev-add, arguments: { options: { +node-name: drive2, +driver: raw, +file: { +driver: file, +filename: $TEST_DIR/2.raw +} +
[Qemu-devel] [PATCH v4 29/38] blockdev: Add blockdev-insert-medium
And a helper function for that, which directly takes a pointer to the BDS to be inserted instead of its node-name (which will be used for implementing 'change' using blockdev-insert-medium). Signed-off-by: Max Reitz mre...@redhat.com --- blockdev.c | 48 qapi/block-core.json | 17 + qmp-commands.hx | 37 + 3 files changed, 102 insertions(+) diff --git a/blockdev.c b/blockdev.c index 481760a..a80d0e2 100644 --- a/blockdev.c +++ b/blockdev.c @@ -2164,6 +2164,54 @@ void qmp_blockdev_remove_medium(const char *device, Error **errp) } } +static void qmp_blockdev_insert_anon_medium(const char *device, +BlockDriverState *bs, Error **errp) +{ +BlockBackend *blk; +bool has_device; + +blk = blk_by_name(device); +if (!blk) { +error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND, + Device '%s' not found, device); +return; +} + +/* For BBs without a device, we can exchange the BDS tree at will */ +has_device = blk_get_attached_dev(blk); + +if (has_device !blk_dev_has_removable_media(blk)) { +error_setg(errp, Device '%s' is not removable, device); +return; +} + +if (has_device !blk_dev_is_tray_open(blk)) { +error_setg(errp, Tray of device '%s' is not open, device); +return; +} + +if (blk_bs(blk)) { +error_setg(errp, There already is a medium in device '%s', device); +return; +} + +blk_insert_bs(blk, bs); +} + +void qmp_blockdev_insert_medium(const char *device, const char *node_name, +Error **errp) +{ +BlockDriverState *bs; + +bs = bdrv_find_node(node_name); +if (!bs) { +error_setg(errp, Node '%s' not found, node_name); +return; +} + +qmp_blockdev_insert_anon_medium(device, bs, errp); +} + /* throttling disk I/O limits */ void qmp_block_set_io_throttle(const char *device, int64_t bps, int64_t bps_rd, int64_t bps_wr, diff --git a/qapi/block-core.json b/qapi/block-core.json index 63a83e4..84c9b23 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1925,6 +1925,23 @@ { 'command': 'blockdev-remove-medium', 'data': { 'device': 'str' } } +## +# @blockdev-insert-medium: +# +# Inserts a medium (a block driver state tree) into a block device. That block +# device's tray must currently be open and there must be no medium inserted +# already. +# +# @device:block device name +# +# @node-name: name of a node in the block driver state graph +# +# Since: 2.5 +## +{ 'command': 'blockdev-insert-medium', + 'data': { 'device': 'str', +'node-name': 'str'} } + ## # @BlockErrorAction diff --git a/qmp-commands.hx b/qmp-commands.hx index ff6c572..b4c34fe 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -3991,6 +3991,43 @@ Example: EQMP { +.name = blockdev-insert-medium, +.args_type = device:s,node-name:s, +.mhandler.cmd_new = qmp_marshal_input_blockdev_insert_medium, +}, + +SQMP +blockdev-insert-medium +-- + +Inserts a medium (a block driver state tree) into a block device. That block +device's tray must currently be open and there must be no medium inserted +already. + +Arguments: + +- device: block device name (json-string) +- node-name: root node of the BDS tree to insert into the block device + +Example: + +- { execute: blockdev-add, + arguments: { options: { node-name: node0, + driver: raw, + file: { driver: file, + filename: fedora.iso } } } } + +- { return: {} } + +- { execute: blockdev-insert-medium, + arguments: { device: ide1-cd0, +node-name: node0 } } + +- { return: {} } + +EQMP + +{ .name = query-named-block-nodes, .args_type = , .mhandler.cmd_new = qmp_marshal_input_query_named_block_nodes, -- 2.4.6
[Qemu-devel] [PATCH v4 10/38] hw/usb-storage: Check whether BB is inserted
Only call bdrv_add_key() on the BlockDriverState if it is not NULL. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com Reviewed-by: Alberto Garcia be...@igalia.com --- hw/usb/dev-storage.c | 30 -- 1 file changed, 16 insertions(+), 14 deletions(-) diff --git a/hw/usb/dev-storage.c b/hw/usb/dev-storage.c index 9a4e7dc..597d8fd 100644 --- a/hw/usb/dev-storage.c +++ b/hw/usb/dev-storage.c @@ -613,20 +613,22 @@ static void usb_msd_realize_storage(USBDevice *dev, Error **errp) return; } -bdrv_add_key(blk_bs(blk), NULL, err); -if (err) { -if (monitor_cur_is_qmp()) { -error_propagate(errp, err); -return; -} -error_free(err); -err = NULL; -if (cur_mon) { -monitor_read_bdrv_key_start(cur_mon, blk_bs(blk), -usb_msd_password_cb, s); -s-dev.auto_attach = 0; -} else { -autostart = 0; +if (blk_bs(blk)) { +bdrv_add_key(blk_bs(blk), NULL, err); +if (err) { +if (monitor_cur_is_qmp()) { +error_propagate(errp, err); +return; +} +error_free(err); +err = NULL; +if (cur_mon) { +monitor_read_bdrv_key_start(cur_mon, blk_bs(blk), +usb_msd_password_cb, s); +s-dev.auto_attach = 0; +} else { +autostart = 0; +} } } -- 2.4.6
[Qemu-devel] [PATCH v4 13/38] block: Remove wr_highest_sector from BlockAcctStats
BlockAcctStats contains statistics about the data transferred from and to the device; wr_highest_sector does not fit in with the rest. Furthermore, those statistics are supposed to be specific for a certain device and not necessarily for a BDS (see the comment above bdrv_get_stats()); on the other hand, wr_highest_sector may be a rather important information to know for each BDS. When BlockAcctStats is finally removed from the BDS, we will want to keep wr_highest_sector in the BDS. Finally, wr_highest_sector is renamed to wr_highest_offset and given the appropriate meaning. Externally, it is represented as an offset so there is no point in doing something different internally. Its definition is changed to match that in qapi/block-core.json which is the offset after the greatest byte written to. Doing so should not cause any harm since if external programs tried to calculate the volume usage by (wr_highest_offset + 512) / volume_size, after this patch they will just assume the volume to be full slightly earlier than before. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com Reviewed-by: Alberto Garcia be...@igalia.com --- block/accounting.c | 8 block/io.c | 4 +++- block/qapi.c | 4 ++-- include/block/accounting.h | 3 --- include/block/block_int.h | 3 +++ qmp-commands.hx| 4 ++-- 6 files changed, 10 insertions(+), 16 deletions(-) diff --git a/block/accounting.c b/block/accounting.c index 01d594f..a423560 100644 --- a/block/accounting.c +++ b/block/accounting.c @@ -47,14 +47,6 @@ void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie) } -void block_acct_highest_sector(BlockAcctStats *stats, int64_t sector_num, - unsigned int nb_sectors) -{ -if (stats-wr_highest_sector sector_num + nb_sectors - 1) { -stats-wr_highest_sector = sector_num + nb_sectors - 1; -} -} - void block_acct_merge_done(BlockAcctStats *stats, enum BlockAcctType type, int num_requests) { diff --git a/block/io.c b/block/io.c index d4bc83b..21cc82a 100644 --- a/block/io.c +++ b/block/io.c @@ -1141,7 +1141,9 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs, bdrv_set_dirty(bs, sector_num, nb_sectors); -block_acct_highest_sector(bs-stats, sector_num, nb_sectors); +if (bs-wr_highest_offset offset + bytes) { +bs-wr_highest_offset = offset + bytes; +} if (ret = 0) { bs-total_sectors = MAX(bs-total_sectors, sector_num + nb_sectors); diff --git a/block/qapi.c b/block/qapi.c index 2ce5097..d3cbc80 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -350,13 +350,13 @@ static BlockStats *bdrv_query_stats(const BlockDriverState *bs, s-stats-wr_operations = bs-stats.nr_ops[BLOCK_ACCT_WRITE]; s-stats-rd_merged = bs-stats.merged[BLOCK_ACCT_READ]; s-stats-wr_merged = bs-stats.merged[BLOCK_ACCT_WRITE]; -s-stats-wr_highest_offset = -bs-stats.wr_highest_sector * BDRV_SECTOR_SIZE; s-stats-flush_operations = bs-stats.nr_ops[BLOCK_ACCT_FLUSH]; s-stats-wr_total_time_ns = bs-stats.total_time_ns[BLOCK_ACCT_WRITE]; s-stats-rd_total_time_ns = bs-stats.total_time_ns[BLOCK_ACCT_READ]; s-stats-flush_total_time_ns = bs-stats.total_time_ns[BLOCK_ACCT_FLUSH]; +s-stats-wr_highest_offset = bs-wr_highest_offset; + if (bs-file) { s-has_parent = true; s-parent = bdrv_query_stats(bs-file, query_backing); diff --git a/include/block/accounting.h b/include/block/accounting.h index 4c406cf..66637cd 100644 --- a/include/block/accounting.h +++ b/include/block/accounting.h @@ -40,7 +40,6 @@ typedef struct BlockAcctStats { uint64_t nr_ops[BLOCK_MAX_IOTYPE]; uint64_t total_time_ns[BLOCK_MAX_IOTYPE]; uint64_t merged[BLOCK_MAX_IOTYPE]; -uint64_t wr_highest_sector; } BlockAcctStats; typedef struct BlockAcctCookie { @@ -52,8 +51,6 @@ typedef struct BlockAcctCookie { void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie, int64_t bytes, enum BlockAcctType type); void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie); -void block_acct_highest_sector(BlockAcctStats *stats, int64_t sector_num, - unsigned int nb_sectors); void block_acct_merge_done(BlockAcctStats *stats, enum BlockAcctType type, int num_requests); diff --git a/include/block/block_int.h b/include/block/block_int.h index b7e1e16..67e05ac 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -403,6 +403,9 @@ struct BlockDriverState { /* I/O stats (display with info blockstats). */ BlockAcctStats stats; +/* Offset after the highest byte written to */ +uint64_t wr_highest_offset; + /* I/O Limits */ BlockLimits bl; diff --git a/qmp-commands.hx b/qmp-commands.hx index ba630b1..df3b116 100644 --- a/qmp-commands.hx +++
[Qemu-devel] [PATCH v4 26/38] blockdev: Add blockdev-open-tray
Signed-off-by: Max Reitz mre...@redhat.com --- blockdev.c | 49 + qapi/block-core.json | 23 +++ qmp-commands.hx | 39 +++ 3 files changed, 111 insertions(+) diff --git a/blockdev.c b/blockdev.c index 44a8c6b..265b7a9 100644 --- a/blockdev.c +++ b/blockdev.c @@ -2062,6 +2062,55 @@ out: aio_context_release(aio_context); } +void qmp_blockdev_open_tray(const char *device, bool has_force, bool force, +Error **errp) +{ +BlockBackend *blk; +BlockDriverState *bs; +AioContext *aio_context = NULL; + +if (!has_force) { +force = false; +} + +blk = blk_by_name(device); +if (!blk) { +error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND, + Device '%s' not found, device); +return; +} + +if (!blk_dev_has_removable_media(blk)) { +error_setg(errp, Device '%s' is not removable, device); +return; +} + +if (blk_dev_is_tray_open(blk)) { +return; +} + +bs = blk_bs(blk); +if (bs) { +aio_context = bdrv_get_aio_context(bs); +aio_context_acquire(aio_context); + +if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_EJECT, errp)) { +goto out; +} +} + +if (blk_dev_is_medium_locked(blk)) { +blk_dev_eject_request(blk, force); +} else { +blk_dev_change_media_cb(blk, false); +} + +out: +if (aio_context) { +aio_context_release(aio_context); +} +} + /* throttling disk I/O limits */ void qmp_block_set_io_throttle(const char *device, int64_t bps, int64_t bps_rd, int64_t bps_wr, diff --git a/qapi/block-core.json b/qapi/block-core.json index bc12934..f593245 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1871,6 +1871,29 @@ ## { 'command': 'blockdev-add', 'data': { 'options': 'BlockdevOptions' } } +## +# @blockdev-open-tray: +# +# Opens a block device's tray. If there is a block driver state tree inserted as +# a medium, it will become inaccessible to the guest (but it will remain +# associated to the block device, so closing the tray will make it accessible +# again). +# +# If the tray was already open before, this will be a no-op. +# +# @device: block device name +# +# @force: #optional if false (the default), an eject request will be sent to +# the guest if it has locked the tray (and the tray will not be opened +# immediately); if true, the tray will be opened regardless of whether +# it is locked +# +# Since: 2.5 +## +{ 'command': 'blockdev-open-tray', + 'data': { 'device': 'str', +'*force': 'bool' } } + ## # @BlockErrorAction diff --git a/qmp-commands.hx b/qmp-commands.hx index df3b116..01c5d6e 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -3872,6 +3872,45 @@ Example (2): EQMP { +.name = blockdev-open-tray, +.args_type = device:s,force:b?, +.mhandler.cmd_new = qmp_marshal_input_blockdev_open_tray, +}, + +SQMP +blockdev-open-tray +-- + +Opens a block device's tray. If there is a block driver state tree inserted as a +medium, it will become inaccessible to the guest (but it will remain associated +to the block device, so closing the tray will make it accessible again). + +If the tray was already open before, this will be a no-op. + +Arguments: + +- device: block device name (json-string) +- force: if false (the default), an eject request will be sent to the guest if + it has locked the tray (and the tray will not be opened immediately); + if true, the tray will be opened regardless of whether it is locked + (json-bool, optional) + +Example: + +- { execute: blockdev-open-tray, + arguments: { device: ide1-cd0 } } + +- { timestamp: { seconds: 1418751016, +microseconds: 716996 }, + event: DEVICE_TRAY_MOVED, + data: { device: ide1-cd0, + tray-open: true } } + +- { return: {} } + +EQMP + +{ .name = query-named-block-nodes, .args_type = , .mhandler.cmd_new = qmp_marshal_input_query_named_block_nodes, -- 2.4.6
[Qemu-devel] [PATCH v4 36/38] hmp: Add read-only-mode option to change command
Expose the new read-only-mode option of 'blockdev-change-medium' for the 'change' HMP command. Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- hmp-commands.hx | 20 +--- hmp.c | 22 +- 2 files changed, 38 insertions(+), 4 deletions(-) diff --git a/hmp-commands.hx b/hmp-commands.hx index d3b7932..b3e1632 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -194,8 +194,8 @@ ETEXI { .name = change, -.args_type = device:B,target:F,arg:s?, -.params = device filename [format], +.args_type = device:B,target:F,arg:s?,read-only-mode:s?, +.params = device filename [format [read-only-mode]], .help = change a removable medium, optional format, .mhandler.cmd = hmp_change, }, @@ -206,7 +206,7 @@ STEXI Change the configuration of a device. @table @option -@item change @var{diskdevice} @var{filename} [@var{format}] +@item change @var{diskdevice} @var{filename} [@var{format} [@var{read-only-mode}]] Change the medium for a removable disk device to point to @var{filename}. eg @example @@ -215,6 +215,20 @@ Change the medium for a removable disk device to point to @var{filename}. eg @var{format} is optional. +@var{read-only-mode} may be used to change the read-only status of the device. +It accepts the following values: + +@table @var +@item retain +Retains the current status; this is the default. + +@item read-only +Makes the device read-only. + +@item read-write +Makes the device writable. +@end table + @item change vnc @var{display},@var{options} Change the configuration of the VNC server. The valid syntax for @var{display} and @var{options} are described at @ref{sec_invocation}. eg diff --git a/hmp.c b/hmp.c index a4f3634..e503414 100644 --- a/hmp.c +++ b/hmp.c @@ -27,6 +27,7 @@ #include qapi/opts-visitor.h #include qapi/qmp/qerror.h #include qapi/string-output-visitor.h +#include qapi/util.h #include qapi-visit.h #include ui/console.h #include block/qapi.h @@ -1315,9 +1316,16 @@ void hmp_change(Monitor *mon, const QDict *qdict) const char *device = qdict_get_str(qdict, device); const char *target = qdict_get_str(qdict, target); const char *arg = qdict_get_try_str(qdict, arg); +const char *read_only = qdict_get_try_str(qdict, read-only-mode); +BlockdevChangeReadOnlyMode read_only_mode = 0; Error *err = NULL; if (strcmp(device, vnc) == 0) { +if (read_only) { +monitor_printf(mon, + Parameter 'read-only-mode' is invalid for VNC); +return; +} if (strcmp(target, passwd) == 0 || strcmp(target, password) == 0) { if (!arg) { @@ -1327,7 +1335,19 @@ void hmp_change(Monitor *mon, const QDict *qdict) } qmp_change(vnc, target, !!arg, arg, err); } else { -qmp_blockdev_change_medium(device, target, !!arg, arg, false, 0, err); +if (read_only) { +read_only_mode = +qapi_enum_parse(BlockdevChangeReadOnlyMode_lookup, +read_only, BLOCKDEV_CHANGE_READ_ONLY_MODE_MAX, +BLOCKDEV_CHANGE_READ_ONLY_MODE_RETAIN, err); +if (err) { +hmp_handle_error(mon, err); +return; +} +} + +qmp_blockdev_change_medium(device, target, !!arg, arg, + !!read_only, read_only_mode, err); if (err error_get_class(err) == ERROR_CLASS_DEVICE_ENCRYPTED) { error_free(err); -- 2.4.6
[Qemu-devel] [PATCH v4 21/38] block: Prepare for NULL BDS
blk_bs() will not necessarily return a non-NULL value any more (unless blk_is_available() is true or it can be assumed to otherwise, e.g. because it is called immediately after a successful blk_new_with_bs() or blk_new_open()). Signed-off-by: Max Reitz mre...@redhat.com --- block.c | 5 ++ block/qapi.c| 4 +- blockdev.c | 203 ++-- hw/block/xen_disk.c | 4 +- migration/block.c | 5 ++ monitor.c | 4 ++ 6 files changed, 154 insertions(+), 71 deletions(-) diff --git a/block.c b/block.c index 5e05c72..f51005d 100644 --- a/block.c +++ b/block.c @@ -2771,6 +2771,11 @@ BlockDriverState *bdrv_lookup_bs(const char *device, blk = blk_by_name(device); if (blk) { +if (!blk_bs(blk)) { +error_setg(errp, Device '%s' has no medium, device); +return NULL; +} + return blk_bs(blk); } } diff --git a/block/qapi.c b/block/qapi.c index f295692..e936ba7 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -306,12 +306,12 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo **p_info, info-io_status = blk_iostatus(blk); } -if (!QLIST_EMPTY(bs-dirty_bitmaps)) { +if (bs !QLIST_EMPTY(bs-dirty_bitmaps)) { info-has_dirty_bitmaps = true; info-dirty_bitmaps = bdrv_query_dirty_bitmaps(bs); } -if (bs-drv) { +if (bs bs-drv) { info-has_inserted = true; info-inserted = bdrv_block_device_info(bs, errp); if (info-inserted == NULL) { diff --git a/blockdev.c b/blockdev.c index 7da8c45..7bb6ad0 100644 --- a/blockdev.c +++ b/blockdev.c @@ -124,14 +124,16 @@ void blockdev_mark_auto_del(BlockBackend *blk) return; } -aio_context = bdrv_get_aio_context(bs); -aio_context_acquire(aio_context); +if (bs) { +aio_context = bdrv_get_aio_context(bs); +aio_context_acquire(aio_context); -if (bs-job) { -block_job_cancel(bs-job); -} +if (bs-job) { +block_job_cancel(bs-job); +} -aio_context_release(aio_context); +aio_context_release(aio_context); +} dinfo-auto_del = 1; } @@ -229,8 +231,8 @@ bool drive_check_orphaned(void) dinfo-type != IF_NONE) { fprintf(stderr, Warning: Orphaned drive without device: id=%s,file=%s,if=%s,bus=%d,unit=%d\n, -blk_name(blk), blk_bs(blk)-filename, if_name[dinfo-type], -dinfo-bus, dinfo-unit); +blk_name(blk), blk_bs(blk) ? blk_bs(blk)-filename : , +if_name[dinfo-type], dinfo-bus, dinfo-unit); rs = true; } } @@ -1038,6 +1040,10 @@ void hmp_commit(Monitor *mon, const QDict *qdict) monitor_printf(mon, Device '%s' not found\n, device); return; } +if (!blk_is_available(blk)) { +monitor_printf(mon, Device '%s' has no medium\n, device); +return; +} ret = bdrv_commit(blk_bs(blk)); } if (ret 0) { @@ -1117,7 +1123,9 @@ SnapshotInfo *qmp_blockdev_snapshot_delete_internal_sync(const char *device, Device '%s' not found, device); return NULL; } -bs = blk_bs(blk); + +aio_context = blk_get_aio_context(blk); +aio_context_acquire(aio_context); if (!has_id) { id = NULL; @@ -1129,11 +1137,14 @@ SnapshotInfo *qmp_blockdev_snapshot_delete_internal_sync(const char *device, if (!id !name) { error_setg(errp, Name or id must be provided); -return NULL; +goto out_aio_context; } -aio_context = bdrv_get_aio_context(bs); -aio_context_acquire(aio_context); +if (!blk_is_available(blk)) { +error_setg(errp, Device '%s' has no medium, device); +goto out_aio_context; +} +bs = blk_bs(blk); if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE, errp)) { goto out_aio_context; @@ -1307,16 +1318,16 @@ static void internal_snapshot_prepare(BlkTransactionState *common, Device '%s' not found, device); return; } -bs = blk_bs(blk); /* AioContext is released in .clean() */ -state-aio_context = bdrv_get_aio_context(bs); +state-aio_context = blk_get_aio_context(blk); aio_context_acquire(state-aio_context); -if (!bdrv_is_inserted(bs)) { +if (!blk_is_available(blk)) { error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device); return; } +bs = blk_bs(blk); if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT, errp)) { return; @@ -1574,7 +1585,6 @@ typedef struct DriveBackupState { static void drive_backup_prepare(BlkTransactionState *common, Error **errp) { DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common); -BlockDriverState *bs;
[Qemu-devel] [PATCH v4 38/38] iotests: Add test for change-related QMP commands
Signed-off-by: Max Reitz mre...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- tests/qemu-iotests/118 | 638 + tests/qemu-iotests/118.out | 5 + tests/qemu-iotests/group | 1 + 3 files changed, 644 insertions(+) create mode 100755 tests/qemu-iotests/118 create mode 100644 tests/qemu-iotests/118.out diff --git a/tests/qemu-iotests/118 b/tests/qemu-iotests/118 new file mode 100755 index 000..915e439 --- /dev/null +++ b/tests/qemu-iotests/118 @@ -0,0 +1,638 @@ +#!/usr/bin/env python +# +# Test case for the QMP 'change' command and all other associated +# commands +# +# Copyright (C) 2015 Red Hat, Inc. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see http://www.gnu.org/licenses/. +# + +import os +import stat +import time +import iotests +from iotests import qemu_img + +old_img = os.path.join(iotests.test_dir, 'test0.img') +new_img = os.path.join(iotests.test_dir, 'test1.img') + +class ChangeBaseClass(iotests.QMPTestCase): +has_opened = False +has_closed = False + +def process_events(self): +for event in self.vm.get_qmp_events(wait=False): +if (event['event'] == 'DEVICE_TRAY_MOVED' and +event['data']['device'] == 'drive0'): +if event['data']['tray-open'] == False: +self.has_closed = True +else: +self.has_opened = True + +def wait_for_open(self): +timeout = time.clock() + 3 +while not self.has_opened and time.clock() timeout: +self.process_events() +if not self.has_opened: +self.fail('Timeout while waiting for the tray to open') + +def wait_for_close(self): +timeout = time.clock() + 3 +while not self.has_closed and time.clock() timeout: +self.process_events() +if not self.has_opened: +self.fail('Timeout while waiting for the tray to close') + +class GeneralChangeTestsBaseClass(ChangeBaseClass): +def test_change(self): +result = self.vm.qmp('change', device='drive0', target=new_img, + arg=iotests.imgfmt) +self.assert_qmp(result, 'return', {}) + +self.wait_for_open() +self.wait_for_close() + +result = self.vm.qmp('query-block') +self.assert_qmp(result, 'return[0]/tray_open', False) +self.assert_qmp(result, 'return[0]/inserted/image/filename', new_img) + +def test_blockdev_change_medium(self): +result = self.vm.qmp('blockdev-change-medium', device='drive0', + filename=new_img, + format=iotests.imgfmt) +self.assert_qmp(result, 'return', {}) + +self.wait_for_open() +self.wait_for_close() + +result = self.vm.qmp('query-block') +self.assert_qmp(result, 'return[0]/tray_open', False) +self.assert_qmp(result, 'return[0]/inserted/image/filename', new_img) + +def test_eject(self): +result = self.vm.qmp('eject', device='drive0', force=True) +self.assert_qmp(result, 'return', {}) + +self.wait_for_open() + +result = self.vm.qmp('query-block') +self.assert_qmp(result, 'return[0]/tray_open', True) +self.assert_qmp_absent(result, 'return[0]/inserted') + +def test_tray_eject_change(self): +result = self.vm.qmp('eject', device='drive0', force=True) +self.assert_qmp(result, 'return', {}) + +self.wait_for_open() + +result = self.vm.qmp('query-block') +self.assert_qmp(result, 'return[0]/tray_open', True) +self.assert_qmp_absent(result, 'return[0]/inserted') + +result = self.vm.qmp('blockdev-change-medium', device='drive0', + filename=new_img, + format=iotests.imgfmt) +self.assert_qmp(result, 'return', {}) + +self.wait_for_close() + +result = self.vm.qmp('query-block') +self.assert_qmp(result, 'return[0]/tray_open', False) +self.assert_qmp(result, 'return[0]/inserted/image/filename', new_img) + +def test_tray_open_close(self): +result = self.vm.qmp('blockdev-open-tray', device='drive0', force=True) +self.assert_qmp(result, 'return', {}) + +
[Qemu-devel] [PATCH v4 31/38] blockdev: Implement change with basic operations
Implement 'change' on block devices by calling blockdev-open-tray, blockdev-remove-medium, blockdev-insert-medium (a variation of that which does not need a node-name) and blockdev-close-tray. Signed-off-by: Max Reitz mre...@redhat.com --- blockdev.c | 187 ++--- 1 file changed, 78 insertions(+), 109 deletions(-) diff --git a/blockdev.c b/blockdev.c index 0a4a761..3ea9d8d 100644 --- a/blockdev.c +++ b/blockdev.c @@ -1915,41 +1915,6 @@ exit: } } - -static void eject_device(BlockBackend *blk, int force, Error **errp) -{ -BlockDriverState *bs = blk_bs(blk); -AioContext *aio_context; - -aio_context = blk_get_aio_context(blk); -aio_context_acquire(aio_context); - -if (bs bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_EJECT, errp)) { -goto out; -} -if (!blk_dev_has_removable_media(blk)) { -error_setg(errp, Device '%s' is not removable, - bdrv_get_device_name(bs)); -goto out; -} - -if (blk_dev_is_medium_locked(blk) !blk_dev_is_tray_open(blk)) { -blk_dev_eject_request(blk, force); -if (!force) { -error_setg(errp, Device '%s' is locked, - bdrv_get_device_name(bs)); -goto out; -} -} - -if (bs) { -bdrv_close(bs); -} - -out: -aio_context_release(aio_context); -} - void qmp_eject(const char *device, bool has_force, bool force, Error **errp) { Error *local_err = NULL; @@ -1987,80 +1952,6 @@ void qmp_block_passwd(bool has_device, const char *device, aio_context_release(aio_context); } -/* Assumes AioContext is held */ -static void qmp_bdrv_open_encrypted(BlockDriverState **pbs, -const char *filename, -int bdrv_flags, BlockDriver *drv, -const char *password, Error **errp) -{ -BlockDriverState *bs; -Error *local_err = NULL; -int ret; - -ret = bdrv_open(pbs, filename, NULL, NULL, bdrv_flags, drv, local_err); -if (ret 0) { -error_propagate(errp, local_err); -return; -} -bs = *pbs; - -bdrv_add_key(bs, password, errp); -} - -void qmp_change_blockdev(const char *device, const char *filename, - const char *format, Error **errp) -{ -BlockBackend *blk; -BlockDriverState *bs; -AioContext *aio_context; -BlockDriver *drv = NULL; -int bdrv_flags; -bool new_bs; -Error *err = NULL; - -blk = blk_by_name(device); -if (!blk) { -error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND, - Device '%s' not found, device); -return; -} -bs = blk_bs(blk); -new_bs = !bs; - -aio_context = blk_get_aio_context(blk); -aio_context_acquire(aio_context); - -if (format) { -drv = bdrv_find_whitelisted_format(format, blk_is_read_only(blk)); -if (!drv) { -error_setg(errp, QERR_INVALID_BLOCK_FORMAT, format); -goto out; -} -} - -eject_device(blk, 0, err); -if (err) { -error_propagate(errp, err); -goto out; -} - -bdrv_flags = blk_is_read_only(blk) ? 0 : BDRV_O_RDWR; -bdrv_flags |= blk_get_root_state(blk)-open_flags ~BDRV_O_RDWR; - -qmp_bdrv_open_encrypted(bs, filename, bdrv_flags, drv, NULL, err); -if (err) { -error_propagate(errp, err); -} else if (new_bs) { -blk_insert_bs(blk, bs); -/* Has been sent automatically by bdrv_open() if blk_bs(blk) was not - * NULL */ -blk_dev_change_media_cb(blk, true); -} - -out: -aio_context_release(aio_context); -} - void qmp_blockdev_open_tray(const char *device, bool has_force, bool force, Error **errp) { @@ -2211,6 +2102,84 @@ void qmp_blockdev_insert_medium(const char *device, const char *node_name, qmp_blockdev_insert_anon_medium(device, bs, errp); } +void qmp_change_blockdev(const char *device, const char *filename, + const char *format, Error **errp) +{ +BlockBackend *blk; +BlockBackendRootState *blk_rs; +BlockDriverState *medium_bs = NULL; +BlockDriver *drv = NULL; +int bdrv_flags, ret; +Error *err = NULL; + +blk = blk_by_name(device); +if (!blk) { +error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND, + Device '%s' not found, device); +goto fail; +} + +if (blk_bs(blk)) { +blk_update_root_state(blk); +} + +blk_rs = blk_get_root_state(blk); +bdrv_flags = blk_rs-read_only ? 0 : BDRV_O_RDWR; +bdrv_flags |= blk_rs-open_flags ~BDRV_O_RDWR; + +if (format) { +drv = bdrv_find_whitelisted_format(format, bdrv_flags BDRV_O_RDWR); +if (!drv) { +error_setg(errp, Invalid block format '%s', format); +goto fail; +} +} + +assert(!medium_bs); +ret = bdrv_open(medium_bs,
Re: [Qemu-devel] Summary MTTCG related patch sets
On 20/07/2015 19:41, alvise rigo wrote: Hi Alex, Thank you for this summary. Some comments below. On Mon, Jul 20, 2015 at 6:17 PM, Alex Bennée alex.ben...@linaro.org wrote: Hi, Following this afternoons call I thought I'd summarise the state of the various patch series and their relative dependencies. We re-stated the aim should be to get what is up-streamable through the review process and heading for merge so the delta for a full working MTTCG can be as low as possible. There was some concern about the practicality of submitting patches where the full benefit will not be seen until MTTCG is finally merged. On the patch submission note could I encourage posting public git trees along with the patches for ease of review? BQL lock breaking patches, Paolo/Jan - required for working virt-io in MTTCG - supersedes some of Fred's patches - merged upstream as of -rc0 TCG async_safe_work, Fred - http://git.greensocs.com/fkonrad/mttcg.git async_work_v3 - [1437144337-21442-1-git-send-email-fred.kon...@greensocs.com] - split from earlier MTTCG patch series - needed for cross-cpu sync mechanism for main series and slow-path - candidate for upstreaming, but only MTTCG uses for now? Slow-path for atomic instruction translation, Alvise - [1436516626-8322-1-git-send-email-a.r...@virtualopensystems.com] - Needs re-basing to use TCG async_safe_work - Earlier part of series (pre MTTCG) could be upstreamed as is I will create a branch for upstreaming (pre MTTCG) and another one based on MTTCG. - Concern about performance impact in non-MTTCG scenarios - Single CPU thread impact may be minimal with latest version, needs benchmarking - Also incomplete backend support, would BACKEND_HAS_LLSC_OPS be acceptable to maintainers while support added by more knowledgable backend people for non-x86/arm backends? Multi-threaded TCG V6, Fred - g...@git.greensocs.com:fkonrad/mttcg.git branch multi_tcg_v6 - [1435330053-18733-1-git-send-email-fred.kon...@greensocs.com] - Needs re-basing on top of latest -rc (BQL breaking) - Contains the rest of the MTTCG work (tb locking, tlb stuff etc) - Currently target-arm only, other builds broken As far as balancing the desire to get things upstreamed versus having a stable base for testing I suggest we try an approach like this: - select the current upstream -rc as the common base point - create a branch from -rc with: - stuff submitted for upstream (reviewed, not nacked) - doesn't break any tree - has minimal performance impact Then both Fred and Alvise could base their trees of this point and we aim to rebase onto a new branch each time the patches get merged into a new upstream RC. The question then become how to deal with any cross-dependencies between the slow-path and the main MTTCG branches? From my side I will take care of rebasing my patch series on the latest MTTCG branch as often as possible. Up to now, there are not so many cross-dependencies, so I don't see it as a big issue. Is this a workable solution? Thank you, alvise The RFC V3 you sent is based on MTTCG if I remember right. That's why you introduced this rendez-vous right? And the point was to use async_safe_work for this as I need it actually for tb_flush and tb_invalidate (if we don't find any other solution for tb_invalidate). So this is the cross-dependency which we are talking of. But maybe and probably this is not needed with upstream as there is only one TCG thread. Thanks, Fred I suspect the initial common branch point would just be 2.4.0-rc1+safe_async. Does that sound workable? -- Alex Bennée
[Qemu-devel] [PULL 3/3] tests: Fix broken targets check-report-qtest-*
From: Stefan Weil s...@weilnetz.de They need QTEST_QEMU_IMG. Without it, the tests raise an assertion: $ make -C bin check-report-qtest-i386.xml make: Entering directory 'bin' GTESTER check-report-qtest-i386.xml blkdebug: Suspended request 'A' blkdebug: Resuming request 'A' ahci-test: tests/libqos/libqos.c:162: mkimg: Assertion `qemu_img_path' failed. main-loop: WARNING: I/O thread spun for 1000 iterations Signed-off-by: Stefan Weil s...@weilnetz.de Reviewed-by: John Snow js...@redhat.com Message-id: 1437231284-17455-1-git-send-email...@weilnetz.de Signed-off-by: John Snow js...@redhat.com --- tests/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/Makefile b/tests/Makefile index 2c4b8dc..8d26736 100644 --- a/tests/Makefile +++ b/tests/Makefile @@ -478,6 +478,7 @@ $(patsubst %, check-%, $(check-unit-y)): check-%: % $(patsubst %, check-report-qtest-%.xml, $(QTEST_TARGETS)): check-report-qtest-%.xml: $(check-qtest-y) $(call quiet-command,QTEST_QEMU_BINARY=$*-softmmu/qemu-system-$* \ + QTEST_QEMU_IMG=qemu-img$(EXESUF) \ gtester -q $(GTESTER_OPTIONS) -o $@ -m=$(SPEED) $(check-qtest-$*-y),GTESTER $@) check-report-unit.xml: $(check-unit-y) -- 2.1.0
Re: [Qemu-devel] [PATCH] No change in userland tools after resizing qcow2 image
On 07/19/2015 05:24 AM, Taeha Kim wrote: Hello, There is no change in userland tools after resizing qcow2 image except file utility. For example when resize qcow2 image, the file utility is detectable increased size. However, the ls, “stat”, and “du” utility still don't know how many change size of image is changed. The following patch enables to let userland tools - ls, du, stat - know and apply changed size after resizing qcow2 image created by the qemu-img tool. Currently, “file” utility can only know without this patch. In addtion, file utility sometimes don't know qcow2 image format's version as follows. $ file 100G.qcow2 100G.qcow2: QEMU QCOW Image (Unknown) So, user can't have any information about qcow2 image size. == Signed-off-by: Taeha Kim kthg...@gmail.com diff --git a/block.c b/block.c index d088ee0..93427f8 100644 --- a/block.c +++ b/block.c @@ -2529,6 +2529,9 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset) if (bs-blk) { blk_dev_resize_cb(bs-blk); } + if (ret == 0) { + ret = truncate(bs-filename, offset); + } } return ret; } = $ ./qemu/qemu-img --version qemu-img version 2.3.91, Copyright (c) 2004-2008 Fabrice Bellard The how-to-reproduce is as follows with three reproduction. First let’s say that we create a qcow2 image using qemu-img tool as follows. There is still no problem. $ ./qemu/qemu-img create -f qcow2 -o preallocation=metadata 100G.qcow2 100G Formatting '100G.qcow2', fmt=qcow2 size=107374182400 encryption=off cluster_size=65536 preallocation='metadata' lazy_refcounts=off refcount_bits=16 $ ./qemu/qemu-img check 100G.qcow2 No errors were found on the image. 1638400/1638400 = 100.00% allocated, 0.00% fragmented, 0.00% compressed clusters Image end offset: 107390828544 $ ls -l 100G.qcow2 -rw-r--r--. 1 devjames devjames 107390828544 Jul 15 09:55 100G.qcow2 $ stat 100G.qcow2 File: ‘100G.qcow2’ Size: 107390828544Blocks: 32408 IO Block: 4096 regular file Device: fd00h/64768dInode: 5778245 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 1000/devjames) Gid: ( 1000/devjames) Context: unconfined_u:object_r:user_home_t:s0 Access: 2015-07-15 09:55:17.269620129 +0900 Modify: 2015-07-15 09:55:17.269620129 +0900 Change: 2015-07-15 09:55:17.269620129 +0900 Birth: - $ du -bb 100G.qcow2 107390828544100G.qcow2 $ file 100G.qcow2 100G.qcow2: QEMU QCOW Image (v3), 107374182400 bytes But, from now on there is a problem. Second let’s say we resize the qcow2 image size from 100G to 200G using current qemu-img without the patch. $ ./qemu/qemu-img resize 100G.qcow2 200G Image resized. $ ./qemu/qemu-img check 100G.qcow2 No errors were found on the image. 1638400/3276800 = 50.00% allocated, 0.00% fragmented, 0.00% compressed clusters Image end offset: 107390894080 $ ls -l 100G.qcow2 -rw-r--r--. 1 devjames devjames 107390832128 Jul 15 10:02 100G.qcow2 $ stat 100G.qcow2 File: ‘100G.qcow2’ Size: 107390832128Blocks: 32416 IO Block: 4096 regular file Device: fd00h/64768dInode: 5778245 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 1000/devjames) Gid: ( 1000/devjames) Context: unconfined_u:object_r:user_home_t:s0 Access: 2015-07-15 10:02:04.842425479 +0900 Modify: 2015-07-15 10:02:04.854425682 +0900 Change: 2015-07-15 10:02:04.854425682 +0900 Birth: - $ du -bb 100G.qcow2 107390832128100G.qcow2 $ file 100G.qcow2 100G.qcow2: QEMU QCOW Image (v3), 214748364800 bytes We can see that “ls”, “stat”, and “du” utilities don’t know change size of qcow2 image except “file” one. Third let’s say we resize the qcow2 image size from 100G to 200G using qemu-img with the patch. $ ./qemu/qemu-img resize 100G.qcow2 200G Image resized. $ ./qemu/qemu-img check 100G.qcow2 No errors were found on the image. 1638400/3276800 = 50.00% allocated, 0.00% fragmented, 0.00% compressed clusters Image end offset: 107390894080 $ ls -l 100G.qcow2 -rw-r--r--. 1 devjames devjames 214748364800 Jul 15 10:08 100G.qcow2 $ stat 100G.qcow2 File: ‘100G.qcow2’ Size: 214748364800Blocks: 32416 IO Block: 4096 regular file Device: fd00h/64768dInode: 5778245 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 1000/devjames) Gid: ( 1000/devjames) Context: unconfined_u:object_r:user_home_t:s0 Access: 2015-07-15 10:04:37.595018501 +0900 Modify: 2015-07-15 10:08:01.622484709 +0900 Change: 2015-07-15 10:08:01.622484709 +0900 Birth: - $ du -bb 100G.qcow2 14748364800100G.qcow2 $ file 100G.qcow2 100G.qcow2: QEMU QCOW Image (v3), 214748364800 bytes Now we can see above all four utilities know change size of qcow2 image. So I made the patch above for the qemu-img tool so that ls, “stat”, and “du” utility can be applied increased size of the qcow2 image file. It seems to me that