Re: [Qemu-devel] [PATCH 12/12] Add disk_size field to BlockDriverState structure
On Mon, Jun 20, 2011 at 6:37 AM, Fam Zheng famc...@gmail.com wrote: Is there any difference between bdrv_getlength and bdrv_get_allocated_file_size for bs-file? If not, I can simplify it by reusing it in two raw devices. Yes, the two functions are different: POSIX sparse files (files with holes) take up less space on disk than their file size. For example a 1 GB file where you've only written the last byte and never touched any other blocks will only take up one block - the rest will be unallocated. So bdrv_getlength() == 1 GB and bdrv_get_allocated_file_size() == 4 KB (or whatever the file system block size is). You can look at this using the stat(1) command and dd(1) to only write the last byte of a file. Stefan
[Qemu-devel] [PATCH v3] linux-user: Define AT_RANDOM to support target stack protection mechanism.
From: Laurent ALFONSI laurent.alfo...@st.com Note that the support for the command-line argument requires: 1. add the new field uint8_t rand_bytes[16] to struct image_info since only the variable info lives both in main() and in create_elf_tables() 2. write a dedicated parser to convert the command-line to fill rand_bytes[] These two steps aren't really hard to achieve but I finally think they are a little bit overkill regarding the purpose of these 16 bytes. Maybe we could always fill the 16 bytes pointed to by AT_RANDOM with zero if we really want to get reproducibility. Regards, Cédric. 888888888888 The dynamic linker from the GNU C library v2.10+ uses the ELF auxiliary vector AT_RANDOM [1] as a pointer to 16 bytes with random values to initialize the stack protection mechanism. Technically the emulated GNU dynamic linker crashes due to a NULL pointer derefencement if it is built with stack protection enabled and if AT_RANDOM is not defined by the QEMU ELF loader. [1] This ELF auxiliary vector was introduced in Linux v2.6.29. This patch can be tested with the code above: #include elf.h /* Elf*_auxv_t, AT_RANDOM, */ #include stdio.h /* printf(3), */ #include stdlib.h/* exit(3), EXIT_*, */ #include stdint.h/* uint8_t, */ #include string.h/* memcpy(3), */ #if defined(__LP64__) || defined(__ILP64__) || defined(__LLP64__) #define Elf_auxv_t Elf64_auxv_t #else #define Elf_auxv_t Elf32_auxv_t #endif main(int argc, char* argv[], char* envp[]) { Elf_auxv_t *auxv; /* *envp = NULL marks end of envp. */ while (*envp++ != NULL); /* auxv-a_type = AT_NULL marks the end of auxv. */ for (auxv = (Elf_auxv_t *)envp; auxv-a_type != AT_NULL; auxv++) { if (auxv-a_type == AT_RANDOM) { int i; uint8_t rand_bytes[16]; printf(AT_RANDOM is: 0x%x\n, auxv-a_un.a_val); memcpy(rand_bytes, (const uint8_t *)auxv-a_un.a_val, sizeof(rand_bytes)); printf(it points to: ); for (i = 0; i 16; i++) { printf(0x%02x , rand_bytes[i]); } printf(\n); exit(EXIT_SUCCESS); } } exit(EXIT_FAILURE); } Changes introduced in v2 and v3: * Fix typos + thinko (AT_RANDOM is used for stack canary, not for ASLR) * AT_RANDOM points to 16 random bytes stored inside the user stack. * Add a small test program. Signed-off-by: Cédric VINCENT cedric.vinc...@st.com Signed-off-by: Laurent ALFONSI laurent.alfo...@st.com --- linux-user/elfload.c | 21 - 1 files changed, 20 insertions(+), 1 deletions(-) diff --git a/linux-user/elfload.c b/linux-user/elfload.c index dcfeb7a..23c69d9 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -927,7 +927,7 @@ struct exec #define TARGET_ELF_PAGESTART(_v) ((_v) ~(unsigned long)(TARGET_ELF_EXEC_PAGESIZE-1)) #define TARGET_ELF_PAGEOFFSET(_v) ((_v) (TARGET_ELF_EXEC_PAGESIZE-1)) -#define DLINFO_ITEMS 12 +#define DLINFO_ITEMS 13 static inline void memcpy_fromfs(void * to, const void * from, unsigned long n) { @@ -1202,6 +1202,9 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, int envc, { abi_ulong sp; int size; +int i; +abi_ulong u_rand_bytes; +uint8_t k_rand_bytes[16]; abi_ulong u_platform; const char *k_platform; const int n = sizeof(elf_addr_t); @@ -1231,6 +1234,20 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, int envc, /* FIXME - check return value of memcpy_to_target() for failure */ memcpy_to_target(sp, k_platform, len); } + +/* + * Generate 16 random bytes for userspace PRNG seeding (not + * cryptically secure but it's not the aim of QEMU). + */ +srand((unsigned int) time(NULL)); +for (i = 0; i 16; i++) { +k_rand_bytes[i] = rand(); +} +sp -= 16; +u_rand_bytes = sp; +/* FIXME - check return value of memcpy_to_target() for failure */ +memcpy_to_target(sp, k_rand_bytes, 16); + /* * Force 16 byte _final_ alignment here for generality. */ @@ -1271,6 +1288,8 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, int envc, NEW_AUX_ENT(AT_EGID, (abi_ulong) getegid()); NEW_AUX_ENT(AT_HWCAP, (abi_ulong) ELF_HWCAP); NEW_AUX_ENT(AT_CLKTCK, (abi_ulong) sysconf(_SC_CLK_TCK)); +NEW_AUX_ENT(AT_RANDOM, (abi_ulong) u_rand_bytes); + if (k_platform) NEW_AUX_ENT(AT_PLATFORM, u_platform); #ifdef ARCH_DLINFO -- 1.7.5.1
Re: [Qemu-devel] [PATCH 1/3] kvm: ppc: booke206: use MMU API
On 2011-06-18 01:28, Alexander Graf wrote: On 17.06.2011, at 22:39, Scott Wood wrote: Share the TLB array with KVM. This allows us to set the initial TLB both on initial boot and reset, is useful for debugging, and could eventually be used to support migration. Signed-off-by: Scott Wood scottw...@freescale.com --- hw/ppce500_mpc8544ds.c |2 + target-ppc/cpu.h |2 + target-ppc/kvm.c | 85 3 files changed, 89 insertions(+), 0 deletions(-) diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c index 5ac8843..3cdeb43 100644 --- a/hw/ppce500_mpc8544ds.c +++ b/hw/ppce500_mpc8544ds.c @@ -192,6 +192,8 @@ static void mmubooke_create_initial_mapping(CPUState *env, tlb-mas2 = va TARGET_PAGE_MASK; tlb-mas7_3 = pa TARGET_PAGE_MASK; tlb-mas7_3 |= MAS3_UR | MAS3_UW | MAS3_UX | MAS3_SR | MAS3_SW | MAS3_SX; + +env-tlb_dirty = true; } static void mpc8544ds_cpu_reset(void *opaque) diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h index 46d86be..8191ed2 100644 --- a/target-ppc/cpu.h +++ b/target-ppc/cpu.h @@ -921,6 +921,8 @@ struct CPUPPCState { ppc_tlb_t tlb; /* TLB is optional. Allocate them only if needed */ /* 403 dedicated access protection registers */ target_ulong pb[4]; +bool tlb_dirty; /* Set to non-zero when modifying TLB */ +bool kvm_sw_tlb; /* non-zero if KVM SW TLB API is active */ #endif /* Other registers */ diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index e7b1b10..9a88fc9 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -122,6 +122,51 @@ static int kvm_arch_sync_sregs(CPUState *cenv) return kvm_vcpu_ioctl(cenv, KVM_SET_SREGS, sregs); } +static int kvm_booke206_tlb_init(CPUState *env) +{ +#if defined(KVM_CAP_SW_TLB) defined(KVM_MMU_FSL_BOOKE_NOHV) Those hopefully shouldn't be required anymore soon - when Jan's patches make it upstream. Jan, how's progress on that front? I can only forward this question: Avi, what are the plans for http://thread.gmane.org/gmane.comp.emulators.kvm.devel/73917? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] KVM call agenda for June 21
Please send in any agenda items you are interested in covering. thanks, -juan
Re: [Qemu-devel] [PATCH 1/3] kvm: ppc: booke206: use MMU API
On 06/20/2011 10:41 AM, Jan Kiszka wrote: Those hopefully shouldn't be required anymore soon - when Jan's patches make it upstream. Jan, how's progress on that front? I can only forward this question: Avi, what are the plans for http://thread.gmane.org/gmane.comp.emulators.kvm.devel/73917? Will apply once all comments are addressed. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH] Optimize screendump
On 06/19/2011 08:00 PM, Alexander Graf wrote: On 19.06.2011, at 18:04, Avi Kivity wrote: On 06/19/2011 06:53 PM, Andreas Färber wrote: Am 19.06.2011 um 17:46 schrieb Avi Kivity: On 06/19/2011 06:22 PM, Stefan Hajnoczi wrote: I wonder if this will break non-Linux platforms. Perhaps buffer an entire row of pixels instead and only fwrite(3) at the end of the outer loop. That's how I wrote this in the first place. Since the consensus is against these functions, I'll submit that version instead. Maybe add a qemu_fputc_unlocked() and do a configure check for it? Good idea. I'll try that, unless people disagree. Writing by row should be faster and pretty straight forward, no? I don't see how it's faster, but I guess I'll do that, it's a local issue and is best addressed locally. -- error compiling committee.c: too many arguments to function
[Qemu-devel] [PATCH v2] Optimize screendump
When running kvm-autotest, fputc() is often the second highest (sometimes #1) function showing up in a profile. This is due to fputc() locking the file for every byte written. Optimize by buffering a line's worth of pixels and writing that out in a single call. Signed-off-by: Avi Kivity a...@redhat.com --- v2: drop unportable fputc_unlocked hw/vga.c | 13 ++--- 1 files changed, 10 insertions(+), 3 deletions(-) diff --git a/hw/vga.c b/hw/vga.c index d5bc582..97c96bf 100644 --- a/hw/vga.c +++ b/hw/vga.c @@ -2349,15 +2349,19 @@ int ppm_save(const char *filename, struct DisplaySurface *ds) uint32_t v; int y, x; uint8_t r, g, b; +int ret; +char *linebuf, *pbuf; f = fopen(filename, wb); if (!f) return -1; fprintf(f, P6\n%d %d\n%d\n, ds-width, ds-height, 255); +linebuf = qemu_malloc(ds-width * 3); d1 = ds-data; for(y = 0; y ds-height; y++) { d = d1; +pbuf = linebuf; for(x = 0; x ds-width; x++) { if (ds-pf.bits_per_pixel == 32) v = *(uint32_t *)d; @@ -2369,13 +2373,16 @@ int ppm_save(const char *filename, struct DisplaySurface *ds) (ds-pf.gmax + 1); b = ((v ds-pf.bshift) ds-pf.bmax) * 256 / (ds-pf.bmax + 1); -fputc(r, f); -fputc(g, f); -fputc(b, f); +*pbuf++ = r; +*pbuf++ = g; +*pbuf++ = b; d += ds-pf.bytes_per_pixel; } d1 += ds-linesize; +ret = fwrite(linebuf, 1, pbuf - linebuf, f); +(void)ret; } +qemu_free(linebuf); fclose(f); return 0; } -- 1.7.5.3
[Qemu-devel] [PATCH] Support logging xen-guest console
Add code to support logging xen-domU console, as what xenconsoled does. Log info will be saved in /var/log/xen/console/guest-domUname.log. Signed-off-by: Chunyan Liu cy...@novell.com --- hw/xen_console.c | 63 ++ 1 files changed, 63 insertions(+), 0 deletions(-) diff --git a/hw/xen_console.c b/hw/xen_console.c index c6c8163..ac3208d 100644 --- a/hw/xen_console.c +++ b/hw/xen_console.c @@ -36,6 +36,8 @@ #include qemu-char.h #include xen_backend.h +static int log_guest = 0; + struct buffer { uint8_t *data; size_t consumed; @@ -52,8 +54,24 @@ struct XenConsole { void *sring; CharDriverState *chr; int backlog; +int log_fd; }; +static int write_all(int fd, const char* buf, size_t len) +{ +while (len) { +ssize_t ret = write(fd, buf, len); +if (ret == -1 errno == EINTR) +continue; +if (ret = 0) +return -1; +len -= ret; +buf += ret; +} + +return 0; +} + static void buffer_append(struct XenConsole *con) { struct buffer *buffer = con-buffer; @@ -81,6 +99,14 @@ static void buffer_append(struct XenConsole *con) intf-out_cons = cons; xen_be_send_notify(con-xendev); +if (con-log_fd != -1) { +int logret; +logret = write_all(con-log_fd, buffer-data + buffer-size - size, size); +if (logret 0) +xen_be_printf(con-xendev, 1, Write to log failed on domain %d: %d (%s)\n, + con-xendev.dom, errno, strerror(errno)); + } + if (buffer-max_capacity buffer-size buffer-max_capacity) { /* Discard the middle of the data. */ @@ -174,12 +200,36 @@ static void xencons_send(struct XenConsole *con) } } +static int create_domain_log(struct XenConsole *con) +{ +char *logfile; +char *path, *domname; +int fd; + +path = xs_get_domain_path(xenstore, con-xendev.dom); +domname = xenstore_read_str(path, name); +free(path); +if (!domname) +return -1; + +asprintf(logfile, /var/log/xen/console/guest-%s.log, domname); +qemu_free(domname); + +fd = open(logfile, O_WRONLY|O_CREAT|O_APPEND, 0644); +free(logfile); +if (fd == -1) +xen_be_printf(con-xendev, 1, Failed to open log %s: %d (%s), logfile, errno, strerror(errno)); + +return fd; +} + /* */ static int con_init(struct XenDevice *xendev) { struct XenConsole *con = container_of(xendev, struct XenConsole, xendev); char *type, *dom; +char *logenv = NULL; /* setup */ dom = xs_get_domain_path(xenstore, con-xendev.dom); @@ -198,6 +248,10 @@ static int con_init(struct XenDevice *xendev) else con-chr = serial_hds[con-xendev.dev]; +logenv = getenv(XENCONSOLED_TRACE); +if (logenv != NULL !strcmp(logenv, guest)) { +log_guest = 1; +} return 0; } @@ -230,6 +284,9 @@ static int con_connect(struct XenDevice *xendev) con-xendev.remote_port, con-xendev.local_port, con-buffer.max_capacity); +con-log_fd = -1; +if (log_guest) + con-log_fd = create_domain_log(con); return 0; } @@ -245,6 +302,12 @@ static void con_disconnect(struct XenDevice *xendev) munmap(con-sring, XC_PAGE_SIZE); con-sring = NULL; } + +if (con-log_fd != -1) { +close(con-log_fd); +con-log_fd = -1; +} + } static void con_event(struct XenDevice *xendev) -- 1.7.3.4
[Qemu-devel] [PULL] Xen Patch Queue
Hi Anthony, This is my current patch queue for Xen patches. Please pull. Alex The following changes since commit eb47d7c5d96060040931c42773ee07e61e547af9: Peter Maydell (1): hw/9118.c: Implement active-low interrupt support are available in the git repository at: git://repo.or.cz/qemu/agraf.git xen-next Anthony PERARD (2): xen: Add xc_domain_add_to_physmap to xen_interface. xen: Introduce VGA sync dirty bitmap support Stefano Stabellini (8): xen: fix qemu_map_cache with size != MCACHE_BUCKET_SIZE xen: remove qemu_map_cache_unlock xen: remove xen_map_block and xen_unmap_block exec.c: refactor cpu_physical_memory_map xen: mapcache performance improvements cirrus_vga: reset lfb_addr after a pci config write if the BAR is unmapped xen: only track the linear framebuffer xen: fix interrupt routing Steven Smith (1): xen: Add the Xen platform pci device Makefile.target |2 + configure | 29 - cpu-common.h|1 + exec.c | 88 +++--- hw/cirrus_vga.c |5 +- hw/hw.h |3 + hw/pc.h |1 - hw/pc_piix.c| 10 +- hw/pci_ids.h|2 + hw/piix_pci.c | 66 +- hw/xen_common.h | 14 ++ hw/xen_platform.c | 340 +++ trace-events|4 + xen-all.c | 281 ++ xen-mapcache-stub.c |8 -- xen-mapcache.c | 141 ++ xen-mapcache.h | 16 --- 17 files changed, 826 insertions(+), 185 deletions(-) create mode 100644 hw/xen_platform.c
Re: [Qemu-devel] [PATCH 3/3] xen: implement unplug protocol in xen_platform
On Thu, Jun 16, 2011 at 05:05:19PM +0100, stefano.stabell...@eu.citrix.com wrote: From: Stefano Stabellini stefano.stabell...@eu.citrix.com The unplug protocol is necessary to support PV drivers in the guest: the drivers expect to be able to unplug emulated disks and nics before initializing the Xen PV interfaces. It is responsibility of the guest to make sure that the unplug is done before the emulated devices or the PV interface start to be used. We use pci_for_each_device to walk the PCI bus, identify the devices and disks that we want to disable and dynamically unplug them. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- hw/xen_platform.c | 63 - 1 files changed, 62 insertions(+), 1 deletions(-) diff --git a/hw/xen_platform.c b/hw/xen_platform.c index b167eee..9f8c843 100644 --- a/hw/xen_platform.c +++ b/hw/xen_platform.c @@ -34,6 +34,9 @@ #include xen_backend.h #include rwhandler.h #include trace.h +#include hw/ide/internal.h I'm not an expert here but it looks like you should put some code in hw/ide/xen.c and export an API from there rather than calling ide_bus_reset and tweaking PCIIDEState directly. +#include hw/ide/pci.h +#include hw/pci_ids.h #include xenguest.h @@ -76,6 +79,54 @@ static void log_writeb(PCIXenPlatformState *s, char val) } /* Xen Platform, Fixed IOPort */ +#define UNPLUG_ALL_IDE_DISKS 1 +#define UNPLUG_ALL_NICS 2 +#define UNPLUG_AUX_IDE_DISKS 4 + +static int unplug_param; + +static void unplug_nic(PCIBus *b, PCIDevice *d) +{ +if (d-config[0xa] == 0 d-config[0xb] == 2) { Please use registers from pci_regs.h and pci_ids.h +pci_unplug_device((d-qdev)); Can't you use qdev_unplug? That does other useful checks and updates system state. Also, are there non hotpluggable devices? If not you can assert on qdev_unplug failure. +} +} + +static void pci_unplug_nics(PCIBus *bus) +{ +pci_for_each_device(bus, 0, unplug_nic); +} + +static void unplug_disks(PCIBus *b, PCIDevice *d) +{ +if (d-config[0xa] == 1 d-config[0xb] == 1) { Same comment about hardcoded constants. +PCIIDEState *pci_ide = DO_UPCAST(PCIIDEState, dev, d); +DriveInfo *di; +int i = 0; + +if (unplug_param UNPLUG_AUX_IDE_DISKS) +i++; + +for (; i 3; i++) { +di = drive_get_by_index(IF_IDE, i); +if (di != NULL di-bdrv != NULL di-bdrv-type != BDRV_TYPE_CDROM) { line too long +DeviceState *ds = bdrv_get_attached(di-bdrv); +if (ds) +bdrv_detach(di-bdrv, ds); +bdrv_close(di-bdrv); +pci_ide-bus[di-bus].ifs[di-unit].bs = NULL; +drive_put_ref(di); +} +} +ide_bus_reset(pci_ide-bus[0]); +ide_bus_reset(pci_ide-bus[1]); +} +} + +static void pci_unplug_disks(PCIBus *bus) +{ +pci_for_each_device(bus, 0, unplug_disks); +} static void platform_fixed_ioport_writew(void *opaque, uint32_t addr, uint32_t val) { @@ -83,10 +134,20 @@ static void platform_fixed_ioport_writew(void *opaque, uint32_t addr, uint32_t v switch (addr - XEN_PLATFORM_IOPORT) { case 0: -/* TODO: */ +unplug_param = val; /* Unplug devices. Value is a bitmask of which devices to unplug, with bit 0 the IDE devices, bit 1 the network devices, and bit 2 the non-primary-master IDE devices. */ +if (val UNPLUG_ALL_IDE_DISKS || val UNPLUG_AUX_IDE_DISKS) { +DPRINTF(unplug disks\n); +qemu_aio_flush(); +bdrv_flush_all(); +pci_unplug_disks(s-pci_dev.bus); +} +if (val UNPLUG_ALL_NICS) { +DPRINTF(unplug nics\n); +pci_unplug_nics(s-pci_dev.bus); +} break; case 2: switch (val) { -- 1.7.2.3
Re: [Qemu-devel] [PATCH 2/3] pci: export pci_unplug_device
On Thu, Jun 16, 2011 at 05:05:18PM +0100, stefano.stabell...@eu.citrix.com wrote: From: Stefano Stabellini stefano.stabell...@eu.citrix.com pci_unplug_device is needed by the xen_platform device to perfom dynamic nic unplug. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com I think it's better to go through qdev, pci_unplug_device was intended as an internal API. --- hw/pci.c |2 +- hw/pci.h |1 + 2 files changed, 2 insertions(+), 1 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index 1d297d6..679e976 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -1692,7 +1692,7 @@ static int pci_qdev_init(DeviceState *qdev, DeviceInfo *base) return 0; } -static int pci_unplug_device(DeviceState *qdev) +int pci_unplug_device(DeviceState *qdev) { PCIDevice *dev = DO_UPCAST(PCIDevice, qdev, qdev); PCIDeviceInfo *info = container_of(qdev-info, PCIDeviceInfo, qdev); diff --git a/hw/pci.h b/hw/pci.h index 0d288ce..868f793 100644 --- a/hw/pci.h +++ b/hw/pci.h @@ -452,6 +452,7 @@ typedef struct { void pci_qdev_register(PCIDeviceInfo *info); void pci_qdev_register_many(PCIDeviceInfo *info); +int pci_unplug_device(DeviceState *qdev); PCIDevice *pci_create_multifunction(PCIBus *bus, int devfn, bool multifunction, const char *name); -- 1.7.2.3
Re: [Qemu-devel] [PATCH 1/3] kvm: ppc: booke206: use MMU API
On 2011-06-20 10:03, Avi Kivity wrote: On 06/20/2011 10:41 AM, Jan Kiszka wrote: Those hopefully shouldn't be required anymore soon - when Jan's patches make it upstream. Jan, how's progress on that front? I can only forward this question: Avi, what are the plans for http://thread.gmane.org/gmane.comp.emulators.kvm.devel/73917? Will apply once all comments are addressed. Well, then go ahead :) - or did I miss a comment? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 3/3] xen: implement unplug protocol in xen_platform
Am 20.06.2011 10:28, schrieb Alexander Graf: On 16.06.2011, at 18:05, stefano.stabell...@eu.citrix.com stefano.stabell...@eu.citrix.com wrote: From: Stefano Stabellini stefano.stabell...@eu.citrix.com The unplug protocol is necessary to support PV drivers in the guest: the drivers expect to be able to unplug emulated disks and nics before initializing the Xen PV interfaces. It is responsibility of the guest to make sure that the unplug is done before the emulated devices or the PV interface start to be used. We use pci_for_each_device to walk the PCI bus, identify the devices and disks that we want to disable and dynamically unplug them. Kevin, please check the block parts of this code. Michael, please check the PCI parts of this code. Thanks :) Alex Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- hw/xen_platform.c | 63 - 1 files changed, 62 insertions(+), 1 deletions(-) diff --git a/hw/xen_platform.c b/hw/xen_platform.c index b167eee..9f8c843 100644 --- a/hw/xen_platform.c +++ b/hw/xen_platform.c @@ -34,6 +34,9 @@ #include xen_backend.h #include rwhandler.h #include trace.h +#include hw/ide/internal.h Sorry, no. :-) This is not using a proper interface, but just a hack that depends on the internal structure of the IDE emulation. It's going to break sooner or later. It seems your problem is that IDE isn't unpluggable. I'm not entirely sure what the right solution is, maybe just adding a new xen-ide device that is used for the Xen machine and closely resembles piix4-ide, but can be hot-unplugged. Kevin +#include hw/ide/pci.h +#include hw/pci_ids.h #include xenguest.h @@ -76,6 +79,54 @@ static void log_writeb(PCIXenPlatformState *s, char val) } /* Xen Platform, Fixed IOPort */ +#define UNPLUG_ALL_IDE_DISKS 1 +#define UNPLUG_ALL_NICS 2 +#define UNPLUG_AUX_IDE_DISKS 4 + +static int unplug_param; + +static void unplug_nic(PCIBus *b, PCIDevice *d) +{ +if (d-config[0xa] == 0 d-config[0xb] == 2) { +pci_unplug_device((d-qdev)); +} +} + +static void pci_unplug_nics(PCIBus *bus) +{ +pci_for_each_device(bus, 0, unplug_nic); +} + +static void unplug_disks(PCIBus *b, PCIDevice *d) +{ +if (d-config[0xa] == 1 d-config[0xb] == 1) { +PCIIDEState *pci_ide = DO_UPCAST(PCIIDEState, dev, d); +DriveInfo *di; +int i = 0; + +if (unplug_param UNPLUG_AUX_IDE_DISKS) +i++; + +for (; i 3; i++) { +di = drive_get_by_index(IF_IDE, i); +if (di != NULL di-bdrv != NULL di-bdrv-type != BDRV_TYPE_CDROM) { +DeviceState *ds = bdrv_get_attached(di-bdrv); +if (ds) +bdrv_detach(di-bdrv, ds); +bdrv_close(di-bdrv); +pci_ide-bus[di-bus].ifs[di-unit].bs = NULL; +drive_put_ref(di); +} +} +ide_bus_reset(pci_ide-bus[0]); +ide_bus_reset(pci_ide-bus[1]); +} +} + +static void pci_unplug_disks(PCIBus *bus) +{ +pci_for_each_device(bus, 0, unplug_disks); +} static void platform_fixed_ioport_writew(void *opaque, uint32_t addr, uint32_t val) { @@ -83,10 +134,20 @@ static void platform_fixed_ioport_writew(void *opaque, uint32_t addr, uint32_t v switch (addr - XEN_PLATFORM_IOPORT) { case 0: -/* TODO: */ +unplug_param = val; /* Unplug devices. Value is a bitmask of which devices to unplug, with bit 0 the IDE devices, bit 1 the network devices, and bit 2 the non-primary-master IDE devices. */ +if (val UNPLUG_ALL_IDE_DISKS || val UNPLUG_AUX_IDE_DISKS) { +DPRINTF(unplug disks\n); +qemu_aio_flush(); +bdrv_flush_all(); +pci_unplug_disks(s-pci_dev.bus); +} +if (val UNPLUG_ALL_NICS) { +DPRINTF(unplug nics\n); +pci_unplug_nics(s-pci_dev.bus); +} break; case 2: switch (val) { -- 1.7.2.3
Re: [Qemu-devel] [PATCH] do not send packet to nic if the packet will be dropped by nic
Am 17.06.2011 03:33, schrieb Wen Congyang: If !s-clock_enabled or !rtl8139_receiver_enabled(s), it means that the nic will drop all packets from host. So qemu will keep getting packets from host and wasting CPU on dropping packets. This seems worse than packets that should be dropped but aren't. Signed-off-by: Wen Congyang we...@cn.fujitsu.com Which bug does this change fix? I'm still not convinced that we should do it. --- hw/rtl8139.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/rtl8139.c b/hw/rtl8139.c index 2f8db58..9084678 100644 --- a/hw/rtl8139.c +++ b/hw/rtl8139.c @@ -810,9 +810,9 @@ static int rtl8139_can_receive(VLANClientState *nc) /* Receive (drop) packets if card is disabled. */ This comment isn't accurate any more after applying the patch. if (!s-clock_enabled) - return 1; + return 0; if (!rtl8139_receiver_enabled(s)) - return 1; + return 0; if (rtl8139_cp_receiver_enabled(s)) { /* ??? Flow control not implemented in c+ mode. -- 1.7.1 Kevin
Re: [Qemu-devel] [PATCH 1/3] kvm: ppc: booke206: use MMU API
On 06/20/2011 11:47 AM, Jan Kiszka wrote: On 2011-06-20 10:03, Avi Kivity wrote: On 06/20/2011 10:41 AM, Jan Kiszka wrote: Those hopefully shouldn't be required anymore soon - when Jan's patches make it upstream. Jan, how's progress on that front? I can only forward this question: Avi, what are the plans for http://thread.gmane.org/gmane.comp.emulators.kvm.devel/73917? Will apply once all comments are addressed. Well, then go ahead :) - or did I miss a comment? If everyone's happy I (or rather Marcelo this week) will be happy to apply. -- error compiling committee.c: too many arguments to function
[Qemu-devel] [PATCH] Support logging xen-guest console
Add code to support logging xen-domU console, as what xenconsoled does. Log info will be saved in /var/log/xen/console/guest-domUname.log. Signed-off-by: Chunyan Liu cy...@novell.com --- hw/xen_console.c | 63 ++ 1 files changed, 63 insertions(+), 0 deletions(-) diff --git a/hw/xen_console.c b/hw/xen_console.c index c6c8163..ac3208d 100644 --- a/hw/xen_console.c +++ b/hw/xen_console.c @@ -36,6 +36,8 @@ #include qemu-char.h #include xen_backend.h +static int log_guest = 0; + struct buffer { uint8_t *data; size_t consumed; @@ -52,8 +54,24 @@ struct XenConsole { void *sring; CharDriverState *chr; int backlog; +int log_fd; }; +static int write_all(int fd, const char* buf, size_t len) +{ +while (len) { +ssize_t ret = write(fd, buf, len); +if (ret == -1 errno == EINTR) +continue; +if (ret = 0) +return -1; +len -= ret; +buf += ret; +} + +return 0; +} + static void buffer_append(struct XenConsole *con) { struct buffer *buffer = con-buffer; @@ -81,6 +99,14 @@ static void buffer_append(struct XenConsole *con) intf-out_cons = cons; xen_be_send_notify(con-xendev); +if (con-log_fd != -1) { +int logret; +logret = write_all(con-log_fd, buffer-data + buffer-size - size, size); +if (logret 0) +xen_be_printf(con-xendev, 1, Write to log failed on domain %d: %d (%s)\n, + con-xendev.dom, errno, strerror(errno)); + } + if (buffer-max_capacity buffer-size buffer-max_capacity) { /* Discard the middle of the data. */ @@ -174,12 +200,36 @@ static void xencons_send(struct XenConsole *con) } } +static int create_domain_log(struct XenConsole *con) +{ +char *logfile; +char *path, *domname; +int fd; + +path = xs_get_domain_path(xenstore, con-xendev.dom); +domname = xenstore_read_str(path, name); +free(path); +if (!domname) +return -1; + +asprintf(logfile, /var/log/xen/console/guest-%s.log, domname); +qemu_free(domname); + +fd = open(logfile, O_WRONLY|O_CREAT|O_APPEND, 0644); +free(logfile); +if (fd == -1) +xen_be_printf(con-xendev, 1, Failed to open log %s: %d (%s), logfile, errno, strerror(errno)); + +return fd; +} + /* */ static int con_init(struct XenDevice *xendev) { struct XenConsole *con = container_of(xendev, struct XenConsole, xendev); char *type, *dom; +char *logenv = NULL; /* setup */ dom = xs_get_domain_path(xenstore, con-xendev.dom); @@ -198,6 +248,10 @@ static int con_init(struct XenDevice *xendev) else con-chr = serial_hds[con-xendev.dev]; +logenv = getenv(XENCONSOLED_TRACE); +if (logenv != NULL !strcmp(logenv, guest)) { +log_guest = 1; +} return 0; } @@ -230,6 +284,9 @@ static int con_connect(struct XenDevice *xendev) con-xendev.remote_port, con-xendev.local_port, con-buffer.max_capacity); +con-log_fd = -1; +if (log_guest) + con-log_fd = create_domain_log(con); return 0; } @@ -245,6 +302,12 @@ static void con_disconnect(struct XenDevice *xendev) munmap(con-sring, XC_PAGE_SIZE); con-sring = NULL; } + +if (con-log_fd != -1) { +close(con-log_fd); +con-log_fd = -1; +} + } static void con_event(struct XenDevice *xendev) -- 1.7.3.4
Re: [Qemu-devel] [PATCH] vmstate: Add unmigratable flag
On 2011-06-19 22:46, Cam Macdonell wrote: On Thu, Jun 9, 2011 at 2:39 PM, Jan Kiszka jan.kis...@web.de wrote: On 2011-06-09 22:00, Anthony Liguori wrote: On 06/09/2011 11:44 AM, Jan Kiszka wrote: A first step towards getting rid of register_device_unmigratable (ivshmem and lacking vmstate support in virtio are blocking this): Allow to register an unmigratable vmstate via qdev, i.e. tag a device declaratively. I thought part of the problem with this was that for some devices (like ivshmem), whether it can be migrated was dynamic. It depends on configuration, state, etc. That only applies to ivshmem (the other user is device assignment which is unconditionally unmigratable). And the ivshmem issue could easily be solved by defining two devices, ivshmem-peer (or just ivshmem) and ivshmem-master, eliminating the need for the role property. I don't think there will ever be a use case for a transformer device that becomes unmigratable during runtime (would be a nightmare for management apps anyway). If breaking the user interface of ivshmem for this is OK, I'll post a patch. Jan The migratability of ivshmem is not dynamic in that it doesn't change at runtime, it's set when the device is created, either role=peer or role=master is specified. So iiuc, this could work with ivshmem. So you are fine with breaking the interface? Everyone else as well? Then I'll cook a patch to sort at least this out for 0.15. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] Extracting TCG
Hi All, TCG look very interesting for generating machine code. Is they a way to extract it as a stand alone library in other to use it in for JIT compiler? Thanks -- Mathieu
[Qemu-devel] [PATCH 08/11] cirrus_vga: reset lfb_addr after a pci config write if the BAR is unmapped
From: Stefano Stabellini stefano.stabell...@eu.citrix.com If the cirrus_vga PCI BAR is unmapped than we should not only reset map_addr but also lfb_addr, otherwise we'll keep trying to map the old lfb_addr in map_linear_vram. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Acked-by: Jan Kiszka jan.kis...@siemens.com Signed-off-by: Alexander Graf ag...@suse.de --- hw/cirrus_vga.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/hw/cirrus_vga.c b/hw/cirrus_vga.c index 722cac7..3c5043e 100644 --- a/hw/cirrus_vga.c +++ b/hw/cirrus_vga.c @@ -3088,8 +3088,11 @@ static void pci_cirrus_write_config(PCIDevice *d, CirrusVGAState *s = pvs-cirrus_vga; pci_default_write_config(d, address, val, len); -if (s-vga.map_addr d-io_regions[0].addr == PCI_BAR_UNMAPPED) +if (s-vga.map_addr d-io_regions[0].addr == PCI_BAR_UNMAPPED) { s-vga.map_addr = 0; +s-vga.lfb_addr = 0; +s-vga.lfb_end = 0; +} cirrus_update_memory_access(s); } -- 1.6.0.2
[Qemu-devel] [PATCH 09/11] xen: only track the linear framebuffer
From: Stefano Stabellini stefano.stabell...@eu.citrix.com Xen can only do dirty bit tracking for one memory region, so we should explicitly avoid trying to track anything but the vga vram region. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Signed-off-by: Alexander Graf ag...@suse.de --- xen-all.c | 14 ++ 1 files changed, 14 insertions(+), 0 deletions(-) diff --git a/xen-all.c b/xen-all.c index 75a82c2..fe75ddd 100644 --- a/xen-all.c +++ b/xen-all.c @@ -215,6 +215,7 @@ static int xen_add_to_physmap(XenIOState *state, int rc = 0; XenPhysmap *physmap = NULL; target_phys_addr_t pfn, start_gpfn; +RAMBlock *block; if (get_physmapping(state, start_addr, size)) { return 0; @@ -223,6 +224,19 @@ static int xen_add_to_physmap(XenIOState *state, return -1; } +/* Xen can only handle a single dirty log region for now and we want + * the linear framebuffer to be that region. + * Avoid tracking any regions that is not videoram and avoid tracking + * the legacy vga region. */ +QLIST_FOREACH(block, ram_list.blocks, next) { +if (!strcmp(block-idstr, vga.vram) block-offset == phys_offset + start_addr 0xb) { +goto go_physmap; +} +} +return -1; + +go_physmap: DPRINTF(mapping vram to %llx - %llx, from %llx\n, start_addr, start_addr + size, phys_offset); -- 1.6.0.2
[Qemu-devel] [PATCH 06/11] exec.c: refactor cpu_physical_memory_map
From: Stefano Stabellini stefano.stabell...@eu.citrix.com Introduce qemu_ram_ptr_length that takes an address and a size as parameters rather than just an address. Refactor cpu_physical_memory_map so that we call qemu_ram_ptr_length only once rather than calling qemu_get_ram_ptr one time per page. This is not only more efficient but also tries to simplify the logic of the function. Currently we are relying on the fact that all the pages are mapped contiguously in qemu's address space: we have a check to make sure that the virtual address returned by qemu_get_ram_ptr from the second call on is consecutive. Now we are making this more explicit replacing all the calls to qemu_get_ram_ptr with a single call to qemu_ram_ptr_length passing a size argument. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com CC: ag...@suse.de CC: anth...@codemonkey.ws Signed-off-by: Alexander Graf ag...@suse.de --- cpu-common.h |1 + exec.c | 51 ++- 2 files changed, 35 insertions(+), 17 deletions(-) diff --git a/cpu-common.h b/cpu-common.h index 9f59172..b027e43 100644 --- a/cpu-common.h +++ b/cpu-common.h @@ -65,6 +65,7 @@ void qemu_ram_free_from_ptr(ram_addr_t addr); void qemu_ram_remap(ram_addr_t addr, ram_addr_t length); /* This should only be used for ram local to a device. */ void *qemu_get_ram_ptr(ram_addr_t addr); +void *qemu_ram_ptr_length(target_phys_addr_t addr, target_phys_addr_t *size); /* Same but slower, to use for migration, where the order of * RAMBlocks must not change. */ void *qemu_safe_ram_ptr(ram_addr_t addr); diff --git a/exec.c b/exec.c index e11c1dd..238c173 100644 --- a/exec.c +++ b/exec.c @@ -3131,6 +3131,31 @@ void *qemu_safe_ram_ptr(ram_addr_t addr) return NULL; } +/* Return a host pointer to guest's ram. Similar to qemu_get_ram_ptr + * but takes a size argument */ +void *qemu_ram_ptr_length(target_phys_addr_t addr, target_phys_addr_t *size) +{ +if (xen_mapcache_enabled()) +return qemu_map_cache(addr, *size, 1); +else { +RAMBlock *block; + +QLIST_FOREACH(block, ram_list.blocks, next) { +if (addr - block-offset block-length) { +if (addr - block-offset + *size block-length) +*size = block-length - addr + block-offset; +return block-host + (addr - block-offset); +} +} + +fprintf(stderr, Bad ram offset % PRIx64 \n, (uint64_t)addr); +abort(); + +*size = 0; +return NULL; +} +} + void qemu_put_ram_ptr(void *addr) { trace_qemu_put_ram_ptr(addr); @@ -3992,14 +4017,12 @@ void *cpu_physical_memory_map(target_phys_addr_t addr, int is_write) { target_phys_addr_t len = *plen; -target_phys_addr_t done = 0; +target_phys_addr_t todo = 0; int l; -uint8_t *ret = NULL; -uint8_t *ptr; target_phys_addr_t page; unsigned long pd; PhysPageDesc *p; -unsigned long addr1; +target_phys_addr_t addr1 = addr; while (len 0) { page = addr TARGET_PAGE_MASK; @@ -4014,7 +4037,7 @@ void *cpu_physical_memory_map(target_phys_addr_t addr, } if ((pd ~TARGET_PAGE_MASK) != IO_MEM_RAM) { -if (done || bounce.buffer) { +if (todo || bounce.buffer) { break; } bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, TARGET_PAGE_SIZE); @@ -4023,23 +4046,17 @@ void *cpu_physical_memory_map(target_phys_addr_t addr, if (!is_write) { cpu_physical_memory_read(addr, bounce.buffer, l); } -ptr = bounce.buffer; -} else { -addr1 = (pd TARGET_PAGE_MASK) + (addr ~TARGET_PAGE_MASK); -ptr = qemu_get_ram_ptr(addr1); -} -if (!done) { -ret = ptr; -} else if (ret + done != ptr) { -break; + +*plen = l; +return bounce.buffer; } len -= l; addr += l; -done += l; +todo += l; } -*plen = done; -return ret; +*plen = todo; +return qemu_ram_ptr_length(addr1, plen); } /* Unmaps a memory region previously mapped by cpu_physical_memory_map(). -- 1.6.0.2
[Qemu-devel] [PATCH 04/11] xen: remove qemu_map_cache_unlock
From: Stefano Stabellini stefano.stabell...@eu.citrix.com There is no need for qemu_map_cache_unlock, just use qemu_invalidate_entry instead. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Signed-off-by: Alexander Graf ag...@suse.de --- exec.c |2 +- xen-mapcache-stub.c |4 xen-mapcache.c | 33 - xen-mapcache.h |1 - 4 files changed, 1 insertions(+), 39 deletions(-) diff --git a/exec.c b/exec.c index 09928a3..01f33bb 100644 --- a/exec.c +++ b/exec.c @@ -3146,7 +3146,7 @@ void qemu_put_ram_ptr(void *addr) xen_unmap_block(block-host, block-length); block-host = NULL; } else { -qemu_map_cache_unlock(addr); +qemu_invalidate_entry(addr); } } } diff --git a/xen-mapcache-stub.c b/xen-mapcache-stub.c index 7c14b3d..60f712b 100644 --- a/xen-mapcache-stub.c +++ b/xen-mapcache-stub.c @@ -22,10 +22,6 @@ uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, u return qemu_get_ram_ptr(phys_addr); } -void qemu_map_cache_unlock(void *buffer) -{ -} - ram_addr_t qemu_ram_addr_from_mapcache(void *ptr) { return -1; diff --git a/xen-mapcache.c b/xen-mapcache.c index 90fbd49..57fe24d 100644 --- a/xen-mapcache.c +++ b/xen-mapcache.c @@ -230,39 +230,6 @@ uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, u return mapcache-last_address_vaddr + address_offset; } -void qemu_map_cache_unlock(void *buffer) -{ -MapCacheEntry *entry = NULL, *pentry = NULL; -MapCacheRev *reventry; -target_phys_addr_t paddr_index; -int found = 0; - -QTAILQ_FOREACH(reventry, mapcache-locked_entries, next) { -if (reventry-vaddr_req == buffer) { -paddr_index = reventry-paddr_index; -found = 1; -break; -} -} -if (!found) { -return; -} -QTAILQ_REMOVE(mapcache-locked_entries, reventry, next); -qemu_free(reventry); - -entry = mapcache-entry[paddr_index % mapcache-nr_buckets]; -while (entry entry-paddr_index != paddr_index) { -pentry = entry; -entry = entry-next; -} -if (!entry) { -return; -} -if (entry-lock 0) { -entry-lock--; -} -} - ram_addr_t qemu_ram_addr_from_mapcache(void *ptr) { MapCacheEntry *entry = NULL, *pentry = NULL; diff --git a/xen-mapcache.h b/xen-mapcache.h index 339444c..b89b8f9 100644 --- a/xen-mapcache.h +++ b/xen-mapcache.h @@ -14,7 +14,6 @@ void qemu_map_cache_init(void); uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, uint8_t lock); -void qemu_map_cache_unlock(void *phys_addr); ram_addr_t qemu_ram_addr_from_mapcache(void *ptr); void qemu_invalidate_entry(uint8_t *buffer); void qemu_invalidate_map_cache(void); -- 1.6.0.2
[Qemu-devel] [PATCH 01/11] xen: Add xc_domain_add_to_physmap to xen_interface.
From: Anthony PERARD anthony.per...@citrix.com This function will be used to support sync dirty bitmap. This come with a check against every Xen release, and special implementation for Xen version that doesn't have this specific call. This function will not be usable with Xen 3.3 because the behavior is different. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Alexander Graf ag...@suse.de --- configure | 29 - hw/xen_common.h | 14 ++ 2 files changed, 42 insertions(+), 1 deletions(-) diff --git a/configure b/configure index 44c092a..b63b49f 100755 --- a/configure +++ b/configure @@ -1210,6 +1210,7 @@ int main(void) { xc = xc_interface_open(0, 0, 0); xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0); xc_gnttab_open(NULL, 0); + xc_domain_add_to_physmap(0, 0, XENMAPSPACE_gmfn, 0, 0); return 0; } EOF @@ -1228,10 +1229,14 @@ EOF # error HVM_MAX_VCPUS not defined #endif int main(void) { + struct xen_add_to_physmap xatp = { +.domid = 0, .space = XENMAPSPACE_gmfn, .idx = 0, .gpfn = 0, + }; xs_daemon_open(); xc_interface_open(); xc_gnttab_open(); xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0); + xc_memory_op(0, XENMEM_add_to_physmap, xatp); return 0; } EOF @@ -1240,7 +1245,29 @@ EOF xen_ctrl_version=400 xen=yes - # Xen 3.3.0, 3.4.0 + # Xen 3.4.0 + elif ( + cat $TMPC EOF +#include xenctrl.h +#include xs.h +int main(void) { + struct xen_add_to_physmap xatp = { +.domid = 0, .space = XENMAPSPACE_gmfn, .idx = 0, .gpfn = 0, + }; + xs_daemon_open(); + xc_interface_open(); + xc_gnttab_open(); + xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0); + xc_memory_op(0, XENMEM_add_to_physmap, xatp); + return 0; +} +EOF + compile_prog $xen_libs +) ; then +xen_ctrl_version=340 +xen=yes + + # Xen 3.3.0 elif ( cat $TMPC EOF #include xenctrl.h diff --git a/hw/xen_common.h b/hw/xen_common.h index a1958a0..2c79af6 100644 --- a/hw/xen_common.h +++ b/hw/xen_common.h @@ -71,6 +71,20 @@ static inline int xc_domain_populate_physmap_exact (xc_handle, domid, nr_extents, extent_order, mem_flags, extent_start); } +static inline int xc_domain_add_to_physmap(int xc_handle, uint32_t domid, + unsigned int space, unsigned long idx, + xen_pfn_t gpfn) +{ +struct xen_add_to_physmap xatp = { +.domid = domid, +.space = space, +.idx = idx, +.gpfn = gpfn, +}; + +return xc_memory_op(xc_handle, XENMEM_add_to_physmap, xatp); +} + /* Xen 4.1 */ #else -- 1.6.0.2
[Qemu-devel] [PATCH 10/11] xen: fix interrupt routing
From: Stefano Stabellini stefano.stabell...@eu.citrix.com Compared to the last version I only added a comment to the code. - remove i440FX-xen and i440fx_write_config_xen we don't need to intercept pci config writes to i440FX anymore; - introduce PIIX3-xen and piix3_write_config_xen we do need to intercept pci config write to the PCI-ISA bridge to update the PCI link routing; - set the number of PIIX3-xen interrupts line to 128; Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Signed-off-by: Alexander Graf ag...@suse.de --- hw/pc.h |1 - hw/pc_piix.c |6 + hw/piix_pci.c | 66 +--- 3 files changed, 35 insertions(+), 38 deletions(-) diff --git a/hw/pc.h b/hw/pc.h index 0dcbee7..6d5730b 100644 --- a/hw/pc.h +++ b/hw/pc.h @@ -176,7 +176,6 @@ struct PCII440FXState; typedef struct PCII440FXState PCII440FXState; PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, qemu_irq *pic, ram_addr_t ram_size); -PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size); void i440fx_init_memory_mappings(PCII440FXState *d); /* piix4.c */ diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 9a22a8a..ba198de 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size, isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24); if (pci_enabled) { -if (!xen_enabled()) { -pci_bus = i440fx_init(i440fx_state, piix3_devfn, isa_irq, ram_size); -} else { -pci_bus = i440fx_xen_init(i440fx_state, piix3_devfn, isa_irq, ram_size); -} +pci_bus = i440fx_init(i440fx_state, piix3_devfn, isa_irq, ram_size); } else { pci_bus = NULL; i440fx_state = NULL; diff --git a/hw/piix_pci.c b/hw/piix_pci.c index 85a320e..3e2698d 100644 --- a/hw/piix_pci.c +++ b/hw/piix_pci.c @@ -40,6 +40,7 @@ typedef PCIHostState I440FXState; #define PIIX_NUM_PIC_IRQS 16 /* i8259 * 2 */ #define PIIX_NUM_PIRQS 4ULL/* PIRQ[A-D] */ +#define XEN_PIIX_NUM_PIRQS 128ULL #define PIIX_PIRQC 0x60 typedef struct PIIX3State { @@ -78,6 +79,8 @@ struct PCII440FXState { #define I440FX_SMRAM0x72 static void piix3_set_irq(void *opaque, int pirq, int level); +static void piix3_write_config_xen(PCIDevice *dev, + uint32_t address, uint32_t val, int len); /* return the global irq number corresponding to a given device irq pin. We could also use the bus number to have a more precise @@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev, } } -static void i440fx_write_config_xen(PCIDevice *dev, -uint32_t address, uint32_t val, int len) -{ -xen_piix_pci_write_config_client(address, val, len); -i440fx_write_config(dev, address, val, len); -} - static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id) { PCII440FXState *d = opaque; @@ -267,8 +263,21 @@ static PCIBus *i440fx_common_init(const char *device_name, d = pci_create_simple(b, 0, device_name); *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d); -piix3 = DO_UPCAST(PIIX3State, dev, - pci_create_simple_multifunction(b, -1, true, PIIX3)); +/* Xen supports additional interrupt routes from the PCI devices to + * the IOAPIC: the four pins of each PCI device on the bus are also + * connected to the IOAPIC directly. + * These additional routes can be discovered through ACPI. */ +if (xen_enabled()) { +piix3 = DO_UPCAST(PIIX3State, dev, +pci_create_simple_multifunction(b, -1, true, PIIX3-xen)); +pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq, +piix3, XEN_PIIX_NUM_PIRQS); +} else { +piix3 = DO_UPCAST(PIIX3State, dev, +pci_create_simple_multifunction(b, -1, true, PIIX3)); +pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3, +PIIX_NUM_PIRQS); +} piix3-pic = pic; (*pi440fx_state)-piix3 = piix3; @@ -289,21 +298,6 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn, PCIBus *b; b = i440fx_common_init(i440FX, pi440fx_state, piix3_devfn, pic, ram_size); -pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, (*pi440fx_state)-piix3, - PIIX_NUM_PIRQS); - -return b; -} - -PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, -qemu_irq *pic, ram_addr_t ram_size) -{ -PCIBus *b; - -b = i440fx_common_init(i440FX-xen, pi440fx_state, piix3_devfn, pic, ram_size); -pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq, - (*pi440fx_state)-piix3, PIIX_NUM_PIRQS); - return b; } @@ -365,6 +359,13 @@ static void piix3_write_config(PCIDevice *dev, } } +static void
[Qemu-devel] [PATCH 05/11] xen: remove xen_map_block and xen_unmap_block
From: Stefano Stabellini stefano.stabell...@eu.citrix.com Replace xen_map_block with qemu_map_cache with the appropriate locking and size parameters. Replace xen_unmap_block with qemu_invalidate_entry. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Signed-off-by: Alexander Graf ag...@suse.de --- exec.c | 19 --- xen-mapcache-stub.c |4 xen-mapcache.c | 31 --- xen-mapcache.h | 15 --- 4 files changed, 4 insertions(+), 65 deletions(-) diff --git a/exec.c b/exec.c index 01f33bb..e11c1dd 100644 --- a/exec.c +++ b/exec.c @@ -53,6 +53,7 @@ #endif #else /* !CONFIG_USER_ONLY */ #include xen-mapcache.h +#include trace.h #endif //#define DEBUG_TB_INVALIDATE @@ -3088,7 +3089,7 @@ void *qemu_get_ram_ptr(ram_addr_t addr) if (block-offset == 0) { return qemu_map_cache(addr, 0, 1); } else if (block-host == NULL) { -block-host = xen_map_block(block-offset, block-length); +block-host = qemu_map_cache(block-offset, block-length, 1); } } return block-host + (addr - block-offset); @@ -3117,7 +3118,7 @@ void *qemu_safe_ram_ptr(ram_addr_t addr) if (block-offset == 0) { return qemu_map_cache(addr, 0, 1); } else if (block-host == NULL) { -block-host = xen_map_block(block-offset, block-length); +block-host = qemu_map_cache(block-offset, block-length, 1); } } return block-host + (addr - block-offset); @@ -3135,19 +3136,7 @@ void qemu_put_ram_ptr(void *addr) trace_qemu_put_ram_ptr(addr); if (xen_mapcache_enabled()) { -RAMBlock *block; - -QLIST_FOREACH(block, ram_list.blocks, next) { -if (addr == block-host) { -break; -} -} -if (block block-host) { -xen_unmap_block(block-host, block-length); -block-host = NULL; -} else { -qemu_invalidate_entry(addr); -} +qemu_invalidate_entry(block-host); } } diff --git a/xen-mapcache-stub.c b/xen-mapcache-stub.c index 60f712b..8a2380a 100644 --- a/xen-mapcache-stub.c +++ b/xen-mapcache-stub.c @@ -34,7 +34,3 @@ void qemu_invalidate_map_cache(void) void qemu_invalidate_entry(uint8_t *buffer) { } -uint8_t *xen_map_block(target_phys_addr_t phys_addr, target_phys_addr_t size) -{ -return NULL; -} diff --git a/xen-mapcache.c b/xen-mapcache.c index 57fe24d..fac47cd 100644 --- a/xen-mapcache.c +++ b/xen-mapcache.c @@ -362,34 +362,3 @@ void qemu_invalidate_map_cache(void) mapcache_unlock(); } - -uint8_t *xen_map_block(target_phys_addr_t phys_addr, target_phys_addr_t size) -{ -uint8_t *vaddr_base; -xen_pfn_t *pfns; -int *err; -unsigned int i; -target_phys_addr_t nb_pfn = size XC_PAGE_SHIFT; - -trace_xen_map_block(phys_addr, size); -phys_addr = XC_PAGE_SHIFT; - -pfns = qemu_mallocz(nb_pfn * sizeof (xen_pfn_t)); -err = qemu_mallocz(nb_pfn * sizeof (int)); - -for (i = 0; i nb_pfn; i++) { -pfns[i] = phys_addr + i; -} - -vaddr_base = xc_map_foreign_bulk(xen_xc, xen_domid, PROT_READ|PROT_WRITE, - pfns, err, nb_pfn); -if (vaddr_base == NULL) { -perror(xc_map_foreign_bulk); -exit(-1); -} - -qemu_free(pfns); -qemu_free(err); - -return vaddr_base; -} diff --git a/xen-mapcache.h b/xen-mapcache.h index b89b8f9..6216cc3 100644 --- a/xen-mapcache.h +++ b/xen-mapcache.h @@ -9,27 +9,12 @@ #ifndef XEN_MAPCACHE_H #define XEN_MAPCACHE_H -#include sys/mman.h -#include trace.h - void qemu_map_cache_init(void); uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, uint8_t lock); ram_addr_t qemu_ram_addr_from_mapcache(void *ptr); void qemu_invalidate_entry(uint8_t *buffer); void qemu_invalidate_map_cache(void); -uint8_t *xen_map_block(target_phys_addr_t phys_addr, target_phys_addr_t size); - -static inline void xen_unmap_block(void *addr, ram_addr_t size) -{ -trace_xen_unmap_block(addr, size); - -if (munmap(addr, size) != 0) { -hw_error(xen_unmap_block: %s, strerror(errno)); -} -} - - #define mapcache_lock() ((void)0) #define mapcache_unlock() ((void)0) -- 1.6.0.2
[Qemu-devel] [PATCH] virtio-blk: Turn drive serial into a qdev property
It needs to be a qdev property, because it belongs to the drive's guest part. Precedence: commit a0fef654 and 6ced55a5. Bonus: info qtree now shows the serial number. Signed-off-by: Markus Armbruster arm...@redhat.com --- hw/s390-virtio-bus.c |4 +++- hw/s390-virtio-bus.h |1 + hw/virtio-blk.c | 29 +++-- hw/virtio-blk.h |2 ++ hw/virtio-pci.c |4 +++- hw/virtio-pci.h |1 + hw/virtio.h |3 ++- 7 files changed, 31 insertions(+), 13 deletions(-) diff --git a/hw/s390-virtio-bus.c b/hw/s390-virtio-bus.c index d4a12f7..2bf4821 100644 --- a/hw/s390-virtio-bus.c +++ b/hw/s390-virtio-bus.c @@ -128,7 +128,8 @@ static int s390_virtio_blk_init(VirtIOS390Device *dev) { VirtIODevice *vdev; -vdev = virtio_blk_init((DeviceState *)dev, dev-block); +vdev = virtio_blk_init((DeviceState *)dev, dev-block, + dev-block_serial); if (!vdev) { return -1; } @@ -355,6 +356,7 @@ static VirtIOS390DeviceInfo s390_virtio_blk = { .qdev.size = sizeof(VirtIOS390Device), .qdev.props = (Property[]) { DEFINE_BLOCK_PROPERTIES(VirtIOS390Device, block), +DEFINE_PROP_STRING(serial, VirtIOS390Device, block_serial), DEFINE_PROP_END_OF_LIST(), }, }; diff --git a/hw/s390-virtio-bus.h b/hw/s390-virtio-bus.h index 0c412d0..f1bece7 100644 --- a/hw/s390-virtio-bus.h +++ b/hw/s390-virtio-bus.h @@ -42,6 +42,7 @@ typedef struct VirtIOS390Device { uint8_t feat_len; VirtIODevice *vdev; BlockConf block; +char *block_serial; NICConf nic; uint32_t host_features; virtio_serial_conf serial; diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c index 91e0394..6471ac8 100644 --- a/hw/virtio-blk.c +++ b/hw/virtio-blk.c @@ -28,8 +28,8 @@ typedef struct VirtIOBlock void *rq; QEMUBH *bh; BlockConf *conf; +char *serial; unsigned short sector_mask; -char sn[BLOCK_SERIAL_STRLEN]; DeviceState *qdev; } VirtIOBlock; @@ -362,8 +362,13 @@ static void virtio_blk_handle_request(VirtIOBlockReq *req, } else if (type VIRTIO_BLK_T_GET_ID) { VirtIOBlock *s = req-dev; -memcpy(req-elem.in_sg[0].iov_base, s-sn, - MIN(req-elem.in_sg[0].iov_len, sizeof(s-sn))); +/* + * NB: per existing s/n string convention the string is + * terminated by '\0' only when shorter than buffer. + */ +strncpy(req-elem.in_sg[0].iov_base, +s-serial ? s-serial : , +MIN(req-elem.in_sg[0].iov_len, VIRTIO_BLK_ID_BYTES)); virtio_blk_req_complete(req, VIRTIO_BLK_S_OK); } else if (type VIRTIO_BLK_T_OUT) { qemu_iovec_init_external(req-qiov, req-elem.out_sg[1], @@ -531,7 +536,8 @@ static void virtio_blk_change_cb(void *opaque, int reason) } } -VirtIODevice *virtio_blk_init(DeviceState *dev, BlockConf *conf) +VirtIODevice *virtio_blk_init(DeviceState *dev, BlockConf *conf, + char **serial) { VirtIOBlock *s; int cylinders, heads, secs; @@ -547,6 +553,14 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, BlockConf *conf) return NULL; } +if (!*serial) { +/* try to fall back to value set with legacy -drive serial=... */ +dinfo = drive_get_by_blockdev(conf-bs); +if (*dinfo-serial) { +*serial = strdup(dinfo-serial); +} +} + s = (VirtIOBlock *)virtio_common_init(virtio-blk, VIRTIO_ID_BLOCK, sizeof(struct virtio_blk_config), sizeof(VirtIOBlock)); @@ -556,16 +570,11 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, BlockConf *conf) s-vdev.reset = virtio_blk_reset; s-bs = conf-bs; s-conf = conf; +s-serial = *serial; s-rq = NULL; s-sector_mask = (s-conf-logical_block_size / BDRV_SECTOR_SIZE) - 1; bdrv_guess_geometry(s-bs, cylinders, heads, secs); -/* NB: per existing s/n string convention the string is terminated - * by '\0' only when less than sizeof (s-sn) - */ -dinfo = drive_get_by_blockdev(s-bs); -strncpy(s-sn, dinfo-serial, sizeof (s-sn)); - s-vq = virtio_add_queue(s-vdev, 128, virtio_blk_handle_output); qemu_add_vm_change_state_handler(virtio_blk_dma_restart_cb, s); diff --git a/hw/virtio-blk.h b/hw/virtio-blk.h index fff46da..5645d2b 100644 --- a/hw/virtio-blk.h +++ b/hw/virtio-blk.h @@ -34,6 +34,8 @@ #define VIRTIO_BLK_F_WCACHE 9 /* write cache enabled */ #define VIRTIO_BLK_F_TOPOLOGY 10 /* Topology information is available */ +#define VIRTIO_BLK_ID_BYTES 20 /* ID string length */ + struct virtio_blk_config { uint64_t capacity; diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c index c018351..a8c236e 100644 --- a/hw/virtio-pci.c +++ b/hw/virtio-pci.c @@ -710,7 +710,8 @@ static int virtio_blk_init_pci(PCIDevice *pci_dev)
[Qemu-devel] [PATCH 11/11] xen: Add the Xen platform pci device
From: Steven Smith ssm...@xensource.com Introduce a new emulated PCI device, specific to fully virtualized Xen guests. The device is necessary for PV on HVM drivers to work. Signed-off-by: Steven Smith ssm...@xensource.com Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Signed-off-by: Alexander Graf ag...@suse.de --- Makefile.target |2 + hw/hw.h |3 + hw/pc_piix.c |4 + hw/pci_ids.h |2 + hw/xen_platform.c | 340 + trace-events |3 + 6 files changed, 354 insertions(+), 0 deletions(-) create mode 100644 hw/xen_platform.c diff --git a/Makefile.target b/Makefile.target index b1a0f6d..760aa02 100644 --- a/Makefile.target +++ b/Makefile.target @@ -218,6 +218,8 @@ obj-$(CONFIG_NO_XEN) += xen-stub.o obj-i386-$(CONFIG_XEN_MAPCACHE) += xen-mapcache.o obj-$(CONFIG_NO_XEN_MAPCACHE) += xen-mapcache-stub.o +obj-i386-$(CONFIG_XEN) += xen_platform.o + # Inter-VM PCI shared memory CONFIG_IVSHMEM = ifeq ($(CONFIG_KVM), y) diff --git a/hw/hw.h b/hw/hw.h index 56447a7..9dd7096 100644 --- a/hw/hw.h +++ b/hw/hw.h @@ -780,6 +780,9 @@ extern const VMStateDescription vmstate_ptimer; #define VMSTATE_INT32_LE(_f, _s) \ VMSTATE_SINGLE(_f, _s, 0, vmstate_info_int32_le, int32_t) +#define VMSTATE_UINT8_TEST(_f, _s, _t) \ +VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint8, uint8_t) + #define VMSTATE_UINT16_TEST(_f, _s, _t) \ VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint16, uint16_t) diff --git a/hw/pc_piix.c b/hw/pc_piix.c index ba198de..8dbeb0c 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -136,6 +136,10 @@ static void pc_init1(ram_addr_t ram_size, pc_vga_init(pci_enabled? pci_bus: NULL); +if (xen_enabled()) { +pci_create_simple(pci_bus, -1, xen-platform); +} + /* init basic PC hardware */ pc_basic_device_init(isa_irq, rtc_state, xen_enabled()); diff --git a/hw/pci_ids.h b/hw/pci_ids.h index d9457ed..d94578c 100644 --- a/hw/pci_ids.h +++ b/hw/pci_ids.h @@ -109,3 +109,5 @@ #define PCI_DEVICE_ID_INTEL_82371AB 0x7111 #define PCI_DEVICE_ID_INTEL_82371AB_20x7112 #define PCI_DEVICE_ID_INTEL_82371AB_30x7113 + +#define PCI_VENDOR_ID_XENSOURCE 0x5853 diff --git a/hw/xen_platform.c b/hw/xen_platform.c new file mode 100644 index 000..b167eee --- /dev/null +++ b/hw/xen_platform.c @@ -0,0 +1,340 @@ +/* + * XEN platform pci device, formerly known as the event channel device + * + * Copyright (c) 2003-2004 Intel Corp. + * Copyright (c) 2006 XenSource + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the Software), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include assert.h + +#include hw.h +#include pc.h +#include pci.h +#include irq.h +#include xen_common.h +#include net.h +#include xen_backend.h +#include rwhandler.h +#include trace.h + +#include xenguest.h + +//#define DEBUG_PLATFORM + +#ifdef DEBUG_PLATFORM +#define DPRINTF(fmt, ...) do { \ +fprintf(stderr, xen_platform: fmt, ## __VA_ARGS__); \ +} while (0) +#else +#define DPRINTF(fmt, ...) do { } while (0) +#endif + +#define PFFLAG_ROM_LOCK 1 /* Sets whether ROM memory area is RW or RO */ + +typedef struct PCIXenPlatformState { +PCIDevice pci_dev; +uint8_t flags; /* used only for version_id == 2 */ +int drivers_blacklisted; +uint16_t driver_product_version; + +/* Log from guest drivers */ +char log_buffer[4096]; +int log_buffer_off; +} PCIXenPlatformState; + +#define XEN_PLATFORM_IOPORT 0x10 + +/* Send bytes to syslog */ +static void log_writeb(PCIXenPlatformState *s, char val) +{ +if (val == '\n' || s-log_buffer_off == sizeof(s-log_buffer) - 1) { +/* Flush buffer */ +s-log_buffer[s-log_buffer_off] = 0; +trace_xen_platform_log(s-log_buffer); +s-log_buffer_off = 0; +} else {
[Qemu-devel] [PATCH 03/11] xen: fix qemu_map_cache with size != MCACHE_BUCKET_SIZE
From: Stefano Stabellini stefano.stabell...@eu.citrix.com Fix the implementation of qemu_map_cache: correctly support size arguments different from 0 or MCACHE_BUCKET_SIZE. The new implementation supports locked mapcache entries with size multiple of MCACHE_BUCKET_SIZE. qemu_invalidate_entry can correctly find and unmap these large mapcache entries given that the virtual address passed to qemu_invalidate_entry is the same returned by qemu_map_cache when the locked mapcache entry was created. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Signed-off-by: Alexander Graf ag...@suse.de --- xen-mapcache.c | 77 +++ 1 files changed, 65 insertions(+), 12 deletions(-) diff --git a/xen-mapcache.c b/xen-mapcache.c index 349cc62..90fbd49 100644 --- a/xen-mapcache.c +++ b/xen-mapcache.c @@ -43,14 +43,16 @@ typedef struct MapCacheEntry { target_phys_addr_t paddr_index; uint8_t *vaddr_base; -DECLARE_BITMAP(valid_mapping, MCACHE_BUCKET_SIZE XC_PAGE_SHIFT); +unsigned long *valid_mapping; uint8_t lock; +target_phys_addr_t size; struct MapCacheEntry *next; } MapCacheEntry; typedef struct MapCacheRev { uint8_t *vaddr_req; target_phys_addr_t paddr_index; +target_phys_addr_t size; QTAILQ_ENTRY(MapCacheRev) next; } MapCacheRev; @@ -68,6 +70,15 @@ typedef struct MapCache { static MapCache *mapcache; +static inline int test_bits(int nr, int size, const unsigned long *addr) +{ +unsigned long res = find_next_zero_bit(addr, size + nr, nr); +if (res = nr + size) +return 1; +else +return 0; +} + void qemu_map_cache_init(void) { unsigned long size; @@ -115,11 +126,15 @@ static void qemu_remap_bucket(MapCacheEntry *entry, err = qemu_mallocz(nb_pfn * sizeof (int)); if (entry-vaddr_base != NULL) { -if (munmap(entry-vaddr_base, size) != 0) { +if (munmap(entry-vaddr_base, entry-size) != 0) { perror(unmap fails); exit(-1); } } +if (entry-valid_mapping != NULL) { +qemu_free(entry-valid_mapping); +entry-valid_mapping = NULL; +} for (i = 0; i nb_pfn; i++) { pfns[i] = (address_index (MCACHE_BUCKET_SHIFT-XC_PAGE_SHIFT)) + i; @@ -134,6 +149,9 @@ static void qemu_remap_bucket(MapCacheEntry *entry, entry-vaddr_base = vaddr_base; entry-paddr_index = address_index; +entry-size = size; +entry-valid_mapping = (unsigned long *) qemu_mallocz(sizeof(unsigned long) * +BITS_TO_LONGS(size XC_PAGE_SHIFT)); bitmap_zero(entry-valid_mapping, nb_pfn); for (i = 0; i nb_pfn; i++) { @@ -151,32 +169,47 @@ uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, u MapCacheEntry *entry, *pentry = NULL; target_phys_addr_t address_index = phys_addr MCACHE_BUCKET_SHIFT; target_phys_addr_t address_offset = phys_addr (MCACHE_BUCKET_SIZE - 1); +target_phys_addr_t __size = size; trace_qemu_map_cache(phys_addr); -if (address_index == mapcache-last_address_index !lock) { +if (address_index == mapcache-last_address_index !lock !__size) { trace_qemu_map_cache_return(mapcache-last_address_vaddr + address_offset); return mapcache-last_address_vaddr + address_offset; } +/* size is always a multiple of MCACHE_BUCKET_SIZE */ +if ((address_offset + (__size % MCACHE_BUCKET_SIZE)) MCACHE_BUCKET_SIZE) +__size += MCACHE_BUCKET_SIZE; +if (__size % MCACHE_BUCKET_SIZE) +__size += MCACHE_BUCKET_SIZE - (__size % MCACHE_BUCKET_SIZE); +if (!__size) +__size = MCACHE_BUCKET_SIZE; + entry = mapcache-entry[address_index % mapcache-nr_buckets]; -while (entry entry-lock entry-paddr_index != address_index entry-vaddr_base) { +while (entry entry-lock entry-vaddr_base +(entry-paddr_index != address_index || entry-size != __size || + !test_bits(address_offset XC_PAGE_SHIFT, size XC_PAGE_SHIFT, + entry-valid_mapping))) { pentry = entry; entry = entry-next; } if (!entry) { entry = qemu_mallocz(sizeof (MapCacheEntry)); pentry-next = entry; -qemu_remap_bucket(entry, size ? : MCACHE_BUCKET_SIZE, address_index); +qemu_remap_bucket(entry, __size, address_index); } else if (!entry-lock) { if (!entry-vaddr_base || entry-paddr_index != address_index || -!test_bit(address_offset XC_PAGE_SHIFT, entry-valid_mapping)) { -qemu_remap_bucket(entry, size ? : MCACHE_BUCKET_SIZE, address_index); +entry-size != __size || +!test_bits(address_offset XC_PAGE_SHIFT, size XC_PAGE_SHIFT, +entry-valid_mapping)) { +qemu_remap_bucket(entry, __size, address_index); } } -if (!test_bit(address_offset XC_PAGE_SHIFT,
Re: [Qemu-devel] [PATCH] do not send packet to nic if the packet will be dropped by nic
At 06/20/2011 05:10 PM, Kevin Wolf Write: Am 17.06.2011 03:33, schrieb Wen Congyang: If !s-clock_enabled or !rtl8139_receiver_enabled(s), it means that the nic will drop all packets from host. So qemu will keep getting packets from host and wasting CPU on dropping packets. This seems worse than packets that should be dropped but aren't. Signed-off-by: Wen Congyang we...@cn.fujitsu.com Which bug does this change fix? I'm still not convinced that we should do it. Maybe not a bug fix now. As Michael S. Tsirkin said, if rtl8139_can_receive() returns 1, qemu will keep getting packets from host and wasting CPU on dropping packets. We can save CPU by return 0. --- hw/rtl8139.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/rtl8139.c b/hw/rtl8139.c index 2f8db58..9084678 100644 --- a/hw/rtl8139.c +++ b/hw/rtl8139.c @@ -810,9 +810,9 @@ static int rtl8139_can_receive(VLANClientState *nc) /* Receive (drop) packets if card is disabled. */ This comment isn't accurate any more after applying the patch. if (!s-clock_enabled) - return 1; + return 0; if (!rtl8139_receiver_enabled(s)) - return 1; + return 0; if (rtl8139_cp_receiver_enabled(s)) { /* ??? Flow control not implemented in c+ mode. -- 1.7.1 Kevin
[Qemu-devel] [PATCH 02/11] xen: Introduce VGA sync dirty bitmap support
From: Anthony PERARD anthony.per...@citrix.com This patch introduces phys memory client for Xen. Only sync dirty_bitmap and set_memory are actually implemented. migration_log will stay empty for the moment. Xen can only log one range for bit change, so only the range in the first call will be synced. Signed-off-by: Anthony PERARD anthony.per...@citrix.com Signed-off-by: Alexander Graf ag...@suse.de --- trace-events |1 + xen-all.c| 267 ++ 2 files changed, 268 insertions(+), 0 deletions(-) diff --git a/trace-events b/trace-events index f1230f1..46a19d3 100644 --- a/trace-events +++ b/trace-events @@ -396,6 +396,7 @@ disable milkymist_vgafb_memory_write(uint32_t addr, uint32_t value) addr %08x v # xen-all.c disable xen_ram_alloc(unsigned long ram_addr, unsigned long size) requested: %#lx, size %#lx +disable xen_client_set_memory(uint64_t start_addr, unsigned long size, unsigned long phys_offset, bool log_dirty) %#PRIx64 size %#lx, offset %#lx, log_dirty %i # xen-mapcache.c disable qemu_map_cache(uint64_t phys_addr) want %#PRIx64 diff --git a/xen-all.c b/xen-all.c index 0eac202..75a82c2 100644 --- a/xen-all.c +++ b/xen-all.c @@ -13,6 +13,7 @@ #include hw/xen_common.h #include hw/xen_backend.h +#include range.h #include xen-mapcache.h #include trace.h @@ -54,6 +55,14 @@ static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t *shared_page, int vcpu) #define BUFFER_IO_MAX_DELAY 100 +typedef struct XenPhysmap { +target_phys_addr_t start_addr; +ram_addr_t size; +target_phys_addr_t phys_offset; + +QLIST_ENTRY(XenPhysmap) list; +} XenPhysmap; + typedef struct XenIOState { shared_iopage_t *shared_page; buffered_iopage_t *buffered_io_page; @@ -66,6 +75,9 @@ typedef struct XenIOState { int send_vcpu; struct xs_handle *xenstore; +CPUPhysMemoryClient client; +QLIST_HEAD(, XenPhysmap) physmap; +const XenPhysmap *log_for_dirtybit; Notifier exit; } XenIOState; @@ -178,6 +190,256 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size) qemu_free(pfn_list); } +static XenPhysmap *get_physmapping(XenIOState *state, + target_phys_addr_t start_addr, ram_addr_t size) +{ +XenPhysmap *physmap = NULL; + +start_addr = TARGET_PAGE_MASK; + +QLIST_FOREACH(physmap, state-physmap, list) { +if (range_covers_byte(physmap-start_addr, physmap-size, start_addr)) { +return physmap; +} +} +return NULL; +} + +#if CONFIG_XEN_CTRL_INTERFACE_VERSION = 340 +static int xen_add_to_physmap(XenIOState *state, + target_phys_addr_t start_addr, + ram_addr_t size, + target_phys_addr_t phys_offset) +{ +unsigned long i = 0; +int rc = 0; +XenPhysmap *physmap = NULL; +target_phys_addr_t pfn, start_gpfn; + +if (get_physmapping(state, start_addr, size)) { +return 0; +} +if (size = 0) { +return -1; +} + +DPRINTF(mapping vram to %llx - %llx, from %llx\n, +start_addr, start_addr + size, phys_offset); + +pfn = phys_offset TARGET_PAGE_BITS; +start_gpfn = start_addr TARGET_PAGE_BITS; +for (i = 0; i size TARGET_PAGE_BITS; i++) { +unsigned long idx = pfn + i; +xen_pfn_t gpfn = start_gpfn + i; + +rc = xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, idx, gpfn); +if (rc) { +DPRINTF(add_to_physmap MFN %PRI_xen_pfn to PFN % +PRI_xen_pfn failed: %d\n, idx, gpfn, rc); +return -rc; +} +} + +physmap = qemu_malloc(sizeof (XenPhysmap)); + +physmap-start_addr = start_addr; +physmap-size = size; +physmap-phys_offset = phys_offset; + +QLIST_INSERT_HEAD(state-physmap, physmap, list); + +xc_domain_pin_memory_cacheattr(xen_xc, xen_domid, + start_addr TARGET_PAGE_BITS, + (start_addr + size) TARGET_PAGE_BITS, + XEN_DOMCTL_MEM_CACHEATTR_WB); +return 0; +} + +static int xen_remove_from_physmap(XenIOState *state, + target_phys_addr_t start_addr, + ram_addr_t size) +{ +unsigned long i = 0; +int rc = 0; +XenPhysmap *physmap = NULL; +target_phys_addr_t phys_offset = 0; + +physmap = get_physmapping(state, start_addr, size); +if (physmap == NULL) { +return -1; +} + +phys_offset = physmap-phys_offset; +size = physmap-size; + +DPRINTF(unmapping vram to %llx - %llx, from %llx\n, +phys_offset, phys_offset + size, start_addr); + +size = TARGET_PAGE_BITS; +start_addr = TARGET_PAGE_BITS; +phys_offset = TARGET_PAGE_BITS; +for (i = 0; i size; i++) { +unsigned long idx = start_addr + i; +xen_pfn_t gpfn =
[Qemu-devel] [PATCH 07/11] xen: mapcache performance improvements
From: Stefano Stabellini stefano.stabell...@eu.citrix.com Use qemu_invalidate_entry in cpu_physical_memory_unmap. Do not lock mapcache entries in qemu_get_ram_ptr if the address falls in the ramblock with offset == 0. We don't need to do that because the callers of qemu_get_ram_ptr either try to map an entire block, other from the main ramblock, or until the end of a page to implement a single read or write in the main ramblock. If we don't lock mapcache entries in qemu_get_ram_ptr we don't need to call qemu_invalidate_entry in qemu_put_ram_ptr anymore because we can leave with few long lived block mappings requested by devices. Also move the call to qemu_ram_addr_from_mapcache at the beginning of qemu_ram_addr_from_host. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Signed-off-by: Alexander Graf ag...@suse.de --- exec.c | 28 ++-- 1 files changed, 10 insertions(+), 18 deletions(-) diff --git a/exec.c b/exec.c index 238c173..7f14332 100644 --- a/exec.c +++ b/exec.c @@ -3085,9 +3085,10 @@ void *qemu_get_ram_ptr(ram_addr_t addr) if (xen_mapcache_enabled()) { /* We need to check if the requested address is in the RAM * because we don't want to map the entire memory in QEMU. + * In that case just map until the end of the page. */ if (block-offset == 0) { -return qemu_map_cache(addr, 0, 1); +return qemu_map_cache(addr, 0, 0); } else if (block-host == NULL) { block-host = qemu_map_cache(block-offset, block-length, 1); } @@ -3114,9 +3115,10 @@ void *qemu_safe_ram_ptr(ram_addr_t addr) if (xen_mapcache_enabled()) { /* We need to check if the requested address is in the RAM * because we don't want to map the entire memory in QEMU. + * In that case just map until the end of the page. */ if (block-offset == 0) { -return qemu_map_cache(addr, 0, 1); +return qemu_map_cache(addr, 0, 0); } else if (block-host == NULL) { block-host = qemu_map_cache(block-offset, block-length, 1); } @@ -3159,10 +3161,6 @@ void *qemu_ram_ptr_length(target_phys_addr_t addr, target_phys_addr_t *size) void qemu_put_ram_ptr(void *addr) { trace_qemu_put_ram_ptr(addr); - -if (xen_mapcache_enabled()) { -qemu_invalidate_entry(block-host); -} } int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr) @@ -3170,6 +3168,11 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr) RAMBlock *block; uint8_t *host = ptr; +if (xen_mapcache_enabled()) { +*ram_addr = qemu_ram_addr_from_mapcache(ptr); +return 0; +} + QLIST_FOREACH(block, ram_list.blocks, next) { /* This case append when the block is not mapped. */ if (block-host == NULL) { @@ -3181,11 +3184,6 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr) } } -if (xen_mapcache_enabled()) { -*ram_addr = qemu_ram_addr_from_mapcache(ptr); -return 0; -} - return -1; } @@ -4086,13 +4084,7 @@ void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len, } } if (xen_mapcache_enabled()) { -uint8_t *buffer1 = buffer; -uint8_t *end_buffer = buffer + len; - -while (buffer1 end_buffer) { -qemu_put_ram_ptr(buffer1); -buffer1 += TARGET_PAGE_SIZE; -} +qemu_invalidate_entry(buffer); } return; } -- 1.6.0.2
Re: [Qemu-devel] [PATCH] do not send packet to nic if the packet will be dropped by nic
Am 20.06.2011 11:40, schrieb Wen Congyang: At 06/20/2011 05:10 PM, Kevin Wolf Write: Am 17.06.2011 03:33, schrieb Wen Congyang: If !s-clock_enabled or !rtl8139_receiver_enabled(s), it means that the nic will drop all packets from host. So qemu will keep getting packets from host and wasting CPU on dropping packets. This seems worse than packets that should be dropped but aren't. Signed-off-by: Wen Congyang we...@cn.fujitsu.com Which bug does this change fix? I'm still not convinced that we should do it. Maybe not a bug fix now. As Michael S. Tsirkin said, if rtl8139_can_receive() returns 1, qemu will keep getting packets from host and wasting CPU on dropping packets. We can save CPU by return 0. Don't we waste memory instead then because we leave the packets queued indefinitely? Kevin
[Qemu-devel] [PATCH 0/2] Suspend (S3) support
The first patch is a slightly revised patch send before, introducing a print helper (qxl_mode_to_string) that is used by the second patch too, hence I'm sending them together. I've looked for additional places to use qxl_mode_to_string like Gerd asked before, found just one. The second patch is the one adding support for QXL_IO_UPDATE_MEM. This is just part of the suspend support. It requires a revised spice-server (to implement the update_mem io), and a revised driver (patches on spice-devel for the windows driver, the linux to-be-done). Alon Levy (2): qxl: interface_get_command: fix reported mode qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support hw/qxl.c | 47 --- 1 files changed, 44 insertions(+), 3 deletions(-) -- 1.7.5.2
Re: [Qemu-devel] [PATCH] do not send packet to nic if the packet will be dropped by nic
On Mon, Jun 20, 2011 at 11:52:20AM +0200, Kevin Wolf wrote: Am 20.06.2011 11:40, schrieb Wen Congyang: At 06/20/2011 05:10 PM, Kevin Wolf Write: Am 17.06.2011 03:33, schrieb Wen Congyang: If !s-clock_enabled or !rtl8139_receiver_enabled(s), it means that the nic will drop all packets from host. So qemu will keep getting packets from host and wasting CPU on dropping packets. This seems worse than packets that should be dropped but aren't. Signed-off-by: Wen Congyang we...@cn.fujitsu.com Which bug does this change fix? I'm still not convinced that we should do it. Maybe not a bug fix now. As Michael S. Tsirkin said, if rtl8139_can_receive() returns 1, qemu will keep getting packets from host and wasting CPU on dropping packets. We can save CPU by return 0. Don't we waste memory instead then because we leave the packets queued indefinitely? Kevin Yes but the amount of wasted memory is bound from above so this doesn't seem too bad to me ... -- MST
[Qemu-devel] [PATCH 1/2] qxl: interface_get_command: fix reported mode
report correct mode when in undefined mode. introduces qxl_mode_to_string, and uses it in another place that looks helpful (qxl_post_load) --- hw/qxl.c | 21 ++--- 1 files changed, 18 insertions(+), 3 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index 1906e84..ca5c8b3 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -336,6 +336,21 @@ static void interface_get_init_info(QXLInstance *sin, QXLDevInitInfo *info) info-n_surfaces = NUM_SURFACES; } +static const char *qxl_mode_to_string(int mode) +{ +switch (mode) { +case QXL_MODE_COMPAT: +return compat; +case QXL_MODE_NATIVE: +return native; +case QXL_MODE_UNDEFINED: +return undefined; +case QXL_MODE_VGA: +return vga; +} +return unknown; +} + /* called from spice server thread context only */ static int interface_get_command(QXLInstance *sin, struct QXLCommandExt *ext) { @@ -364,8 +379,7 @@ static int interface_get_command(QXLInstance *sin, struct QXLCommandExt *ext) case QXL_MODE_COMPAT: case QXL_MODE_NATIVE: case QXL_MODE_UNDEFINED: -dprint(qxl, 2, %s: %s\n, __FUNCTION__, - qxl-cmdflags ? compat : native); +dprint(qxl, 2, %s: %s\n, __FUNCTION__, qxl_mode_to_string(qxl-mode)); ring = qxl-ram-cmd_ring; if (SPICE_RING_IS_EMPTY(ring)) { return false; @@ -1378,7 +1392,8 @@ static int qxl_post_load(void *opaque, int version) d-modes = (QXLModes*)((uint8_t*)d-rom + d-rom-modes_offset); -dprint(d, 1, %s: restore mode\n, __FUNCTION__); +dprint(d, 1, %s: restore mode (%s)\n, __FUNCTION__, + qxl_mode_to_string(d-mode)); newmode = d-mode; d-mode = QXL_MODE_UNDEFINED; switch (newmode) { -- 1.7.5.2
Re: [Qemu-devel] [PATCH 0/9] AREG0 patches
Am 19.06.2011 23:55, schrieb Andreas Färber: Am 19.06.2011 um 22:57 schrieb Blue Swirl: These and the stack frame patches can be found in git://repo.or.cz/qemu/blueswirl.git Blue Swirl (9): cpu_loop_exit: avoid using AREG0 sparc: fix coding style of the area to be moved sparc: move do_interrupt to helper.c x86: use caller supplied CPUState for interrupt related stuff m68k: use caller supplied CPUState for interrupt related stuff cpu-exec: unify do_interrupt call exec.h: fix coding style and change cpu_has_work to return bool Move cpu_has_work and cpu_pc_from_tb to cpu.h Remove exec-all.h include directives This is getting rather unhandy with two series... Could you please check that chainreplyto = true under [sendemail]? I have no other related options set, and it used to work via Gmail last time I tried. Actually, chainreply = false is what you want, so that all patches are replies to patch 0 instead of patch n-1. Of course, you need to send off the whole series with only a single git-send-email invocation for it to work, like git send-email 00*.patch Kevin
[Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support
Add QXL_IO_UPDATE_MEM. Used to reduce vmexits from the guest when it resets the spice server state before going to sleep. The implementation requires an updated spice-server (0.8.2) with the new worker function update_mem. Cc: Yonit Halperin yhalp...@redhat.com --- hw/qxl.c | 26 ++ 1 files changed, 26 insertions(+), 0 deletions(-) diff --git a/hw/qxl.c b/hw/qxl.c index ca5c8b3..4945d95 100644 --- a/hw/qxl.c +++ b/hw/qxl.c @@ -1016,6 +1016,32 @@ static void ioport_write(void *opaque, uint32_t addr, uint32_t val) case QXL_IO_DESTROY_ALL_SURFACES: d-ssd.worker-destroy_surfaces(d-ssd.worker); break; +case QXL_IO_UPDATE_MEM: +dprint(d, 1, QXL_IO_UPDATE_MEM (%d) entry (%s, s#=%d, res#=%d)\n, +val, qxl_mode_to_string(d-mode), d-guest_surfaces.count, +d-num_free_res); +switch (val) { +case (QXL_UPDATE_MEM_RENDER_ALL): +d-ssd.worker-update_mem(d-ssd.worker); +break; +case (QXL_UPDATE_MEM_FLUSH): { +QXLReleaseRing *ring = d-ram-release_ring; +if (ring-prod - ring-cons + 1 == ring-num_items) { +// TODO - return a value to the guest and let it loop? +fprintf(stderr, +ERROR: no flush, full release ring [p%d,%dc]\n, +ring-prod, ring-cons); +} +qxl_push_free_res(d, 1 /* flush */); +dprint(d, 1, QXL_IO_UPDATE_MEM exit (%s, s#=%d, res#=%d,%p)\n, +qxl_mode_to_string(d-mode), d-guest_surfaces.count, +d-num_free_res, d-last_release); +break; +} +default: +fprintf(stderr, ERROR: unexpected value to QXL_IO_UPDATE_MEM\n); +} +break; default: fprintf(stderr, %s: ioport=0x%x, abort()\n, __FUNCTION__, io_port); abort(); -- 1.7.5.2
[Qemu-devel] [PATCH v2 0/3] Let RTC follow backward jumps of host clock immediately
Just noticed that this issue is still unfixed because my series was somehow forgotten. So I've rebased it over current master, refactored it to use the generic Notifier infrastructure and renamed it to clock reset notifier to avoid confusion with icount related warping. Please review / apply before 0.15-rc0, it fixes a relevant issue. Original series description: By default, we base the mc146818 RTC on the host clock (CLOCK_REALTIME). This works fine if only the frequency of the host clock is tuned (e.g. by NTP) or if it is set to a future time. However, if the host is tuned backward, e.g. because NTP obtained the correct time after the guest was already started or the admin decided to tune the local time, we see an unpleasant effect in the guest: The RTC will stall for the period the host clock is set back. We identified that one prominent guest affected by this is Windows which relies on the periodic RTC interrupt for time keeping. This series address the issue by detecting those warps and providing a callback mechanism to device models. The RTC is enabled to update its timers and register content immediately. Tested successfully both with hwclock in a Linux guest and by monitoring the Windows clock while fiddling with the host time. Note that if this kind of RTC adjustment is not wanted, the user is still free to decouple the RTC from the host clock and base it on the VM clock - just like before. Jan Kiszka (3): notifier: Pass data argument to callback qemu-timer: Introduce clock reset notifier mc146818rtc: Handle host clock resets hw/fw_cfg.c |2 +- hw/mc146818rtc.c | 20 input.c |2 +- migration.c | 12 ++-- notify.c |4 ++-- notify.h |4 ++-- qemu-timer.c | 29 - qemu-timer.h |5 + ui/sdl.c |2 +- ui/spice-core.c |2 +- ui/spice-input.c |4 ++-- ui/vnc.c |4 ++-- usb-linux.c |2 +- vl.c |4 ++-- xen-all.c|2 +- 15 files changed, 75 insertions(+), 23 deletions(-)
[Qemu-devel] [PATCH v2 3/3] mc146818rtc: Handle host clock resets
Make use of the new clock reset notifier to update the RTC whenever rtc_clock is the host clock and that happens to jump backward. This avoids that the RTC stalls for the period the host clock was set back. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/mc146818rtc.c | 20 1 files changed, 20 insertions(+), 0 deletions(-) diff --git a/hw/mc146818rtc.c b/hw/mc146818rtc.c index 1c9a706..feb3b25 100644 --- a/hw/mc146818rtc.c +++ b/hw/mc146818rtc.c @@ -99,6 +99,7 @@ typedef struct RTCState { QEMUTimer *coalesced_timer; QEMUTimer *second_timer; QEMUTimer *second_timer2; +Notifier clock_reset_notifier; } RTCState; static void rtc_set_time(RTCState *s); @@ -572,6 +573,22 @@ static const VMStateDescription vmstate_rtc = { } }; +static void rtc_notify_clock_reset(Notifier *notifier, void *data) +{ +RTCState *s = container_of(notifier, RTCState, clock_reset_notifier); +int64_t now = *(int64_t *)data; + +rtc_set_date_from_host(s-dev); +s-next_second_time = now + (get_ticks_per_sec() * 99) / 100; +qemu_mod_timer(s-second_timer2, s-next_second_time); +rtc_timer_update(s, now); +#ifdef TARGET_I386 +if (rtc_td_hack) { +rtc_coalesced_timer_update(s); +} +#endif +} + static void rtc_reset(void *opaque) { RTCState *s = opaque; @@ -608,6 +625,9 @@ static int rtc_initfn(ISADevice *dev) s-second_timer = qemu_new_timer_ns(rtc_clock, rtc_update_second, s); s-second_timer2 = qemu_new_timer_ns(rtc_clock, rtc_update_second2, s); +s-clock_reset_notifier.notify = rtc_notify_clock_reset; +qemu_register_clock_reset_notifier(rtc_clock, s-clock_reset_notifier); + s-next_second_time = qemu_get_clock_ns(rtc_clock) + (get_ticks_per_sec() * 99) / 100; qemu_mod_timer(s-second_timer2, s-next_second_time); -- 1.7.1
Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support
Hi, +case QXL_IO_UPDATE_MEM: +switch (val) { +case (QXL_UPDATE_MEM_RENDER_ALL): +d-ssd.worker-update_mem(d-ssd.worker); +break; What is the difference to one worker-stop() + worker-start() cycle? cheers, Gerd
[Qemu-devel] [PATCH v2 1/3] notifier: Pass data argument to callback
This allows to pass additional information to the notifier callback which is useful if sender and receiver do not share any other distinct data structure. Will be used first for the clock reset notifier. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- hw/fw_cfg.c |2 +- input.c |2 +- migration.c | 12 ++-- notify.c |4 ++-- notify.h |4 ++-- ui/sdl.c |2 +- ui/spice-core.c |2 +- ui/spice-input.c |4 ++-- ui/vnc.c |4 ++-- usb-linux.c |2 +- vl.c |4 ++-- xen-all.c|2 +- 12 files changed, 22 insertions(+), 22 deletions(-) diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c index 85c8c3c..34e7526 100644 --- a/hw/fw_cfg.c +++ b/hw/fw_cfg.c @@ -316,7 +316,7 @@ int fw_cfg_add_file(FWCfgState *s, const char *filename, uint8_t *data, return 1; } -static void fw_cfg_machine_ready(struct Notifier* n) +static void fw_cfg_machine_ready(struct Notifier *n, void *data) { uint32_t len; FWCfgState *s = container_of(n, FWCfgState, machine_ready); diff --git a/input.c b/input.c index 5664d3a..894d57f 100644 --- a/input.c +++ b/input.c @@ -59,7 +59,7 @@ static void check_mode_change(void) if (is_absolute != current_is_absolute || has_absolute != current_has_absolute) { -notifier_list_notify(mouse_mode_notifiers); +notifier_list_notify(mouse_mode_notifiers, NULL); } current_is_absolute = is_absolute; diff --git a/migration.c b/migration.c index af3a1f2..2a15b98 100644 --- a/migration.c +++ b/migration.c @@ -124,7 +124,7 @@ int do_migrate(Monitor *mon, const QDict *qdict, QObject **ret_data) } current_migration = s; -notifier_list_notify(migration_state_notifiers); +notifier_list_notify(migration_state_notifiers, NULL); return 0; } @@ -276,7 +276,7 @@ void migrate_fd_error(FdMigrationState *s) { DPRINTF(setting error state\n); s-state = MIG_STATE_ERROR; -notifier_list_notify(migration_state_notifiers); +notifier_list_notify(migration_state_notifiers, NULL); migrate_fd_cleanup(s); } @@ -334,7 +334,7 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size) monitor_resume(s-mon); } s-state = MIG_STATE_ERROR; -notifier_list_notify(migration_state_notifiers); +notifier_list_notify(migration_state_notifiers, NULL); } return ret; @@ -395,7 +395,7 @@ void migrate_fd_put_ready(void *opaque) state = MIG_STATE_ERROR; } s-state = state; -notifier_list_notify(migration_state_notifiers); +notifier_list_notify(migration_state_notifiers, NULL); } } @@ -415,7 +415,7 @@ void migrate_fd_cancel(MigrationState *mig_state) DPRINTF(cancelling migration\n); s-state = MIG_STATE_CANCELLED; -notifier_list_notify(migration_state_notifiers); +notifier_list_notify(migration_state_notifiers, NULL); qemu_savevm_state_cancel(s-mon, s-file); migrate_fd_cleanup(s); @@ -429,7 +429,7 @@ void migrate_fd_release(MigrationState *mig_state) if (s-state == MIG_STATE_ACTIVE) { s-state = MIG_STATE_CANCELLED; -notifier_list_notify(migration_state_notifiers); +notifier_list_notify(migration_state_notifiers, NULL); migrate_fd_cleanup(s); } qemu_free(s); diff --git a/notify.c b/notify.c index bcd3fc5..a6bac1f 100644 --- a/notify.c +++ b/notify.c @@ -29,11 +29,11 @@ void notifier_list_remove(NotifierList *list, Notifier *notifier) QTAILQ_REMOVE(list-notifiers, notifier, node); } -void notifier_list_notify(NotifierList *list) +void notifier_list_notify(NotifierList *list, void *data) { Notifier *notifier, *next; QTAILQ_FOREACH_SAFE(notifier, list-notifiers, node, next) { -notifier-notify(notifier); +notifier-notify(notifier, data); } } diff --git a/notify.h b/notify.h index b40522f..54fc57c 100644 --- a/notify.h +++ b/notify.h @@ -20,7 +20,7 @@ typedef struct Notifier Notifier; struct Notifier { -void (*notify)(Notifier *notifier); +void (*notify)(Notifier *notifier, void *data); QTAILQ_ENTRY(Notifier) node; }; @@ -38,6 +38,6 @@ void notifier_list_add(NotifierList *list, Notifier *notifier); void notifier_list_remove(NotifierList *list, Notifier *notifier); -void notifier_list_notify(NotifierList *list); +void notifier_list_notify(NotifierList *list, void *data); #endif diff --git a/ui/sdl.c b/ui/sdl.c index f2bd4a0..6dbc5cb 100644 --- a/ui/sdl.c +++ b/ui/sdl.c @@ -481,7 +481,7 @@ static void sdl_grab_end(void) sdl_update_caption(); } -static void sdl_mouse_mode_change(Notifier *notify) +static void sdl_mouse_mode_change(Notifier *notify, void *data) { if (kbd_mouse_is_absolute()) { if (!absolute_enabled) { diff --git a/ui/spice-core.c b/ui/spice-core.c index dd9905b..e134f04 100644 --- a/ui/spice-core.c +++
[Qemu-devel] [PATCH v2 2/3] qemu-timer: Introduce clock reset notifier
QEMU_CLOCK_HOST is based on the system time which may jump backward in case the admin or NTP adjusts it. RTC emulations and other device models can suffer in this case as timers will stall for the period the clock was tuned back. This adds a detection mechanism that checks on every host clock readout if the new time is before the last result. If that is the case a notifier list is informed. Device models interested in this event can register a notifier with the clock. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- qemu-timer.c | 29 - qemu-timer.h |5 + 2 files changed, 33 insertions(+), 1 deletions(-) diff --git a/qemu-timer.c b/qemu-timer.c index 72066c7..df323ae 100644 --- a/qemu-timer.c +++ b/qemu-timer.c @@ -150,6 +150,9 @@ struct QEMUClock { int enabled; QEMUTimer *warp_timer; + +NotifierList reset_notifiers; +int64_t last; }; struct QEMUTimer { @@ -375,9 +378,15 @@ static QEMUTimer *active_timers[QEMU_NUM_CLOCKS]; static QEMUClock *qemu_new_clock(int type) { QEMUClock *clock; + clock = qemu_mallocz(sizeof(QEMUClock)); clock-type = type; clock-enabled = 1; +notifier_list_init(clock-reset_notifiers); +/* required to detect report backward jumps */ +if (type == QEMU_CLOCK_HOST) { +clock-last = get_clock_realtime(); +} return clock; } @@ -592,6 +601,8 @@ static void qemu_run_timers(QEMUClock *clock) int64_t qemu_get_clock_ns(QEMUClock *clock) { +int64_t now, last; + switch(clock-type) { case QEMU_CLOCK_REALTIME: return get_clock(); @@ -603,10 +614,26 @@ int64_t qemu_get_clock_ns(QEMUClock *clock) return cpu_get_clock(); } case QEMU_CLOCK_HOST: -return get_clock_realtime(); +now = get_clock_realtime(); +last = clock-last; +clock-last = now; +if (now last) { +notifier_list_notify(clock-reset_notifiers, now); +} +return now; } } +void qemu_register_clock_reset_notifier(QEMUClock *clock, Notifier *notifier) +{ +notifier_list_add(clock-reset_notifiers, notifier); +} + +void qemu_unregister_clock_reset_notifier(QEMUClock *clock, Notifier *notifier) +{ +notifier_list_remove(clock-reset_notifiers, notifier); +} + void init_clocks(void) { rt_clock = qemu_new_clock(QEMU_CLOCK_REALTIME); diff --git a/qemu-timer.h b/qemu-timer.h index 06cbe20..0a43469 100644 --- a/qemu-timer.h +++ b/qemu-timer.h @@ -2,6 +2,7 @@ #define QEMU_TIMER_H #include qemu-common.h +#include notify.h #include time.h #include sys/time.h @@ -40,6 +41,10 @@ int64_t qemu_get_clock_ns(QEMUClock *clock); void qemu_clock_enable(QEMUClock *clock, int enabled); void qemu_clock_warp(QEMUClock *clock); +void qemu_register_clock_reset_notifier(QEMUClock *clock, Notifier *notifier); +void qemu_unregister_clock_reset_notifier(QEMUClock *clock, + Notifier *notifier); + QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale, QEMUTimerCB *cb, void *opaque); void qemu_free_timer(QEMUTimer *ts); -- 1.7.1
Re: [Qemu-devel] [PATCH v2] Optimize screendump
On 2011-06-20 10:12, Avi Kivity wrote: When running kvm-autotest, fputc() is often the second highest (sometimes #1) function showing up in a profile. This is due to fputc() locking the file for every byte written. Optimize by buffering a line's worth of pixels and writing that out in a single call. Signed-off-by: Avi Kivity a...@redhat.com --- v2: drop unportable fputc_unlocked hw/vga.c | 13 ++--- 1 files changed, 10 insertions(+), 3 deletions(-) diff --git a/hw/vga.c b/hw/vga.c index d5bc582..97c96bf 100644 --- a/hw/vga.c +++ b/hw/vga.c @@ -2349,15 +2349,19 @@ int ppm_save(const char *filename, struct DisplaySurface *ds) uint32_t v; int y, x; uint8_t r, g, b; +int ret; +char *linebuf, *pbuf; f = fopen(filename, wb); if (!f) return -1; fprintf(f, P6\n%d %d\n%d\n, ds-width, ds-height, 255); +linebuf = qemu_malloc(ds-width * 3); d1 = ds-data; for(y = 0; y ds-height; y++) { d = d1; +pbuf = linebuf; for(x = 0; x ds-width; x++) { if (ds-pf.bits_per_pixel == 32) v = *(uint32_t *)d; @@ -2369,13 +2373,16 @@ int ppm_save(const char *filename, struct DisplaySurface *ds) (ds-pf.gmax + 1); b = ((v ds-pf.bshift) ds-pf.bmax) * 256 / (ds-pf.bmax + 1); -fputc(r, f); -fputc(g, f); -fputc(b, f); +*pbuf++ = r; +*pbuf++ = g; +*pbuf++ = b; d += ds-pf.bytes_per_pixel; } d1 += ds-linesize; +ret = fwrite(linebuf, 1, pbuf - linebuf, f); +(void)ret; } +qemu_free(linebuf); fclose(f); return 0; } Unrelated to this patch, but why is this function located in vga.c and not in console.c? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support
On Mon, Jun 20, 2011 at 02:13:36PM +0200, Gerd Hoffmann wrote: Hi, +case QXL_IO_UPDATE_MEM: +switch (val) { +case (QXL_UPDATE_MEM_RENDER_ALL): +d-ssd.worker-update_mem(d-ssd.worker); +break; What is the difference to one worker-stop() + worker-start() cycle? this won't disconnect any clients. cheers, Gerd
Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support
On Mon, Jun 20, 2011 at 02:13:36PM +0200, Gerd Hoffmann wrote: Hi, +case QXL_IO_UPDATE_MEM: +switch (val) { +case (QXL_UPDATE_MEM_RENDER_ALL): +d-ssd.worker-update_mem(d-ssd.worker); +break; What is the difference to one worker-stop() + worker-start() cycle? ok, stop+start won't disconnect any clients either. But does stop render all waiting commands? I'll have to look, I don't know if it does. cheers, Gerd
Re: [Qemu-devel] [PATCH v2] Add support for fd: protocol
On 06/18/2011 04:50 PM, Blue Swirl wrote: On Thu, Jun 16, 2011 at 5:48 PM, Corey Bryantbrynt...@us.ibm.com wrote: On 06/15/2011 03:12 PM, Blue Swirl wrote: On Tue, Jun 14, 2011 at 4:31 PM, Corey Bryantbrynt...@us.ibm.comwrote: sVirt provides SELinux MAC isolation for Qemu guest processes and their corresponding resources (image files). sVirt provides this support by labeling guests and resources with security labels that are stored in file system extended attributes. Some file systems, such as NFS, do not support the extended attribute security namespace, which is needed for image file isolation when using the sVirt SELinux security driver in libvirt. The proposed solution entails a combination of Qemu, libvirt, and SELinux patches that work together to isolate multiple guests' images when they're stored in the same NFS mount. This results in an environment where sVirt isolation and NFS image file isolation can both be provided. This patch contains the Qemu code to support this solution. I would like to solicit input from the libvirt community prior to starting the libvirt patch. Currently, Qemu opens an image file in addition to performing the necessary read and write operations. The proposed solution will move the open out of Qemu and into libvirt. Once libvirt opens an image file for the guest, it will pass the file descriptor to Qemu via a new fd: protocol. If the image file resides in an NFS mount, the following SELinux policy changes will provide image isolation: - A new SELinux boolean is created (e.g. virt_read_write_nfs) to allow Qemu (svirt_t) to only have SELinux read and write permissions on nfs_t files - Qemu (svirt_t) also gets SELinux use permissions on libvirt (virtd_t) file descriptors Following is a sample invocation of Qemu using the fd: protocol on the command line: qemu -drive file=fd:4,format=qcow2 The fd: protocol is also supported by the drive_add monitor command. This requires that the specified file descriptor is passed to the monitor alongside a prior getfd monitor command. There are some additional features provided by certain image types where Qemu reopens the image file. All of these scenarios will be unsupported for the fd: protocol, at least for this patch: - The -snapshot command line option - The savevm monitor command - The snapshot_blkdev monitor command - Starting Qemu with a backing file There's also native CDROM device. Did you consider adding an explicit reopen method to block layer? Thanks. Yes it looks like I overlooked CDROM reopens. I'm not sure that I'm clear on the purpose of the reopen function. Would the goal be to funnel all block layer reopens through a single function, enabling potential future support where a privileged layer of Qemu, or libvirt, performs the open? Eventually yes, but I think it would help also now by moving the checks to a single place. It's a bit orthogonal to this patch though. This would definitely simplify things, especially when reopen support is added. I'm going to defer this until then. The thought is that this support can be added in the future, but is not required for the initial fd: support. This patch was tested with the following formats: raw, cow, qcow, qcow2, qed, and vmdk, using the fd: protocol from the command line and the monitor. Tests were also run to verify existing file name support and qemu-img were not regressed. Non-valid file descriptors, fd: without format, snapshot and backing files were also tested. Signed-off-by: Corey Bryantcor...@linux.vnet.ibm.com --- block.c | 16 ++ block.h |1 + block/cow.c |5 +++ block/qcow.c |5 +++ block/qcow2.c |5 +++ block/qed.c |4 ++ block/raw-posix.c | 81 +++-- block/vmdk.c |5 +++ block_int.h |1 + blockdev.c| 10 ++ monitor.c |5 +++ monitor.h |1 + qemu-options.hx |8 +++-- qemu-tool.c |5 +++ 14 files changed, 140 insertions(+), 12 deletions(-) diff --git a/block.c b/block.c index 24a25d5..500db84 100644 --- a/block.c +++ b/block.c @@ -536,6 +536,10 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, char tmp_filename[PATH_MAX]; char backing_filename[PATH_MAX]; +if (bdrv_is_fd_protocol(bs)) { +return -ENOTSUP; +} + /* if snapshot, we create a temporary backing file and open it instead of opening 'filename' directly */ @@ -585,6 +589,10 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, /* Find the right image format driver */ if (!drv) { +/* format must be specified for fd: protocol */ +if
Re: [Qemu-devel] [PATCH v2] Optimize screendump
On 06/20/2011 03:33 PM, Jan Kiszka wrote: --- a/hw/vga.c +++ b/hw/vga.c @@ -2349,15 +2349,19 @@ int ppm_save(const char *filename, struct DisplaySurface *ds) Unrelated to this patch, but why is this function located in vga.c and not in console.c? It's located in omap_lcdc.c as well. But it needs to be fully generalized to be moved out (handle all PixelFormats). -- error compiling committee.c: too many arguments to function
[Qemu-devel] [PATCH 1/1] fix operator precedence
Signed-off-by: Frediano Ziglio fredd...@gmail.com --- cmd.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/cmd.c b/cmd.c index db2c9c4..ecca167 100644 --- a/cmd.c +++ b/cmd.c @@ -486,7 +486,7 @@ timestr( snprintf(ts, size, %u:%02u.%02u, (unsigned int) MINUTES(tv-tv_sec), (unsigned int) SECONDS(tv-tv_sec), - (unsigned int) usec * 100); + (unsigned int) (usec * 100)); return; } format |= VERBOSE_FIXED_TIME; /* fallback if hours needed */ @@ -497,9 +497,9 @@ timestr( (unsigned int) HOURS(tv-tv_sec), (unsigned int) MINUTES(tv-tv_sec), (unsigned int) SECONDS(tv-tv_sec), - (unsigned int) usec * 100); + (unsigned int) (usec * 100)); } else { - snprintf(ts, size, 0.%04u sec, (unsigned int) usec * 1); + snprintf(ts, size, 0.%04u sec, (unsigned int) (usec * 1)); } } -- 1.7.1
Re: [Qemu-devel] [PATCH v2] Add support for fd: protocol
On 06/14/2011 04:31 PM, Corey Bryant wrote: - Starting Qemu with a backing file For this we could tell qemu that a file named xyz is available via fd n, via an extension of the getfd command. For example (qemu) getfd path=/images/my-image.img (qemu) getfd path=/images/template.img (qemu) drive-add path=/images/my-image.img The open() for my-image.img first looks up the name in the getfd database, and finds it, so it returns the fd from there instead of opening. It then opens the backing file (template.img) and looks it up again, and finds the second fd from the session. The result is that open()s are satisfied from the monitor, instead of the host kernel, but without reversing the request/reply nature of the monitor protocol. A similar extension could be added to the command line: qemu -drive file=fd:4,cache=none -path-alias name=/images/template.img,path=fd:5 Here the main image is opened via a fd 4; if it needs template.img, it gets shunted to fd 5. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH v2] Add support for fd: protocol
On 06/20/2011 08:40 AM, Avi Kivity wrote: On 06/14/2011 04:31 PM, Corey Bryant wrote: - Starting Qemu with a backing file For this we could tell qemu that a file named xyz is available via fd n, via an extension of the getfd command. For example (qemu) getfd path=/images/my-image.img (qemu) getfd path=/images/template.img (qemu) drive-add path=/images/my-image.img The open() for my-image.img first looks up the name in the getfd database, and finds it, so it returns the fd from there instead of opening. It then opens the backing file (template.img) and looks it up again, and finds the second fd from the session. The way I've been thinking about this is: -blockdev id=hd0-back,file=fd:4,format=raw \ -blockdev file=fd:3,format=qcow2,backing=hd0-back While your proposal is clever, it makes me a little nervous about subtle security ramifications. Regards, Anthony Liguori The result is that open()s are satisfied from the monitor, instead of the host kernel, but without reversing the request/reply nature of the monitor protocol. A similar extension could be added to the command line: qemu -drive file=fd:4,cache=none -path-alias name=/images/template.img,path=fd:5 Here the main image is opened via a fd 4; if it needs template.img, it gets shunted to fd 5.
Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support
What is the difference to one worker-stop() + worker-start() cycle? ok, stop+start won't disconnect any clients either. But does stop render all waiting commands? I'll have to look, I don't know if it does. It does. This is what qemu uses to flush all spice server state to device memory on migration. What is the reason for deleting all surfaces? cheers, Gerd
Re: [Qemu-devel] [PATCH v5 2/5] guest agent: qemu-ga daemon
On Sun, 19 Jun 2011 14:00:30 -0500 Michael Roth mdr...@linux.vnet.ibm.com wrote: On 06/17/2011 10:25 PM, Luiz Capitulino wrote: On Fri, 17 Jun 2011 16:25:32 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: On 06/17/2011 03:13 PM, Luiz Capitulino wrote: On Fri, 17 Jun 2011 14:21:31 -0500 Michael Rothmdr...@linux.vnet.ibm.com wrote: On 06/16/2011 01:42 PM, Luiz Capitulino wrote: On Tue, 14 Jun 2011 15:06:22 -0500 Michael Rothmdr...@linux.vnet.ibm.comwrote: This is the actual guest daemon, it listens for requests over a virtio-serial/isa-serial/unix socket channel and routes them through to dispatch routines, and writes the results back to the channel in a manner similar to QMP. A shorthand invocation: qemu-ga -d Is equivalent to: qemu-ga -c virtio-serial -p /dev/virtio-ports/org.qemu.guest_agent \ -p /var/run/qemu-guest-agent.pid -d Signed-off-by: Michael Rothmdr...@linux.vnet.ibm.com Would be nice to have a more complete description, like explaining how to do a simple test. And this can't be built... --- qemu-ga.c | 631 qga/guest-agent-core.h |4 + 2 files changed, 635 insertions(+), 0 deletions(-) create mode 100644 qemu-ga.c diff --git a/qemu-ga.c b/qemu-ga.c new file mode 100644 index 000..df08d8c --- /dev/null +++ b/qemu-ga.c @@ -0,0 +1,631 @@ +/* + * QEMU Guest Agent + * + * Copyright IBM Corp. 2011 + * + * Authors: + * Adam Litkeagli...@linux.vnet.ibm.com + * Michael Rothmdr...@linux.vnet.ibm.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#includestdlib.h +#includestdio.h +#includestdbool.h +#includeglib.h +#includegio/gio.h +#includegetopt.h +#includetermios.h +#includesyslog.h +#include qemu_socket.h +#include json-streamer.h +#include json-parser.h +#include qint.h +#include qjson.h +#include qga/guest-agent-core.h +#include qga-qmp-commands.h +#include module.h + +#define QGA_VIRTIO_PATH_DEFAULT /dev/virtio-ports/org.qemu.guest_agent +#define QGA_PIDFILE_DEFAULT /var/run/qemu-va.pid +#define QGA_BAUDRATE_DEFAULT B38400 /* for isa-serial channels */ +#define QGA_TIMEOUT_DEFAULT 30*1000 /* ms */ + +struct GAState { +const char *proxy_path; Where is this used? Nowhere actually. Will remove. +JSONMessageParser parser; +GMainLoop *main_loop; +guint conn_id; +GSocket *conn_sock; +GIOChannel *conn_channel; +guint listen_id; +GSocket *listen_sock; +GIOChannel *listen_channel; +const char *path; +const char *method; +bool virtio; /* fastpath to check for virtio to deal with poll() quirks */ +GACommandState *command_state; +GLogLevelFlags log_level; +FILE *log_file; +bool logging_enabled; +}; + +static void usage(const char *cmd) +{ +printf( +Usage: %s -cchannel_opts\n +QEMU Guest Agent %s\n +\n + -c, --channel channel method: one of unix-connect, virtio-serial, or\n +isa-serial (virtio-serial is the default)\n + -p, --pathchannel path (%s is the default for virtio-serial)\n + -l, --logfile set logfile path, logs to stderr by default\n + -f, --pidfile specify pidfile (default is %s)\n + -v, --verbose log extra debugging information\n + -V, --version print version information and exit\n + -d, --daemonize become a daemon\n + -h, --helpdisplay this help and exit\n +\n +Report bugs tomdr...@linux.vnet.ibm.com\n +, cmd, QGA_VERSION, QGA_VIRTIO_PATH_DEFAULT, QGA_PIDFILE_DEFAULT); +} + +static void conn_channel_close(GAState *s); + +static const char *ga_log_level_str(GLogLevelFlags level) +{ +switch (levelG_LOG_LEVEL_MASK) { +case G_LOG_LEVEL_ERROR: +return error; +case G_LOG_LEVEL_CRITICAL: +return critical; +case G_LOG_LEVEL_WARNING: +return warning; +case G_LOG_LEVEL_MESSAGE: +return message; +case G_LOG_LEVEL_INFO: +return info; +case G_LOG_LEVEL_DEBUG: +return debug; +default: +return user; +} +} + +bool ga_logging_enabled(GAState *s) +{ +return s-logging_enabled; +} + +void ga_disable_logging(GAState *s) +{ +s-logging_enabled = false; +} + +void ga_enable_logging(GAState *s) +{ +s-logging_enabled = true; +} Just to check I got this right, this is needed because of the fsfreeze command, correct? Isn't it better to have a more descriptive name, like fsfrozen? First I thought this was about a log file. Then I realized this was probably about
Re: [Qemu-devel] [V3 1/3] Enhance info block to display hostcache setting
Am 17.06.2011 18:37, schrieb Supriya Kannery: Enhance info block to display hostcache setting for each block device. Example: (qemu) info block ide0-hd0: type=hd removable=0 file=../rhel6-32.qcow2 ro=0 drv=qcow2 encrypted=0 Enhanced to display hostcache setting: (qemu) info block ide0-hd0: type=hd removable=0 hostcache=true file=../rhel6-32.qcow2 ro=0 drv=qcow2 encrypted=0 Signed-off-by: Supriya Kannery supri...@in.ibm.com --- block.c | 21 + qmp-commands.hx |2 ++ 2 files changed, 19 insertions(+), 4 deletions(-) Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -1694,6 +1694,14 @@ static void bdrv_print_dict(QObject *obj monitor_printf(mon, locked=%d, qdict_get_bool(bs_dict, locked)); } + if (qdict_haskey(bs_dict, open_flags)) { + int open_flags = qdict_get_int(bs_dict, open_flags); + if (open_flags BDRV_O_NOCACHE) + monitor_printf(mon, hostcache=false); + else + monitor_printf(mon, hostcache=true); Coding style requires braces. + } + if (qdict_haskey(bs_dict, inserted)) { QDict *qdict = qobject_to_qdict(qdict_get(bs_dict, inserted)); @@ -1730,13 +1738,18 @@ void bdrv_info(Monitor *mon, QObject **r QObject *bs_obj; bs_obj = qobject_from_jsonf({ 'device': %s, 'type': 'unknown', -'removable': %i, 'locked': %i }, -bs-device_name, bs-removable, -bs-locked); + 'removable': %i, 'locked': %i, + 'hostcache': %s }, + bs-device_name, bs-removable, + bs-locked, + (bs-open_flags BDRV_O_NOCACHE) ? + false : true); Don't use tabs. Kevin
Re: [Qemu-devel] [V3 2/3] Error classes for file reopen and device insertion
Am 17.06.2011 18:37, schrieb Supriya Kannery: New error classes defined for cases where device not inserted and file reopen failed. Signed-off-by: Supriya Kannery supri...@in.ibm.com This one has tabs, too. Kevin
Re: [Qemu-devel] [V2 3/3] Command block_set for dynamic block params change
Am 17.06.2011 18:38, schrieb Supriya Kannery: New command block_set added for dynamically changing any of the block device parameters. For now, dynamic setting of hostcache params using this command is implemented. Other block device parameters, can be integrated in similar lines. Signed-off-by: Supriya Kannery supri...@in.ibm.com Coding style is off in this one as well. Index: qemu/blockdev.c === --- qemu.orig/blockdev.c +++ qemu/blockdev.c @@ -797,3 +797,35 @@ int do_block_resize(Monitor *mon, const return 0; } + + +/* + * Handle changes to block device settings, like hostcache, + * while guest is running. +*/ +int do_block_set(Monitor *mon, const QDict *qdict, QObject **ret_data) +{ + const char *device = qdict_get_str(qdict, device); + const char *name = qdict_get_str(qdict, name); + int enable = qdict_get_bool(qdict, enable); + BlockDriverState *bs; + + bs = bdrv_find(device); + if (!bs) { + qerror_report(QERR_DEVICE_NOT_FOUND, device); + return -1; + } + + if (!(strcmp(name, hostcache))) { The bracket after ! isn't necessary. + if (bdrv_is_inserted(bs)) { + /* cache change applicable only if device inserted */ + return bdrv_change_hostcache(bs, enable); + } else { + qerror_report(QERR_DEVICE_NOT_INSERTED, device); + return -1; + } I'm not so sure about this one. Why shouldn't I change the cache mode for a device which is currently? The next thing I want to do could be inserting a medium and using it with the new cache mode. + } + + return 0; +} + Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -651,6 +651,33 @@ unlink_and_fail: return ret; } +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags) +{ + BlockDriver *drv = bs-drv; + int ret = 0; + + /* No need to reopen as no change in flags */ + if (bdrv_flags == bs-open_flags) + return 0; There could be other reasons for reopening besides changing flags, e.g. invalidating cached metadata. + + /* Quiesce IO for the given block device */ + qemu_aio_flush(); + bdrv_flush(bs); Missing error handling. + + bdrv_close(bs); Here, too. + ret = bdrv_open(bs, bs-filename, bdrv_flags, drv); + + /* + * A failed attempt to reopen the image file must lead to 'abort()' + */ + if (ret != 0) { + qerror_report(QERR_REOPEN_FILE_FAILED, bs-filename); + abort(); + } Maybe we can retry with the old flags at least before aborting? Also I would like to see a (Linux specific) version that uses the old fd for the reopen, so that we can handle files that aren't accessible with their old name any more. This would mean adding a .bdrv_reopen callback in raw-posix. + + return ret; +} + void bdrv_close(BlockDriverState *bs) { if (bs-drv) { @@ -691,6 +718,20 @@ void bdrv_close_all(void) } } +int bdrv_change_hostcache(BlockDriverState *bs, bool enable_host_cache) +{ + int bdrv_flags = bs-open_flags; + + /* set hostcache flags (without changing WCE/flush bits) */ + if (enable_host_cache) + bdrv_flags = ~BDRV_O_NOCACHE; + else + bdrv_flags |= BDRV_O_NOCACHE; + + /* Reopen file with changed set of flags */ + return bdrv_reopen(bs, bdrv_flags); +} Hm, interesting. Now we can get a O_DIRECT | O_SYNC mode with the monitor. We should probably expose the same functionality for the command line, too. Kevin
Re: [Qemu-devel] [PATCH] qemu-img: Add cache command line option
Am 16.06.2011 16:43, schrieb Kevin Wolf: Am 16.06.2011 16:28, schrieb Christoph Hellwig: On Wed, Jun 15, 2011 at 09:46:10AM -0400, Federico Simoncelli wrote: qemu-img currently writes disk images using writeback and filling up the cache buffers which are then flushed by the kernel preventing other processes from accessing the storage. This is particularly bad in cluster environments where time-based algorithms might be in place and accessing the storage within certain timeouts is critical. This patch adds the option to choose a cache method when writing disk images. Allowing to chose the mode is of course fine, but what about also choosing a good default? writethrough doesn't really make any sense for qemu-img, given that we can trivially flush the cache at the end of the operations. I'd also say that using the buffer cache doesn't make sense either, as there is little point in caching these operations. Right, we need to keep the defaults as they are. That is, for convert unsafe and for everything else writeback. The patch seems to make writeback the default for everything. Federico, are you going to fix this in a v4? Kevin
Re: [Qemu-devel] struct TimerState
Nilay writes: I am trying to understand the structures that QEMU saves when do_savevm() is invoked. Can anyone explain to me the fields that are part of the TimerState structure in qemu-timer.c? If my meory does not fail me, its main task is to capture what is the time in the host whenever the VM is started and stopped. This is later used to adapt the VM time when using icount in adaptive mode (-icount=auto). I remember seeing it used somewhere else, but right now I cannot recall exactly what for. This reminds me that I've been navigating through all the time-related code in QEMU and, in order to make it more easy to follow, I've started separating the routines in qemu-timer into different files (e.g., qemu-htime and qemu-vtime for routines accessing time sources in the host and in the VM). I will send the patches as soon as I finish the rewrite. Lluis -- And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer. -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth
Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support
On Mon, Jun 20, 2011 at 04:07:59PM +0200, Gerd Hoffmann wrote: What is the difference to one worker-stop() + worker-start() cycle? ok, stop+start won't disconnect any clients either. But does stop render all waiting commands? I'll have to look, I don't know if it does. It does. This is what qemu uses to flush all spice server state to device memory on migration. What is the reason for deleting all surfaces? Making sure all references are dropped to pci memory in devram. We would need to recreate all the surfaces after reset anyway. cheers, Gerd
Re: [Qemu-devel] [PATCH RFC 0/3] basic support for composing sysbus devices
Yeah, that's why I said, hard to do well. It makes it very hard to add new socket types. PCI, USB, IDE, SCSI, SBus, what else? APICBus? I2C? 8 socket types ought to be enough for anybody. Off the top of my head: AClink (audio), i2s (audio), SSI/SSP (synchonous serial), Firewire, rs232, CAN, FibreChannel, ISA, PS2, ADB (apple desktop bus) and probably a bunch of others I've missed. There's also a bunch of all-but extinct system architectures with interesting bus-level features (MCA, NuBus, etc.) Paul
Re: [Qemu-devel] [PATCH v2] Optimize screendump
On Mon, Jun 20, 2011 at 9:12 AM, Avi Kivity a...@redhat.com wrote: When running kvm-autotest, fputc() is often the second highest (sometimes #1) function showing up in a profile. This is due to fputc() locking the file for every byte written. Optimize by buffering a line's worth of pixels and writing that out in a single call. Signed-off-by: Avi Kivity a...@redhat.com --- v2: drop unportable fputc_unlocked hw/vga.c | 13 ++--- 1 files changed, 10 insertions(+), 3 deletions(-) Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support
On Mon, Jun 20, 2011 at 05:11:07PM +0200, Alon Levy wrote: On Mon, Jun 20, 2011 at 04:07:59PM +0200, Gerd Hoffmann wrote: What is the difference to one worker-stop() + worker-start() cycle? ok, stop+start won't disconnect any clients either. But does stop render all waiting commands? I'll have to look, I don't know if it does. It does. This is what qemu uses to flush all spice server state to device memory on migration. What is the reason for deleting all surfaces? Making sure all references are dropped to pci memory in devram. We would need to recreate all the surfaces after reset anyway. That's not right. The reason is that for the windows driver I don't know if this is a resolution change or a suspend. So it was easier to destroy all the surfaces and then the two cases are equal - before going to sleep / leaving the current resolution I destroy all the surfaces, when coming back I recreate the surfaces. If it's a resolution change there is no coming back stage, but since all surfaces are destroyed there is no error when the same surface id's are reused. cheers, Gerd
Re: [Qemu-devel] [PATCH 0/2] Introduce panic hypercall
On 06/20/2011 06:38 PM, Daniel P. Berrange wrote: On Mon, Jun 20, 2011 at 06:31:23PM +0300, Avi Kivity wrote: On 06/20/2011 04:38 PM, Daniel Gollub wrote: Introduce panic hypercall to enable the crashing guest to notify the host. This enables the host to run some actions as soon a guest crashed (kernel panic). This patch series introduces the panic hypercall at the host end. As well as the hypercall for KVM paravirtuliazed Linux guests, by registering the hypercall to the panic_notifier_list. The basic idea is to create KVM crashdump automatically as soon the guest paniced and power-cycle the VM (e.g. libvirton_crash /). This would be more easily done via a panic device (I/O port or memory-mapped address) that the guest hits. It would be intercepted by qemu without any new code in kvm.\ However, I'm not sure I see the gain. Most enterprisey guests already contain in-guest crash dumpers which provide more information than a qemu memory dump could, since they know exact load addresses etc. and are integrated with crash analysis tools. What do you have in mind? Well libvirt can capture a core file by doing 'virsh dump $GUESTNAME'. This actually uses the QEMU monitor migration command to capture the entire of QEMU memory. The 'crash' command line tool actually knows how to analyse this data format as it would a normal kernel crashdump. Interesting. I think having a way for a guest OS to notify the host that is has crashed would be useful. libvirt could automatically do a crash dump of the QEMU memory, or at least pause the guest CPUs and notify the management app of the crash, which can then decide what todo. You can also use tools like 'virt-dmesg' which uses libvirt to peek into guest memory to extract the most recent kernel dmesg logs (even if the guest OS itself is crashed didn't manage to send them out via netconsole or something else). I agree. But let's do this via a device, this way kvm need not be changed. Do ILO cards / IPMI support something like this? We could follow their lead in that case. This series does need to introduce a QMP event notification upon crash, so that the crash notification can be propagated to mgmt layers above QEMU. Yes. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support
On 06/20/11 17:11, Alon Levy wrote: On Mon, Jun 20, 2011 at 04:07:59PM +0200, Gerd Hoffmann wrote: What is the difference to one worker-stop() + worker-start() cycle? ok, stop+start won't disconnect any clients either. But does stop render all waiting commands? I'll have to look, I don't know if it does. It does. This is what qemu uses to flush all spice server state to device memory on migration. What is the reason for deleting all surfaces? Making sure all references are dropped to pci memory in devram. Ah, because the spice server keeps a reference to the create command until the surface is destroyed, right? There is is QXL_IO_DESTROY_ALL_SURFACES + worker-destroy_surfaces() ... The QXL_IO_UPDATE_MEM command does too much special stuff IMHO. I also think we don't need to extend the libspice-server API. We can add a I/O command which renders everything to device memory via stop+start. We can zap all surfaces with the existing command + worker call. We can add a I/O command to ask qxl to push the release queue head to the release ring. Comments? cheers, Gerd
Re: [Qemu-devel] [PATCH 0/2] Introduce panic hypercall
On 2011-06-20 17:45, Avi Kivity wrote: This series does need to introduce a QMP event notification upon crash, so that the crash notification can be propagated to mgmt layers above QEMU. Yes. I think the best way to deal with that is to stop the VM on guest panic. There is already WIP to signal stop reasons via QMP. Maybe we need to differentiate between hypervisor and guest triggered panics (VMSTOP_GUEST_PANIC?), but the rest should come for free. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] [PATCH 01/18] Don't translate pointer when in restore_sigcontext
From: Mike McCormack mj.mccorm...@samsung.com Fixes crash in i386 when user emulation base address is non-zero. 21797 rt_sigreturn(8,1082124603,1,0,1082126048,1082126248)Exit reason and status: signal 11 Signed-off-by: Mike McCormack mj.mccorm...@samsung.com Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/signal.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/linux-user/signal.c b/linux-user/signal.c index 11b25be..cb7138f 100644 --- a/linux-user/signal.c +++ b/linux-user/signal.c @@ -981,8 +981,8 @@ restore_sigcontext(CPUX86State *env, struct target_sigcontext *sc, int *peax) env-regs[R_ECX] = tswapl(sc-ecx); env-eip = tswapl(sc-eip); -cpu_x86_load_seg(env, R_CS, lduw(sc-cs) | 3); -cpu_x86_load_seg(env, R_SS, lduw(sc-ss) | 3); +cpu_x86_load_seg(env, R_CS, lduw_p(sc-cs) | 3); +cpu_x86_load_seg(env, R_SS, lduw_p(sc-ss) | 3); tmpflags = tswapl(sc-eflags); env-eflags = (env-eflags ~0x40DD5) | (tmpflags 0x40DD5); -- 1.7.4.1
[Qemu-devel] [PATCH 06/18] m68k-semi.c: Use correct check for failure of do_brk()
From: Peter Maydell peter.mayd...@linaro.org In the m68k semihosting implementation of HOSTED_INIT_SIM, use the correct check for whether do_brk() has failed -- it does not return -1 but the previous value of the break limit. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Riku Voipio riku.voi...@iki.fi --- m68k-semi.c |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/m68k-semi.c b/m68k-semi.c index 0371089..7fde10e 100644 --- a/m68k-semi.c +++ b/m68k-semi.c @@ -370,7 +370,7 @@ void do_m68k_semihosting(CPUM68KState *env, int nr) TaskState *ts = env-opaque; /* Allocate the heap using sbrk. */ if (!ts-heap_limit) { -long ret; +abi_ulong ret; uint32_t size; uint32_t base; @@ -379,8 +379,9 @@ void do_m68k_semihosting(CPUM68KState *env, int nr) /* Try a big heap, and reduce the size if that fails. */ for (;;) { ret = do_brk(base + size); -if (ret != -1) +if (ret = (base + size)) { break; +} size = 1; } ts-heap_limit = base + size; -- 1.7.4.1
[Qemu-devel] [PATCH 00/18] pending linux-user patches
From: Riku Voipio riku.voi...@iki.fi Hi, All included patches except mine have already been on the list. These patches should be ready for pull, but giving last minute chance for people to object. The following changes since commit eb47d7c5d96060040931c42773ee07e61e547af9 hw/9118.c: Implement active-low interrupt support (2011-06-15 13:23:37 +0200) are available in the git repository at: git://git.linaro.org/people/rikuvoipio/qemu.git linux-user-for-upstream Cédric VINCENT (2): linux-user: Fix the load of ELF files that have no useful symbol linux-user: Fix the computation of the requested heap size Juan Quintela (5): linuxload: id_change was a write only variable syscall: really return ret code linux-user: syscall should use sanitized arg1 flatload: end_code was only used in a debug message flatload: memp was a write-only variable Laurent ALFONSI (1): linux-user: Define AT_RANDOM to support target stack protection mechanism. Mike Frysinger (1): linux-user: add pselect6 syscall support Mike McCormack (1): Don't translate pointer when in restore_sigcontext Peter Maydell (7): linux-user: Handle images where lowest vaddr is not page aligned linux-user: Don't use MAP_FIXED in do_brk() arm-semi.c: Use correct check for failure of do_brk() m68k-semi.c: Use correct check for failure of do_brk() linux-user: Bump do_syscall() up to 8 syscall arguments linux-user/signal.c: Remove only-ever-set variable fpu_save_addr linux-user/signal.c: Remove unused fenab Riku Voipio (1): linux-user: Fix sync_file_range on 32bit mips arm-semi.c |5 +- linux-user/elfload.c | 185 + linux-user/flatload.c |8 +-- linux-user/linuxload.c | 25 +-- linux-user/main.c | 37 ++--- linux-user/qemu.h |3 +- linux-user/signal.c| 21 +++-- linux-user/syscall.c | 214 ++-- m68k-semi.c|5 +- 9 files changed, 331 insertions(+), 172 deletions(-) -- 1.7.4.1
[Qemu-devel] [PATCH 02/18] linux-user: Fix the load of ELF files that have no useful symbol
From: Cédric VINCENT cedric.vinc...@st.com This patch fixes a double free() due to realloc(syms, 0) in the loader when the ELF file has no useful symbol, as with the following example (compiled with sh4-linux-gcc -nostdlib): .text .align 1 .global _start _start: mov #1, r3 trapa #40 // syscall(__NR_exit) nop The bug appears when the log (option -d) is enabled. Signed-off-by: Cédric VINCENT cedric.vinc...@st.com Signed-off-by: Yves JANIN yves.ja...@st.com Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/elfload.c | 34 +++--- 1 files changed, 19 insertions(+), 15 deletions(-) diff --git a/linux-user/elfload.c b/linux-user/elfload.c index dcfeb7a..a4aabd5 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -1643,9 +1643,9 @@ static void load_symbols(struct elfhdr *hdr, int fd, abi_ulong load_bias) { int i, shnum, nsyms, sym_idx = 0, str_idx = 0; struct elf_shdr *shdr; -char *strings; -struct syminfo *s; -struct elf_sym *syms, *new_syms; +char *strings = NULL; +struct syminfo *s = NULL; +struct elf_sym *new_syms, *syms = NULL; shnum = hdr-e_shnum; i = shnum * sizeof(struct elf_shdr); @@ -1670,24 +1670,19 @@ static void load_symbols(struct elfhdr *hdr, int fd, abi_ulong load_bias) /* Now know where the strtab and symtab are. Snarf them. */ s = malloc(sizeof(*s)); if (!s) { -return; +goto give_up; } i = shdr[str_idx].sh_size; s-disas_strtab = strings = malloc(i); if (!strings || pread(fd, strings, i, shdr[str_idx].sh_offset) != i) { -free(s); -free(strings); -return; +goto give_up; } i = shdr[sym_idx].sh_size; syms = malloc(i); if (!syms || pread(fd, syms, i, shdr[sym_idx].sh_offset) != i) { -free(s); -free(strings); -free(syms); -return; +goto give_up; } nsyms = i / sizeof(struct elf_sym); @@ -1710,16 +1705,18 @@ static void load_symbols(struct elfhdr *hdr, int fd, abi_ulong load_bias) } } +/* No useful symbol. */ +if (nsyms == 0) { +goto give_up; +} + /* Attempt to free the storage associated with the local symbols that we threw away. Whether or not this has any effect on the memory allocation depends on the malloc implementation and how many symbols we managed to discard. */ new_syms = realloc(syms, nsyms * sizeof(*syms)); if (new_syms == NULL) { -free(s); -free(syms); -free(strings); -return; +goto give_up; } syms = new_syms; @@ -1734,6 +1731,13 @@ static void load_symbols(struct elfhdr *hdr, int fd, abi_ulong load_bias) s-lookup_symbol = lookup_symbolxx; s-next = syminfos; syminfos = s; + +return; + +give_up: +free(s); +free(strings); +free(syms); } int load_elf_binary(struct linux_binprm * bprm, struct target_pt_regs * regs, -- 1.7.4.1
[Qemu-devel] [PATCH 05/18] arm-semi.c: Use correct check for failure of do_brk()
From: Peter Maydell peter.mayd...@linaro.org In the ARM semihosting implementation of SYS_HEAPINFO, use the correct check for whether do_brk() has failed -- it does not return -1 but the previous value of the break limit. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Riku Voipio riku.voi...@iki.fi --- arm-semi.c |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/arm-semi.c b/arm-semi.c index e9e6f89..5a62d03 100644 --- a/arm-semi.c +++ b/arm-semi.c @@ -440,15 +440,16 @@ uint32_t do_arm_semihosting(CPUState *env) /* Some C libraries assume the heap immediately follows .bss, so allocate it using sbrk. */ if (!ts-heap_limit) { -long ret; +abi_ulong ret; ts-heap_base = do_brk(0); limit = ts-heap_base + ARM_ANGEL_HEAP_SIZE; /* Try a big heap, and reduce the size if that fails. */ for (;;) { ret = do_brk(limit); -if (ret != -1) +if (ret = limit) { break; +} limit = (ts-heap_base 1) + (limit 1); } ts-heap_limit = limit; -- 1.7.4.1
[Qemu-devel] [PATCH 12/18] linux-user: syscall should use sanitized arg1
From: Juan Quintela quint...@redhat.com Looking at the other architectures, we should be using how not arg1. Signed-off-by: Juan Quintela quint...@redhat.com [peter.mayd...@linaro.org: remove unnecessary initialisation of how] Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/syscall.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 57d9233..1c0503f 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -7181,7 +7181,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, case TARGET_NR_osf_sigprocmask: { abi_ulong mask; -int how = arg1; +int how; sigset_t set, oldset; switch(arg1) { @@ -7200,7 +7200,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, } mask = arg2; target_to_host_old_sigset(set, mask); -sigprocmask(arg1, set, oldset); +sigprocmask(how, set, oldset); host_to_target_old_sigset(mask, oldset); ret = mask; } -- 1.7.4.1
[Qemu-devel] [PATCH 07/18] linux-user: Fix the computation of the requested heap size
From: Cédric VINCENT cedric.vinc...@st.com There were two remaining bugs in the previous implementation of do_brk(): 1. the value of new_alloc_size was one page too large when the requested brk was aligned on a host page boundary. 2. no new pages should be (re-)allocated when the requested brk is in the range of the pages that were already allocated previsouly (for the same purpose). Technically these pages are never unmapped in the current implementation. The problem/fix can be reproduced/validated with the following test case: #include unistd.h /* syscall(2), */ #include sys/syscall.h /* SYS_brk, */ #include stdio.h/* puts(3), */ #include stdlib.h /* exit(3), EXIT_*, */ int main() { int current_brk = 0; int new_brk; int failure = 0; void test(int increment) { static int test_number = 0; test_number++; new_brk = syscall(SYS_brk, current_brk + increment); if (new_brk == current_brk) { printf(test %d fails\n, test_number); failure++; } current_brk = new_brk; } /* Initialization. */ test(0); /* Does QEMU overlap host pages? */ test(HOST_PAGE_SIZE); test(HOST_PAGE_SIZE); /* Does QEMU allocate the same host page twice? */ test(-HOST_PAGE_SIZE); test(HOST_PAGE_SIZE); if (!failure) { printf(success\n); exit(EXIT_SUCCESS); } else { exit(EXIT_FAILURE); } } Signed-off-by: Cédric VINCENT cedric.vinc...@st.com Reviewed-by: Christophe Guillon christophe.guil...@st.com Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/syscall.c | 11 ++- 1 files changed, 6 insertions(+), 5 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index b975730..be27f53 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -709,16 +709,17 @@ char *target_strerror(int err) static abi_ulong target_brk; static abi_ulong target_original_brk; +static abi_ulong brk_page; void target_set_brk(abi_ulong new_brk) { target_original_brk = target_brk = HOST_PAGE_ALIGN(new_brk); +brk_page = HOST_PAGE_ALIGN(target_brk); } /* do_brk() must return target values and target errnos. */ abi_long do_brk(abi_ulong new_brk) { -abi_ulong brk_page; abi_long mapped_addr; intnew_alloc_size; @@ -727,9 +728,8 @@ abi_long do_brk(abi_ulong new_brk) if (new_brk target_original_brk) return target_brk; -brk_page = HOST_PAGE_ALIGN(target_brk); - -/* If the new brk is less than this, set it and we're done... */ +/* If the new brk is less than the highest page reserved to the + * target heap allocation, set it and we're done... */ if (new_brk brk_page) { target_brk = new_brk; return target_brk; @@ -741,13 +741,14 @@ abi_long do_brk(abi_ulong new_brk) * itself); instead we treat mapped but at wrong address as * a failure and unmap again. */ -new_alloc_size = HOST_PAGE_ALIGN(new_brk - brk_page + 1); +new_alloc_size = HOST_PAGE_ALIGN(new_brk - brk_page); mapped_addr = get_errno(target_mmap(brk_page, new_alloc_size, PROT_READ|PROT_WRITE, MAP_ANON|MAP_PRIVATE, 0, 0)); if (mapped_addr == brk_page) { target_brk = new_brk; +brk_page = HOST_PAGE_ALIGN(target_brk); return target_brk; } else if (mapped_addr != -1) { /* Mapped but at wrong address, meaning there wasn't actually -- 1.7.4.1
[Qemu-devel] [PATCH 13/18] flatload: end_code was only used in a debug message
From: Juan Quintela quint...@redhat.com Just unfold its definition in only use. Signed-off-by: Juan Quintela quint...@redhat.com [peter.mayd...@linaro.org: fixed typo in the debug code, added parentheses to fix precedence issue] Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/flatload.c |5 ++--- 1 files changed, 2 insertions(+), 3 deletions(-) diff --git a/linux-user/flatload.c b/linux-user/flatload.c index cd7af7c..6fb78f5 100644 --- a/linux-user/flatload.c +++ b/linux-user/flatload.c @@ -384,7 +384,7 @@ static int load_flat_file(struct linux_binprm * bprm, abi_ulong reloc = 0, rp; int i, rev, relocs = 0; abi_ulong fpos; -abi_ulong start_code, end_code; +abi_ulong start_code; abi_ulong indx_len; hdr = ((struct flat_hdr *) bprm-buf); /* exec-header */ @@ -552,11 +552,10 @@ static int load_flat_file(struct linux_binprm * bprm, /* The main program needs a little extra setup in the task structure */ start_code = textpos + sizeof (struct flat_hdr); -end_code = textpos + text_len; DBG_FLT(%s %s: TEXT=%x-%x DATA=%x-%x BSS=%x-%x\n, id ? Lib : Load, bprm-filename, -(int) start_code, (int) end_code, +(int) start_code, (int) (textpos + text_len), (int) datapos, (int) (datapos + data_len), (int) (datapos + data_len), -- 1.7.4.1
[Qemu-devel] [PATCH 04/18] linux-user: Don't use MAP_FIXED in do_brk()
From: Peter Maydell peter.mayd...@linaro.org Since mmap() with MAP_FIXED will map over the top of existing mappings, it's a bad idea to use it to implement brk(), because brk() with a large size is likely to overwrite important things like qemu itself or the host libc. So we drop MAP_FIXED and handle mapped but at different address as an error case instead. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/syscall.c | 29 - 1 files changed, 20 insertions(+), 9 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 5cb27c7..b975730 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -735,23 +735,34 @@ abi_long do_brk(abi_ulong new_brk) return target_brk; } -/* We need to allocate more memory after the brk... */ +/* We need to allocate more memory after the brk... Note that + * we don't use MAP_FIXED because that will map over the top of + * any existing mapping (like the one with the host libc or qemu + * itself); instead we treat mapped but at wrong address as + * a failure and unmap again. + */ new_alloc_size = HOST_PAGE_ALIGN(new_brk - brk_page + 1); mapped_addr = get_errno(target_mmap(brk_page, new_alloc_size, PROT_READ|PROT_WRITE, -MAP_ANON|MAP_FIXED|MAP_PRIVATE, 0, 0)); +MAP_ANON|MAP_PRIVATE, 0, 0)); + +if (mapped_addr == brk_page) { +target_brk = new_brk; +return target_brk; +} else if (mapped_addr != -1) { +/* Mapped but at wrong address, meaning there wasn't actually + * enough space for this brk. + */ +target_munmap(mapped_addr, new_alloc_size); +mapped_addr = -1; +} #if defined(TARGET_ALPHA) /* We (partially) emulate OSF/1 on Alpha, which requires we return a proper errno, not an unchanged brk value. */ -if (is_error(mapped_addr)) { -return -TARGET_ENOMEM; -} +return -TARGET_ENOMEM; #endif - -if (!is_error(mapped_addr)) { - target_brk = new_brk; -} +/* For everything else, return the previous break. */ return target_brk; } -- 1.7.4.1
[Qemu-devel] [PATCH 08/18] linux-user: add pselect6 syscall support
From: Mike Frysinger vap...@gentoo.org Some architectures (like Blackfin) only implement pselect6 (and skip select/newselect). So add support for it. Signed-off-by: Mike Frysinger vap...@gentoo.org Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/syscall.c | 149 +++-- 1 files changed, 130 insertions(+), 19 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index be27f53..362cc63 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -550,6 +550,15 @@ _syscall5(int, sys_ppoll, struct pollfd *, fds, nfds_t, nfds, size_t, sigsetsize) #endif +#if defined(TARGET_NR_pselect6) +#ifndef __NR_pselect6 +# define __NR_pselect6 -1 +#endif +#define __NR_sys_pselect6 __NR_pselect6 +_syscall6(int, sys_pselect6, int, nfds, fd_set *, readfds, fd_set *, writefds, + fd_set *, exceptfds, struct timespec *, timeout, void *, sig); +#endif + extern int personality(int); extern int flock(int, int); extern int setfsuid(int); @@ -799,6 +808,20 @@ static inline abi_long copy_from_user_fdset(fd_set *fds, return 0; } +static inline abi_ulong copy_from_user_fdset_ptr(fd_set *fds, fd_set **fds_ptr, + abi_ulong target_fds_addr, + int n) +{ +if (target_fds_addr) { +if (copy_from_user_fdset(fds, target_fds_addr, n)) +return -TARGET_EFAULT; +*fds_ptr = fds; +} else { +*fds_ptr = NULL; +} +return 0; +} + static inline abi_long copy_to_user_fdset(abi_ulong target_fds_addr, const fd_set *fds, int n) @@ -964,6 +987,7 @@ static inline abi_long copy_to_user_mq_attr(abi_ulong target_mq_attr_addr, } #endif +#if defined(TARGET_NR_select) || defined(TARGET_NR__newselect) /* do_select() must return target values and target errnos. */ static abi_long do_select(int n, abi_ulong rfd_addr, abi_ulong wfd_addr, @@ -974,26 +998,17 @@ static abi_long do_select(int n, struct timeval tv, *tv_ptr; abi_long ret; -if (rfd_addr) { -if (copy_from_user_fdset(rfds, rfd_addr, n)) -return -TARGET_EFAULT; -rfds_ptr = rfds; -} else { -rfds_ptr = NULL; +ret = copy_from_user_fdset_ptr(rfds, rfds_ptr, rfd_addr, n); +if (ret) { +return ret; } -if (wfd_addr) { -if (copy_from_user_fdset(wfds, wfd_addr, n)) -return -TARGET_EFAULT; -wfds_ptr = wfds; -} else { -wfds_ptr = NULL; +ret = copy_from_user_fdset_ptr(wfds, wfds_ptr, wfd_addr, n); +if (ret) { +return ret; } -if (efd_addr) { -if (copy_from_user_fdset(efds, efd_addr, n)) -return -TARGET_EFAULT; -efds_ptr = efds; -} else { -efds_ptr = NULL; +ret = copy_from_user_fdset_ptr(efds, efds_ptr, efd_addr, n); +if (ret) { +return ret; } if (target_tv_addr) { @@ -1020,6 +1035,7 @@ static abi_long do_select(int n, return ret; } +#endif static abi_long do_pipe2(int host_pipe[], int flags) { @@ -5581,7 +5597,102 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, #endif #ifdef TARGET_NR_pselect6 case TARGET_NR_pselect6: - goto unimplemented_nowarn; +{ +abi_long rfd_addr, wfd_addr, efd_addr, n, ts_addr; +fd_set rfds, wfds, efds; +fd_set *rfds_ptr, *wfds_ptr, *efds_ptr; +struct timespec ts, *ts_ptr; + +/* + * The 6th arg is actually two args smashed together, + * so we cannot use the C library. + */ +sigset_t set; +struct { +sigset_t *set; +size_t size; +} sig, *sig_ptr; + +abi_ulong arg_sigset, arg_sigsize, *arg7; +target_sigset_t *target_sigset; + +n = arg1; +rfd_addr = arg2; +wfd_addr = arg3; +efd_addr = arg4; +ts_addr = arg5; + +ret = copy_from_user_fdset_ptr(rfds, rfds_ptr, rfd_addr, n); +if (ret) { +goto fail; +} +ret = copy_from_user_fdset_ptr(wfds, wfds_ptr, wfd_addr, n); +if (ret) { +goto fail; +} +ret = copy_from_user_fdset_ptr(efds, efds_ptr, efd_addr, n); +if (ret) { +goto fail; +} + +/* + * This takes a timespec, and not a timeval, so we cannot + * use the do_select() helper ... + */ +if (ts_addr) { +if (target_to_host_timespec(ts, ts_addr)) { +goto efault; +} +ts_ptr = ts; +} else { +ts_ptr = NULL; +} + +/*
[Qemu-devel] [PATCH 11/18] syscall: really return ret code
From: Juan Quintela quint...@redhat.com We assign ret with the error code, but then return 0 unconditionally. Signed-off-by: Juan Quintela quint...@redhat.com Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/syscall.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 362cc63..57d9233 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -3779,10 +3779,10 @@ static abi_long do_get_thread_area(CPUX86State *env, abi_ulong ptr) #ifndef TARGET_ABI32 static abi_long do_arch_prctl(CPUX86State *env, int code, abi_ulong addr) { -abi_long ret; +abi_long ret = 0; abi_ulong val; int idx; - + switch(code) { case TARGET_ARCH_SET_GS: case TARGET_ARCH_SET_FS: @@ -3801,13 +3801,13 @@ static abi_long do_arch_prctl(CPUX86State *env, int code, abi_ulong addr) idx = R_FS; val = env-segs[idx].base; if (put_user(val, addr, abi_ulong)) -return -TARGET_EFAULT; +ret = -TARGET_EFAULT; break; default: ret = -TARGET_EINVAL; break; } -return 0; +return ret; } #endif -- 1.7.4.1
[Qemu-devel] [PATCH 03/18] linux-user: Handle images where lowest vaddr is not page aligned
From: Peter Maydell peter.mayd...@linaro.org Fix a bug in the linux-user ELF loader code where it was not correctly handling images where the lowest vaddr to be loaded was not page aligned. The problem was that the code to probe for a suitable guest base address was changing the 'loaddr' variable (by rounding it to a page boundary), which meant that the load bias would then be incorrectly calculated unless loaddr happened to already be page-aligned. Binaries generated by gcc with the default linker script do start with a loadable segment at a page-aligned vaddr, so were unaffected. This bug was noticed with a binary created by the Google Go toolchain for ARM. We fix the bug by refactoring the probe for guest base code out into its own self-contained function. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/elfload.c | 130 -- 1 files changed, 73 insertions(+), 57 deletions(-) diff --git a/linux-user/elfload.c b/linux-user/elfload.c index a4aabd5..a13eb7b 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -1288,6 +1288,78 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, int envc, return sp; } +static void probe_guest_base(const char *image_name, + abi_ulong loaddr, abi_ulong hiaddr) +{ +/* Probe for a suitable guest base address, if the user has not set + * it explicitly, and set guest_base appropriately. + * In case of error we will print a suitable message and exit. + */ +#if defined(CONFIG_USE_GUEST_BASE) +const char *errmsg; +if (!have_guest_base !reserved_va) { +unsigned long host_start, real_start, host_size; + +/* Round addresses to page boundaries. */ +loaddr = qemu_host_page_mask; +hiaddr = HOST_PAGE_ALIGN(hiaddr); + +if (loaddr mmap_min_addr) { +host_start = HOST_PAGE_ALIGN(mmap_min_addr); +} else { +host_start = loaddr; +if (host_start != loaddr) { +errmsg = Address overflow loading ELF binary; +goto exit_errmsg; +} +} +host_size = hiaddr - loaddr; +while (1) { +/* Do not use mmap_find_vma here because that is limited to the + guest address space. We are going to make the + guest address space fit whatever we're given. */ +real_start = (unsigned long) +mmap((void *)host_start, host_size, PROT_NONE, + MAP_ANONYMOUS | MAP_PRIVATE | MAP_NORESERVE, -1, 0); +if (real_start == (unsigned long)-1) { +goto exit_perror; +} +if (real_start == host_start) { +break; +} +/* That address didn't work. Unmap and try a different one. + The address the host picked because is typically right at + the top of the host address space and leaves the guest with + no usable address space. Resort to a linear search. We + already compensated for mmap_min_addr, so this should not + happen often. Probably means we got unlucky and host + address space randomization put a shared library somewhere + inconvenient. */ +munmap((void *)real_start, host_size); +host_start += qemu_host_page_size; +if (host_start == loaddr) { +/* Theoretically possible if host doesn't have any suitably + aligned areas. Normally the first mmap will fail. */ +errmsg = Unable to find space for application; +goto exit_errmsg; +} +} +qemu_log(Relocating guest address space from 0x + TARGET_ABI_FMT_lx to 0x%lx\n, + loaddr, real_start); +guest_base = real_start - loaddr; +} +return; + +exit_perror: +errmsg = strerror(errno); +exit_errmsg: +fprintf(stderr, %s: %s\n, image_name, errmsg); +exit(-1); +#endif +} + + /* Load an ELF image into the address space. IMAGE_NAME is the filename of the image, to use in error messages. @@ -1373,63 +1445,7 @@ static void load_elf_image(const char *image_name, int image_fd, /* This is the main executable. Make sure that the low address does not conflict with MMAP_MIN_ADDR or the QEMU application itself. */ -#if defined(CONFIG_USE_GUEST_BASE) -/* - * In case where user has not explicitly set the guest_base, we - * probe here that should we set it automatically. - */ -if (!have_guest_base !reserved_va) { -unsigned long host_start, real_start, host_size; - -/* Round addresses to page boundaries. */ -loaddr = qemu_host_page_mask; -hiaddr = HOST_PAGE_ALIGN(hiaddr); - -
[Qemu-devel] [PATCH 10/18] linuxload: id_change was a write only variable
From: Juan Quintela quint...@redhat.com Signed-off-by: Juan Quintela quint...@redhat.com Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/linuxload.c | 25 + 1 files changed, 1 insertions(+), 24 deletions(-) diff --git a/linux-user/linuxload.c b/linux-user/linuxload.c index ac8c486..62ebc7e 100644 --- a/linux-user/linuxload.c +++ b/linux-user/linuxload.c @@ -26,22 +26,6 @@ abi_long memcpy_to_target(abi_ulong dest, const void *src, return 0; } -static int in_group_p(gid_t g) -{ -/* return TRUE if we're in the specified group, FALSE otherwise */ -intngroup; -inti; -gid_t grouplist[NGROUPS]; - -ngroup = getgroups(NGROUPS, grouplist); -for(i = 0; i ngroup; i++) { - if(grouplist[i] == g) { - return 1; - } -} -return 0; -} - static int count(char ** vec) { inti; @@ -57,7 +41,7 @@ static int prepare_binprm(struct linux_binprm *bprm) { struct statst; int mode; -int retval, id_change; +int retval; if(fstat(bprm-fd, st) 0) { return(-errno); @@ -73,14 +57,10 @@ static int prepare_binprm(struct linux_binprm *bprm) bprm-e_uid = geteuid(); bprm-e_gid = getegid(); -id_change = 0; /* Set-uid? */ if(mode S_ISUID) { bprm-e_uid = st.st_uid; - if(bprm-e_uid != geteuid()) { - id_change = 1; - } } /* Set-gid? */ @@ -91,9 +71,6 @@ static int prepare_binprm(struct linux_binprm *bprm) */ if ((mode (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) { bprm-e_gid = st.st_gid; - if (!in_group_p(bprm-e_gid)) { - id_change = 1; - } } retval = read(bprm-fd, bprm-buf, BPRM_BUF_SIZE); -- 1.7.4.1
[Qemu-devel] [PATCH 15/18] linux-user: Bump do_syscall() up to 8 syscall arguments
From: Peter Maydell peter.mayd...@linaro.org On 32 bit MIPS a few syscalls have 7 arguments, and so to call them via NR_syscall the guest needs to be able to pass 8 arguments to do_syscall(). Raise the number of arguments do_syscall() takes accordingly. This fixes some gcc 4.6 compiler warnings about arg7 and arg8 variables being set and never used. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/main.c| 37 - linux-user/qemu.h|3 ++- linux-user/syscall.c |8 +--- 3 files changed, 31 insertions(+), 17 deletions(-) diff --git a/linux-user/main.c b/linux-user/main.c index 71dd253..1293450 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -319,7 +319,8 @@ void cpu_loop(CPUX86State *env) env-regs[R_EDX], env-regs[R_ESI], env-regs[R_EDI], - env-regs[R_EBP]); + env-regs[R_EBP], + 0, 0); break; #ifndef TARGET_ABI32 case EXCP_SYSCALL: @@ -331,7 +332,8 @@ void cpu_loop(CPUX86State *env) env-regs[R_EDX], env-regs[10], env-regs[8], - env-regs[9]); + env-regs[9], + 0, 0); env-eip = env-exception_next_eip; break; #endif @@ -735,7 +737,8 @@ void cpu_loop(CPUARMState *env) env-regs[2], env-regs[3], env-regs[4], - env-regs[5]); + env-regs[5], + 0, 0); } } else { goto error; @@ -831,7 +834,8 @@ void cpu_loop(CPUState *env) env-regs[2], env-regs[3], env-regs[4], - env-regs[5]); + env-regs[5], + 0, 0); } } else { goto error; @@ -1018,7 +1022,8 @@ void cpu_loop (CPUSPARCState *env) ret = do_syscall (env, env-gregs[1], env-regwptr[0], env-regwptr[1], env-regwptr[2], env-regwptr[3], - env-regwptr[4], env-regwptr[5]); + env-regwptr[4], env-regwptr[5], + 0, 0); if ((abi_ulong)ret = (abi_ulong)(-515)) { #if defined(TARGET_SPARC64) !defined(TARGET_ABI32) env-xcc |= PSR_CARRY; @@ -1611,7 +1616,7 @@ void cpu_loop(CPUPPCState *env) env-crf[0] = ~0x1; ret = do_syscall(env, env-gpr[0], env-gpr[3], env-gpr[4], env-gpr[5], env-gpr[6], env-gpr[7], - env-gpr[8]); + env-gpr[8], 0, 0); if (ret == (uint32_t)(-TARGET_QEMU_ESIGRETURN)) { /* Returning from a successful sigreturn syscall. Avoid corrupting register state. */ @@ -2072,7 +2077,7 @@ void cpu_loop(CPUMIPSState *env) env-active_tc.gpr[5], env-active_tc.gpr[6], env-active_tc.gpr[7], - arg5, arg6/*, arg7, arg8*/); + arg5, arg6, arg7, arg8); } if (ret == -TARGET_QEMU_ESIGRETURN) { /* Returning from a successful sigreturn syscall. @@ -2160,7 +2165,8 @@ void cpu_loop (CPUState *env) env-gregs[6], env-gregs[7], env-gregs[0], - env-gregs[1]); + env-gregs[1], + 0, 0); env-gregs[0] = ret; break; case EXCP_INTERRUPT: @@ -2229,7 +2235,8 @@ void cpu_loop (CPUState *env) env-regs[12], env-regs[13], env-pregs[7], - env-pregs[11]); + env-pregs[11], + 0, 0); env-regs[10] =
[Qemu-devel] [PATCHv4] qemu-img: Add cache command line option
qemu-img currently writes disk images using writeback and filling up the cache buffers which are then flushed by the kernel preventing other processes from accessing the storage. This is particularly bad in cluster environments where time-based algorithms might be in place and accessing the storage within certain timeouts is critical. This patch adds the option to choose a cache method when writing disk images. Signed-off-by: Federico Simoncelli fsimo...@redhat.com --- qemu-img-cmds.hx |6 ++-- qemu-img.c | 80 +- 2 files changed, 70 insertions(+), 16 deletions(-) diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx index 3072d38..2b70618 100644 --- a/qemu-img-cmds.hx +++ b/qemu-img-cmds.hx @@ -22,13 +22,13 @@ STEXI ETEXI DEF(commit, img_commit, -commit [-f fmt] filename) +commit [-f fmt] [-t cache] filename) STEXI @item commit [-f @var{fmt}] @var{filename} ETEXI DEF(convert, img_convert, -convert [-c] [-p] [-f fmt] [-O output_fmt] [-o options] [-s snapshot_name] filename [filename2 [...]] output_filename) +convert [-c] [-p] [-f fmt] [-t cache] [-O output_fmt] [-o options] [-s snapshot_name] filename [filename2 [...]] output_filename) STEXI @item convert [-c] [-f @var{fmt}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_name}] @var{filename} [@var{filename2} [...]] @var{output_filename} ETEXI @@ -46,7 +46,7 @@ STEXI ETEXI DEF(rebase, img_rebase, -rebase [-f fmt] [-p] [-u] -b backing_file [-F backing_fmt] filename) +rebase [-f fmt] [-t cache] [-p] [-u] -b backing_file [-F backing_fmt] filename) STEXI @item rebase [-f @var{fmt}] [-u] -b @var{backing_file} [-F @var{backing_fmt}] @var{filename} ETEXI diff --git a/qemu-img.c b/qemu-img.c index 4f162d1..f904e32 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -40,6 +40,7 @@ typedef struct img_cmd_t { /* Default to cache=writeback as data integrity is not important for qemu-tcg. */ #define BDRV_O_FLAGS BDRV_O_CACHE_WB +#define BDRV_DEFAULT_CACHE writeback static void format_print(void *opaque, const char *name) { @@ -64,6 +65,8 @@ static void help(void) Command parameters:\n 'filename' is a disk image filename\n 'fmt' is the disk image format. It is guessed automatically in most cases\n + 'cache' is the cache mode used to write the output disk image, the valid\n + options are: 'none', 'writeback' (default), 'writethrough' and 'unsafe'\n 'size' is the disk image size in bytes. Optional suffixes\n 'k' or 'K' (kilobyte, 1024), 'M' (megabyte, 1024k), 'G' (gigabyte, 1024M)\n and T (terabyte, 1024G) are supported. 'b' is ignored.\n @@ -180,6 +183,27 @@ static int read_password(char *buf, int buf_size) } #endif +static int set_cache_flag(const char *mode, int *flags) +{ +*flags = ~BDRV_O_CACHE_MASK; + +if (!strcmp(mode, none) || !strcmp(mode, off)) { +*flags |= BDRV_O_CACHE_WB; +*flags |= BDRV_O_NOCACHE; +} else if (!strcmp(mode, writeback)) { +*flags |= BDRV_O_CACHE_WB; +} else if (!strcmp(mode, unsafe)) { +*flags |= BDRV_O_CACHE_WB; +*flags |= BDRV_O_NO_FLUSH; +} else if (!strcmp(mode, writethrough)) { +/* this is the default */ +} else { +return -1; +} + +return 0; +} + static int print_block_option_help(const char *filename, const char *fmt) { BlockDriver *drv, *proto_drv; @@ -441,13 +465,14 @@ static int img_check(int argc, char **argv) static int img_commit(int argc, char **argv) { -int c, ret; -const char *filename, *fmt; +int c, ret, flags; +const char *filename, *fmt, *cache; BlockDriverState *bs; fmt = NULL; +cache = BDRV_DEFAULT_CACHE; for(;;) { -c = getopt(argc, argv, f:h); +c = getopt(argc, argv, f:ht:); if (c == -1) { break; } @@ -459,6 +484,9 @@ static int img_commit(int argc, char **argv) case 'f': fmt = optarg; break; +case 't': +cache = optarg; +break; } } if (optind = argc) { @@ -466,7 +494,14 @@ static int img_commit(int argc, char **argv) } filename = argv[optind++]; -bs = bdrv_new_open(filename, fmt, BDRV_O_FLAGS | BDRV_O_RDWR); +flags = BDRV_O_RDWR; +ret = set_cache_flag(cache, flags); +if (ret 0) { +error_report(Invalid cache option: %s\n, cache); +return -1; +} + +bs = bdrv_new_open(filename, fmt, flags); if (!bs) { return 1; } @@ -591,8 +626,8 @@ static int compare_sectors(const uint8_t *buf1, const uint8_t *buf2, int n, static int img_convert(int argc, char **argv) { int c, ret = 0, n, n1, bs_n, bs_i, compress, cluster_size, cluster_sectors; -int progress = 0; -const char *fmt, *out_fmt, *out_baseimg, *out_filename; +int progress = 0, flags; +
[Qemu-devel] [PATCH 09/18] linux-user: Define AT_RANDOM to support target stack protection mechanism.
From: Laurent ALFONSI laurent.alfo...@st.com Note that the support for the command-line argument requires: 1. add the new field uint8_t rand_bytes[16] to struct image_info since only the variable info lives both in main() and in create_elf_tables() 2. write a dedicated parser to convert the command-line to fill rand_bytes[] These two steps aren't really hard to achieve but I finally think they are a little bit overkill regarding the purpose of these 16 bytes. Maybe we could always fill the 16 bytes pointed to by AT_RANDOM with zero if we really want to get reproducibility. Regards, Cédric. 888888888888 The dynamic linker from the GNU C library v2.10+ uses the ELF auxiliary vector AT_RANDOM [1] as a pointer to 16 bytes with random values to initialize the stack protection mechanism. Technically the emulated GNU dynamic linker crashes due to a NULL pointer derefencement if it is built with stack protection enabled and if AT_RANDOM is not defined by the QEMU ELF loader. [1] This ELF auxiliary vector was introduced in Linux v2.6.29. This patch can be tested with the code above: #include elf.h /* Elf*_auxv_t, AT_RANDOM, */ #include stdio.h /* printf(3), */ #include stdlib.h/* exit(3), EXIT_*, */ #include stdint.h/* uint8_t, */ #include string.h/* memcpy(3), */ #if defined(__LP64__) || defined(__ILP64__) || defined(__LLP64__) #define Elf_auxv_t Elf64_auxv_t #else #define Elf_auxv_t Elf32_auxv_t #endif main(int argc, char* argv[], char* envp[]) { Elf_auxv_t *auxv; /* *envp = NULL marks end of envp. */ while (*envp++ != NULL); /* auxv-a_type = AT_NULL marks the end of auxv. */ for (auxv = (Elf_auxv_t *)envp; auxv-a_type != AT_NULL; auxv++) { if (auxv-a_type == AT_RANDOM) { int i; uint8_t rand_bytes[16]; printf(AT_RANDOM is: 0x%x\n, auxv-a_un.a_val); memcpy(rand_bytes, (const uint8_t *)auxv-a_un.a_val, sizeof(rand_bytes)); printf(it points to: ); for (i = 0; i 16; i++) { printf(0x%02x , rand_bytes[i]); } printf(\n); exit(EXIT_SUCCESS); } } exit(EXIT_FAILURE); } Changes introduced in v2 and v3: * Fix typos + thinko (AT_RANDOM is used for stack canary, not for ASLR) * AT_RANDOM points to 16 random bytes stored inside the user stack. * Add a small test program. Signed-off-by: Cédric VINCENT cedric.vinc...@st.com Signed-off-by: Laurent ALFONSI laurent.alfo...@st.com Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/elfload.c | 21 - 1 files changed, 20 insertions(+), 1 deletions(-) diff --git a/linux-user/elfload.c b/linux-user/elfload.c index a13eb7b..b2746f2 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -927,7 +927,7 @@ struct exec #define TARGET_ELF_PAGESTART(_v) ((_v) ~(unsigned long)(TARGET_ELF_EXEC_PAGESIZE-1)) #define TARGET_ELF_PAGEOFFSET(_v) ((_v) (TARGET_ELF_EXEC_PAGESIZE-1)) -#define DLINFO_ITEMS 12 +#define DLINFO_ITEMS 13 static inline void memcpy_fromfs(void * to, const void * from, unsigned long n) { @@ -1202,6 +1202,9 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, int envc, { abi_ulong sp; int size; +int i; +abi_ulong u_rand_bytes; +uint8_t k_rand_bytes[16]; abi_ulong u_platform; const char *k_platform; const int n = sizeof(elf_addr_t); @@ -1231,6 +1234,20 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, int envc, /* FIXME - check return value of memcpy_to_target() for failure */ memcpy_to_target(sp, k_platform, len); } + +/* + * Generate 16 random bytes for userspace PRNG seeding (not + * cryptically secure but it's not the aim of QEMU). + */ +srand((unsigned int) time(NULL)); +for (i = 0; i 16; i++) { +k_rand_bytes[i] = rand(); +} +sp -= 16; +u_rand_bytes = sp; +/* FIXME - check return value of memcpy_to_target() for failure */ +memcpy_to_target(sp, k_rand_bytes, 16); + /* * Force 16 byte _final_ alignment here for generality. */ @@ -1271,6 +1288,8 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, int envc, NEW_AUX_ENT(AT_EGID, (abi_ulong) getegid()); NEW_AUX_ENT(AT_HWCAP, (abi_ulong) ELF_HWCAP); NEW_AUX_ENT(AT_CLKTCK, (abi_ulong) sysconf(_SC_CLK_TCK)); +NEW_AUX_ENT(AT_RANDOM, (abi_ulong) u_rand_bytes); + if (k_platform) NEW_AUX_ENT(AT_PLATFORM, u_platform); #ifdef ARCH_DLINFO -- 1.7.4.1
[Qemu-devel] [PATCH 18/18] linux-user: Fix sync_file_range on 32bit mips
From: Riku Voipio riku.voi...@iki.fi As noticed while looking at Bump do_syscall() up to 8 syscall arguments patch, sync_file_range uses a pad argument on 32bit mips. Deal with it by reading the correct arguments when on mips. Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/syscall.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index aa11a2c..beb482c 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -7842,8 +7842,13 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, #if defined(TARGET_NR_sync_file_range) case TARGET_NR_sync_file_range: #if TARGET_ABI_BITS == 32 +#if defined(TARGET_MIPS) +ret = get_errno(sync_file_range(arg1, target_offset64(arg3, arg4), +target_offset64(arg5, arg6), arg7)); +#else ret = get_errno(sync_file_range(arg1, target_offset64(arg2, arg3), target_offset64(arg4, arg5), arg6)); +#endif /* !TARGET_MIPS */ #else ret = get_errno(sync_file_range(arg1, arg2, arg3, arg4)); #endif -- 1.7.4.1
[Qemu-devel] [PATCH 17/18] linux-user/signal.c: Remove unused fenab
From: Peter Maydell peter.mayd...@linaro.org Remove fenab as it is only written, never used. Add a FIXME comment about the discrepancy between our behaviour and that of the Linux kernel for this routine. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/signal.c |7 +-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/linux-user/signal.c b/linux-user/signal.c index 4edd974..7d168e1 100644 --- a/linux-user/signal.c +++ b/linux-user/signal.c @@ -2228,7 +2228,6 @@ void sparc64_set_context(CPUSPARCState *env) target_mc_gregset_t *grp; abi_ulong pc, npc, tstate; abi_ulong fp, i7, w_addr; -unsigned char fenab; int err; unsigned int i; @@ -2293,7 +2292,11 @@ void sparc64_set_context(CPUSPARCState *env) if (put_user(i7, w_addr + offsetof(struct target_reg_window, ins[7]), abi_ulong) != 0) goto do_sigsegv; -err |= __get_user(fenab, (ucp-tuc_mcontext.mc_fpregs.mcfpu_enab)); +/* FIXME this does not match how the kernel handles the FPU in + * its sparc64_set_context implementation. In particular the FPU + * is only restored if fenab is non-zero in: + * __get_user(fenab, (ucp-tuc_mcontext.mc_fpregs.mcfpu_enab)); + */ err |= __get_user(env-fprs, (ucp-tuc_mcontext.mc_fpregs.mcfpu_fprs)); { uint32_t *src, *dst; -- 1.7.4.1
[Qemu-devel] [PATCH 14/18] flatload: memp was a write-only variable
From: Juan Quintela quint...@redhat.com Signed-off-by: Juan Quintela quint...@redhat.com Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/flatload.c |3 --- 1 files changed, 0 insertions(+), 3 deletions(-) diff --git a/linux-user/flatload.c b/linux-user/flatload.c index 6fb78f5..1062da3 100644 --- a/linux-user/flatload.c +++ b/linux-user/flatload.c @@ -379,7 +379,6 @@ static int load_flat_file(struct linux_binprm * bprm, abi_long result; abi_ulong realdatastart = 0; abi_ulong text_len, data_len, bss_len, stack_len, flags; -abi_ulong memp = 0; /* for finding the brk area */ abi_ulong extra; abi_ulong reloc = 0, rp; int i, rev, relocs = 0; @@ -491,7 +490,6 @@ static int load_flat_file(struct linux_binprm * bprm, } reloc = datapos + (ntohl(hdr-reloc_start) - text_len); -memp = realdatastart; } else { @@ -506,7 +504,6 @@ static int load_flat_file(struct linux_binprm * bprm, realdatastart = textpos + ntohl(hdr-data_start); datapos = realdatastart + indx_len; reloc = (textpos + ntohl(hdr-reloc_start) + indx_len); -memp = textpos; #ifdef CONFIG_BINFMT_ZFLAT #error code needs checking -- 1.7.4.1
Re: [Qemu-devel] [PATCH v2] Add support for fd: protocol
On 06/20/2011 04:50 PM, Anthony Liguori wrote: On 06/20/2011 08:40 AM, Avi Kivity wrote: On 06/14/2011 04:31 PM, Corey Bryant wrote: - Starting Qemu with a backing file For this we could tell qemu that a file named xyz is available via fd n, via an extension of the getfd command. For example (qemu) getfd path=/images/my-image.img (qemu) getfd path=/images/template.img (qemu) drive-add path=/images/my-image.img The open() for my-image.img first looks up the name in the getfd database, and finds it, so it returns the fd from there instead of opening. It then opens the backing file (template.img) and looks it up again, and finds the second fd from the session. The way I've been thinking about this is: -blockdev id=hd0-back,file=fd:4,format=raw \ -blockdev file=fd:3,format=qcow2,backing=hd0-back While your proposal is clever, it makes me a little nervous about subtle security ramifications. It would need careful explanation in the management tool author's guide, yes. The main advantage is generality. It doesn't assume that a file format has just one backing file, and doesn't require new syntax wherever a file is referred to indirectly. -- error compiling committee.c: too many arguments to function
[Qemu-devel] [PATCH 16/18] linux-user/signal.c: Remove only-ever-set variable fpu_save_addr
From: Peter Maydell peter.mayd...@linaro.org Move the access of fpu_save into the commented out skeleton code for restoring FPU registers on SPARC sigreturn, thus silencing a gcc 4.6 variable set but never used warning. (This doesn't affect the calculation of 'err' because in fact __get_user() can never fail.) Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Riku Voipio riku.voi...@iki.fi --- linux-user/signal.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/linux-user/signal.c b/linux-user/signal.c index cb7138f..4edd974 100644 --- a/linux-user/signal.c +++ b/linux-user/signal.c @@ -2080,7 +2080,6 @@ long do_sigreturn(CPUState *env) uint32_t up_psr, pc, npc; target_sigset_t set; sigset_t host_set; -abi_ulong fpu_save_addr; int err, i; sf_addr = env-regwptr[UREG_FP]; @@ -2120,10 +2119,11 @@ long do_sigreturn(CPUState *env) err |= __get_user(env-regwptr[i + UREG_I0], sf-info.si_regs.u_regs[i+8]); } -err |= __get_user(fpu_save_addr, sf-fpu_save); - -//if (fpu_save) -//err |= restore_fpu_state(env, fpu_save); +/* FIXME: implement FPU save/restore: + * __get_user(fpu_save, sf-fpu_save); + * if (fpu_save) + *err |= restore_fpu_state(env, fpu_save); + */ /* This is pretty much atomic, no amount locking would prevent * the races which exist anyways. -- 1.7.4.1
Re: [Qemu-devel] [PATCH 0/2] Introduce panic hypercall
On Monday, June 20, 2011 05:45:36 pm Avi Kivity wrote: However, I'm not sure I see the gain. Most enterprisey guests already contain in-guest crash dumpers which provide more information than a qemu memory dump could, since they know exact load addresses etc. and are integrated with crash analysis tools. What do you have in mind? Right kexec/kdump works perfectly already inside the guest. But: - in the field a lot of people still manage to setup VM guest without kexec/kdump properly setup (even though most enterprisey distribution try hard to setup this up out-of-the-box .. still people manage to not have kexec/kdump loaded once they run into a crash). - you don't have to reserve disk space for a crashdump for each guest e.g. if you run 4 guests with 60 GB of memory each you would loose somehow 4*60 GB space ... just for the (rare) case that each of those guest could write a crashdump, uncompressed ... - legacy distribution - no or buggy kexec - maybe writing a crashdump+reboot with QEMU/libvirt is faster then with in-guest kexec/kdump? (haven't tested yet) - single place on the VM-host to collect coredumps Well libvirt can capture a core file by doing 'virsh dump $GUESTNAME'. This actually uses the QEMU monitor migration command to capture the entire of QEMU memory. The 'crash' command line tool actually knows how to analyse this data format as it would a normal kernel crashdump. Interesting. Right. I'm using the kvmdump support of the crash utility now and then ... it could be more often. But unfortunately the people who run KVM in a productive environment with some strict service-level-agreement often just reboot, due to time pressure, or run out of disk space in the guest, or just forgot that they got told to do always virsh dump on a freeze or crash. I think having a way for a guest OS to notify the host that is has crashed would be useful. libvirt could automatically do a crash dump of the QEMU memory, or at least pause the guest CPUs and notify the management app of the crash, which can then decide what todo. You can also use tools like 'virt-dmesg' which uses libvirt to peek into guest memory to extract the most recent kernel dmesg logs (even if the guest OS itself is crashed didn't manage to send them out via netconsole or something else). I agree. But let's do this via a device, this way kvm need not be changed. Is a device reliable enough if the guest kernel crashes? Do you mean something like a hardware watchdog? Do ILO cards / IPMI support something like this? We could follow their lead in that case. The only two things which came to my mind are: * NMI (aka. ipmitool diag) - already available in qemu/kvm - but requires in-guest kexec/kdump * Hardware-Watchdog (also available in qemu/libvirt) lguest and xen have something similar. They also have an hypercall which get called by a function registered in the panic_notifier_list. Not quite sure if you want to follow their lead. Something I forgot to mention: This panic hypercall could also sit within an external kernel module ... to support (legacy) distribution. This series does need to introduce a QMP event notification upon crash, so that the crash notification can be propagated to mgmt layers above QEMU. Yes. Already done. I posted the QEMU relevant changes as a separated series to the KVM list ... since the initial implementation is KVM specific (KVM hypercall) Best Regards, Daniel -- Daniel Gollub Linux Consultant Developer Tel.: +49-160 47 73 970 Mail: gol...@b1-systems.de B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537 signature.asc Description: This is a digitally signed message part.
[Qemu-devel] unix domain socket communication with guests
I'm trying to set up a unix domain socket with a guest on one end and the host on the other, where the server is running on and bound to the socket on the guest. I've been able to get the reverse, where the server is running on the host, this way: qemu-kvm -kernel kernel -initrd initrd -hda root -device virtio-serial -serial stdio -chardev socket,path=/home/uckelman/projects/lightbox/supermin/foo,id=channel0,name=org.libguestfs.channel.0 But, when I try to bind(2) on the guest, I get an Address already in use error. Adding the server,nowait options to -chardev doesn't help---I still get the same error. What am I doing wrong here?
Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support
On Mon, Jun 20, 2011 at 05:50:32PM +0200, Gerd Hoffmann wrote: On 06/20/11 17:11, Alon Levy wrote: On Mon, Jun 20, 2011 at 04:07:59PM +0200, Gerd Hoffmann wrote: What is the difference to one worker-stop() + worker-start() cycle? ok, stop+start won't disconnect any clients either. But does stop render all waiting commands? I'll have to look, I don't know if it does. It does. This is what qemu uses to flush all spice server state to device memory on migration. What is the reason for deleting all surfaces? Making sure all references are dropped to pci memory in devram. Ah, because the spice server keeps a reference to the create command until the surface is destroyed, right? Actually right, so my correction stands corrected. There is is QXL_IO_DESTROY_ALL_SURFACES + worker-destroy_surfaces() ... Regarding QXL_IO_DESTROY_ALL_SURFACES, it destroys the primary surface too, which is a little special, that's another difference - update_mem destroys everything except the primary. I know I tried to destroy the primary but it didn't work right, don't recall why right now, so I guess I'll have to retry. The QXL_IO_UPDATE_MEM command does too much special stuff IMHO. I also think we don't need to extend the libspice-server API. We can add a I/O command which renders everything to device memory via stop+start. We can zap all surfaces with the existing command + Yes, start+stop work nicely, didn't realize (saw it before, assumed it wouldn't be good enough), just need to destroy the surfaces too. worker call. We can add a I/O command to ask qxl to push the release queue head to the release ring. So you suggest to replace QXL_IO_UPDATE_MEM with what, two io commands instead of using the val parameter? QXL_IO_UPDATE_MEM QXL_IO_FLUSH_RELEASE ? Comments? cheers, Gerd
Re: [Qemu-devel] [PATCH 0/2] Introduce panic hypercall
On 06/20/2011 07:26 PM, Daniel Gollub wrote: I agree. But let's do this via a device, this way kvm need not be changed. Is a device reliable enough if the guest kernel crashes? Do you mean something like a hardware watchdog? I'm proposing a 1:1 equivalent. Instead of issuing a hypercall that tells the host about the panic, write to an I/O port that tells the host about the panic. Do ILO cards / IPMI support something like this? We could follow their lead in that case. The only two things which came to my mind are: * NMI (aka. ipmitool diag) - already available in qemu/kvm - but requires in-guest kexec/kdump * Hardware-Watchdog (also available in qemu/libvirt) A watchdog has the advantage that is also detects lockups. In fact you could implement the panic device via the existing watchdogs. Simply program the timer for the minimum interval and *don't* service the interrupt. This would work for non-virt setups as well as another way to issue a reset. lguest and xen have something similar. They also have an hypercall which get called by a function registered in the panic_notifier_list. Not quite sure if you want to follow their lead. We could do the same, except s/hypercall/writel/. Something I forgot to mention: This panic hypercall could also sit within an external kernel module ... to support (legacy) distribution. Yes. This series does need to introduce a QMP event notification upon crash, so that the crash notification can be propagated to mgmt layers above QEMU. Yes. Already done. I posted the QEMU relevant changes as a separated series to the KVM list ... since the initial implementation is KVM specific (KVM hypercall) -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH 1/1] fix operator precedence
On Mon, Jun 20, 2011 at 2:25 PM, Frediano Ziglio fredd...@gmail.com wrote: Signed-off-by: Frediano Ziglio fredd...@gmail.com --- cmd.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) Thanks for the patch! cmd.c:timestr() has tabs for indentation but your patch uses spaces. I applied your changes manually to the trivial-patches tree: http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/trivial-patches For more info on the trivial patches tree, see http://wiki.qemu.org/Contribute/TrivialPatches. Please make sure whitespace remains unmodified in the future so that your patches apply, this is often a mail client issue. Try git-send-email(1), it does the right thing. Stefan
Re: [Qemu-devel] High speed polling
Thank you for your reply. I am still a novice with Qemu so pardon me if I don't make any sense. I tried --enable-io-thread. I get the error: cpus.o: In function `qemu_kvm_eat_signal': cpus.c:(.text+0x111a): undefined reference to `kvm_on_sigbus_vcpu' so I assume it requires KVM. I'm not using KVM because I don't have full control over the host I am running on. I have 8 host processors running 4 Qemu copies (1 vcpu each) plus my network simulator. I have tried polling via a call in vl.c:mainloop and via qemu_mod_timer(). There doesn't appear to be much difference. The guest is a full-blown x86_64 OS. I am polling to minimize latency. I am looking at other ways to tolerate the current latency in case I can't do much better. Clay On 06/15/11 01:22, Stefan Hajnoczi wrote: On Tue, Jun 14, 2011 at 11:32 PM, Clay Andreasenc...@cray.com wrote: I have a network device simulation that I am connecting to multiple instances of Qemu (nodes) via a shared memory queue. It works pretty well as long as all of the nodes are initiating communication but when one node is passive, it must poll to get packets. So far the fastest I have been able to get it to poll is about every 2M emulated clocks. This is with CONFIG_HIGH_RES_TIMERS and CONFIG_NO_HZ on the host. I also set MIN_TIMER_REARM_NS in qemu-timer.c to 10. Is there some way to increase the polling rate by about an order of magnitude? Without more details it's hard to say what is going on: Running an x86 guest? Are you using ./configure --enable-io-thread? It sounds like you may not be using KVM? How many vcpus are running on the host in total compared to the number of logical CPUs on the host? You haven't given details on how you are polling in the guest. Are you running a polling loop in ring 0 or is the guest running a full-blown OS and polling from userspace? Why are you polling in the first place - to minimize latency? Stefan
Re: [Qemu-devel] [PATCH 0/2] Introduce panic hypercall
On 06/20/2011 08:13 PM, Jan Kiszka wrote: A watchdog has the advantage that is also detects lockups. In fact you could implement the panic device via the existing watchdogs. Simply program the timer for the minimum interval and *don't* service the interrupt. This would work for non-virt setups as well as another way to issue a reset. If you manage to bring down the other guest CPUs fast enough. Otherwise, they may corrupt your crashdump before the host had a chance to collect all pieces. Synchronous signaling to the hypervisor is a bit safer. You could NMI-IPI them. But I agree a synchronous signal is better (note it's not race-free itself). -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH 0/2] Introduce panic hypercall
On 2011-06-20 18:34, Avi Kivity wrote: Do ILO cards / IPMI support something like this? We could follow their lead in that case. The only two things which came to my mind are: * NMI (aka. ipmitool diag) - already available in qemu/kvm - but requires in-guest kexec/kdump * Hardware-Watchdog (also available in qemu/libvirt) A watchdog has the advantage that is also detects lockups. In fact you could implement the panic device via the existing watchdogs. Simply program the timer for the minimum interval and *don't* service the interrupt. This would work for non-virt setups as well as another way to issue a reset. If you manage to bring down the other guest CPUs fast enough. Otherwise, they may corrupt your crashdump before the host had a chance to collect all pieces. Synchronous signaling to the hypervisor is a bit safer. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH] Support logging xen-guest console
On Mon, 20 Jun 2011, Chunyan Liu wrote: Add code to support logging xen-domU console, as what xenconsoled does. Log info will be saved in /var/log/xen/console/guest-domUname.log. Signed-off-by: Chunyan Liu cy...@novell.com --- hw/xen_console.c | 63 ++ 1 files changed, 63 insertions(+), 0 deletions(-) diff --git a/hw/xen_console.c b/hw/xen_console.c index c6c8163..ac3208d 100644 --- a/hw/xen_console.c +++ b/hw/xen_console.c @@ -36,6 +36,8 @@ #include qemu-char.h #include xen_backend.h +static int log_guest = 0; + struct buffer { uint8_t *data; size_t consumed; @@ -52,8 +54,24 @@ struct XenConsole { void *sring; CharDriverState *chr; int backlog; +int log_fd; }; +static int write_all(int fd, const char* buf, size_t len) +{ +while (len) { +ssize_t ret = write(fd, buf, len); +if (ret == -1 errno == EINTR) +continue; +if (ret = 0) +return -1; +len -= ret; +buf += ret; +} + +return 0; +} + If I am not mistaken ret == 0 doesn't always mean an error on write. static void buffer_append(struct XenConsole *con) { struct buffer *buffer = con-buffer; @@ -81,6 +99,14 @@ static void buffer_append(struct XenConsole *con) intf-out_cons = cons; xen_be_send_notify(con-xendev); +if (con-log_fd != -1) { +int logret; +logret = write_all(con-log_fd, buffer-data + buffer-size - size, size); +if (logret 0) +xen_be_printf(con-xendev, 1, Write to log failed on domain %d: %d (%s)\n, + con-xendev.dom, errno, strerror(errno)); + } code style: you needs brackets around the xen_be_printf statement if (buffer-max_capacity buffer-size buffer-max_capacity) { /* Discard the middle of the data. */ @@ -174,12 +200,36 @@ static void xencons_send(struct XenConsole *con) } } +static int create_domain_log(struct XenConsole *con) +{ +char *logfile; +char *path, *domname; +int fd; + +path = xs_get_domain_path(xenstore, con-xendev.dom); +domname = xenstore_read_str(path, name); +free(path); +if (!domname) +return -1; + +asprintf(logfile, /var/log/xen/console/guest-%s.log, domname); +qemu_free(domname); + +fd = open(logfile, O_WRONLY|O_CREAT|O_APPEND, 0644); +free(logfile); +if (fd == -1) +xen_be_printf(con-xendev, 1, Failed to open log %s: %d (%s), logfile, errno, strerror(errno)); + +return fd; +} + What if the console subdirectory is missing? Maybe we should create the directory automatically here. /* */ static int con_init(struct XenDevice *xendev) { struct XenConsole *con = container_of(xendev, struct XenConsole, xendev); char *type, *dom; +char *logenv = NULL; /* setup */ dom = xs_get_domain_path(xenstore, con-xendev.dom); @@ -198,6 +248,10 @@ static int con_init(struct XenDevice *xendev) else con-chr = serial_hds[con-xendev.dev]; +logenv = getenv(XENCONSOLED_TRACE); +if (logenv != NULL !strcmp(logenv, guest)) { +log_guest = 1; +} return 0; } please check the length of logenv before using strcmp on it
Re: [Qemu-devel] [PATCH v2] Add support for fd: protocol
On 06/20/2011 12:35 PM, Avi Kivity wrote: On 06/20/2011 04:50 PM, Anthony Liguori wrote: On 06/20/2011 08:40 AM, Avi Kivity wrote: On 06/14/2011 04:31 PM, Corey Bryant wrote: - Starting Qemu with a backing file For this we could tell qemu that a file named xyz is available via fd n, via an extension of the getfd command. For example (qemu) getfd path=/images/my-image.img (qemu) getfd path=/images/template.img (qemu) drive-add path=/images/my-image.img The open() for my-image.img first looks up the name in the getfd database, and finds it, so it returns the fd from there instead of opening. It then opens the backing file (template.img) and looks it up again, and finds the second fd from the session. The way I've been thinking about this is: -blockdev id=hd0-back,file=fd:4,format=raw \ -blockdev file=fd:3,format=qcow2,backing=hd0-back While your proposal is clever, it makes me a little nervous about subtle security ramifications. It would need careful explanation in the management tool author's guide, yes. The main advantage is generality. It doesn't assume that a file format has just one backing file, and doesn't require new syntax wherever a file is referred to indirectly. FWIW, with blockdev, we need options to control this all anyway. If you go back to my QCFG proposal, the parameters would actually be format specific, so if we had: -block file=fd:4,format=fancypantsformat,part0=hd0-back.part1,part1=hd0-back.part2... Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 00/12] [uq/master] Import linux headers and some cleanups
On Wed, Jun 08, 2011 at 04:10:54PM +0200, Jan Kiszka wrote: Licensing of the virtio headers is no clarified. So we can finally resolve the clumbsy and constantly buggy #ifdef'ery around old KVM and virtio headers. Recent example: current qemu-kvm does not build against 2.6.32 headers. This series introduces an import mechanism for all required Linux headers so that the appropriate versions can be kept safely inside the QEMU tree. I've incorporated all the valuable review comments on the first version and rebased the result over current uq/master after rebasing that one over current QEMU master. Please note that I had no chance to test-build PPC or s390. Beside the header topic, this series also includes a few assorted KVM cleanup patches so that my queue is empty again. Applied all, thanks.
[Qemu-devel] REMINDER: Participation Requested: Survey about Open-Source Software Development
Hi, Apologies for any inconvenience and thank you to those who have already completed the survey. We will keep the survey open for another couple of weeks. But, we do hope you will consider responding to the email request below (sent 2 weeks ago). Thanks, Dr. Jeffrey Carver Assistant Professor University of Alabama (v) 205-348-9829 (f) 205-348-0219 http://www.cs.ua.edu/~carver -Original Message- From: Jeffrey Carver [mailto:opensourcesur...@cs.ua.edu] Sent: Monday, June 13, 2011 11:45 AM To: 'qemu-devel@nongnu.org' Subject: Participation Requested: Survey about Open-Source Software Development Hi, Drs. Jeffrey Carver, Rosanna Guadagno, Debra McCallum, and Mr. Amiangshu Bosu, University of Alabama, and Dr. Lorin Hochstein, University of Southern California, are conducting a survey of open-source software developers. This survey seeks to understand how developers on distributed, virtual teams, like open-source projects, interact with each other to accomplish their tasks. You must be at least 19 years of age to complete the survey. The survey should take approximately 15 minutes to complete. If you are actively participating as a developer, please consider completing our survey. Here is the link to the survey: http://goo.gl/HQnux We apologize for inconvenience and if you receive multiple copies of this email. This survey has been approved by The University of Alabama IRB board. Thanks, Dr. Jeffrey Carver Assistant Professor University of Alabama (v) 205-348-9829 (f) 205-348-0219 http://www.cs.ua.edu/~carver
Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support
On Mon, Jun 20, 2011 at 06:32:30PM +0200, Alon Levy wrote: On Mon, Jun 20, 2011 at 05:50:32PM +0200, Gerd Hoffmann wrote: On 06/20/11 17:11, Alon Levy wrote: On Mon, Jun 20, 2011 at 04:07:59PM +0200, Gerd Hoffmann wrote: What is the difference to one worker-stop() + worker-start() cycle? ok, stop+start won't disconnect any clients either. But does stop render all waiting commands? I'll have to look, I don't know if it does. It does. This is what qemu uses to flush all spice server state to device memory on migration. What is the reason for deleting all surfaces? Making sure all references are dropped to pci memory in devram. Ah, because the spice server keeps a reference to the create command until the surface is destroyed, right? Actually right, so my correction stands corrected. There is is QXL_IO_DESTROY_ALL_SURFACES + worker-destroy_surfaces() ... Regarding QXL_IO_DESTROY_ALL_SURFACES, it destroys the primary surface too, which is a little special, that's another difference - update_mem destroys everything except the primary. I know I tried to destroy the primary but it didn't work right, don't recall why right now, so I guess I'll have to retry. The QXL_IO_UPDATE_MEM command does too much special stuff IMHO. I also think we don't need to extend the libspice-server API. We can add a I/O command which renders everything to device memory via stop+start. We can zap all surfaces with the existing command + Yes, start+stop work nicely, didn't realize (saw it before, assumed it wouldn't be good enough), just need to destroy the surfaces too. ok, it all works nicely except with the current driver patches I get a double destroy for the primary surface. Removing it with the following patch makes everything (resolution change/suspend/hibernate) work. I would really suggest we remove that PANIC_ON, besides of course fixing the driver patches (I'll do a v2 for the affected patche, the last series of qxl, I didn't cc you since I didn't assume you'd want to review, but you probably saw it). Something like: diff --git a/server/red_worker.c b/server/red_worker.c index f0a8dfc..3b53a3f 100644 --- a/server/red_worker.c +++ b/server/red_worker.c @@ -9684,7 +9684,11 @@ static inline void handle_dev_destroy_primary_surface(RedWorker *worker) receive_data(worker-channel, surface_id, sizeof(uint32_t)); PANIC_ON(surface_id != 0); -PANIC_ON(!worker-surfaces[surface_id].context.canvas); + +if (!worker-surfaces[surface_id].context.canvas) { +red_printf(warning: double destroy of primary surface\n); +goto end; +} if (worker-cursor) { red_release_cursor(worker, worker-cursor); @@ -9711,6 +9715,7 @@ static inline void handle_dev_destroy_primary_surface(RedWorker *worker) worker-cursor_position.x = worker-cursor_position.y = 0; worker-cursor_trail_length = worker-cursor_trail_frequency = 0; +end: message = RED_WORKER_MESSAGE_READY; write_message(worker-channel, message); } worker call. We can add a I/O command to ask qxl to push the release queue head to the release ring. So you suggest to replace QXL_IO_UPDATE_MEM with what, two io commands instead of using the val parameter? QXL_IO_UPDATE_MEM QXL_IO_FLUSH_RELEASE ? Comments? cheers, Gerd
Re: [Qemu-devel] [PATCH 14/18] TCG/PPC: use TCG_REG_CALL_STACK instead of TCG_REG_R1
On Mon, Jun 20, 2011 at 1:14 AM, malc av1...@comtv.ru wrote: On Mon, 20 Jun 2011, Blue Swirl wrote: Use TCG_REG_CALL_STACK instead of TCG_REG_R1 etc. for consistency. You spell it TCG_REG_CALL_STACK in the subject/comment but REG_CALL_STACK in the patch, which suggest that it was never even compile tested. Actually I seem to have used both versions. I didn't compile test, but to make matters even worse, I didn't even read any reference manuals or ABI descriptions for any of these patches but based all this on bits gathered from */tcg-target.[ch]. But is the patch otherwise OK? ;-)
Re: [Qemu-devel] [PATCH RFC 0/3] basic support for composing sysbus devices
On Mon, Jun 20, 2011 at 6:23 PM, Paul Brook p...@codesourcery.com wrote: Yeah, that's why I said, hard to do well. It makes it very hard to add new socket types. PCI, USB, IDE, SCSI, SBus, what else? APICBus? I2C? 8 socket types ought to be enough for anybody. Off the top of my head: AClink (audio), i2s (audio), SSI/SSP (synchonous serial), Firewire, rs232, CAN, FibreChannel, ISA, PS2, ADB (apple desktop bus) and probably a bunch of others I've missed. There's also a bunch of all-but extinct system architectures with interesting bus-level features (MCA, NuBus, etc.) Are these really buses with identifiable sockets? For example, it's not possible to enumerate the users of ISA bus or RS-232.