date:20110620

Re: [Qemu-devel] [PATCH 12/12] Add disk_size field to BlockDriverState structure

2011-06-20 Thread Stefan Hajnoczi

On Mon, Jun 20, 2011 at 6:37 AM, Fam Zheng famc...@gmail.com wrote:
 Is there any difference between bdrv_getlength and
 bdrv_get_allocated_file_size for bs-file? If not, I can simplify it by
 reusing it in two raw devices.

Yes, the two functions are different:

POSIX sparse files (files with holes) take up less space on disk than
their file size.  For example a 1 GB file where you've only written
the last byte and never touched any other blocks will only take up one
block - the rest will be unallocated.  So bdrv_getlength() == 1 GB and
bdrv_get_allocated_file_size() == 4 KB (or whatever the file system
block size is).

You can look at this using the stat(1) command and dd(1) to only write
the last byte of a file.

Stefan

[Qemu-devel] [PATCH v3] linux-user: Define AT_RANDOM to support target stack protection mechanism.

2011-06-20 Thread Cédric VINCENT

From: Laurent ALFONSI laurent.alfo...@st.com

Note that the support for the command-line argument requires:

 1. add the new field uint8_t rand_bytes[16] to struct
image_info since only the variable info lives both in
main() and in create_elf_tables()

 2. write a dedicated parser to convert the command-line to fill
rand_bytes[]

These two steps aren't really hard to achieve but I finally think they
are a little bit overkill regarding the purpose of these 16 bytes.
Maybe we could always fill the 16 bytes pointed to by AT_RANDOM with
zero if we really want to get reproducibility.

Regards,
Cédric.

888888888888

The dynamic linker from the GNU C library v2.10+ uses the ELF
auxiliary vector AT_RANDOM [1] as a pointer to 16 bytes with random
values to initialize the stack protection mechanism.  Technically the
emulated GNU dynamic linker crashes due to a NULL pointer
derefencement if it is built with stack protection enabled and if
AT_RANDOM is not defined by the QEMU ELF loader.

[1] This ELF auxiliary vector was introduced in Linux v2.6.29.

This patch can be tested with the code above:

#include elf.h   /* Elf*_auxv_t, AT_RANDOM, */
#include stdio.h /* printf(3), */
#include stdlib.h/* exit(3), EXIT_*, */
#include stdint.h/* uint8_t, */
#include string.h/* memcpy(3), */

#if defined(__LP64__) || defined(__ILP64__) || defined(__LLP64__)
#define Elf_auxv_t Elf64_auxv_t
#else
#define Elf_auxv_t Elf32_auxv_t
#endif

main(int argc, char* argv[], char* envp[])
{
Elf_auxv_t *auxv;

/* *envp = NULL marks end of envp. */
while (*envp++ != NULL);

/* auxv-a_type = AT_NULL marks the end of auxv. */
for (auxv = (Elf_auxv_t *)envp; auxv-a_type != AT_NULL; auxv++) {
if (auxv-a_type == AT_RANDOM) {
int i;
uint8_t rand_bytes[16];

printf(AT_RANDOM is: 0x%x\n, auxv-a_un.a_val);
memcpy(rand_bytes, (const uint8_t *)auxv-a_un.a_val, 
sizeof(rand_bytes));
printf(it points to: );
for (i = 0; i  16; i++) {
printf(0x%02x , rand_bytes[i]);
}
printf(\n);
exit(EXIT_SUCCESS);
}
}
exit(EXIT_FAILURE);
}

Changes introduced in v2 and v3:

* Fix typos + thinko (AT_RANDOM is used for stack canary, not for
  ASLR)

* AT_RANDOM points to 16 random bytes stored inside the user
  stack.

* Add a small test program.

Signed-off-by: Cédric VINCENT cedric.vinc...@st.com
Signed-off-by: Laurent ALFONSI laurent.alfo...@st.com
---
 linux-user/elfload.c |   21 -
 1 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index dcfeb7a..23c69d9 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -927,7 +927,7 @@ struct exec
 #define TARGET_ELF_PAGESTART(_v) ((_v)  ~(unsigned 
long)(TARGET_ELF_EXEC_PAGESIZE-1))
 #define TARGET_ELF_PAGEOFFSET(_v) ((_v)  (TARGET_ELF_EXEC_PAGESIZE-1))
 
-#define DLINFO_ITEMS 12
+#define DLINFO_ITEMS 13
 
 static inline void memcpy_fromfs(void * to, const void * from, unsigned long n)
 {
@@ -1202,6 +1202,9 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, 
int envc,
 {
 abi_ulong sp;
 int size;
+int i;
+abi_ulong u_rand_bytes;
+uint8_t k_rand_bytes[16];
 abi_ulong u_platform;
 const char *k_platform;
 const int n = sizeof(elf_addr_t);
@@ -1231,6 +1234,20 @@ static abi_ulong create_elf_tables(abi_ulong p, int 
argc, int envc,
 /* FIXME - check return value of memcpy_to_target() for failure */
 memcpy_to_target(sp, k_platform, len);
 }
+
+/*
+ * Generate 16 random bytes for userspace PRNG seeding (not
+ * cryptically secure but it's not the aim of QEMU).
+ */
+srand((unsigned int) time(NULL));
+for (i = 0; i  16; i++) {
+k_rand_bytes[i] = rand();
+}
+sp -= 16;
+u_rand_bytes = sp;
+/* FIXME - check return value of memcpy_to_target() for failure */
+memcpy_to_target(sp, k_rand_bytes, 16);
+
 /*
  * Force 16 byte _final_ alignment here for generality.
  */
@@ -1271,6 +1288,8 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, 
int envc,
 NEW_AUX_ENT(AT_EGID, (abi_ulong) getegid());
 NEW_AUX_ENT(AT_HWCAP, (abi_ulong) ELF_HWCAP);
 NEW_AUX_ENT(AT_CLKTCK, (abi_ulong) sysconf(_SC_CLK_TCK));
+NEW_AUX_ENT(AT_RANDOM, (abi_ulong) u_rand_bytes);
+
 if (k_platform)
 NEW_AUX_ENT(AT_PLATFORM, u_platform);
 #ifdef ARCH_DLINFO
-- 
1.7.5.1

Re: [Qemu-devel] [PATCH 1/3] kvm: ppc: booke206: use MMU API

2011-06-20 Thread Jan Kiszka

On 2011-06-18 01:28, Alexander Graf wrote:
 
 On 17.06.2011, at 22:39, Scott Wood wrote:
 
 Share the TLB array with KVM.  This allows us to set the initial TLB
 both on initial boot and reset, is useful for debugging, and could
 eventually be used to support migration.

 Signed-off-by: Scott Wood scottw...@freescale.com
 ---
 hw/ppce500_mpc8544ds.c |2 +
 target-ppc/cpu.h   |2 +
 target-ppc/kvm.c   |   85 
 
 3 files changed, 89 insertions(+), 0 deletions(-)

 diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
 index 5ac8843..3cdeb43 100644
 --- a/hw/ppce500_mpc8544ds.c
 +++ b/hw/ppce500_mpc8544ds.c
 @@ -192,6 +192,8 @@ static void mmubooke_create_initial_mapping(CPUState 
 *env,
 tlb-mas2 = va  TARGET_PAGE_MASK;
 tlb-mas7_3 = pa  TARGET_PAGE_MASK;
 tlb-mas7_3 |= MAS3_UR | MAS3_UW | MAS3_UX | MAS3_SR | MAS3_SW | MAS3_SX;
 +
 +env-tlb_dirty = true;
 }

 static void mpc8544ds_cpu_reset(void *opaque)
 diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
 index 46d86be..8191ed2 100644
 --- a/target-ppc/cpu.h
 +++ b/target-ppc/cpu.h
 @@ -921,6 +921,8 @@ struct CPUPPCState {
 ppc_tlb_t tlb;   /* TLB is optional. Allocate them only if needed
 */
 /* 403 dedicated access protection registers */
 target_ulong pb[4];
 +bool tlb_dirty;   /* Set to non-zero when modifying TLB 
  */
 +bool kvm_sw_tlb;  /* non-zero if KVM SW TLB API is active   
  */
 #endif

 /* Other registers */
 diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
 index e7b1b10..9a88fc9 100644
 --- a/target-ppc/kvm.c
 +++ b/target-ppc/kvm.c
 @@ -122,6 +122,51 @@ static int kvm_arch_sync_sregs(CPUState *cenv)
 return kvm_vcpu_ioctl(cenv, KVM_SET_SREGS, sregs);
 }

 +static int kvm_booke206_tlb_init(CPUState *env)
 +{
 +#if defined(KVM_CAP_SW_TLB)  defined(KVM_MMU_FSL_BOOKE_NOHV)
 
 Those hopefully shouldn't be required anymore soon - when Jan's patches make 
 it upstream. Jan, how's progress on that front?

I can only forward this question: Avi, what are the plans for
http://thread.gmane.org/gmane.comp.emulators.kvm.devel/73917?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] KVM call agenda for June 21

2011-06-20 Thread Juan Quintela


Please send in any agenda items you are interested in covering.

thanks,
-juan

Re: [Qemu-devel] [PATCH 1/3] kvm: ppc: booke206: use MMU API

2011-06-20 Thread Avi Kivity


On 06/20/2011 10:41 AM, Jan Kiszka wrote:


  Those hopefully shouldn't be required anymore soon - when Jan's patches make 
it upstream. Jan, how's progress on that front?

I can only forward this question: Avi, what are the plans for
http://thread.gmane.org/gmane.comp.emulators.kvm.devel/73917?


Will apply once all comments are addressed.

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH] Optimize screendump

2011-06-20 Thread Avi Kivity


On 06/19/2011 08:00 PM, Alexander Graf wrote:

On 19.06.2011, at 18:04, Avi Kivity wrote:

  On 06/19/2011 06:53 PM, Andreas Färber wrote:
  Am 19.06.2011 um 17:46 schrieb Avi Kivity:

  On 06/19/2011 06:22 PM, Stefan Hajnoczi wrote:
  I wonder if this will break non-Linux platforms.  Perhaps buffer an
  entire row of pixels instead and only fwrite(3) at the end of the
  outer loop.

  That's how I wrote this in the first place.  Since the consensus is 
against these functions, I'll submit that version instead.

  Maybe add a qemu_fputc_unlocked() and do a configure check for it?

  Good idea.  I'll try that, unless people disagree.

Writing by row should be faster and pretty straight forward, no?


I don't see how it's faster, but I guess I'll do that, it's a local 
issue and is best addressed locally.


--
error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH v2] Optimize screendump

2011-06-20 Thread Avi Kivity

When running kvm-autotest, fputc() is often the second highest (sometimes #1)
function showing up in a profile.  This is due to fputc() locking the file
for every byte written.

Optimize by buffering a line's worth of pixels and writing that out in a
single call.

Signed-off-by: Avi Kivity a...@redhat.com
---

v2: drop unportable fputc_unlocked

 hw/vga.c |   13 ++---
 1 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/hw/vga.c b/hw/vga.c
index d5bc582..97c96bf 100644
--- a/hw/vga.c
+++ b/hw/vga.c
@@ -2349,15 +2349,19 @@ int ppm_save(const char *filename, struct 
DisplaySurface *ds)
 uint32_t v;
 int y, x;
 uint8_t r, g, b;
+int ret;
+char *linebuf, *pbuf;
 
 f = fopen(filename, wb);
 if (!f)
 return -1;
 fprintf(f, P6\n%d %d\n%d\n,
 ds-width, ds-height, 255);
+linebuf = qemu_malloc(ds-width * 3);
 d1 = ds-data;
 for(y = 0; y  ds-height; y++) {
 d = d1;
+pbuf = linebuf;
 for(x = 0; x  ds-width; x++) {
 if (ds-pf.bits_per_pixel == 32)
 v = *(uint32_t *)d;
@@ -2369,13 +2373,16 @@ int ppm_save(const char *filename, struct 
DisplaySurface *ds)
 (ds-pf.gmax + 1);
 b = ((v  ds-pf.bshift)  ds-pf.bmax) * 256 /
 (ds-pf.bmax + 1);
-fputc(r, f);
-fputc(g, f);
-fputc(b, f);
+*pbuf++ = r;
+*pbuf++ = g;
+*pbuf++ = b;
 d += ds-pf.bytes_per_pixel;
 }
 d1 += ds-linesize;
+ret = fwrite(linebuf, 1, pbuf - linebuf, f);
+(void)ret;
 }
+qemu_free(linebuf);
 fclose(f);
 return 0;
 }
-- 
1.7.5.3

[Qemu-devel] [PATCH] Support logging xen-guest console

2011-06-20 Thread Chunyan Liu

Add code to support logging xen-domU console, as what xenconsoled does. Log info
will be saved in /var/log/xen/console/guest-domUname.log.

Signed-off-by: Chunyan Liu cy...@novell.com
---
 hw/xen_console.c |   63 ++
 1 files changed, 63 insertions(+), 0 deletions(-)

diff --git a/hw/xen_console.c b/hw/xen_console.c
index c6c8163..ac3208d 100644
--- a/hw/xen_console.c
+++ b/hw/xen_console.c
@@ -36,6 +36,8 @@
 #include qemu-char.h
 #include xen_backend.h
 
+static int log_guest = 0;
+
 struct buffer {
 uint8_t *data;
 size_t consumed;
@@ -52,8 +54,24 @@ struct XenConsole {
 void  *sring;
 CharDriverState   *chr;
 int   backlog;
+int   log_fd;
 };
 
+static int write_all(int fd, const char* buf, size_t len)
+{
+while (len) {
+ssize_t ret = write(fd, buf, len);
+if (ret == -1  errno == EINTR)
+continue;
+if (ret = 0)
+return -1;
+len -= ret;
+buf += ret;
+}
+
+return 0;
+}
+
 static void buffer_append(struct XenConsole *con)
 {
 struct buffer *buffer = con-buffer;
@@ -81,6 +99,14 @@ static void buffer_append(struct XenConsole *con)
 intf-out_cons = cons;
 xen_be_send_notify(con-xendev);
 
+if (con-log_fd != -1) {
+int logret;
+logret = write_all(con-log_fd, buffer-data + buffer-size - size, 
size);
+if (logret  0)
+xen_be_printf(con-xendev, 1, Write to log failed on domain %d: 
%d (%s)\n,
+  con-xendev.dom, errno, strerror(errno));
+ }
+
 if (buffer-max_capacity 
buffer-size  buffer-max_capacity) {
/* Discard the middle of the data. */
@@ -174,12 +200,36 @@ static void xencons_send(struct XenConsole *con)
 }
 }
 
+static int create_domain_log(struct XenConsole *con)
+{
+char *logfile;
+char *path, *domname;
+int fd;
+
+path = xs_get_domain_path(xenstore, con-xendev.dom);
+domname = xenstore_read_str(path, name);
+free(path);
+if (!domname)
+return -1;
+
+asprintf(logfile, /var/log/xen/console/guest-%s.log, domname);
+qemu_free(domname);
+
+fd = open(logfile, O_WRONLY|O_CREAT|O_APPEND, 0644);
+free(logfile);
+if (fd == -1)
+xen_be_printf(con-xendev, 1,  Failed to open log %s: %d (%s), 
logfile, errno, strerror(errno));
+
+return fd;
+}
+
 /*  */
 
 static int con_init(struct XenDevice *xendev)
 {
 struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
 char *type, *dom;
+char *logenv = NULL;
 
 /* setup */
 dom = xs_get_domain_path(xenstore, con-xendev.dom);
@@ -198,6 +248,10 @@ static int con_init(struct XenDevice *xendev)
 else
 con-chr = serial_hds[con-xendev.dev];
 
+logenv = getenv(XENCONSOLED_TRACE);
+if (logenv != NULL  !strcmp(logenv, guest)) {
+log_guest = 1;
+}
 return 0;
 }
 
@@ -230,6 +284,9 @@ static int con_connect(struct XenDevice *xendev)
  con-xendev.remote_port,
  con-xendev.local_port,
  con-buffer.max_capacity);
+con-log_fd = -1;
+if (log_guest)
+ con-log_fd = create_domain_log(con);
 return 0;
 }
 
@@ -245,6 +302,12 @@ static void con_disconnect(struct XenDevice *xendev)
munmap(con-sring, XC_PAGE_SIZE);
con-sring = NULL;
 }
+
+if (con-log_fd != -1) {
+close(con-log_fd);
+con-log_fd = -1;
+}
+
 }
 
 static void con_event(struct XenDevice *xendev)
-- 
1.7.3.4

[Qemu-devel] [PULL] Xen Patch Queue

2011-06-20 Thread Alexander Graf

Hi Anthony,

This is my current patch queue for Xen patches.

Please pull.


Alex

The following changes since commit eb47d7c5d96060040931c42773ee07e61e547af9:
  Peter Maydell (1):
hw/9118.c: Implement active-low interrupt support

are available in the git repository at:

  git://repo.or.cz/qemu/agraf.git xen-next

Anthony PERARD (2):
  xen: Add xc_domain_add_to_physmap to xen_interface.
  xen: Introduce VGA sync dirty bitmap support

Stefano Stabellini (8):
  xen: fix qemu_map_cache with size != MCACHE_BUCKET_SIZE
  xen: remove qemu_map_cache_unlock
  xen: remove xen_map_block and xen_unmap_block
  exec.c: refactor cpu_physical_memory_map
  xen: mapcache performance improvements
  cirrus_vga: reset lfb_addr after a pci config write if the BAR is unmapped
  xen: only track the linear framebuffer
  xen: fix interrupt routing

Steven Smith (1):
  xen: Add the Xen platform pci device

 Makefile.target |2 +
 configure   |   29 -
 cpu-common.h|1 +
 exec.c  |   88 +++---
 hw/cirrus_vga.c |5 +-
 hw/hw.h |3 +
 hw/pc.h |1 -
 hw/pc_piix.c|   10 +-
 hw/pci_ids.h|2 +
 hw/piix_pci.c   |   66 +-
 hw/xen_common.h |   14 ++
 hw/xen_platform.c   |  340 +++
 trace-events|4 +
 xen-all.c   |  281 ++
 xen-mapcache-stub.c |8 --
 xen-mapcache.c  |  141 ++
 xen-mapcache.h  |   16 ---
 17 files changed, 826 insertions(+), 185 deletions(-)
 create mode 100644 hw/xen_platform.c

Re: [Qemu-devel] [PATCH 3/3] xen: implement unplug protocol in xen_platform

2011-06-20 Thread Michael S. Tsirkin

On Thu, Jun 16, 2011 at 05:05:19PM +0100, stefano.stabell...@eu.citrix.com 
wrote:
 From: Stefano Stabellini stefano.stabell...@eu.citrix.com
 
 The unplug protocol is necessary to support PV drivers in the guest: the
 drivers expect to be able to unplug emulated disks and nics before
 initializing the Xen PV interfaces.
 It is responsibility of the guest to make sure that the unplug is done
 before the emulated devices or the PV interface start to be used.
 
 We use pci_for_each_device to walk the PCI bus, identify the devices and
 disks that we want to disable and dynamically unplug them.
 
 Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
 ---
  hw/xen_platform.c |   63 
 -
  1 files changed, 62 insertions(+), 1 deletions(-)
 
 diff --git a/hw/xen_platform.c b/hw/xen_platform.c
 index b167eee..9f8c843 100644
 --- a/hw/xen_platform.c
 +++ b/hw/xen_platform.c
 @@ -34,6 +34,9 @@
  #include xen_backend.h
  #include rwhandler.h
  #include trace.h
 +#include hw/ide/internal.h

I'm not an expert here but it looks like
you should put some code in hw/ide/xen.c
and export an API from there rather
than calling ide_bus_reset and tweaking
PCIIDEState directly.

 +#include hw/ide/pci.h
 +#include hw/pci_ids.h
  
  #include xenguest.h
  
 @@ -76,6 +79,54 @@ static void log_writeb(PCIXenPlatformState *s, char val)
  }
  
  /* Xen Platform, Fixed IOPort */
 +#define UNPLUG_ALL_IDE_DISKS 1
 +#define UNPLUG_ALL_NICS 2
 +#define UNPLUG_AUX_IDE_DISKS 4
 +
 +static int unplug_param;
 +
 +static void unplug_nic(PCIBus *b, PCIDevice *d)
 +{
 +if (d-config[0xa] == 0  d-config[0xb] == 2) {

Please use registers from pci_regs.h and pci_ids.h

 +pci_unplug_device((d-qdev));

Can't you use qdev_unplug?
That does other useful checks and updates system state.
Also, are there non hotpluggable devices?
If not you can assert on qdev_unplug failure.

 +}
 +}
 +
 +static void pci_unplug_nics(PCIBus *bus)
 +{
 +pci_for_each_device(bus, 0, unplug_nic);
 +}
 +
 +static void unplug_disks(PCIBus *b, PCIDevice *d)
 +{
 +if (d-config[0xa] == 1  d-config[0xb] == 1) {

Same comment about hardcoded constants.

 +PCIIDEState *pci_ide = DO_UPCAST(PCIIDEState, dev, d);
 +DriveInfo *di;
 +int i = 0;
 +
 +if (unplug_param  UNPLUG_AUX_IDE_DISKS)
 +i++;
 +
 +for (; i  3; i++) {
 +di = drive_get_by_index(IF_IDE, i); 
 +if (di != NULL  di-bdrv != NULL  di-bdrv-type != 
 BDRV_TYPE_CDROM) {

line too long

 +DeviceState *ds = bdrv_get_attached(di-bdrv);
 +if (ds)
 +bdrv_detach(di-bdrv, ds);
 +bdrv_close(di-bdrv);
 +pci_ide-bus[di-bus].ifs[di-unit].bs = NULL;
 +drive_put_ref(di);
 +}
 +}
 +ide_bus_reset(pci_ide-bus[0]);
 +ide_bus_reset(pci_ide-bus[1]);
 +}
 +}
 +
 +static void pci_unplug_disks(PCIBus *bus)
 +{
 +pci_for_each_device(bus, 0, unplug_disks);
 +}
  
  static void platform_fixed_ioport_writew(void *opaque, uint32_t addr, 
 uint32_t val)
  {
 @@ -83,10 +134,20 @@ static void platform_fixed_ioport_writew(void *opaque, 
 uint32_t addr, uint32_t v
  
  switch (addr - XEN_PLATFORM_IOPORT) {
  case 0:
 -/* TODO: */
 +unplug_param = val;
  /* Unplug devices.  Value is a bitmask of which devices to
 unplug, with bit 0 the IDE devices, bit 1 the network
 devices, and bit 2 the non-primary-master IDE devices. */
 +if (val  UNPLUG_ALL_IDE_DISKS || val  UNPLUG_AUX_IDE_DISKS) {
 +DPRINTF(unplug disks\n);
 +qemu_aio_flush();
 +bdrv_flush_all();
 +pci_unplug_disks(s-pci_dev.bus);
 +}
 +if (val  UNPLUG_ALL_NICS) {
 +DPRINTF(unplug nics\n);
 +pci_unplug_nics(s-pci_dev.bus);
 +}
  break;
  case 2:
  switch (val) {
 -- 
 1.7.2.3

Re: [Qemu-devel] [PATCH 2/3] pci: export pci_unplug_device

2011-06-20 Thread Michael S. Tsirkin

On Thu, Jun 16, 2011 at 05:05:18PM +0100, stefano.stabell...@eu.citrix.com 
wrote:
 From: Stefano Stabellini stefano.stabell...@eu.citrix.com
 
 pci_unplug_device is needed by the xen_platform device to perfom dynamic
 nic unplug.
 
 Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com

I think it's better to go through qdev, pci_unplug_device
was intended as an internal API.

 ---
  hw/pci.c |2 +-
  hw/pci.h |1 +
  2 files changed, 2 insertions(+), 1 deletions(-)
 
 diff --git a/hw/pci.c b/hw/pci.c
 index 1d297d6..679e976 100644
 --- a/hw/pci.c
 +++ b/hw/pci.c
 @@ -1692,7 +1692,7 @@ static int pci_qdev_init(DeviceState *qdev, DeviceInfo 
 *base)
  return 0;
  }
  
 -static int pci_unplug_device(DeviceState *qdev)
 +int pci_unplug_device(DeviceState *qdev)
  {
  PCIDevice *dev = DO_UPCAST(PCIDevice, qdev, qdev);
  PCIDeviceInfo *info = container_of(qdev-info, PCIDeviceInfo, qdev);
 diff --git a/hw/pci.h b/hw/pci.h
 index 0d288ce..868f793 100644
 --- a/hw/pci.h
 +++ b/hw/pci.h
 @@ -452,6 +452,7 @@ typedef struct {
  
  void pci_qdev_register(PCIDeviceInfo *info);
  void pci_qdev_register_many(PCIDeviceInfo *info);
 +int pci_unplug_device(DeviceState *qdev);
  
  PCIDevice *pci_create_multifunction(PCIBus *bus, int devfn, bool 
 multifunction,
  const char *name);
 -- 
 1.7.2.3

Re: [Qemu-devel] [PATCH 1/3] kvm: ppc: booke206: use MMU API

2011-06-20 Thread Jan Kiszka

On 2011-06-20 10:03, Avi Kivity wrote:
 On 06/20/2011 10:41 AM, Jan Kiszka wrote:

  Those hopefully shouldn't be required anymore soon - when Jan's patches 
 make it upstream. Jan, how's progress on that front?

 I can only forward this question: Avi, what are the plans for
 http://thread.gmane.org/gmane.comp.emulators.kvm.devel/73917?
 
 Will apply once all comments are addressed.

Well, then go ahead :) - or did I miss a comment?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 3/3] xen: implement unplug protocol in xen_platform

2011-06-20 Thread Kevin Wolf

Am 20.06.2011 10:28, schrieb Alexander Graf:
 
 On 16.06.2011, at 18:05, stefano.stabell...@eu.citrix.com 
 stefano.stabell...@eu.citrix.com wrote:
 
 From: Stefano Stabellini stefano.stabell...@eu.citrix.com

 The unplug protocol is necessary to support PV drivers in the guest: the
 drivers expect to be able to unplug emulated disks and nics before
 initializing the Xen PV interfaces.
 It is responsibility of the guest to make sure that the unplug is done
 before the emulated devices or the PV interface start to be used.

 We use pci_for_each_device to walk the PCI bus, identify the devices and
 disks that we want to disable and dynamically unplug them.
 
 Kevin, please check the block parts of this code.
 Michael, please check the PCI parts of this code.
 
 Thanks :)
 
 Alex
 

 Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
 ---
 hw/xen_platform.c |   63 
 -
 1 files changed, 62 insertions(+), 1 deletions(-)

 diff --git a/hw/xen_platform.c b/hw/xen_platform.c
 index b167eee..9f8c843 100644
 --- a/hw/xen_platform.c
 +++ b/hw/xen_platform.c
 @@ -34,6 +34,9 @@
 #include xen_backend.h
 #include rwhandler.h
 #include trace.h
 +#include hw/ide/internal.h

Sorry, no. :-)

This is not using a proper interface, but just a hack that depends on
the internal structure of the IDE emulation. It's going to break sooner
or later.

It seems your problem is that IDE isn't unpluggable. I'm not entirely
sure what the right solution is, maybe just adding a new xen-ide device
that is used for the Xen machine and closely resembles piix4-ide, but
can be hot-unplugged.

Kevin

 +#include hw/ide/pci.h
 +#include hw/pci_ids.h

 #include xenguest.h

 @@ -76,6 +79,54 @@ static void log_writeb(PCIXenPlatformState *s, char val)
 }

 /* Xen Platform, Fixed IOPort */
 +#define UNPLUG_ALL_IDE_DISKS 1
 +#define UNPLUG_ALL_NICS 2
 +#define UNPLUG_AUX_IDE_DISKS 4
 +
 +static int unplug_param;
 +
 +static void unplug_nic(PCIBus *b, PCIDevice *d)
 +{
 +if (d-config[0xa] == 0  d-config[0xb] == 2) {
 +pci_unplug_device((d-qdev));
 +}
 +}
 +
 +static void pci_unplug_nics(PCIBus *bus)
 +{
 +pci_for_each_device(bus, 0, unplug_nic);
 +}
 +
 +static void unplug_disks(PCIBus *b, PCIDevice *d)
 +{
 +if (d-config[0xa] == 1  d-config[0xb] == 1) {
 +PCIIDEState *pci_ide = DO_UPCAST(PCIIDEState, dev, d);
 +DriveInfo *di;
 +int i = 0;
 +
 +if (unplug_param  UNPLUG_AUX_IDE_DISKS)
 +i++;
 +
 +for (; i  3; i++) {
 +di = drive_get_by_index(IF_IDE, i); 
 +if (di != NULL  di-bdrv != NULL  di-bdrv-type != 
 BDRV_TYPE_CDROM) {
 +DeviceState *ds = bdrv_get_attached(di-bdrv);
 +if (ds)
 +bdrv_detach(di-bdrv, ds);
 +bdrv_close(di-bdrv);
 +pci_ide-bus[di-bus].ifs[di-unit].bs = NULL;
 +drive_put_ref(di);
 +}
 +}
 +ide_bus_reset(pci_ide-bus[0]);
 +ide_bus_reset(pci_ide-bus[1]);
 +}
 +}
 +
 +static void pci_unplug_disks(PCIBus *bus)
 +{
 +pci_for_each_device(bus, 0, unplug_disks);
 +}

 static void platform_fixed_ioport_writew(void *opaque, uint32_t addr, 
 uint32_t val)
 {
 @@ -83,10 +134,20 @@ static void platform_fixed_ioport_writew(void *opaque, 
 uint32_t addr, uint32_t v

 switch (addr - XEN_PLATFORM_IOPORT) {
 case 0:
 -/* TODO: */
 +unplug_param = val;
 /* Unplug devices.  Value is a bitmask of which devices to
unplug, with bit 0 the IDE devices, bit 1 the network
devices, and bit 2 the non-primary-master IDE devices. */
 +if (val  UNPLUG_ALL_IDE_DISKS || val  UNPLUG_AUX_IDE_DISKS) {
 +DPRINTF(unplug disks\n);
 +qemu_aio_flush();
 +bdrv_flush_all();
 +pci_unplug_disks(s-pci_dev.bus);
 +}
 +if (val  UNPLUG_ALL_NICS) {
 +DPRINTF(unplug nics\n);
 +pci_unplug_nics(s-pci_dev.bus);
 +}
 break;
 case 2:
 switch (val) {
 -- 
 1.7.2.3

Re: [Qemu-devel] [PATCH] do not send packet to nic if the packet will be dropped by nic

2011-06-20 Thread Kevin Wolf

Am 17.06.2011 03:33, schrieb Wen Congyang:
 If !s-clock_enabled or !rtl8139_receiver_enabled(s), it means that
 the nic will drop all packets from host. So qemu will keep getting
 packets from host and wasting CPU on dropping packets. This seems
 worse than packets that should be dropped but aren't.
 
 Signed-off-by: Wen Congyang we...@cn.fujitsu.com

Which bug does this change fix? I'm still not convinced that we should
do it.

 ---
  hw/rtl8139.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/hw/rtl8139.c b/hw/rtl8139.c
 index 2f8db58..9084678 100644
 --- a/hw/rtl8139.c
 +++ b/hw/rtl8139.c
 @@ -810,9 +810,9 @@ static int rtl8139_can_receive(VLANClientState *nc)
  
  /* Receive (drop) packets if card is disabled.  */

This comment isn't accurate any more after applying the patch.

  if (!s-clock_enabled)
 -  return 1;
 +  return 0;
  if (!rtl8139_receiver_enabled(s))
 -  return 1;
 +  return 0;
  
  if (rtl8139_cp_receiver_enabled(s)) {
  /* ??? Flow control not implemented in c+ mode.
 -- 1.7.1 

Kevin

Re: [Qemu-devel] [PATCH 1/3] kvm: ppc: booke206: use MMU API

2011-06-20 Thread Avi Kivity


On 06/20/2011 11:47 AM, Jan Kiszka wrote:

On 2011-06-20 10:03, Avi Kivity wrote:
  On 06/20/2011 10:41 AM, Jan Kiszka wrote:

   Those hopefully shouldn't be required anymore soon - when Jan's patches 
make it upstream. Jan, how's progress on that front?

  I can only forward this question: Avi, what are the plans for
  http://thread.gmane.org/gmane.comp.emulators.kvm.devel/73917?

  Will apply once all comments are addressed.

Well, then go ahead :) - or did I miss a comment?


If everyone's happy I (or rather Marcelo this week) will be happy to apply.

--
error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH] Support logging xen-guest console

2011-06-20 Thread Chunyan Liu

Add code to support logging xen-domU console, as what xenconsoled does. Log info
will be saved in /var/log/xen/console/guest-domUname.log.

Signed-off-by: Chunyan Liu cy...@novell.com
---
 hw/xen_console.c |   63 ++
 1 files changed, 63 insertions(+), 0 deletions(-)

diff --git a/hw/xen_console.c b/hw/xen_console.c
index c6c8163..ac3208d 100644
--- a/hw/xen_console.c
+++ b/hw/xen_console.c
@@ -36,6 +36,8 @@
 #include qemu-char.h
 #include xen_backend.h
 
+static int log_guest = 0;
+
 struct buffer {
 uint8_t *data;
 size_t consumed;
@@ -52,8 +54,24 @@ struct XenConsole {
 void  *sring;
 CharDriverState   *chr;
 int   backlog;
+int   log_fd;
 };
 
+static int write_all(int fd, const char* buf, size_t len)
+{
+while (len) {
+ssize_t ret = write(fd, buf, len);
+if (ret == -1  errno == EINTR)
+continue;
+if (ret = 0)
+return -1;
+len -= ret;
+buf += ret;
+}
+
+return 0;
+}
+
 static void buffer_append(struct XenConsole *con)
 {
 struct buffer *buffer = con-buffer;
@@ -81,6 +99,14 @@ static void buffer_append(struct XenConsole *con)
 intf-out_cons = cons;
 xen_be_send_notify(con-xendev);
 
+if (con-log_fd != -1) {
+int logret;
+logret = write_all(con-log_fd, buffer-data + buffer-size - size, 
size);
+if (logret  0)
+xen_be_printf(con-xendev, 1, Write to log failed on domain %d: 
%d (%s)\n,
+  con-xendev.dom, errno, strerror(errno));
+ }
+
 if (buffer-max_capacity 
buffer-size  buffer-max_capacity) {
/* Discard the middle of the data. */
@@ -174,12 +200,36 @@ static void xencons_send(struct XenConsole *con)
 }
 }
 
+static int create_domain_log(struct XenConsole *con)
+{
+char *logfile;
+char *path, *domname;
+int fd;
+
+path = xs_get_domain_path(xenstore, con-xendev.dom);
+domname = xenstore_read_str(path, name);
+free(path);
+if (!domname)
+return -1;
+
+asprintf(logfile, /var/log/xen/console/guest-%s.log, domname);
+qemu_free(domname);
+
+fd = open(logfile, O_WRONLY|O_CREAT|O_APPEND, 0644);
+free(logfile);
+if (fd == -1)
+xen_be_printf(con-xendev, 1,  Failed to open log %s: %d (%s), 
logfile, errno, strerror(errno));
+
+return fd;
+}
+
 /*  */
 
 static int con_init(struct XenDevice *xendev)
 {
 struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
 char *type, *dom;
+char *logenv = NULL;
 
 /* setup */
 dom = xs_get_domain_path(xenstore, con-xendev.dom);
@@ -198,6 +248,10 @@ static int con_init(struct XenDevice *xendev)
 else
 con-chr = serial_hds[con-xendev.dev];
 
+logenv = getenv(XENCONSOLED_TRACE);
+if (logenv != NULL  !strcmp(logenv, guest)) {
+log_guest = 1;
+}
 return 0;
 }
 
@@ -230,6 +284,9 @@ static int con_connect(struct XenDevice *xendev)
  con-xendev.remote_port,
  con-xendev.local_port,
  con-buffer.max_capacity);
+con-log_fd = -1;
+if (log_guest)
+ con-log_fd = create_domain_log(con);
 return 0;
 }
 
@@ -245,6 +302,12 @@ static void con_disconnect(struct XenDevice *xendev)
munmap(con-sring, XC_PAGE_SIZE);
con-sring = NULL;
 }
+
+if (con-log_fd != -1) {
+close(con-log_fd);
+con-log_fd = -1;
+}
+
 }
 
 static void con_event(struct XenDevice *xendev)
-- 
1.7.3.4

Re: [Qemu-devel] [PATCH] vmstate: Add unmigratable flag

2011-06-20 Thread Jan Kiszka

On 2011-06-19 22:46, Cam Macdonell wrote:
 On Thu, Jun 9, 2011 at 2:39 PM, Jan Kiszka jan.kis...@web.de wrote:
 On 2011-06-09 22:00, Anthony Liguori wrote:
 On 06/09/2011 11:44 AM, Jan Kiszka wrote:
 A first step towards getting rid of register_device_unmigratable
 (ivshmem and lacking vmstate support in virtio are blocking this):

 Allow to register an unmigratable vmstate via qdev, i.e. tag a device
 declaratively.

 I thought part of the problem with this was that for some devices (like
 ivshmem), whether it can be migrated was dynamic.  It depends on
 configuration, state, etc.

 That only applies to ivshmem (the other user is device assignment which
 is unconditionally unmigratable). And the ivshmem issue could easily be
 solved by defining two devices, ivshmem-peer (or just ivshmem) and
 ivshmem-master, eliminating the need for the role property.

 I don't think there will ever be a use case for a transformer device
 that becomes unmigratable during runtime (would be a nightmare for
 management apps anyway).

 If breaking the user interface of ivshmem for this is OK, I'll post a patch.

 Jan


 
 The migratability of ivshmem is not dynamic in that it doesn't change
 at runtime, it's set when the device is created, either role=peer or
 role=master is specified.  So iiuc, this could work with ivshmem.

So you are fine with breaking the interface? Everyone else as well? Then
I'll cook a patch to sort at least this out for 0.15.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Extracting TCG

2011-06-20 Thread Mathieu SUEN

Hi All,

TCG look very interesting for generating machine code.
Is they a way to extract it as a stand alone library in other to use
it in for JIT compiler?

Thanks

--
Mathieu

[Qemu-devel] [PATCH 08/11] cirrus_vga: reset lfb_addr after a pci config write if the BAR is unmapped

2011-06-20 Thread Alexander Graf

From: Stefano Stabellini stefano.stabell...@eu.citrix.com

If the cirrus_vga PCI BAR is unmapped than we should not only reset
map_addr but also lfb_addr, otherwise we'll keep trying to map
the old lfb_addr in map_linear_vram.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
Acked-by: Jan Kiszka jan.kis...@siemens.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/cirrus_vga.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/hw/cirrus_vga.c b/hw/cirrus_vga.c
index 722cac7..3c5043e 100644
--- a/hw/cirrus_vga.c
+++ b/hw/cirrus_vga.c
@@ -3088,8 +3088,11 @@ static void pci_cirrus_write_config(PCIDevice *d,
 CirrusVGAState *s = pvs-cirrus_vga;
 
 pci_default_write_config(d, address, val, len);
-if (s-vga.map_addr  d-io_regions[0].addr == PCI_BAR_UNMAPPED)
+if (s-vga.map_addr  d-io_regions[0].addr == PCI_BAR_UNMAPPED) {
 s-vga.map_addr = 0;
+s-vga.lfb_addr = 0;
+s-vga.lfb_end = 0;
+}
 cirrus_update_memory_access(s);
 }
 
-- 
1.6.0.2

[Qemu-devel] [PATCH 09/11] xen: only track the linear framebuffer

2011-06-20 Thread Alexander Graf

From: Stefano Stabellini stefano.stabell...@eu.citrix.com

Xen can only do dirty bit tracking for one memory region, so we should
explicitly avoid trying to track anything but the vga vram region.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 xen-all.c |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/xen-all.c b/xen-all.c
index 75a82c2..fe75ddd 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -215,6 +215,7 @@ static int xen_add_to_physmap(XenIOState *state,
 int rc = 0;
 XenPhysmap *physmap = NULL;
 target_phys_addr_t pfn, start_gpfn;
+RAMBlock *block;
 
 if (get_physmapping(state, start_addr, size)) {
 return 0;
@@ -223,6 +224,19 @@ static int xen_add_to_physmap(XenIOState *state,
 return -1;
 }
 
+/* Xen can only handle a single dirty log region for now and we want
+ * the linear framebuffer to be that region.
+ * Avoid tracking any regions that is not videoram and avoid tracking
+ * the legacy vga region. */
+QLIST_FOREACH(block, ram_list.blocks, next) {
+if (!strcmp(block-idstr, vga.vram)  block-offset == phys_offset
+ start_addr  0xb) {
+goto go_physmap;
+}
+}
+return -1;
+
+go_physmap:
 DPRINTF(mapping vram to %llx - %llx, from %llx\n,
 start_addr, start_addr + size, phys_offset);
 
-- 
1.6.0.2

[Qemu-devel] [PATCH 06/11] exec.c: refactor cpu_physical_memory_map

2011-06-20 Thread Alexander Graf

From: Stefano Stabellini stefano.stabell...@eu.citrix.com

Introduce qemu_ram_ptr_length that takes an address and a size as
parameters rather than just an address.

Refactor cpu_physical_memory_map so that we call qemu_ram_ptr_length only
once rather than calling qemu_get_ram_ptr one time per page.
This is not only more efficient but also tries to simplify the logic of
the function.
Currently we are relying on the fact that all the pages are mapped
contiguously in qemu's address space: we have a check to make sure that
the virtual address returned by qemu_get_ram_ptr from the second call on
is consecutive. Now we are making this more explicit replacing all the
calls to qemu_get_ram_ptr with a single call to qemu_ram_ptr_length
passing a size argument.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
CC: ag...@suse.de
CC: anth...@codemonkey.ws
Signed-off-by: Alexander Graf ag...@suse.de
---
 cpu-common.h |1 +
 exec.c   |   51 ++-
 2 files changed, 35 insertions(+), 17 deletions(-)

diff --git a/cpu-common.h b/cpu-common.h
index 9f59172..b027e43 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -65,6 +65,7 @@ void qemu_ram_free_from_ptr(ram_addr_t addr);
 void qemu_ram_remap(ram_addr_t addr, ram_addr_t length);
 /* This should only be used for ram local to a device.  */
 void *qemu_get_ram_ptr(ram_addr_t addr);
+void *qemu_ram_ptr_length(target_phys_addr_t addr, target_phys_addr_t *size);
 /* Same but slower, to use for migration, where the order of
  * RAMBlocks must not change. */
 void *qemu_safe_ram_ptr(ram_addr_t addr);
diff --git a/exec.c b/exec.c
index e11c1dd..238c173 100644
--- a/exec.c
+++ b/exec.c
@@ -3131,6 +3131,31 @@ void *qemu_safe_ram_ptr(ram_addr_t addr)
 return NULL;
 }
 
+/* Return a host pointer to guest's ram. Similar to qemu_get_ram_ptr
+ * but takes a size argument */
+void *qemu_ram_ptr_length(target_phys_addr_t addr, target_phys_addr_t *size)
+{
+if (xen_mapcache_enabled())
+return qemu_map_cache(addr, *size, 1);
+else {
+RAMBlock *block;
+
+QLIST_FOREACH(block, ram_list.blocks, next) {
+if (addr - block-offset  block-length) {
+if (addr - block-offset + *size  block-length)
+*size = block-length - addr + block-offset;
+return block-host + (addr - block-offset);
+}
+}
+
+fprintf(stderr, Bad ram offset % PRIx64 \n, (uint64_t)addr);
+abort();
+
+*size = 0;
+return NULL;
+}
+}
+
 void qemu_put_ram_ptr(void *addr)
 {
 trace_qemu_put_ram_ptr(addr);
@@ -3992,14 +4017,12 @@ void *cpu_physical_memory_map(target_phys_addr_t addr,
   int is_write)
 {
 target_phys_addr_t len = *plen;
-target_phys_addr_t done = 0;
+target_phys_addr_t todo = 0;
 int l;
-uint8_t *ret = NULL;
-uint8_t *ptr;
 target_phys_addr_t page;
 unsigned long pd;
 PhysPageDesc *p;
-unsigned long addr1;
+target_phys_addr_t addr1 = addr;
 
 while (len  0) {
 page = addr  TARGET_PAGE_MASK;
@@ -4014,7 +4037,7 @@ void *cpu_physical_memory_map(target_phys_addr_t addr,
 }
 
 if ((pd  ~TARGET_PAGE_MASK) != IO_MEM_RAM) {
-if (done || bounce.buffer) {
+if (todo || bounce.buffer) {
 break;
 }
 bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, TARGET_PAGE_SIZE);
@@ -4023,23 +4046,17 @@ void *cpu_physical_memory_map(target_phys_addr_t addr,
 if (!is_write) {
 cpu_physical_memory_read(addr, bounce.buffer, l);
 }
-ptr = bounce.buffer;
-} else {
-addr1 = (pd  TARGET_PAGE_MASK) + (addr  ~TARGET_PAGE_MASK);
-ptr = qemu_get_ram_ptr(addr1);
-}
-if (!done) {
-ret = ptr;
-} else if (ret + done != ptr) {
-break;
+
+*plen = l;
+return bounce.buffer;
 }
 
 len -= l;
 addr += l;
-done += l;
+todo += l;
 }
-*plen = done;
-return ret;
+*plen = todo;
+return qemu_ram_ptr_length(addr1, plen);
 }
 
 /* Unmaps a memory region previously mapped by cpu_physical_memory_map().
-- 
1.6.0.2

[Qemu-devel] [PATCH 04/11] xen: remove qemu_map_cache_unlock

2011-06-20 Thread Alexander Graf

From: Stefano Stabellini stefano.stabell...@eu.citrix.com

There is no need for qemu_map_cache_unlock, just use
qemu_invalidate_entry instead.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 exec.c  |2 +-
 xen-mapcache-stub.c |4 
 xen-mapcache.c  |   33 -
 xen-mapcache.h  |1 -
 4 files changed, 1 insertions(+), 39 deletions(-)

diff --git a/exec.c b/exec.c
index 09928a3..01f33bb 100644
--- a/exec.c
+++ b/exec.c
@@ -3146,7 +3146,7 @@ void qemu_put_ram_ptr(void *addr)
 xen_unmap_block(block-host, block-length);
 block-host = NULL;
 } else {
-qemu_map_cache_unlock(addr);
+qemu_invalidate_entry(addr);
 }
 }
 }
diff --git a/xen-mapcache-stub.c b/xen-mapcache-stub.c
index 7c14b3d..60f712b 100644
--- a/xen-mapcache-stub.c
+++ b/xen-mapcache-stub.c
@@ -22,10 +22,6 @@ uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, 
target_phys_addr_t size, u
 return qemu_get_ram_ptr(phys_addr);
 }
 
-void qemu_map_cache_unlock(void *buffer)
-{
-}
-
 ram_addr_t qemu_ram_addr_from_mapcache(void *ptr)
 {
 return -1;
diff --git a/xen-mapcache.c b/xen-mapcache.c
index 90fbd49..57fe24d 100644
--- a/xen-mapcache.c
+++ b/xen-mapcache.c
@@ -230,39 +230,6 @@ uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, 
target_phys_addr_t size, u
 return mapcache-last_address_vaddr + address_offset;
 }
 
-void qemu_map_cache_unlock(void *buffer)
-{
-MapCacheEntry *entry = NULL, *pentry = NULL;
-MapCacheRev *reventry;
-target_phys_addr_t paddr_index;
-int found = 0;
-
-QTAILQ_FOREACH(reventry, mapcache-locked_entries, next) {
-if (reventry-vaddr_req == buffer) {
-paddr_index = reventry-paddr_index;
-found = 1;
-break;
-}
-}
-if (!found) {
-return;
-}
-QTAILQ_REMOVE(mapcache-locked_entries, reventry, next);
-qemu_free(reventry);
-
-entry = mapcache-entry[paddr_index % mapcache-nr_buckets];
-while (entry  entry-paddr_index != paddr_index) {
-pentry = entry;
-entry = entry-next;
-}
-if (!entry) {
-return;
-}
-if (entry-lock  0) {
-entry-lock--;
-}
-}
-
 ram_addr_t qemu_ram_addr_from_mapcache(void *ptr)
 {
 MapCacheEntry *entry = NULL, *pentry = NULL;
diff --git a/xen-mapcache.h b/xen-mapcache.h
index 339444c..b89b8f9 100644
--- a/xen-mapcache.h
+++ b/xen-mapcache.h
@@ -14,7 +14,6 @@
 
 void qemu_map_cache_init(void);
 uint8_t  *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t 
size, uint8_t lock);
-void qemu_map_cache_unlock(void *phys_addr);
 ram_addr_t qemu_ram_addr_from_mapcache(void *ptr);
 void qemu_invalidate_entry(uint8_t *buffer);
 void qemu_invalidate_map_cache(void);
-- 
1.6.0.2

[Qemu-devel] [PATCH 01/11] xen: Add xc_domain_add_to_physmap to xen_interface.

2011-06-20 Thread Alexander Graf

From: Anthony PERARD anthony.per...@citrix.com

This function will be used to support sync dirty bitmap.

This come with a check against every Xen release, and special
implementation for Xen version that doesn't have this specific call.

This function will not be usable with Xen 3.3 because the behavior is
different.

Signed-off-by: Anthony PERARD anthony.per...@citrix.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 configure   |   29 -
 hw/xen_common.h |   14 ++
 2 files changed, 42 insertions(+), 1 deletions(-)

diff --git a/configure b/configure
index 44c092a..b63b49f 100755
--- a/configure
+++ b/configure
@@ -1210,6 +1210,7 @@ int main(void) {
   xc = xc_interface_open(0, 0, 0);
   xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0);
   xc_gnttab_open(NULL, 0);
+  xc_domain_add_to_physmap(0, 0, XENMAPSPACE_gmfn, 0, 0);
   return 0;
 }
 EOF
@@ -1228,10 +1229,14 @@ EOF
 # error HVM_MAX_VCPUS not defined
 #endif
 int main(void) {
+  struct xen_add_to_physmap xatp = {
+.domid = 0, .space = XENMAPSPACE_gmfn, .idx = 0, .gpfn = 0,
+  };
   xs_daemon_open();
   xc_interface_open();
   xc_gnttab_open();
   xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0);
+  xc_memory_op(0, XENMEM_add_to_physmap, xatp);
   return 0;
 }
 EOF
@@ -1240,7 +1245,29 @@ EOF
 xen_ctrl_version=400
 xen=yes
 
-  # Xen 3.3.0, 3.4.0
+  # Xen 3.4.0
+  elif (
+  cat  $TMPC EOF
+#include xenctrl.h
+#include xs.h
+int main(void) {
+  struct xen_add_to_physmap xatp = {
+.domid = 0, .space = XENMAPSPACE_gmfn, .idx = 0, .gpfn = 0,
+  };
+  xs_daemon_open();
+  xc_interface_open();
+  xc_gnttab_open();
+  xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0);
+  xc_memory_op(0, XENMEM_add_to_physmap, xatp);
+  return 0;
+}
+EOF
+  compile_prog  $xen_libs
+) ; then
+xen_ctrl_version=340
+xen=yes
+
+  # Xen 3.3.0
   elif (
   cat  $TMPC EOF
 #include xenctrl.h
diff --git a/hw/xen_common.h b/hw/xen_common.h
index a1958a0..2c79af6 100644
--- a/hw/xen_common.h
+++ b/hw/xen_common.h
@@ -71,6 +71,20 @@ static inline int xc_domain_populate_physmap_exact
 (xc_handle, domid, nr_extents, extent_order, mem_flags, extent_start);
 }
 
+static inline int xc_domain_add_to_physmap(int xc_handle, uint32_t domid,
+   unsigned int space, unsigned long 
idx,
+   xen_pfn_t gpfn)
+{
+struct xen_add_to_physmap xatp = {
+.domid = domid,
+.space = space,
+.idx = idx,
+.gpfn = gpfn,
+};
+
+return xc_memory_op(xc_handle, XENMEM_add_to_physmap, xatp);
+}
+
 
 /* Xen 4.1 */
 #else
-- 
1.6.0.2

[Qemu-devel] [PATCH 10/11] xen: fix interrupt routing

2011-06-20 Thread Alexander Graf

From: Stefano Stabellini stefano.stabell...@eu.citrix.com

Compared to the last version I only added a comment to the code.

- remove i440FX-xen and i440fx_write_config_xen
we don't need to intercept pci config writes to i440FX anymore;

- introduce PIIX3-xen and piix3_write_config_xen
we do need to intercept pci config write to the PCI-ISA bridge to update
the PCI link routing;

- set the number of PIIX3-xen interrupts line to 128;

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/pc.h   |1 -
 hw/pc_piix.c  |6 +
 hw/piix_pci.c |   66 +---
 3 files changed, 35 insertions(+), 38 deletions(-)

diff --git a/hw/pc.h b/hw/pc.h
index 0dcbee7..6d5730b 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -176,7 +176,6 @@ struct PCII440FXState;
 typedef struct PCII440FXState PCII440FXState;
 
 PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, qemu_irq 
*pic, ram_addr_t ram_size);
-PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, 
qemu_irq *pic, ram_addr_t ram_size);
 void i440fx_init_memory_mappings(PCII440FXState *d);
 
 /* piix4.c */
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 9a22a8a..ba198de 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size,
 isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
 
 if (pci_enabled) {
-if (!xen_enabled()) {
-pci_bus = i440fx_init(i440fx_state, piix3_devfn, isa_irq, 
ram_size);
-} else {
-pci_bus = i440fx_xen_init(i440fx_state, piix3_devfn, isa_irq, 
ram_size);
-}
+pci_bus = i440fx_init(i440fx_state, piix3_devfn, isa_irq, ram_size);
 } else {
 pci_bus = NULL;
 i440fx_state = NULL;
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index 85a320e..3e2698d 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
 
 #define PIIX_NUM_PIC_IRQS   16  /* i8259 * 2 */
 #define PIIX_NUM_PIRQS  4ULL/* PIRQ[A-D] */
+#define XEN_PIIX_NUM_PIRQS  128ULL
 #define PIIX_PIRQC  0x60
 
 typedef struct PIIX3State {
@@ -78,6 +79,8 @@ struct PCII440FXState {
 #define I440FX_SMRAM0x72
 
 static void piix3_set_irq(void *opaque, int pirq, int level);
+static void piix3_write_config_xen(PCIDevice *dev,
+   uint32_t address, uint32_t val, int len);
 
 /* return the global irq number corresponding to a given device irq
pin. We could also use the bus number to have a more precise
@@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev,
 }
 }
 
-static void i440fx_write_config_xen(PCIDevice *dev,
-uint32_t address, uint32_t val, int len)
-{
-xen_piix_pci_write_config_client(address, val, len);
-i440fx_write_config(dev, address, val, len);
-}
-
 static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
 {
 PCII440FXState *d = opaque;
@@ -267,8 +263,21 @@ static PCIBus *i440fx_common_init(const char *device_name,
 d = pci_create_simple(b, 0, device_name);
 *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
 
-piix3 = DO_UPCAST(PIIX3State, dev,
-  pci_create_simple_multifunction(b, -1, true, PIIX3));
+/* Xen supports additional interrupt routes from the PCI devices to
+ * the IOAPIC: the four pins of each PCI device on the bus are also
+ * connected to the IOAPIC directly.
+ * These additional routes can be discovered through ACPI. */
+if (xen_enabled()) {
+piix3 = DO_UPCAST(PIIX3State, dev,
+pci_create_simple_multifunction(b, -1, true, PIIX3-xen));
+pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
+piix3, XEN_PIIX_NUM_PIRQS);
+} else {
+piix3 = DO_UPCAST(PIIX3State, dev,
+pci_create_simple_multifunction(b, -1, true, PIIX3));
+pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3,
+PIIX_NUM_PIRQS);
+}
 piix3-pic = pic;
 
 (*pi440fx_state)-piix3 = piix3;
@@ -289,21 +298,6 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int 
*piix3_devfn,
 PCIBus *b;
 
 b = i440fx_common_init(i440FX, pi440fx_state, piix3_devfn, pic, 
ram_size);
-pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, (*pi440fx_state)-piix3,
- PIIX_NUM_PIRQS);
-
-return b;
-}
-
-PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn,
-qemu_irq *pic, ram_addr_t ram_size)
-{
-PCIBus *b;
-
-b = i440fx_common_init(i440FX-xen, pi440fx_state, piix3_devfn, pic, 
ram_size);
-pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
- (*pi440fx_state)-piix3, PIIX_NUM_PIRQS);
-
 return b;
 }
 
@@ -365,6 +359,13 @@ static void piix3_write_config(PCIDevice *dev,
 }
 }
 
+static void

[Qemu-devel] [PATCH 05/11] xen: remove xen_map_block and xen_unmap_block

2011-06-20 Thread Alexander Graf

From: Stefano Stabellini stefano.stabell...@eu.citrix.com

Replace xen_map_block with qemu_map_cache with the appropriate locking
and size parameters.
Replace xen_unmap_block with qemu_invalidate_entry.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 exec.c  |   19 ---
 xen-mapcache-stub.c |4 
 xen-mapcache.c  |   31 ---
 xen-mapcache.h  |   15 ---
 4 files changed, 4 insertions(+), 65 deletions(-)

diff --git a/exec.c b/exec.c
index 01f33bb..e11c1dd 100644
--- a/exec.c
+++ b/exec.c
@@ -53,6 +53,7 @@
 #endif
 #else /* !CONFIG_USER_ONLY */
 #include xen-mapcache.h
+#include trace.h
 #endif
 
 //#define DEBUG_TB_INVALIDATE
@@ -3088,7 +3089,7 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
 if (block-offset == 0) {
 return qemu_map_cache(addr, 0, 1);
 } else if (block-host == NULL) {
-block-host = xen_map_block(block-offset, block-length);
+block-host = qemu_map_cache(block-offset, block-length, 
1);
 }
 }
 return block-host + (addr - block-offset);
@@ -3117,7 +3118,7 @@ void *qemu_safe_ram_ptr(ram_addr_t addr)
 if (block-offset == 0) {
 return qemu_map_cache(addr, 0, 1);
 } else if (block-host == NULL) {
-block-host = xen_map_block(block-offset, block-length);
+block-host = qemu_map_cache(block-offset, block-length, 
1);
 }
 }
 return block-host + (addr - block-offset);
@@ -3135,19 +3136,7 @@ void qemu_put_ram_ptr(void *addr)
 trace_qemu_put_ram_ptr(addr);
 
 if (xen_mapcache_enabled()) {
-RAMBlock *block;
-
-QLIST_FOREACH(block, ram_list.blocks, next) {
-if (addr == block-host) {
-break;
-}
-}
-if (block  block-host) {
-xen_unmap_block(block-host, block-length);
-block-host = NULL;
-} else {
-qemu_invalidate_entry(addr);
-}
+qemu_invalidate_entry(block-host);
 }
 }
 
diff --git a/xen-mapcache-stub.c b/xen-mapcache-stub.c
index 60f712b..8a2380a 100644
--- a/xen-mapcache-stub.c
+++ b/xen-mapcache-stub.c
@@ -34,7 +34,3 @@ void qemu_invalidate_map_cache(void)
 void qemu_invalidate_entry(uint8_t *buffer)
 {
 }
-uint8_t *xen_map_block(target_phys_addr_t phys_addr, target_phys_addr_t size)
-{
-return NULL;
-}
diff --git a/xen-mapcache.c b/xen-mapcache.c
index 57fe24d..fac47cd 100644
--- a/xen-mapcache.c
+++ b/xen-mapcache.c
@@ -362,34 +362,3 @@ void qemu_invalidate_map_cache(void)
 
 mapcache_unlock();
 }
-
-uint8_t *xen_map_block(target_phys_addr_t phys_addr, target_phys_addr_t size)
-{
-uint8_t *vaddr_base;
-xen_pfn_t *pfns;
-int *err;
-unsigned int i;
-target_phys_addr_t nb_pfn = size  XC_PAGE_SHIFT;
-
-trace_xen_map_block(phys_addr, size);
-phys_addr = XC_PAGE_SHIFT;
-
-pfns = qemu_mallocz(nb_pfn * sizeof (xen_pfn_t));
-err = qemu_mallocz(nb_pfn * sizeof (int));
-
-for (i = 0; i  nb_pfn; i++) {
-pfns[i] = phys_addr + i;
-}
-
-vaddr_base = xc_map_foreign_bulk(xen_xc, xen_domid, PROT_READ|PROT_WRITE,
- pfns, err, nb_pfn);
-if (vaddr_base == NULL) {
-perror(xc_map_foreign_bulk);
-exit(-1);
-}
-
-qemu_free(pfns);
-qemu_free(err);
-
-return vaddr_base;
-}
diff --git a/xen-mapcache.h b/xen-mapcache.h
index b89b8f9..6216cc3 100644
--- a/xen-mapcache.h
+++ b/xen-mapcache.h
@@ -9,27 +9,12 @@
 #ifndef XEN_MAPCACHE_H
 #define XEN_MAPCACHE_H
 
-#include sys/mman.h
-#include trace.h
-
 void qemu_map_cache_init(void);
 uint8_t  *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t 
size, uint8_t lock);
 ram_addr_t qemu_ram_addr_from_mapcache(void *ptr);
 void qemu_invalidate_entry(uint8_t *buffer);
 void qemu_invalidate_map_cache(void);
 
-uint8_t *xen_map_block(target_phys_addr_t phys_addr, target_phys_addr_t size);
-
-static inline void xen_unmap_block(void *addr, ram_addr_t size)
-{
-trace_xen_unmap_block(addr, size);
-
-if (munmap(addr, size) != 0) {
-hw_error(xen_unmap_block: %s, strerror(errno));
-}
-}
-
-
 #define mapcache_lock()   ((void)0)
 #define mapcache_unlock() ((void)0)
 
-- 
1.6.0.2

[Qemu-devel] [PATCH] virtio-blk: Turn drive serial into a qdev property

2011-06-20 Thread Markus Armbruster

It needs to be a qdev property, because it belongs to the drive's
guest part.  Precedence: commit a0fef654 and 6ced55a5.

Bonus: info qtree now shows the serial number.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 hw/s390-virtio-bus.c |4 +++-
 hw/s390-virtio-bus.h |1 +
 hw/virtio-blk.c  |   29 +++--
 hw/virtio-blk.h  |2 ++
 hw/virtio-pci.c  |4 +++-
 hw/virtio-pci.h  |1 +
 hw/virtio.h  |3 ++-
 7 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/hw/s390-virtio-bus.c b/hw/s390-virtio-bus.c
index d4a12f7..2bf4821 100644
--- a/hw/s390-virtio-bus.c
+++ b/hw/s390-virtio-bus.c
@@ -128,7 +128,8 @@ static int s390_virtio_blk_init(VirtIOS390Device *dev)
 {
 VirtIODevice *vdev;
 
-vdev = virtio_blk_init((DeviceState *)dev, dev-block);
+vdev = virtio_blk_init((DeviceState *)dev, dev-block,
+   dev-block_serial);
 if (!vdev) {
 return -1;
 }
@@ -355,6 +356,7 @@ static VirtIOS390DeviceInfo s390_virtio_blk = {
 .qdev.size = sizeof(VirtIOS390Device),
 .qdev.props = (Property[]) {
 DEFINE_BLOCK_PROPERTIES(VirtIOS390Device, block),
+DEFINE_PROP_STRING(serial, VirtIOS390Device, block_serial),
 DEFINE_PROP_END_OF_LIST(),
 },
 };
diff --git a/hw/s390-virtio-bus.h b/hw/s390-virtio-bus.h
index 0c412d0..f1bece7 100644
--- a/hw/s390-virtio-bus.h
+++ b/hw/s390-virtio-bus.h
@@ -42,6 +42,7 @@ typedef struct VirtIOS390Device {
 uint8_t feat_len;
 VirtIODevice *vdev;
 BlockConf block;
+char *block_serial;
 NICConf nic;
 uint32_t host_features;
 virtio_serial_conf serial;
diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index 91e0394..6471ac8 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -28,8 +28,8 @@ typedef struct VirtIOBlock
 void *rq;
 QEMUBH *bh;
 BlockConf *conf;
+char *serial;
 unsigned short sector_mask;
-char sn[BLOCK_SERIAL_STRLEN];
 DeviceState *qdev;
 } VirtIOBlock;
 
@@ -362,8 +362,13 @@ static void virtio_blk_handle_request(VirtIOBlockReq *req,
 } else if (type  VIRTIO_BLK_T_GET_ID) {
 VirtIOBlock *s = req-dev;
 
-memcpy(req-elem.in_sg[0].iov_base, s-sn,
-   MIN(req-elem.in_sg[0].iov_len, sizeof(s-sn)));
+/*
+ * NB: per existing s/n string convention the string is
+ * terminated by '\0' only when shorter than buffer.
+ */
+strncpy(req-elem.in_sg[0].iov_base,
+s-serial ? s-serial : ,
+MIN(req-elem.in_sg[0].iov_len, VIRTIO_BLK_ID_BYTES));
 virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
 } else if (type  VIRTIO_BLK_T_OUT) {
 qemu_iovec_init_external(req-qiov, req-elem.out_sg[1],
@@ -531,7 +536,8 @@ static void virtio_blk_change_cb(void *opaque, int reason)
 }
 }
 
-VirtIODevice *virtio_blk_init(DeviceState *dev, BlockConf *conf)
+VirtIODevice *virtio_blk_init(DeviceState *dev, BlockConf *conf,
+  char **serial)
 {
 VirtIOBlock *s;
 int cylinders, heads, secs;
@@ -547,6 +553,14 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, BlockConf 
*conf)
 return NULL;
 }
 
+if (!*serial) {
+/* try to fall back to value set with legacy -drive serial=... */
+dinfo = drive_get_by_blockdev(conf-bs);
+if (*dinfo-serial) {
+*serial = strdup(dinfo-serial);
+}
+}
+
 s = (VirtIOBlock *)virtio_common_init(virtio-blk, VIRTIO_ID_BLOCK,
   sizeof(struct virtio_blk_config),
   sizeof(VirtIOBlock));
@@ -556,16 +570,11 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, BlockConf 
*conf)
 s-vdev.reset = virtio_blk_reset;
 s-bs = conf-bs;
 s-conf = conf;
+s-serial = *serial;
 s-rq = NULL;
 s-sector_mask = (s-conf-logical_block_size / BDRV_SECTOR_SIZE) - 1;
 bdrv_guess_geometry(s-bs, cylinders, heads, secs);
 
-/* NB: per existing s/n string convention the string is terminated
- * by '\0' only when less than sizeof (s-sn)
- */
-dinfo = drive_get_by_blockdev(s-bs);
-strncpy(s-sn, dinfo-serial, sizeof (s-sn));
-
 s-vq = virtio_add_queue(s-vdev, 128, virtio_blk_handle_output);
 
 qemu_add_vm_change_state_handler(virtio_blk_dma_restart_cb, s);
diff --git a/hw/virtio-blk.h b/hw/virtio-blk.h
index fff46da..5645d2b 100644
--- a/hw/virtio-blk.h
+++ b/hw/virtio-blk.h
@@ -34,6 +34,8 @@
 #define VIRTIO_BLK_F_WCACHE 9   /* write cache enabled */
 #define VIRTIO_BLK_F_TOPOLOGY   10  /* Topology information is available */
 
+#define VIRTIO_BLK_ID_BYTES 20  /* ID string length */
+
 struct virtio_blk_config
 {
 uint64_t capacity;
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index c018351..a8c236e 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -710,7 +710,8 @@ static int virtio_blk_init_pci(PCIDevice *pci_dev)

[Qemu-devel] [PATCH 11/11] xen: Add the Xen platform pci device

2011-06-20 Thread Alexander Graf

From: Steven Smith ssm...@xensource.com

Introduce a new emulated PCI device, specific to fully virtualized Xen
guests.  The device is necessary for PV on HVM drivers to work.

Signed-off-by: Steven Smith ssm...@xensource.com
Signed-off-by: Anthony PERARD anthony.per...@citrix.com
Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 Makefile.target   |2 +
 hw/hw.h   |3 +
 hw/pc_piix.c  |4 +
 hw/pci_ids.h  |2 +
 hw/xen_platform.c |  340 +
 trace-events  |3 +
 6 files changed, 354 insertions(+), 0 deletions(-)
 create mode 100644 hw/xen_platform.c

diff --git a/Makefile.target b/Makefile.target
index b1a0f6d..760aa02 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -218,6 +218,8 @@ obj-$(CONFIG_NO_XEN) += xen-stub.o
 obj-i386-$(CONFIG_XEN_MAPCACHE) += xen-mapcache.o
 obj-$(CONFIG_NO_XEN_MAPCACHE) += xen-mapcache-stub.o
 
+obj-i386-$(CONFIG_XEN) += xen_platform.o
+
 # Inter-VM PCI shared memory
 CONFIG_IVSHMEM =
 ifeq ($(CONFIG_KVM), y)
diff --git a/hw/hw.h b/hw/hw.h
index 56447a7..9dd7096 100644
--- a/hw/hw.h
+++ b/hw/hw.h
@@ -780,6 +780,9 @@ extern const VMStateDescription vmstate_ptimer;
 #define VMSTATE_INT32_LE(_f, _s)   \
 VMSTATE_SINGLE(_f, _s, 0, vmstate_info_int32_le, int32_t)
 
+#define VMSTATE_UINT8_TEST(_f, _s, _t)   \
+VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint8, uint8_t)
+
 #define VMSTATE_UINT16_TEST(_f, _s, _t)   \
 VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint16, uint16_t)
 
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index ba198de..8dbeb0c 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -136,6 +136,10 @@ static void pc_init1(ram_addr_t ram_size,
 
 pc_vga_init(pci_enabled? pci_bus: NULL);
 
+if (xen_enabled()) {
+pci_create_simple(pci_bus, -1, xen-platform);
+}
+
 /* init basic PC hardware */
 pc_basic_device_init(isa_irq, rtc_state, xen_enabled());
 
diff --git a/hw/pci_ids.h b/hw/pci_ids.h
index d9457ed..d94578c 100644
--- a/hw/pci_ids.h
+++ b/hw/pci_ids.h
@@ -109,3 +109,5 @@
 #define PCI_DEVICE_ID_INTEL_82371AB  0x7111
 #define PCI_DEVICE_ID_INTEL_82371AB_20x7112
 #define PCI_DEVICE_ID_INTEL_82371AB_30x7113
+
+#define PCI_VENDOR_ID_XENSOURCE  0x5853
diff --git a/hw/xen_platform.c b/hw/xen_platform.c
new file mode 100644
index 000..b167eee
--- /dev/null
+++ b/hw/xen_platform.c
@@ -0,0 +1,340 @@
+/*
+ * XEN platform pci device, formerly known as the event channel device
+ *
+ * Copyright (c) 2003-2004 Intel Corp.
+ * Copyright (c) 2006 XenSource
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include assert.h
+
+#include hw.h
+#include pc.h
+#include pci.h
+#include irq.h
+#include xen_common.h
+#include net.h
+#include xen_backend.h
+#include rwhandler.h
+#include trace.h
+
+#include xenguest.h
+
+//#define DEBUG_PLATFORM
+
+#ifdef DEBUG_PLATFORM
+#define DPRINTF(fmt, ...) do { \
+fprintf(stderr, xen_platform:  fmt, ## __VA_ARGS__); \
+} while (0)
+#else
+#define DPRINTF(fmt, ...) do { } while (0)
+#endif
+
+#define PFFLAG_ROM_LOCK 1 /* Sets whether ROM memory area is RW or RO */
+
+typedef struct PCIXenPlatformState {
+PCIDevice  pci_dev;
+uint8_t flags; /* used only for version_id == 2 */
+int drivers_blacklisted;
+uint16_t driver_product_version;
+
+/* Log from guest drivers */
+char log_buffer[4096];
+int log_buffer_off;
+} PCIXenPlatformState;
+
+#define XEN_PLATFORM_IOPORT 0x10
+
+/* Send bytes to syslog */
+static void log_writeb(PCIXenPlatformState *s, char val)
+{
+if (val == '\n' || s-log_buffer_off == sizeof(s-log_buffer) - 1) {
+/* Flush buffer */
+s-log_buffer[s-log_buffer_off] = 0;
+trace_xen_platform_log(s-log_buffer);
+s-log_buffer_off = 0;
+} else {

[Qemu-devel] [PATCH 03/11] xen: fix qemu_map_cache with size != MCACHE_BUCKET_SIZE

2011-06-20 Thread Alexander Graf

From: Stefano Stabellini stefano.stabell...@eu.citrix.com

Fix the implementation of qemu_map_cache: correctly support size
arguments different from 0 or MCACHE_BUCKET_SIZE.
The new implementation supports locked mapcache entries with size
multiple of MCACHE_BUCKET_SIZE. qemu_invalidate_entry can correctly
find and unmap these large mapcache entries given that the virtual
address passed to qemu_invalidate_entry is the same returned by
qemu_map_cache when the locked mapcache entry was created.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 xen-mapcache.c |   77 +++
 1 files changed, 65 insertions(+), 12 deletions(-)

diff --git a/xen-mapcache.c b/xen-mapcache.c
index 349cc62..90fbd49 100644
--- a/xen-mapcache.c
+++ b/xen-mapcache.c
@@ -43,14 +43,16 @@
 typedef struct MapCacheEntry {
 target_phys_addr_t paddr_index;
 uint8_t *vaddr_base;
-DECLARE_BITMAP(valid_mapping, MCACHE_BUCKET_SIZE  XC_PAGE_SHIFT);
+unsigned long *valid_mapping;
 uint8_t lock;
+target_phys_addr_t size;
 struct MapCacheEntry *next;
 } MapCacheEntry;
 
 typedef struct MapCacheRev {
 uint8_t *vaddr_req;
 target_phys_addr_t paddr_index;
+target_phys_addr_t size;
 QTAILQ_ENTRY(MapCacheRev) next;
 } MapCacheRev;
 
@@ -68,6 +70,15 @@ typedef struct MapCache {
 
 static MapCache *mapcache;
 
+static inline int test_bits(int nr, int size, const unsigned long *addr)
+{
+unsigned long res = find_next_zero_bit(addr, size + nr, nr);
+if (res = nr + size)
+return 1;
+else
+return 0;
+}
+
 void qemu_map_cache_init(void)
 {
 unsigned long size;
@@ -115,11 +126,15 @@ static void qemu_remap_bucket(MapCacheEntry *entry,
 err = qemu_mallocz(nb_pfn * sizeof (int));
 
 if (entry-vaddr_base != NULL) {
-if (munmap(entry-vaddr_base, size) != 0) {
+if (munmap(entry-vaddr_base, entry-size) != 0) {
 perror(unmap fails);
 exit(-1);
 }
 }
+if (entry-valid_mapping != NULL) {
+qemu_free(entry-valid_mapping);
+entry-valid_mapping = NULL;
+}
 
 for (i = 0; i  nb_pfn; i++) {
 pfns[i] = (address_index  (MCACHE_BUCKET_SHIFT-XC_PAGE_SHIFT)) + i;
@@ -134,6 +149,9 @@ static void qemu_remap_bucket(MapCacheEntry *entry,
 
 entry-vaddr_base = vaddr_base;
 entry-paddr_index = address_index;
+entry-size = size;
+entry-valid_mapping = (unsigned long *) qemu_mallocz(sizeof(unsigned 
long) *
+BITS_TO_LONGS(size  XC_PAGE_SHIFT));
 
 bitmap_zero(entry-valid_mapping, nb_pfn);
 for (i = 0; i  nb_pfn; i++) {
@@ -151,32 +169,47 @@ uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, 
target_phys_addr_t size, u
 MapCacheEntry *entry, *pentry = NULL;
 target_phys_addr_t address_index  = phys_addr  MCACHE_BUCKET_SHIFT;
 target_phys_addr_t address_offset = phys_addr  (MCACHE_BUCKET_SIZE - 1);
+target_phys_addr_t __size = size;
 
 trace_qemu_map_cache(phys_addr);
 
-if (address_index == mapcache-last_address_index  !lock) {
+if (address_index == mapcache-last_address_index  !lock  !__size) {
 trace_qemu_map_cache_return(mapcache-last_address_vaddr + 
address_offset);
 return mapcache-last_address_vaddr + address_offset;
 }
 
+/* size is always a multiple of MCACHE_BUCKET_SIZE */
+if ((address_offset + (__size % MCACHE_BUCKET_SIZE))  MCACHE_BUCKET_SIZE)
+__size += MCACHE_BUCKET_SIZE;
+if (__size % MCACHE_BUCKET_SIZE)
+__size += MCACHE_BUCKET_SIZE - (__size % MCACHE_BUCKET_SIZE);
+if (!__size)
+__size = MCACHE_BUCKET_SIZE;
+
 entry = mapcache-entry[address_index % mapcache-nr_buckets];
 
-while (entry  entry-lock  entry-paddr_index != address_index  
entry-vaddr_base) {
+while (entry  entry-lock  entry-vaddr_base 
+(entry-paddr_index != address_index || entry-size != __size ||
+ !test_bits(address_offset  XC_PAGE_SHIFT, size  XC_PAGE_SHIFT,
+ entry-valid_mapping))) {
 pentry = entry;
 entry = entry-next;
 }
 if (!entry) {
 entry = qemu_mallocz(sizeof (MapCacheEntry));
 pentry-next = entry;
-qemu_remap_bucket(entry, size ? : MCACHE_BUCKET_SIZE, address_index);
+qemu_remap_bucket(entry, __size, address_index);
 } else if (!entry-lock) {
 if (!entry-vaddr_base || entry-paddr_index != address_index ||
-!test_bit(address_offset  XC_PAGE_SHIFT, entry-valid_mapping)) {
-qemu_remap_bucket(entry, size ? : MCACHE_BUCKET_SIZE, 
address_index);
+entry-size != __size ||
+!test_bits(address_offset  XC_PAGE_SHIFT, size  
XC_PAGE_SHIFT,
+entry-valid_mapping)) {
+qemu_remap_bucket(entry, __size, address_index);
 }
 }
 
-if (!test_bit(address_offset  XC_PAGE_SHIFT,

Re: [Qemu-devel] [PATCH] do not send packet to nic if the packet will be dropped by nic

2011-06-20 Thread Wen Congyang

At 06/20/2011 05:10 PM, Kevin Wolf Write:
 Am 17.06.2011 03:33, schrieb Wen Congyang:
 If !s-clock_enabled or !rtl8139_receiver_enabled(s), it means that
 the nic will drop all packets from host. So qemu will keep getting
 packets from host and wasting CPU on dropping packets. This seems
 worse than packets that should be dropped but aren't.

 Signed-off-by: Wen Congyang we...@cn.fujitsu.com
 
 Which bug does this change fix? I'm still not convinced that we should
 do it.

Maybe not a bug fix now. As Michael S. Tsirkin said, if rtl8139_can_receive()
returns 1, qemu will keep getting packets from host and wasting CPU on
dropping packets. We can save CPU by return 0.

 
 ---
  hw/rtl8139.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)

 diff --git a/hw/rtl8139.c b/hw/rtl8139.c
 index 2f8db58..9084678 100644
 --- a/hw/rtl8139.c
 +++ b/hw/rtl8139.c
 @@ -810,9 +810,9 @@ static int rtl8139_can_receive(VLANClientState *nc)
  
  /* Receive (drop) packets if card is disabled.  */
 
 This comment isn't accurate any more after applying the patch.
 
  if (!s-clock_enabled)
 -  return 1;
 +  return 0;
  if (!rtl8139_receiver_enabled(s))
 -  return 1;
 +  return 0;
  
  if (rtl8139_cp_receiver_enabled(s)) {
  /* ??? Flow control not implemented in c+ mode.
 -- 1.7.1 
 
 Kevin

[Qemu-devel] [PATCH 02/11] xen: Introduce VGA sync dirty bitmap support

2011-06-20 Thread Alexander Graf

From: Anthony PERARD anthony.per...@citrix.com

This patch introduces phys memory client for Xen.

Only sync dirty_bitmap and set_memory are actually implemented.
migration_log will stay empty for the moment.

Xen can only log one range for bit change, so only the range in the
first call will be synced.

Signed-off-by: Anthony PERARD anthony.per...@citrix.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 trace-events |1 +
 xen-all.c|  267 ++
 2 files changed, 268 insertions(+), 0 deletions(-)

diff --git a/trace-events b/trace-events
index f1230f1..46a19d3 100644
--- a/trace-events
+++ b/trace-events
@@ -396,6 +396,7 @@ disable milkymist_vgafb_memory_write(uint32_t addr, 
uint32_t value) addr %08x v
 
 # xen-all.c
 disable xen_ram_alloc(unsigned long ram_addr, unsigned long size) requested: 
%#lx, size %#lx
+disable xen_client_set_memory(uint64_t start_addr, unsigned long size, 
unsigned long phys_offset, bool log_dirty) %#PRIx64 size %#lx, offset %#lx, 
log_dirty %i
 
 # xen-mapcache.c
 disable qemu_map_cache(uint64_t phys_addr) want %#PRIx64
diff --git a/xen-all.c b/xen-all.c
index 0eac202..75a82c2 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -13,6 +13,7 @@
 #include hw/xen_common.h
 #include hw/xen_backend.h
 
+#include range.h
 #include xen-mapcache.h
 #include trace.h
 
@@ -54,6 +55,14 @@ static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t 
*shared_page, int vcpu)
 
 #define BUFFER_IO_MAX_DELAY  100
 
+typedef struct XenPhysmap {
+target_phys_addr_t start_addr;
+ram_addr_t size;
+target_phys_addr_t phys_offset;
+
+QLIST_ENTRY(XenPhysmap) list;
+} XenPhysmap;
+
 typedef struct XenIOState {
 shared_iopage_t *shared_page;
 buffered_iopage_t *buffered_io_page;
@@ -66,6 +75,9 @@ typedef struct XenIOState {
 int send_vcpu;
 
 struct xs_handle *xenstore;
+CPUPhysMemoryClient client;
+QLIST_HEAD(, XenPhysmap) physmap;
+const XenPhysmap *log_for_dirtybit;
 
 Notifier exit;
 } XenIOState;
@@ -178,6 +190,256 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size)
 qemu_free(pfn_list);
 }
 
+static XenPhysmap *get_physmapping(XenIOState *state,
+   target_phys_addr_t start_addr, ram_addr_t 
size)
+{
+XenPhysmap *physmap = NULL;
+
+start_addr = TARGET_PAGE_MASK;
+
+QLIST_FOREACH(physmap, state-physmap, list) {
+if (range_covers_byte(physmap-start_addr, physmap-size, start_addr)) 
{
+return physmap;
+}
+}
+return NULL;
+}
+
+#if CONFIG_XEN_CTRL_INTERFACE_VERSION = 340
+static int xen_add_to_physmap(XenIOState *state,
+  target_phys_addr_t start_addr,
+  ram_addr_t size,
+  target_phys_addr_t phys_offset)
+{
+unsigned long i = 0;
+int rc = 0;
+XenPhysmap *physmap = NULL;
+target_phys_addr_t pfn, start_gpfn;
+
+if (get_physmapping(state, start_addr, size)) {
+return 0;
+}
+if (size = 0) {
+return -1;
+}
+
+DPRINTF(mapping vram to %llx - %llx, from %llx\n,
+start_addr, start_addr + size, phys_offset);
+
+pfn = phys_offset  TARGET_PAGE_BITS;
+start_gpfn = start_addr  TARGET_PAGE_BITS;
+for (i = 0; i  size  TARGET_PAGE_BITS; i++) {
+unsigned long idx = pfn + i;
+xen_pfn_t gpfn = start_gpfn + i;
+
+rc = xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, 
idx, gpfn);
+if (rc) {
+DPRINTF(add_to_physmap MFN %PRI_xen_pfn to PFN %
+PRI_xen_pfn failed: %d\n, idx, gpfn, rc);
+return -rc;
+}
+}
+
+physmap = qemu_malloc(sizeof (XenPhysmap));
+
+physmap-start_addr = start_addr;
+physmap-size = size;
+physmap-phys_offset = phys_offset;
+
+QLIST_INSERT_HEAD(state-physmap, physmap, list);
+
+xc_domain_pin_memory_cacheattr(xen_xc, xen_domid,
+   start_addr  TARGET_PAGE_BITS,
+   (start_addr + size)  TARGET_PAGE_BITS,
+   XEN_DOMCTL_MEM_CACHEATTR_WB);
+return 0;
+}
+
+static int xen_remove_from_physmap(XenIOState *state,
+   target_phys_addr_t start_addr,
+   ram_addr_t size)
+{
+unsigned long i = 0;
+int rc = 0;
+XenPhysmap *physmap = NULL;
+target_phys_addr_t phys_offset = 0;
+
+physmap = get_physmapping(state, start_addr, size);
+if (physmap == NULL) {
+return -1;
+}
+
+phys_offset = physmap-phys_offset;
+size = physmap-size;
+
+DPRINTF(unmapping vram to %llx - %llx, from %llx\n,
+phys_offset, phys_offset + size, start_addr);
+
+size = TARGET_PAGE_BITS;
+start_addr = TARGET_PAGE_BITS;
+phys_offset = TARGET_PAGE_BITS;
+for (i = 0; i  size; i++) {
+unsigned long idx = start_addr + i;
+xen_pfn_t gpfn =

[Qemu-devel] [PATCH 07/11] xen: mapcache performance improvements

2011-06-20 Thread Alexander Graf

From: Stefano Stabellini stefano.stabell...@eu.citrix.com

Use qemu_invalidate_entry in cpu_physical_memory_unmap.

Do not lock mapcache entries in qemu_get_ram_ptr if the address falls in
the ramblock with offset == 0. We don't need to do that because the
callers of qemu_get_ram_ptr either try to map an entire block, other
from the main ramblock, or until the end of a page to implement a single
read or write in the main ramblock.
If we don't lock mapcache entries in qemu_get_ram_ptr we don't need to
call qemu_invalidate_entry in qemu_put_ram_ptr anymore because we can
leave with few long lived block mappings requested by devices.

Also move the call to qemu_ram_addr_from_mapcache at the beginning of
qemu_ram_addr_from_host.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 exec.c |   28 ++--
 1 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/exec.c b/exec.c
index 238c173..7f14332 100644
--- a/exec.c
+++ b/exec.c
@@ -3085,9 +3085,10 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
 if (xen_mapcache_enabled()) {
 /* We need to check if the requested address is in the RAM
  * because we don't want to map the entire memory in QEMU.
+ * In that case just map until the end of the page.
  */
 if (block-offset == 0) {
-return qemu_map_cache(addr, 0, 1);
+return qemu_map_cache(addr, 0, 0);
 } else if (block-host == NULL) {
 block-host = qemu_map_cache(block-offset, block-length, 
1);
 }
@@ -3114,9 +3115,10 @@ void *qemu_safe_ram_ptr(ram_addr_t addr)
 if (xen_mapcache_enabled()) {
 /* We need to check if the requested address is in the RAM
  * because we don't want to map the entire memory in QEMU.
+ * In that case just map until the end of the page.
  */
 if (block-offset == 0) {
-return qemu_map_cache(addr, 0, 1);
+return qemu_map_cache(addr, 0, 0);
 } else if (block-host == NULL) {
 block-host = qemu_map_cache(block-offset, block-length, 
1);
 }
@@ -3159,10 +3161,6 @@ void *qemu_ram_ptr_length(target_phys_addr_t addr, 
target_phys_addr_t *size)
 void qemu_put_ram_ptr(void *addr)
 {
 trace_qemu_put_ram_ptr(addr);
-
-if (xen_mapcache_enabled()) {
-qemu_invalidate_entry(block-host);
-}
 }
 
 int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr)
@@ -3170,6 +3168,11 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t 
*ram_addr)
 RAMBlock *block;
 uint8_t *host = ptr;
 
+if (xen_mapcache_enabled()) {
+*ram_addr = qemu_ram_addr_from_mapcache(ptr);
+return 0;
+}
+
 QLIST_FOREACH(block, ram_list.blocks, next) {
 /* This case append when the block is not mapped. */
 if (block-host == NULL) {
@@ -3181,11 +3184,6 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t 
*ram_addr)
 }
 }
 
-if (xen_mapcache_enabled()) {
-*ram_addr = qemu_ram_addr_from_mapcache(ptr);
-return 0;
-}
-
 return -1;
 }
 
@@ -4086,13 +4084,7 @@ void cpu_physical_memory_unmap(void *buffer, 
target_phys_addr_t len,
 }
 }
 if (xen_mapcache_enabled()) {
-uint8_t *buffer1 = buffer;
-uint8_t *end_buffer = buffer + len;
-
-while (buffer1  end_buffer) {
-qemu_put_ram_ptr(buffer1);
-buffer1 += TARGET_PAGE_SIZE;
-}
+qemu_invalidate_entry(buffer);
 }
 return;
 }
-- 
1.6.0.2

Re: [Qemu-devel] [PATCH] do not send packet to nic if the packet will be dropped by nic

2011-06-20 Thread Kevin Wolf

Am 20.06.2011 11:40, schrieb Wen Congyang:
 At 06/20/2011 05:10 PM, Kevin Wolf Write:
 Am 17.06.2011 03:33, schrieb Wen Congyang:
 If !s-clock_enabled or !rtl8139_receiver_enabled(s), it means that
 the nic will drop all packets from host. So qemu will keep getting
 packets from host and wasting CPU on dropping packets. This seems
 worse than packets that should be dropped but aren't.

 Signed-off-by: Wen Congyang we...@cn.fujitsu.com

 Which bug does this change fix? I'm still not convinced that we should
 do it.
 
 Maybe not a bug fix now. As Michael S. Tsirkin said, if rtl8139_can_receive()
 returns 1, qemu will keep getting packets from host and wasting CPU on
 dropping packets. We can save CPU by return 0.

Don't we waste memory instead then because we leave the packets queued
indefinitely?

Kevin

[Qemu-devel] [PATCH 0/2] Suspend (S3) support

2011-06-20 Thread Alon Levy

The first patch is a slightly revised patch send before, introducing a
print helper (qxl_mode_to_string) that is used by the second patch too, hence
I'm sending them together. I've looked for additional places to use 
qxl_mode_to_string
like Gerd asked before, found just one.

The second patch is the one adding support for QXL_IO_UPDATE_MEM.

This is just part of the suspend support. It requires a revised spice-server
(to implement the update_mem io), and a revised driver (patches on spice-devel
for the windows driver, the linux to-be-done).

Alon Levy (2):
  qxl: interface_get_command: fix reported mode
  qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support

 hw/qxl.c |   47 ---
 1 files changed, 44 insertions(+), 3 deletions(-)

-- 
1.7.5.2

Re: [Qemu-devel] [PATCH] do not send packet to nic if the packet will be dropped by nic

2011-06-20 Thread Michael S. Tsirkin

On Mon, Jun 20, 2011 at 11:52:20AM +0200, Kevin Wolf wrote:
 Am 20.06.2011 11:40, schrieb Wen Congyang:
  At 06/20/2011 05:10 PM, Kevin Wolf Write:
  Am 17.06.2011 03:33, schrieb Wen Congyang:
  If !s-clock_enabled or !rtl8139_receiver_enabled(s), it means that
  the nic will drop all packets from host. So qemu will keep getting
  packets from host and wasting CPU on dropping packets. This seems
  worse than packets that should be dropped but aren't.
 
  Signed-off-by: Wen Congyang we...@cn.fujitsu.com
 
  Which bug does this change fix? I'm still not convinced that we should
  do it.
  
  Maybe not a bug fix now. As Michael S. Tsirkin said, if 
  rtl8139_can_receive()
  returns 1, qemu will keep getting packets from host and wasting CPU on
  dropping packets. We can save CPU by return 0.
 
 Don't we waste memory instead then because we leave the packets queued
 indefinitely?
 
 Kevin

Yes but the amount of wasted memory is bound from above
so this doesn't seem too bad to me ...

-- 
MST

[Qemu-devel] [PATCH 1/2] qxl: interface_get_command: fix reported mode

2011-06-20 Thread Alon Levy

report correct mode when in undefined mode.
introduces qxl_mode_to_string, and uses it in another place that looks
helpful (qxl_post_load)
---
 hw/qxl.c |   21 ++---
 1 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/hw/qxl.c b/hw/qxl.c
index 1906e84..ca5c8b3 100644
--- a/hw/qxl.c
+++ b/hw/qxl.c
@@ -336,6 +336,21 @@ static void interface_get_init_info(QXLInstance *sin, 
QXLDevInitInfo *info)
 info-n_surfaces = NUM_SURFACES;
 }
 
+static const char *qxl_mode_to_string(int mode)
+{
+switch (mode) {
+case QXL_MODE_COMPAT:
+return compat;
+case QXL_MODE_NATIVE:
+return native;
+case QXL_MODE_UNDEFINED:
+return undefined;
+case QXL_MODE_VGA:
+return vga;
+}
+return unknown;
+}
+
 /* called from spice server thread context only */
 static int interface_get_command(QXLInstance *sin, struct QXLCommandExt *ext)
 {
@@ -364,8 +379,7 @@ static int interface_get_command(QXLInstance *sin, struct 
QXLCommandExt *ext)
 case QXL_MODE_COMPAT:
 case QXL_MODE_NATIVE:
 case QXL_MODE_UNDEFINED:
-dprint(qxl, 2, %s: %s\n, __FUNCTION__,
-   qxl-cmdflags ? compat : native);
+dprint(qxl, 2, %s: %s\n, __FUNCTION__, 
qxl_mode_to_string(qxl-mode));
 ring = qxl-ram-cmd_ring;
 if (SPICE_RING_IS_EMPTY(ring)) {
 return false;
@@ -1378,7 +1392,8 @@ static int qxl_post_load(void *opaque, int version)
 
 d-modes = (QXLModes*)((uint8_t*)d-rom + d-rom-modes_offset);
 
-dprint(d, 1, %s: restore mode\n, __FUNCTION__);
+dprint(d, 1, %s: restore mode (%s)\n, __FUNCTION__,
+   qxl_mode_to_string(d-mode));
 newmode = d-mode;
 d-mode = QXL_MODE_UNDEFINED;
 switch (newmode) {
-- 
1.7.5.2

Re: [Qemu-devel] [PATCH 0/9] AREG0 patches

2011-06-20 Thread Kevin Wolf

Am 19.06.2011 23:55, schrieb Andreas Färber:
 Am 19.06.2011 um 22:57 schrieb Blue Swirl:
 
 These and the stack frame patches can be found in
 git://repo.or.cz/qemu/blueswirl.git

 Blue Swirl (9):
  cpu_loop_exit: avoid using AREG0
  sparc: fix coding style of the area to be moved
  sparc: move do_interrupt to helper.c
  x86: use caller supplied CPUState for interrupt related stuff
  m68k: use caller supplied CPUState for interrupt related stuff
  cpu-exec: unify do_interrupt call
  exec.h: fix coding style and change cpu_has_work to return bool
  Move cpu_has_work and cpu_pc_from_tb to cpu.h
  Remove exec-all.h include directives
 
 This is getting rather unhandy with two series...
 
 Could you please check that chainreplyto = true under [sendemail]? I  
 have no other related options set, and it used to work via Gmail last  
 time I tried.

Actually, chainreply = false is what you want, so that all patches are
replies to patch 0 instead of patch n-1.

Of course, you need to send off the whole series with only a single
git-send-email invocation for it to work, like git send-email 00*.patch

Kevin

[Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support

2011-06-20 Thread Alon Levy

Add QXL_IO_UPDATE_MEM. Used to reduce vmexits from the guest when it resets the
spice server state before going to sleep.

The implementation requires an updated spice-server (0.8.2) with the new
worker function update_mem.

Cc: Yonit Halperin yhalp...@redhat.com
---
 hw/qxl.c |   26 ++
 1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/hw/qxl.c b/hw/qxl.c
index ca5c8b3..4945d95 100644
--- a/hw/qxl.c
+++ b/hw/qxl.c
@@ -1016,6 +1016,32 @@ static void ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 case QXL_IO_DESTROY_ALL_SURFACES:
 d-ssd.worker-destroy_surfaces(d-ssd.worker);
 break;
+case QXL_IO_UPDATE_MEM:
+dprint(d, 1, QXL_IO_UPDATE_MEM (%d) entry (%s, s#=%d, res#=%d)\n,
+val, qxl_mode_to_string(d-mode), d-guest_surfaces.count,
+d-num_free_res);
+switch (val) {
+case (QXL_UPDATE_MEM_RENDER_ALL):
+d-ssd.worker-update_mem(d-ssd.worker);
+break;
+case (QXL_UPDATE_MEM_FLUSH): {
+QXLReleaseRing *ring = d-ram-release_ring;
+if (ring-prod - ring-cons + 1 == ring-num_items) {
+// TODO - return a value to the guest and let it loop?
+fprintf(stderr,
+ERROR: no flush, full release ring [p%d,%dc]\n,
+ring-prod, ring-cons);
+}
+qxl_push_free_res(d, 1 /* flush */);
+dprint(d, 1, QXL_IO_UPDATE_MEM exit (%s, s#=%d, res#=%d,%p)\n,
+qxl_mode_to_string(d-mode), d-guest_surfaces.count,
+d-num_free_res, d-last_release);
+break;
+}
+default:
+fprintf(stderr, ERROR: unexpected value to QXL_IO_UPDATE_MEM\n);
+}
+break;
 default:
 fprintf(stderr, %s: ioport=0x%x, abort()\n, __FUNCTION__, io_port);
 abort();
-- 
1.7.5.2

[Qemu-devel] [PATCH v2 0/3] Let RTC follow backward jumps of host clock immediately

2011-06-20 Thread Jan Kiszka

Just noticed that this issue is still unfixed because my series was
somehow forgotten. So I've rebased it over current master, refactored it
to use the generic Notifier infrastructure and renamed it to clock
reset notifier to avoid confusion with icount related warping. Please
review / apply before 0.15-rc0, it fixes a relevant issue.

Original series description:

By default, we base the mc146818 RTC on the host clock (CLOCK_REALTIME).
This works fine if only the frequency of the host clock is tuned (e.g.
by NTP) or if it is set to a future time. However, if the host is tuned
backward, e.g. because NTP obtained the correct time after the guest was
already started or the admin decided to tune the local time, we see an
unpleasant effect in the guest: The RTC will stall for the period the
host clock is set back. We identified that one prominent guest affected
by this is Windows which relies on the periodic RTC interrupt for time
keeping.

This series address the issue by detecting those warps and providing a
callback mechanism to device models. The RTC is enabled to update its
timers and register content immediately. Tested successfully both with
hwclock in a Linux guest and by monitoring the Windows clock while
fiddling with the host time.

Note that if this kind of RTC adjustment is not wanted, the user is
still free to decouple the RTC from the host clock and base it on the
VM clock - just like before.

Jan Kiszka (3):
  notifier: Pass data argument to callback
  qemu-timer: Introduce clock reset notifier
  mc146818rtc: Handle host clock resets

 hw/fw_cfg.c  |2 +-
 hw/mc146818rtc.c |   20 
 input.c  |2 +-
 migration.c  |   12 ++--
 notify.c |4 ++--
 notify.h |4 ++--
 qemu-timer.c |   29 -
 qemu-timer.h |5 +
 ui/sdl.c |2 +-
 ui/spice-core.c  |2 +-
 ui/spice-input.c |4 ++--
 ui/vnc.c |4 ++--
 usb-linux.c  |2 +-
 vl.c |4 ++--
 xen-all.c|2 +-
 15 files changed, 75 insertions(+), 23 deletions(-)

[Qemu-devel] [PATCH v2 3/3] mc146818rtc: Handle host clock resets

2011-06-20 Thread Jan Kiszka

Make use of the new clock reset notifier to update the RTC whenever
rtc_clock is the host clock and that happens to jump backward. This
avoids that the RTC stalls for the period the host clock was set back.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/mc146818rtc.c |   20 
 1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/hw/mc146818rtc.c b/hw/mc146818rtc.c
index 1c9a706..feb3b25 100644
--- a/hw/mc146818rtc.c
+++ b/hw/mc146818rtc.c
@@ -99,6 +99,7 @@ typedef struct RTCState {
 QEMUTimer *coalesced_timer;
 QEMUTimer *second_timer;
 QEMUTimer *second_timer2;
+Notifier clock_reset_notifier;
 } RTCState;
 
 static void rtc_set_time(RTCState *s);
@@ -572,6 +573,22 @@ static const VMStateDescription vmstate_rtc = {
 }
 };
 
+static void rtc_notify_clock_reset(Notifier *notifier, void *data)
+{
+RTCState *s = container_of(notifier, RTCState, clock_reset_notifier);
+int64_t now = *(int64_t *)data;
+
+rtc_set_date_from_host(s-dev);
+s-next_second_time = now + (get_ticks_per_sec() * 99) / 100;
+qemu_mod_timer(s-second_timer2, s-next_second_time);
+rtc_timer_update(s, now);
+#ifdef TARGET_I386
+if (rtc_td_hack) {
+rtc_coalesced_timer_update(s);
+}
+#endif
+}
+
 static void rtc_reset(void *opaque)
 {
 RTCState *s = opaque;
@@ -608,6 +625,9 @@ static int rtc_initfn(ISADevice *dev)
 s-second_timer = qemu_new_timer_ns(rtc_clock, rtc_update_second, s);
 s-second_timer2 = qemu_new_timer_ns(rtc_clock, rtc_update_second2, s);
 
+s-clock_reset_notifier.notify = rtc_notify_clock_reset;
+qemu_register_clock_reset_notifier(rtc_clock, s-clock_reset_notifier);
+
 s-next_second_time =
 qemu_get_clock_ns(rtc_clock) + (get_ticks_per_sec() * 99) / 100;
 qemu_mod_timer(s-second_timer2, s-next_second_time);
-- 
1.7.1

Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support

2011-06-20 Thread Gerd Hoffmann


  Hi,


+case QXL_IO_UPDATE_MEM:
+switch (val) {
+case (QXL_UPDATE_MEM_RENDER_ALL):
+d-ssd.worker-update_mem(d-ssd.worker);
+break;


What is the difference to one worker-stop() + worker-start() cycle?

cheers,
  Gerd

[Qemu-devel] [PATCH v2 1/3] notifier: Pass data argument to callback

2011-06-20 Thread Jan Kiszka

This allows to pass additional information to the notifier callback
which is useful if sender and receiver do not share any other distinct
data structure.

Will be used first for the clock reset notifier.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/fw_cfg.c  |2 +-
 input.c  |2 +-
 migration.c  |   12 ++--
 notify.c |4 ++--
 notify.h |4 ++--
 ui/sdl.c |2 +-
 ui/spice-core.c  |2 +-
 ui/spice-input.c |4 ++--
 ui/vnc.c |4 ++--
 usb-linux.c  |2 +-
 vl.c |4 ++--
 xen-all.c|2 +-
 12 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
index 85c8c3c..34e7526 100644
--- a/hw/fw_cfg.c
+++ b/hw/fw_cfg.c
@@ -316,7 +316,7 @@ int fw_cfg_add_file(FWCfgState *s,  const char *filename, 
uint8_t *data,
 return 1;
 }
 
-static void fw_cfg_machine_ready(struct Notifier* n)
+static void fw_cfg_machine_ready(struct Notifier *n, void *data)
 {
 uint32_t len;
 FWCfgState *s = container_of(n, FWCfgState, machine_ready);
diff --git a/input.c b/input.c
index 5664d3a..894d57f 100644
--- a/input.c
+++ b/input.c
@@ -59,7 +59,7 @@ static void check_mode_change(void)
 
 if (is_absolute != current_is_absolute ||
 has_absolute != current_has_absolute) {
-notifier_list_notify(mouse_mode_notifiers);
+notifier_list_notify(mouse_mode_notifiers, NULL);
 }
 
 current_is_absolute = is_absolute;
diff --git a/migration.c b/migration.c
index af3a1f2..2a15b98 100644
--- a/migration.c
+++ b/migration.c
@@ -124,7 +124,7 @@ int do_migrate(Monitor *mon, const QDict *qdict, QObject 
**ret_data)
 }
 
 current_migration = s;
-notifier_list_notify(migration_state_notifiers);
+notifier_list_notify(migration_state_notifiers, NULL);
 return 0;
 }
 
@@ -276,7 +276,7 @@ void migrate_fd_error(FdMigrationState *s)
 {
 DPRINTF(setting error state\n);
 s-state = MIG_STATE_ERROR;
-notifier_list_notify(migration_state_notifiers);
+notifier_list_notify(migration_state_notifiers, NULL);
 migrate_fd_cleanup(s);
 }
 
@@ -334,7 +334,7 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void 
*data, size_t size)
 monitor_resume(s-mon);
 }
 s-state = MIG_STATE_ERROR;
-notifier_list_notify(migration_state_notifiers);
+notifier_list_notify(migration_state_notifiers, NULL);
 }
 
 return ret;
@@ -395,7 +395,7 @@ void migrate_fd_put_ready(void *opaque)
 state = MIG_STATE_ERROR;
 }
 s-state = state;
-notifier_list_notify(migration_state_notifiers);
+notifier_list_notify(migration_state_notifiers, NULL);
 }
 }
 
@@ -415,7 +415,7 @@ void migrate_fd_cancel(MigrationState *mig_state)
 DPRINTF(cancelling migration\n);
 
 s-state = MIG_STATE_CANCELLED;
-notifier_list_notify(migration_state_notifiers);
+notifier_list_notify(migration_state_notifiers, NULL);
 qemu_savevm_state_cancel(s-mon, s-file);
 
 migrate_fd_cleanup(s);
@@ -429,7 +429,7 @@ void migrate_fd_release(MigrationState *mig_state)

 if (s-state == MIG_STATE_ACTIVE) {
 s-state = MIG_STATE_CANCELLED;
-notifier_list_notify(migration_state_notifiers);
+notifier_list_notify(migration_state_notifiers, NULL);
 migrate_fd_cleanup(s);
 }
 qemu_free(s);
diff --git a/notify.c b/notify.c
index bcd3fc5..a6bac1f 100644
--- a/notify.c
+++ b/notify.c
@@ -29,11 +29,11 @@ void notifier_list_remove(NotifierList *list, Notifier 
*notifier)
 QTAILQ_REMOVE(list-notifiers, notifier, node);
 }
 
-void notifier_list_notify(NotifierList *list)
+void notifier_list_notify(NotifierList *list, void *data)
 {
 Notifier *notifier, *next;
 
 QTAILQ_FOREACH_SAFE(notifier, list-notifiers, node, next) {
-notifier-notify(notifier);
+notifier-notify(notifier, data);
 }
 }
diff --git a/notify.h b/notify.h
index b40522f..54fc57c 100644
--- a/notify.h
+++ b/notify.h
@@ -20,7 +20,7 @@ typedef struct Notifier Notifier;
 
 struct Notifier
 {
-void (*notify)(Notifier *notifier);
+void (*notify)(Notifier *notifier, void *data);
 QTAILQ_ENTRY(Notifier) node;
 };
 
@@ -38,6 +38,6 @@ void notifier_list_add(NotifierList *list, Notifier 
*notifier);
 
 void notifier_list_remove(NotifierList *list, Notifier *notifier);
 
-void notifier_list_notify(NotifierList *list);
+void notifier_list_notify(NotifierList *list, void *data);
 
 #endif
diff --git a/ui/sdl.c b/ui/sdl.c
index f2bd4a0..6dbc5cb 100644
--- a/ui/sdl.c
+++ b/ui/sdl.c
@@ -481,7 +481,7 @@ static void sdl_grab_end(void)
 sdl_update_caption();
 }
 
-static void sdl_mouse_mode_change(Notifier *notify)
+static void sdl_mouse_mode_change(Notifier *notify, void *data)
 {
 if (kbd_mouse_is_absolute()) {
 if (!absolute_enabled) {
diff --git a/ui/spice-core.c b/ui/spice-core.c
index dd9905b..e134f04 100644
--- a/ui/spice-core.c
+++

[Qemu-devel] [PATCH v2 2/3] qemu-timer: Introduce clock reset notifier

2011-06-20 Thread Jan Kiszka

QEMU_CLOCK_HOST is based on the system time which may jump backward in
case the admin or NTP adjusts it. RTC emulations and other device models
can suffer in this case as timers will stall for the period the clock
was tuned back.

This adds a detection mechanism that checks on every host clock readout
if the new time is before the last result. If that is the case a
notifier list is informed. Device models interested in this event can
register a notifier with the clock.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 qemu-timer.c |   29 -
 qemu-timer.h |5 +
 2 files changed, 33 insertions(+), 1 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index 72066c7..df323ae 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -150,6 +150,9 @@ struct QEMUClock {
 int enabled;
 
 QEMUTimer *warp_timer;
+
+NotifierList reset_notifiers;
+int64_t last;
 };
 
 struct QEMUTimer {
@@ -375,9 +378,15 @@ static QEMUTimer *active_timers[QEMU_NUM_CLOCKS];
 static QEMUClock *qemu_new_clock(int type)
 {
 QEMUClock *clock;
+
 clock = qemu_mallocz(sizeof(QEMUClock));
 clock-type = type;
 clock-enabled = 1;
+notifier_list_init(clock-reset_notifiers);
+/* required to detect  report backward jumps */
+if (type == QEMU_CLOCK_HOST) {
+clock-last = get_clock_realtime();
+}
 return clock;
 }
 
@@ -592,6 +601,8 @@ static void qemu_run_timers(QEMUClock *clock)
 
 int64_t qemu_get_clock_ns(QEMUClock *clock)
 {
+int64_t now, last;
+
 switch(clock-type) {
 case QEMU_CLOCK_REALTIME:
 return get_clock();
@@ -603,10 +614,26 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
 return cpu_get_clock();
 }
 case QEMU_CLOCK_HOST:
-return get_clock_realtime();
+now = get_clock_realtime();
+last = clock-last;
+clock-last = now;
+if (now  last) {
+notifier_list_notify(clock-reset_notifiers, now);
+}
+return now;
 }
 }
 
+void qemu_register_clock_reset_notifier(QEMUClock *clock, Notifier *notifier)
+{
+notifier_list_add(clock-reset_notifiers, notifier);
+}
+
+void qemu_unregister_clock_reset_notifier(QEMUClock *clock, Notifier *notifier)
+{
+notifier_list_remove(clock-reset_notifiers, notifier);
+}
+
 void init_clocks(void)
 {
 rt_clock = qemu_new_clock(QEMU_CLOCK_REALTIME);
diff --git a/qemu-timer.h b/qemu-timer.h
index 06cbe20..0a43469 100644
--- a/qemu-timer.h
+++ b/qemu-timer.h
@@ -2,6 +2,7 @@
 #define QEMU_TIMER_H
 
 #include qemu-common.h
+#include notify.h
 #include time.h
 #include sys/time.h
 
@@ -40,6 +41,10 @@ int64_t qemu_get_clock_ns(QEMUClock *clock);
 void qemu_clock_enable(QEMUClock *clock, int enabled);
 void qemu_clock_warp(QEMUClock *clock);
 
+void qemu_register_clock_reset_notifier(QEMUClock *clock, Notifier *notifier);
+void qemu_unregister_clock_reset_notifier(QEMUClock *clock,
+  Notifier *notifier);
+
 QEMUTimer *qemu_new_timer(QEMUClock *clock, int scale,
   QEMUTimerCB *cb, void *opaque);
 void qemu_free_timer(QEMUTimer *ts);
-- 
1.7.1

Re: [Qemu-devel] [PATCH v2] Optimize screendump

2011-06-20 Thread Jan Kiszka

On 2011-06-20 10:12, Avi Kivity wrote:
 When running kvm-autotest, fputc() is often the second highest (sometimes #1)
 function showing up in a profile.  This is due to fputc() locking the file
 for every byte written.
 
 Optimize by buffering a line's worth of pixels and writing that out in a
 single call.
 
 Signed-off-by: Avi Kivity a...@redhat.com
 ---
 
 v2: drop unportable fputc_unlocked
 
  hw/vga.c |   13 ++---
  1 files changed, 10 insertions(+), 3 deletions(-)
 
 diff --git a/hw/vga.c b/hw/vga.c
 index d5bc582..97c96bf 100644
 --- a/hw/vga.c
 +++ b/hw/vga.c
 @@ -2349,15 +2349,19 @@ int ppm_save(const char *filename, struct 
 DisplaySurface *ds)
  uint32_t v;
  int y, x;
  uint8_t r, g, b;
 +int ret;
 +char *linebuf, *pbuf;
  
  f = fopen(filename, wb);
  if (!f)
  return -1;
  fprintf(f, P6\n%d %d\n%d\n,
  ds-width, ds-height, 255);
 +linebuf = qemu_malloc(ds-width * 3);
  d1 = ds-data;
  for(y = 0; y  ds-height; y++) {
  d = d1;
 +pbuf = linebuf;
  for(x = 0; x  ds-width; x++) {
  if (ds-pf.bits_per_pixel == 32)
  v = *(uint32_t *)d;
 @@ -2369,13 +2373,16 @@ int ppm_save(const char *filename, struct 
 DisplaySurface *ds)
  (ds-pf.gmax + 1);
  b = ((v  ds-pf.bshift)  ds-pf.bmax) * 256 /
  (ds-pf.bmax + 1);
 -fputc(r, f);
 -fputc(g, f);
 -fputc(b, f);
 +*pbuf++ = r;
 +*pbuf++ = g;
 +*pbuf++ = b;
  d += ds-pf.bytes_per_pixel;
  }
  d1 += ds-linesize;
 +ret = fwrite(linebuf, 1, pbuf - linebuf, f);
 +(void)ret;
  }
 +qemu_free(linebuf);
  fclose(f);
  return 0;
  }

Unrelated to this patch, but why is this function located in vga.c and
not in console.c?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support

2011-06-20 Thread Alon Levy

On Mon, Jun 20, 2011 at 02:13:36PM +0200, Gerd Hoffmann wrote:
   Hi,
 
 +case QXL_IO_UPDATE_MEM:
 +switch (val) {
 +case (QXL_UPDATE_MEM_RENDER_ALL):
 +d-ssd.worker-update_mem(d-ssd.worker);
 +break;
 
 What is the difference to one worker-stop() + worker-start() cycle?

this won't disconnect any clients.

 
 cheers,
   Gerd

Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support

2011-06-20 Thread Alon Levy

On Mon, Jun 20, 2011 at 02:13:36PM +0200, Gerd Hoffmann wrote:
   Hi,
 
 +case QXL_IO_UPDATE_MEM:
 +switch (val) {
 +case (QXL_UPDATE_MEM_RENDER_ALL):
 +d-ssd.worker-update_mem(d-ssd.worker);
 +break;
 
 What is the difference to one worker-stop() + worker-start() cycle?
 

ok, stop+start won't disconnect any clients either. But does stop render all 
waiting commands?
I'll have to look, I don't know if it does.

 cheers,
   Gerd

Re: [Qemu-devel] [PATCH v2] Add support for fd: protocol

2011-06-20 Thread Corey Bryant




On 06/18/2011 04:50 PM, Blue Swirl wrote:

On Thu, Jun 16, 2011 at 5:48 PM, Corey Bryantbrynt...@us.ibm.com  wrote:



On 06/15/2011 03:12 PM, Blue Swirl wrote:


On Tue, Jun 14, 2011 at 4:31 PM, Corey Bryantbrynt...@us.ibm.comwrote:



  sVirt provides SELinux MAC isolation for Qemu guest processes and
their
  corresponding resources (image files). sVirt provides this support
  by labeling guests and resources with security labels that are stored
  in file system extended attributes. Some file systems, such as NFS, do
  not support the extended attribute security namespace, which is needed
  for image file isolation when using the sVirt SELinux security driver
  in libvirt.

  The proposed solution entails a combination of Qemu, libvirt, and
  SELinux patches that work together to isolate multiple guests' images
  when they're stored in the same NFS mount. This results in an
  environment where sVirt isolation and NFS image file isolation can
both
  be provided.

  This patch contains the Qemu code to support this solution. I would
  like to solicit input from the libvirt community prior to starting
  the libvirt patch.

  Currently, Qemu opens an image file in addition to performing the
  necessary read and write operations. The proposed solution will move
  the open out of Qemu and into libvirt. Once libvirt opens an image
  file for the guest, it will pass the file descriptor to Qemu via a
  new fd: protocol.

  If the image file resides in an NFS mount, the following SELinux
policy
  changes will provide image isolation:

- A new SELinux boolean is created (e.g. virt_read_write_nfs) to
  allow Qemu (svirt_t) to only have SELinux read and write
  permissions on nfs_t files

- Qemu (svirt_t) also gets SELinux use permissions on libvirt
  (virtd_t) file descriptors

  Following is a sample invocation of Qemu using the fd: protocol on
  the command line:

  qemu -drive file=fd:4,format=qcow2

  The fd: protocol is also supported by the drive_add monitor command.
  This requires that the specified file descriptor is passed to the
  monitor alongside a prior getfd monitor command.

  There are some additional features provided by certain image types
  where Qemu reopens the image file. All of these scenarios will be
  unsupported for the fd: protocol, at least for this patch:

- The -snapshot command line option
- The savevm monitor command
- The snapshot_blkdev monitor command
- Starting Qemu with a backing file


There's also native CDROM device. Did you consider adding an explicit
reopen method to block layer?


Thanks. Yes it looks like I overlooked CDROM reopens.

I'm not sure that I'm clear on the purpose of the reopen function. Would the
goal be to funnel all block layer reopens through a single function,
enabling potential future support where a privileged layer of Qemu, or
libvirt, performs the open?


Eventually yes, but I think it would help also now by moving the
checks to a single place. It's a bit orthogonal to this patch though.



This would definitely simplify things, especially when reopen support is 
added.  I'm going to defer this until then.



  The thought is that this support can be added in the future, but is
  not required for the initial fd: support.

  This patch was tested with the following formats: raw, cow, qcow,
  qcow2, qed, and vmdk, using the fd: protocol from the command line
  and the monitor. Tests were also run to verify existing file name
  support and qemu-img were not regressed. Non-valid file descriptors,
  fd: without format, snapshot and backing files were also tested.

  Signed-off-by: Corey Bryantcor...@linux.vnet.ibm.com
  ---
block.c   |   16 ++
block.h   |1 +
block/cow.c   |5 +++
block/qcow.c  |5 +++
block/qcow2.c |5 +++
block/qed.c   |4 ++
block/raw-posix.c |   81
+++--
block/vmdk.c  |5 +++
block_int.h   |1 +
blockdev.c|   10 ++
monitor.c |5 +++
monitor.h |1 +
qemu-options.hx   |8 +++--
qemu-tool.c   |5 +++
14 files changed, 140 insertions(+), 12 deletions(-)

  diff --git a/block.c b/block.c
  index 24a25d5..500db84 100644
  --- a/block.c
  +++ b/block.c
  @@ -536,6 +536,10 @@ int bdrv_open(BlockDriverState *bs, const char
*filename, int flags,
   char tmp_filename[PATH_MAX];
   char backing_filename[PATH_MAX];

  +if (bdrv_is_fd_protocol(bs)) {
  +return -ENOTSUP;
  +}
  +
   /* if snapshot, we create a temporary backing file and open
it
  instead of opening 'filename' directly */

  @@ -585,6 +589,10 @@ int bdrv_open(BlockDriverState *bs, const char
*filename, int flags,

   /* Find the right image format driver */
   if (!drv) {
  +/* format must be specified for fd: protocol */
  +if

Re: [Qemu-devel] [PATCH v2] Optimize screendump

2011-06-20 Thread Avi Kivity


On 06/20/2011 03:33 PM, Jan Kiszka wrote:

  --- a/hw/vga.c
  +++ b/hw/vga.c
  @@ -2349,15 +2349,19 @@ int ppm_save(const char *filename, struct 
DisplaySurface *ds)



Unrelated to this patch, but why is this function located in vga.c and
not in console.c?


It's located in omap_lcdc.c  as well.  But it needs to be fully 
generalized to be moved out (handle all PixelFormats).


--
error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH 1/1] fix operator precedence

2011-06-20 Thread Frediano Ziglio

Signed-off-by: Frediano Ziglio fredd...@gmail.com
---
 cmd.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/cmd.c b/cmd.c
index db2c9c4..ecca167 100644
--- a/cmd.c
+++ b/cmd.c
@@ -486,7 +486,7 @@ timestr(
snprintf(ts, size, %u:%02u.%02u,
(unsigned int) MINUTES(tv-tv_sec),
(unsigned int) SECONDS(tv-tv_sec),
-   (unsigned int) usec * 100);
+   (unsigned int) (usec * 100));
return;
}
format |= VERBOSE_FIXED_TIME;   /* fallback if hours needed */
@@ -497,9 +497,9 @@ timestr(
(unsigned int) HOURS(tv-tv_sec),
(unsigned int) MINUTES(tv-tv_sec),
(unsigned int) SECONDS(tv-tv_sec),
-   (unsigned int) usec * 100);
+   (unsigned int) (usec * 100));
} else {
-   snprintf(ts, size, 0.%04u sec, (unsigned int) usec * 1);
+   snprintf(ts, size, 0.%04u sec, (unsigned int) (usec * 1));
}
 }

-- 
1.7.1

Re: [Qemu-devel] [PATCH v2] Add support for fd: protocol

2011-06-20 Thread Avi Kivity


On 06/14/2011 04:31 PM, Corey Bryant wrote:

   - Starting Qemu with a backing file



For this we could tell qemu that a file named xyz is available via fd 
n, via an extension of the getfd command.


For example

  (qemu) getfd path=/images/my-image.img
  (qemu) getfd path=/images/template.img
  (qemu) drive-add path=/images/my-image.img

The open() for my-image.img first looks up the name in the getfd 
database, and finds it, so it returns the fd from there instead of 
opening.  It then opens the backing file (template.img) and looks it 
up again, and finds the second fd from the session.


The result is that open()s are satisfied from the monitor, instead of 
the host kernel, but without reversing the request/reply nature of the 
monitor protocol.


A similar extension could be added to the command line:

  qemu -drive file=fd:4,cache=none -path-alias 
name=/images/template.img,path=fd:5


Here the main image is opened via a fd 4; if it needs template.img, it 
gets shunted to fd 5.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH v2] Add support for fd: protocol

2011-06-20 Thread Anthony Liguori


On 06/20/2011 08:40 AM, Avi Kivity wrote:

On 06/14/2011 04:31 PM, Corey Bryant wrote:

- Starting Qemu with a backing file



For this we could tell qemu that a file named xyz is available via fd
n, via an extension of the getfd command.

For example

(qemu) getfd path=/images/my-image.img
(qemu) getfd path=/images/template.img
(qemu) drive-add path=/images/my-image.img

The open() for my-image.img first looks up the name in the getfd
database, and finds it, so it returns the fd from there instead of
opening. It then opens the backing file (template.img) and looks it up
again, and finds the second fd from the session.


The way I've been thinking about this is:

 -blockdev id=hd0-back,file=fd:4,format=raw \
 -blockdev file=fd:3,format=qcow2,backing=hd0-back

While your proposal is clever, it makes me a little nervous about subtle 
security ramifications.


Regards,

Anthony Liguori



The result is that open()s are satisfied from the monitor, instead of
the host kernel, but without reversing the request/reply nature of the
monitor protocol.

A similar extension could be added to the command line:

qemu -drive file=fd:4,cache=none -path-alias
name=/images/template.img,path=fd:5

Here the main image is opened via a fd 4; if it needs template.img, it
gets shunted to fd 5.

Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support

2011-06-20 Thread Gerd Hoffmann


What is the difference to one worker-stop() + worker-start() cycle?



ok, stop+start won't disconnect any clients either. But does stop render all 
waiting commands?
I'll have to look, I don't know if it does.


It does.  This is what qemu uses to flush all spice server state to 
device memory on migration.


What is the reason for deleting all surfaces?

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v5 2/5] guest agent: qemu-ga daemon

2011-06-20 Thread Luiz Capitulino

On Sun, 19 Jun 2011 14:00:30 -0500
Michael Roth mdr...@linux.vnet.ibm.com wrote:

 On 06/17/2011 10:25 PM, Luiz Capitulino wrote:
  On Fri, 17 Jun 2011 16:25:32 -0500
  Michael Rothmdr...@linux.vnet.ibm.com  wrote:
 
  On 06/17/2011 03:13 PM, Luiz Capitulino wrote:
  On Fri, 17 Jun 2011 14:21:31 -0500
  Michael Rothmdr...@linux.vnet.ibm.com   wrote:
 
  On 06/16/2011 01:42 PM, Luiz Capitulino wrote:
  On Tue, 14 Jun 2011 15:06:22 -0500
  Michael Rothmdr...@linux.vnet.ibm.comwrote:
 
  This is the actual guest daemon, it listens for requests over a
  virtio-serial/isa-serial/unix socket channel and routes them through
  to dispatch routines, and writes the results back to the channel in
  a manner similar to QMP.
 
  A shorthand invocation:
 
   qemu-ga -d
 
  Is equivalent to:
 
   qemu-ga -c virtio-serial -p 
  /dev/virtio-ports/org.qemu.guest_agent \
   -p /var/run/qemu-guest-agent.pid -d
 
  Signed-off-by: Michael Rothmdr...@linux.vnet.ibm.com
 
  Would be nice to have a more complete description, like explaining how 
  to
  do a simple test.
 
  And this can't be built...
 
  ---
  qemu-ga.c  |  631 
  
  qga/guest-agent-core.h |4 +
  2 files changed, 635 insertions(+), 0 deletions(-)
  create mode 100644 qemu-ga.c
 
  diff --git a/qemu-ga.c b/qemu-ga.c
  new file mode 100644
  index 000..df08d8c
  --- /dev/null
  +++ b/qemu-ga.c
  @@ -0,0 +1,631 @@
  +/*
  + * QEMU Guest Agent
  + *
  + * Copyright IBM Corp. 2011
  + *
  + * Authors:
  + *  Adam Litkeagli...@linux.vnet.ibm.com
  + *  Michael Rothmdr...@linux.vnet.ibm.com
  + *
  + * This work is licensed under the terms of the GNU GPL, version 2 or 
  later.
  + * See the COPYING file in the top-level directory.
  + */
  +#includestdlib.h
  +#includestdio.h
  +#includestdbool.h
  +#includeglib.h
  +#includegio/gio.h
  +#includegetopt.h
  +#includetermios.h
  +#includesyslog.h
  +#include qemu_socket.h
  +#include json-streamer.h
  +#include json-parser.h
  +#include qint.h
  +#include qjson.h
  +#include qga/guest-agent-core.h
  +#include qga-qmp-commands.h
  +#include module.h
  +
  +#define QGA_VIRTIO_PATH_DEFAULT 
  /dev/virtio-ports/org.qemu.guest_agent
  +#define QGA_PIDFILE_DEFAULT /var/run/qemu-va.pid
  +#define QGA_BAUDRATE_DEFAULT B38400 /* for isa-serial channels */
  +#define QGA_TIMEOUT_DEFAULT 30*1000 /* ms */
  +
  +struct GAState {
  +const char *proxy_path;
 
  Where is this used?
 
 
  Nowhere actually. Will remove.
 
  +JSONMessageParser parser;
  +GMainLoop *main_loop;
  +guint conn_id;
  +GSocket *conn_sock;
  +GIOChannel *conn_channel;
  +guint listen_id;
  +GSocket *listen_sock;
  +GIOChannel *listen_channel;
  +const char *path;
  +const char *method;
  +bool virtio; /* fastpath to check for virtio to deal with poll() 
  quirks */
  +GACommandState *command_state;
  +GLogLevelFlags log_level;
  +FILE *log_file;
  +bool logging_enabled;
  +};
  +
  +static void usage(const char *cmd)
  +{
  +printf(
  +Usage: %s -cchannel_opts\n
  +QEMU Guest Agent %s\n
  +\n
  +  -c, --channel channel method: one of unix-connect, 
  virtio-serial, or\n
  +isa-serial (virtio-serial is the default)\n
  +  -p, --pathchannel path (%s is the default for 
  virtio-serial)\n
  +  -l, --logfile set logfile path, logs to stderr by default\n
  +  -f, --pidfile specify pidfile (default is %s)\n
  +  -v, --verbose log extra debugging information\n
  +  -V, --version print version information and exit\n
  +  -d, --daemonize   become a daemon\n
  +  -h, --helpdisplay this help and exit\n
  +\n
  +Report bugs tomdr...@linux.vnet.ibm.com\n
  +, cmd, QGA_VERSION, QGA_VIRTIO_PATH_DEFAULT, QGA_PIDFILE_DEFAULT);
  +}
  +
  +static void conn_channel_close(GAState *s);
  +
  +static const char *ga_log_level_str(GLogLevelFlags level)
  +{
  +switch (levelG_LOG_LEVEL_MASK) {
  +case G_LOG_LEVEL_ERROR:
  +return error;
  +case G_LOG_LEVEL_CRITICAL:
  +return critical;
  +case G_LOG_LEVEL_WARNING:
  +return warning;
  +case G_LOG_LEVEL_MESSAGE:
  +return message;
  +case G_LOG_LEVEL_INFO:
  +return info;
  +case G_LOG_LEVEL_DEBUG:
  +return debug;
  +default:
  +return user;
  +}
  +}
  +
  +bool ga_logging_enabled(GAState *s)
  +{
  +return s-logging_enabled;
  +}
  +
  +void ga_disable_logging(GAState *s)
  +{
  +s-logging_enabled = false;
  +}
  +
  +void ga_enable_logging(GAState *s)
  +{
  +s-logging_enabled = true;
  +}
 
  Just to check I got this right, this is needed because of the fsfreeze
  command, correct? Isn't it better to have a more descriptive name, like
  fsfrozen?
 
  First I thought this was about a log file. Then I realized this was
  probably about

Re: [Qemu-devel] [V3 1/3] Enhance info block to display hostcache setting

2011-06-20 Thread Kevin Wolf

Am 17.06.2011 18:37, schrieb Supriya Kannery:
 Enhance info block to display hostcache setting for each
 block device.
 
 Example:
 (qemu) info block
 ide0-hd0: type=hd removable=0 file=../rhel6-32.qcow2 ro=0 drv=qcow2
 encrypted=0
 
 Enhanced to display hostcache setting:
 (qemu) info block
 ide0-hd0: type=hd removable=0 hostcache=true file=../rhel6-32.qcow2 
 ro=0 drv=qcow2 encrypted=0
 
 Signed-off-by: Supriya Kannery supri...@in.ibm.com
 
 ---
  block.c |   21 +
  qmp-commands.hx |2 ++
  2 files changed, 19 insertions(+), 4 deletions(-)
 
 Index: qemu/block.c
 ===
 --- qemu.orig/block.c
 +++ qemu/block.c
 @@ -1694,6 +1694,14 @@ static void bdrv_print_dict(QObject *obj
  monitor_printf(mon,  locked=%d, qdict_get_bool(bs_dict, locked));
  }
  
 + if (qdict_haskey(bs_dict, open_flags)) {
 + int open_flags = qdict_get_int(bs_dict, open_flags);
 + if (open_flags  BDRV_O_NOCACHE)
 + monitor_printf(mon,  hostcache=false);
 + else
 + monitor_printf(mon,  hostcache=true);

Coding style requires braces.

 + }
 +
  if (qdict_haskey(bs_dict, inserted)) {
  QDict *qdict = qobject_to_qdict(qdict_get(bs_dict, inserted));
  
 @@ -1730,13 +1738,18 @@ void bdrv_info(Monitor *mon, QObject **r
  QObject *bs_obj;
  
  bs_obj = qobject_from_jsonf({ 'device': %s, 'type': 'unknown', 
 -'removable': %i, 'locked': %i },
 -bs-device_name, bs-removable,
 -bs-locked);
 +  
 'removable': %i, 'locked': %i, 
 +  
 'hostcache': %s },
 +  
 bs-device_name, bs-removable,
 +  
 bs-locked,
 +  
 (bs-open_flags  BDRV_O_NOCACHE) ?
 +  
 false : true);

Don't use tabs.

Kevin

Re: [Qemu-devel] [V3 2/3] Error classes for file reopen and device insertion

2011-06-20 Thread Kevin Wolf

Am 17.06.2011 18:37, schrieb Supriya Kannery:
 New error classes defined for cases where device not inserted
 and file reopen failed.
 
 Signed-off-by: Supriya Kannery supri...@in.ibm.com

This one has tabs, too.

Kevin

Re: [Qemu-devel] [V2 3/3] Command block_set for dynamic block params change

2011-06-20 Thread Kevin Wolf

Am 17.06.2011 18:38, schrieb Supriya Kannery:
 New command block_set added for dynamically changing any of the block 
 device parameters. For now, dynamic setting of hostcache params using this 
 command is implemented. Other block device parameters, can be integrated 
 in similar lines.
 
 Signed-off-by: Supriya Kannery supri...@in.ibm.com

Coding style is off in this one as well.

 Index: qemu/blockdev.c
 ===
 --- qemu.orig/blockdev.c
 +++ qemu/blockdev.c
 @@ -797,3 +797,35 @@ int do_block_resize(Monitor *mon, const 
  
  return 0;
  }
 +
 +
 +/*
 + * Handle changes to block device settings, like hostcache,
 + * while guest is running.
 +*/
 +int do_block_set(Monitor *mon, const QDict *qdict, QObject **ret_data)
 +{
 + const char *device = qdict_get_str(qdict, device);
 + const char *name = qdict_get_str(qdict, name);
 + int enable = qdict_get_bool(qdict, enable);
 + BlockDriverState *bs;
 +
 + bs = bdrv_find(device);
 + if (!bs) {
 + qerror_report(QERR_DEVICE_NOT_FOUND, device);
 + return -1;
 + }
 +
 + if (!(strcmp(name, hostcache))) {

The bracket after ! isn't necessary.

 + if (bdrv_is_inserted(bs)) {
 + /* cache change applicable only if device inserted */
 + return bdrv_change_hostcache(bs, enable);
 + } else {
 + qerror_report(QERR_DEVICE_NOT_INSERTED, device);
 + return -1;
 + }

I'm not so sure about this one. Why shouldn't I change the cache mode
for a device which is currently? The next thing I want to do could be
inserting a medium and using it with the new cache mode.

 + }
 +
 + return 0;
 +}
 +
 Index: qemu/block.c
 ===
 --- qemu.orig/block.c
 +++ qemu/block.c
 @@ -651,6 +651,33 @@ unlink_and_fail:
  return ret;
  }
  
 +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags)
 +{
 + BlockDriver *drv = bs-drv;
 + int ret = 0;
 +
 + /* No need to reopen as no change in flags */
 + if (bdrv_flags == bs-open_flags)
 + return 0;

There could be other reasons for reopening besides changing flags, e.g.
invalidating cached metadata.

 +
 + /* Quiesce IO for the given block device */
 + qemu_aio_flush();
 + bdrv_flush(bs);

Missing error handling.

 +
 + bdrv_close(bs);

Here, too.

 + ret = bdrv_open(bs, bs-filename, bdrv_flags, drv);
 +
 + /*
 + * A failed attempt to reopen the image file must lead to 'abort()'
 + */
 + if (ret != 0) {
 + qerror_report(QERR_REOPEN_FILE_FAILED, bs-filename);
 + abort();
 + }

Maybe we can retry with the old flags at least before aborting?

Also I would like to see a (Linux specific) version that uses the old fd
for the reopen, so that we can handle files that aren't accessible with
their old name any more. This would mean adding a .bdrv_reopen callback
in raw-posix.

 +
 + return ret;
 +}
 +
  void bdrv_close(BlockDriverState *bs)
  {
  if (bs-drv) {
 @@ -691,6 +718,20 @@ void bdrv_close_all(void)
  }
  }
  
 +int bdrv_change_hostcache(BlockDriverState *bs, bool enable_host_cache)
 +{
 + int bdrv_flags = bs-open_flags;
 +
 + /* set hostcache flags (without changing WCE/flush bits) */
 + if (enable_host_cache)
 + bdrv_flags = ~BDRV_O_NOCACHE;
 + else
 + bdrv_flags |= BDRV_O_NOCACHE;
 +
 + /* Reopen file with changed set of flags */
 + return bdrv_reopen(bs, bdrv_flags);
 +}

Hm, interesting. Now we can get a O_DIRECT | O_SYNC mode with the
monitor. We should probably expose the same functionality for the
command line, too.

Kevin

Re: [Qemu-devel] [PATCH] qemu-img: Add cache command line option

2011-06-20 Thread Kevin Wolf

Am 16.06.2011 16:43, schrieb Kevin Wolf:
 Am 16.06.2011 16:28, schrieb Christoph Hellwig:
 On Wed, Jun 15, 2011 at 09:46:10AM -0400, Federico Simoncelli wrote:
 qemu-img currently writes disk images using writeback and filling
 up the cache buffers which are then flushed by the kernel preventing
 other processes from accessing the storage.
 This is particularly bad in cluster environments where time-based
 algorithms might be in place and accessing the storage within
 certain timeouts is critical.
 This patch adds the option to choose a cache method when writing
 disk images.

 Allowing to chose the mode is of course fine, but what about also
 choosing a good default?  writethrough doesn't really make any sense
 for qemu-img, given that we can trivially flush the cache at the end
 of the operations.  I'd also say that using the buffer cache doesn't
 make sense either, as there is little point in caching these operations.
 
 Right, we need to keep the defaults as they are. That is, for convert
 unsafe and for everything else writeback. The patch seems to make
 writeback the default for everything.

Federico, are you going to fix this in a v4?

Kevin

Re: [Qemu-devel] struct TimerState

2011-06-20 Thread Lluís

Nilay  writes:

 I am trying to understand the structures that QEMU saves when do_savevm()
 is invoked. Can anyone explain to me the fields that are part of the
 TimerState structure in qemu-timer.c?

If my meory does not fail me, its main task is to capture what is the
time in the host whenever the VM is started and stopped.

This is later used to adapt the VM time when using icount in adaptive
mode (-icount=auto). I remember seeing it used somewhere else, but right
now I cannot recall exactly what for.

This reminds me that I've been navigating through all the time-related
code in QEMU and, in order to make it more easy to follow, I've started
separating the routines in qemu-timer into different files (e.g.,
qemu-htime and qemu-vtime for routines accessing time sources in the
host and in the VM). I will send the patches as soon as I finish the
rewrite.


Lluis

-- 
 And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer.
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support

2011-06-20 Thread Alon Levy

On Mon, Jun 20, 2011 at 04:07:59PM +0200, Gerd Hoffmann wrote:
 What is the difference to one worker-stop() + worker-start() cycle?
 
 
 ok, stop+start won't disconnect any clients either. But does stop render all 
 waiting commands?
 I'll have to look, I don't know if it does.
 
 It does.  This is what qemu uses to flush all spice server state to
 device memory on migration.
 
 What is the reason for deleting all surfaces?

Making sure all references are dropped to pci memory in devram. We would need 
to recreate all
the surfaces after reset anyway.

 
 cheers,
   Gerd

Re: [Qemu-devel] [PATCH RFC 0/3] basic support for composing sysbus devices

2011-06-20 Thread Paul Brook

  Yeah, that's why I said, hard to do well.  It makes it very hard to add
  new socket types.
 
 PCI, USB, IDE, SCSI, SBus, what else? APICBus? I2C? 8 socket types
 ought to be enough for anybody.

Off the top of my head: AClink (audio), i2s (audio), SSI/SSP (synchonous 
serial), Firewire, rs232, CAN, FibreChannel, ISA, PS2, ADB (apple desktop bus) 
and probably a bunch of others I've missed.  There's also a bunch of all-but 
extinct system architectures with interesting bus-level features (MCA, NuBus, 
etc.)

Paul

Re: [Qemu-devel] [PATCH v2] Optimize screendump

2011-06-20 Thread Stefan Hajnoczi

On Mon, Jun 20, 2011 at 9:12 AM, Avi Kivity a...@redhat.com wrote:
 When running kvm-autotest, fputc() is often the second highest (sometimes #1)
 function showing up in a profile.  This is due to fputc() locking the file
 for every byte written.

 Optimize by buffering a line's worth of pixels and writing that out in a
 single call.

 Signed-off-by: Avi Kivity a...@redhat.com
 ---

 v2: drop unportable fputc_unlocked

  hw/vga.c |   13 ++---
  1 files changed, 10 insertions(+), 3 deletions(-)

Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com

Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support

2011-06-20 Thread Alon Levy

On Mon, Jun 20, 2011 at 05:11:07PM +0200, Alon Levy wrote:
 On Mon, Jun 20, 2011 at 04:07:59PM +0200, Gerd Hoffmann wrote:
  What is the difference to one worker-stop() + worker-start() cycle?
  
  
  ok, stop+start won't disconnect any clients either. But does stop render 
  all waiting commands?
  I'll have to look, I don't know if it does.
  
  It does.  This is what qemu uses to flush all spice server state to
  device memory on migration.
  
  What is the reason for deleting all surfaces?
 
 Making sure all references are dropped to pci memory in devram. We would need 
 to recreate all
 the surfaces after reset anyway.

That's not right. The reason is that for the windows driver I don't know
if this is a resolution change or a suspend. So it was easier to destroy all 
the surfaces and then the two cases are equal - before going to sleep / leaving 
the current resolution I destroy all the surfaces, when coming back I recreate 
the surfaces. If it's a resolution change there is no coming back stage, but 
since all surfaces are destroyed there is no error when the same surface id's 
are reused.

 
  
  cheers,
Gerd

Re: [Qemu-devel] [PATCH 0/2] Introduce panic hypercall

2011-06-20 Thread Avi Kivity


On 06/20/2011 06:38 PM, Daniel P. Berrange wrote:

On Mon, Jun 20, 2011 at 06:31:23PM +0300, Avi Kivity wrote:
  On 06/20/2011 04:38 PM, Daniel Gollub wrote:
  Introduce panic hypercall to enable the crashing guest to notify the
  host. This enables the host to run some actions as soon a guest
  crashed (kernel panic).
  
  This patch series introduces the panic hypercall at the host end.
  As well as the hypercall for KVM paravirtuliazed Linux guests, by
  registering the hypercall to the panic_notifier_list.
  
  The basic idea is to create KVM crashdump automatically as soon the
  guest paniced and power-cycle the VM (e.g. libvirton_crash /).

  This would be more easily done via a panic device (I/O port or
  memory-mapped address) that the guest hits.  It would be intercepted
  by qemu without any new code in kvm.\

  However, I'm not sure I see the gain.  Most enterprisey guests
  already contain in-guest crash dumpers which provide more
  information than a qemu memory dump could, since they know exact
  load addresses etc. and are integrated with crash analysis tools.
  What do you have in mind?

Well libvirt can capture a core file by doing 'virsh dump $GUESTNAME'.
This actually uses the QEMU monitor migration command to capture the
entire of QEMU memory. The 'crash' command line tool actually knows
how to analyse this data format as it would a normal kernel crashdump.


Interesting.


I think having a way for a guest OS to notify the host that is has
crashed would be useful. libvirt could automatically do a crash
dump of the QEMU memory, or at least pause the guest CPUs and notify
the management app of the crash, which can then decide what todo.
You can also use tools like 'virt-dmesg' which uses libvirt to peek
into guest memory to extract the most recent kernel dmesg logs (even
if the guest OS itself is crashed  didn't manage to send them out
via netconsole or something else).


I agree.  But let's do this via a device, this way kvm need not be changed.

Do ILO cards / IPMI support something like this?  We could follow their 
lead in that case.



This series does need to introduce a QMP event notification upon
crash, so that the crash notification can be propagated to mgmt
layers above QEMU.


Yes.

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support

2011-06-20 Thread Gerd Hoffmann


On 06/20/11 17:11, Alon Levy wrote:

On Mon, Jun 20, 2011 at 04:07:59PM +0200, Gerd Hoffmann wrote:

What is the difference to one worker-stop() + worker-start() cycle?



ok, stop+start won't disconnect any clients either. But does stop render all 
waiting commands?
I'll have to look, I don't know if it does.


It does.  This is what qemu uses to flush all spice server state to
device memory on migration.

What is the reason for deleting all surfaces?


Making sure all references are dropped to pci memory in devram.


Ah, because the spice server keeps a reference to the create command 
until the surface is destroyed, right?


There is is QXL_IO_DESTROY_ALL_SURFACES + worker-destroy_surfaces() ...

The QXL_IO_UPDATE_MEM command does too much special stuff IMHO.
I also think we don't need to extend the libspice-server API.

We can add a I/O command which renders everything to device memory via 
stop+start.  We can zap all surfaces with the existing command + worker 
call.  We can add a I/O command to ask qxl to push the release queue 
head to the release ring.


Comments?

cheers,
  Gerd

Re: [Qemu-devel] [PATCH 0/2] Introduce panic hypercall

2011-06-20 Thread Jan Kiszka

On 2011-06-20 17:45, Avi Kivity wrote:
 This series does need to introduce a QMP event notification upon
 crash, so that the crash notification can be propagated to mgmt
 layers above QEMU.
 
 Yes.

I think the best way to deal with that is to stop the VM on guest panic.
There is already WIP to signal stop reasons via QMP. Maybe we need to
differentiate between hypervisor and guest triggered panics
(VMSTOP_GUEST_PANIC?), but the rest should come for free.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH 01/18] Don't translate pointer when in restore_sigcontext

2011-06-20 Thread riku . voipio

From: Mike McCormack mj.mccorm...@samsung.com

Fixes crash in i386 when user emulation base address is non-zero.

21797 rt_sigreturn(8,1082124603,1,0,1082126048,1082126248)Exit reason and 
status: signal 11

Signed-off-by: Mike McCormack mj.mccorm...@samsung.com
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/signal.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 11b25be..cb7138f 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -981,8 +981,8 @@ restore_sigcontext(CPUX86State *env, struct 
target_sigcontext *sc, int *peax)
 env-regs[R_ECX] = tswapl(sc-ecx);
 env-eip = tswapl(sc-eip);
 
-cpu_x86_load_seg(env, R_CS, lduw(sc-cs) | 3);
-cpu_x86_load_seg(env, R_SS, lduw(sc-ss) | 3);
+cpu_x86_load_seg(env, R_CS, lduw_p(sc-cs) | 3);
+cpu_x86_load_seg(env, R_SS, lduw_p(sc-ss) | 3);
 
 tmpflags = tswapl(sc-eflags);
 env-eflags = (env-eflags  ~0x40DD5) | (tmpflags  0x40DD5);
-- 
1.7.4.1

[Qemu-devel] [PATCH 06/18] m68k-semi.c: Use correct check for failure of do_brk()

2011-06-20 Thread riku . voipio

From: Peter Maydell peter.mayd...@linaro.org

In the m68k semihosting implementation of HOSTED_INIT_SIM, use the correct
check for whether do_brk() has failed -- it does not return -1 but the
previous value of the break limit.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 m68k-semi.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/m68k-semi.c b/m68k-semi.c
index 0371089..7fde10e 100644
--- a/m68k-semi.c
+++ b/m68k-semi.c
@@ -370,7 +370,7 @@ void do_m68k_semihosting(CPUM68KState *env, int nr)
 TaskState *ts = env-opaque;
 /* Allocate the heap using sbrk.  */
 if (!ts-heap_limit) {
-long ret;
+abi_ulong ret;
 uint32_t size;
 uint32_t base;
 
@@ -379,8 +379,9 @@ void do_m68k_semihosting(CPUM68KState *env, int nr)
 /* Try a big heap, and reduce the size if that fails.  */
 for (;;) {
 ret = do_brk(base + size);
-if (ret != -1)
+if (ret = (base + size)) {
 break;
+}
 size = 1;
 }
 ts-heap_limit = base + size;
-- 
1.7.4.1

[Qemu-devel] [PATCH 00/18] pending linux-user patches

2011-06-20 Thread riku . voipio

From: Riku Voipio riku.voi...@iki.fi

Hi,

All included patches except mine have already been on the list. These patches
should be ready for pull, but giving last minute chance for people to object.

The following changes since commit eb47d7c5d96060040931c42773ee07e61e547af9

  hw/9118.c: Implement active-low interrupt support (2011-06-15 13:23:37 +0200)


are available in the git repository at:
  git://git.linaro.org/people/rikuvoipio/qemu.git linux-user-for-upstream

Cédric VINCENT (2):
  linux-user: Fix the load of ELF files that have no useful symbol
  linux-user: Fix the computation of the requested heap size

Juan Quintela (5):
  linuxload: id_change was a write only variable
  syscall: really return ret code
  linux-user: syscall should use sanitized arg1
  flatload: end_code was only used in a debug message
  flatload: memp was a write-only variable

Laurent ALFONSI (1):
  linux-user: Define AT_RANDOM to support target stack protection
mechanism.

Mike Frysinger (1):
  linux-user: add pselect6 syscall support

Mike McCormack (1):
  Don't translate pointer when in restore_sigcontext

Peter Maydell (7):
  linux-user: Handle images where lowest vaddr is not page aligned
  linux-user: Don't use MAP_FIXED in do_brk()
  arm-semi.c: Use correct check for failure of do_brk()
  m68k-semi.c: Use correct check for failure of do_brk()
  linux-user: Bump do_syscall() up to 8 syscall arguments
  linux-user/signal.c: Remove only-ever-set variable fpu_save_addr
  linux-user/signal.c: Remove unused fenab

Riku Voipio (1):
  linux-user: Fix sync_file_range on 32bit mips

 arm-semi.c |5 +-
 linux-user/elfload.c   |  185 +
 linux-user/flatload.c  |8 +--
 linux-user/linuxload.c |   25 +--
 linux-user/main.c  |   37 ++---
 linux-user/qemu.h  |3 +-
 linux-user/signal.c|   21 +++--
 linux-user/syscall.c   |  214 ++--
 m68k-semi.c|5 +-
 9 files changed, 331 insertions(+), 172 deletions(-)

-- 
1.7.4.1

[Qemu-devel] [PATCH 02/18] linux-user: Fix the load of ELF files that have no useful symbol

2011-06-20 Thread riku . voipio

From: Cédric VINCENT cedric.vinc...@st.com

This patch fixes a double free() due to realloc(syms, 0) in the
loader when the ELF file has no useful symbol, as with the following
example (compiled with sh4-linux-gcc -nostdlib):

.text
.align 1
.global _start
_start:
mov #1, r3
trapa   #40 // syscall(__NR_exit)
nop

The bug appears when the log (option -d) is enabled.

Signed-off-by: Cédric VINCENT cedric.vinc...@st.com
Signed-off-by: Yves JANIN yves.ja...@st.com
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/elfload.c |   34 +++---
 1 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index dcfeb7a..a4aabd5 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1643,9 +1643,9 @@ static void load_symbols(struct elfhdr *hdr, int fd, 
abi_ulong load_bias)
 {
 int i, shnum, nsyms, sym_idx = 0, str_idx = 0;
 struct elf_shdr *shdr;
-char *strings;
-struct syminfo *s;
-struct elf_sym *syms, *new_syms;
+char *strings = NULL;
+struct syminfo *s = NULL;
+struct elf_sym *new_syms, *syms = NULL;
 
 shnum = hdr-e_shnum;
 i = shnum * sizeof(struct elf_shdr);
@@ -1670,24 +1670,19 @@ static void load_symbols(struct elfhdr *hdr, int fd, 
abi_ulong load_bias)
 /* Now know where the strtab and symtab are.  Snarf them.  */
 s = malloc(sizeof(*s));
 if (!s) {
-return;
+goto give_up;
 }
 
 i = shdr[str_idx].sh_size;
 s-disas_strtab = strings = malloc(i);
 if (!strings || pread(fd, strings, i, shdr[str_idx].sh_offset) != i) {
-free(s);
-free(strings);
-return;
+goto give_up;
 }
 
 i = shdr[sym_idx].sh_size;
 syms = malloc(i);
 if (!syms || pread(fd, syms, i, shdr[sym_idx].sh_offset) != i) {
-free(s);
-free(strings);
-free(syms);
-return;
+goto give_up;
 }
 
 nsyms = i / sizeof(struct elf_sym);
@@ -1710,16 +1705,18 @@ static void load_symbols(struct elfhdr *hdr, int fd, 
abi_ulong load_bias)
 }
 }
 
+/* No useful symbol.  */
+if (nsyms == 0) {
+goto give_up;
+}
+
 /* Attempt to free the storage associated with the local symbols
that we threw away.  Whether or not this has any effect on the
memory allocation depends on the malloc implementation and how
many symbols we managed to discard.  */
 new_syms = realloc(syms, nsyms * sizeof(*syms));
 if (new_syms == NULL) {
-free(s);
-free(syms);
-free(strings);
-return;
+goto give_up;
 }
 syms = new_syms;
 
@@ -1734,6 +1731,13 @@ static void load_symbols(struct elfhdr *hdr, int fd, 
abi_ulong load_bias)
 s-lookup_symbol = lookup_symbolxx;
 s-next = syminfos;
 syminfos = s;
+
+return;
+
+give_up:
+free(s);
+free(strings);
+free(syms);
 }
 
 int load_elf_binary(struct linux_binprm * bprm, struct target_pt_regs * regs,
-- 
1.7.4.1

[Qemu-devel] [PATCH 05/18] arm-semi.c: Use correct check for failure of do_brk()

2011-06-20 Thread riku . voipio

From: Peter Maydell peter.mayd...@linaro.org

In the ARM semihosting implementation of SYS_HEAPINFO, use the correct
check for whether do_brk() has failed -- it does not return -1 but the
previous value of the break limit.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 arm-semi.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arm-semi.c b/arm-semi.c
index e9e6f89..5a62d03 100644
--- a/arm-semi.c
+++ b/arm-semi.c
@@ -440,15 +440,16 @@ uint32_t do_arm_semihosting(CPUState *env)
 /* Some C libraries assume the heap immediately follows .bss, so
allocate it using sbrk.  */
 if (!ts-heap_limit) {
-long ret;
+abi_ulong ret;
 
 ts-heap_base = do_brk(0);
 limit = ts-heap_base + ARM_ANGEL_HEAP_SIZE;
 /* Try a big heap, and reduce the size if that fails.  */
 for (;;) {
 ret = do_brk(limit);
-if (ret != -1)
+if (ret = limit) {
 break;
+}
 limit = (ts-heap_base  1) + (limit  1);
 }
 ts-heap_limit = limit;
-- 
1.7.4.1

[Qemu-devel] [PATCH 12/18] linux-user: syscall should use sanitized arg1

2011-06-20 Thread riku . voipio

From: Juan Quintela quint...@redhat.com

Looking at the other architectures, we should be using how not arg1.

Signed-off-by: Juan Quintela quint...@redhat.com
[peter.mayd...@linaro.org: remove unnecessary initialisation of how]
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/syscall.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 57d9233..1c0503f 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -7181,7 +7181,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 case TARGET_NR_osf_sigprocmask:
 {
 abi_ulong mask;
-int how = arg1;
+int how;
 sigset_t set, oldset;
 
 switch(arg1) {
@@ -7200,7 +7200,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 }
 mask = arg2;
 target_to_host_old_sigset(set, mask);
-sigprocmask(arg1, set, oldset);
+sigprocmask(how, set, oldset);
 host_to_target_old_sigset(mask, oldset);
 ret = mask;
 }
-- 
1.7.4.1

[Qemu-devel] [PATCH 07/18] linux-user: Fix the computation of the requested heap size

2011-06-20 Thread riku . voipio

From: Cédric VINCENT cedric.vinc...@st.com

There were two remaining bugs in the previous implementation of
do_brk():

1. the value of new_alloc_size was one page too large when the
   requested brk was aligned on a host page boundary.

2. no new pages should be (re-)allocated when the requested brk is
   in the range of the pages that were already allocated
   previsouly (for the same purpose).  Technically these pages are
   never unmapped in the current implementation.

The problem/fix can be reproduced/validated with the following test
case:

#include unistd.h   /* syscall(2),  */
#include sys/syscall.h  /* SYS_brk, */
#include stdio.h/* puts(3), */
#include stdlib.h   /* exit(3), EXIT_*, */

int main()
{
int current_brk = 0;
int new_brk;
int failure = 0;

void test(int increment) {
static int test_number = 0;
test_number++;

new_brk = syscall(SYS_brk, current_brk + increment);
if (new_brk == current_brk) {
printf(test %d fails\n, test_number);
failure++;
}

current_brk = new_brk;
}

/* Initialization.  */
test(0);

/* Does QEMU overlap host pages?  */
test(HOST_PAGE_SIZE);
test(HOST_PAGE_SIZE);

/* Does QEMU allocate the same host page twice?  */
test(-HOST_PAGE_SIZE);
test(HOST_PAGE_SIZE);

if (!failure) {
printf(success\n);
exit(EXIT_SUCCESS);
}
else {
exit(EXIT_FAILURE);
}
}

Signed-off-by: Cédric VINCENT cedric.vinc...@st.com
Reviewed-by: Christophe Guillon christophe.guil...@st.com
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/syscall.c |   11 ++-
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index b975730..be27f53 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -709,16 +709,17 @@ char *target_strerror(int err)
 
 static abi_ulong target_brk;
 static abi_ulong target_original_brk;
+static abi_ulong brk_page;
 
 void target_set_brk(abi_ulong new_brk)
 {
 target_original_brk = target_brk = HOST_PAGE_ALIGN(new_brk);
+brk_page = HOST_PAGE_ALIGN(target_brk);
 }
 
 /* do_brk() must return target values and target errnos. */
 abi_long do_brk(abi_ulong new_brk)
 {
-abi_ulong brk_page;
 abi_long mapped_addr;
 intnew_alloc_size;
 
@@ -727,9 +728,8 @@ abi_long do_brk(abi_ulong new_brk)
 if (new_brk  target_original_brk)
 return target_brk;
 
-brk_page = HOST_PAGE_ALIGN(target_brk);
-
-/* If the new brk is less than this, set it and we're done... */
+/* If the new brk is less than the highest page reserved to the
+ * target heap allocation, set it and we're done... */
 if (new_brk  brk_page) {
target_brk = new_brk;
return target_brk;
@@ -741,13 +741,14 @@ abi_long do_brk(abi_ulong new_brk)
  * itself); instead we treat mapped but at wrong address as
  * a failure and unmap again.
  */
-new_alloc_size = HOST_PAGE_ALIGN(new_brk - brk_page + 1);
+new_alloc_size = HOST_PAGE_ALIGN(new_brk - brk_page);
 mapped_addr = get_errno(target_mmap(brk_page, new_alloc_size,
 PROT_READ|PROT_WRITE,
 MAP_ANON|MAP_PRIVATE, 0, 0));
 
 if (mapped_addr == brk_page) {
 target_brk = new_brk;
+brk_page = HOST_PAGE_ALIGN(target_brk);
 return target_brk;
 } else if (mapped_addr != -1) {
 /* Mapped but at wrong address, meaning there wasn't actually
-- 
1.7.4.1

[Qemu-devel] [PATCH 13/18] flatload: end_code was only used in a debug message

2011-06-20 Thread riku . voipio

From: Juan Quintela quint...@redhat.com

Just unfold its definition in only use.

Signed-off-by: Juan Quintela quint...@redhat.com
[peter.mayd...@linaro.org: fixed typo in the debug code,
added parentheses to fix precedence issue]
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/flatload.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/linux-user/flatload.c b/linux-user/flatload.c
index cd7af7c..6fb78f5 100644
--- a/linux-user/flatload.c
+++ b/linux-user/flatload.c
@@ -384,7 +384,7 @@ static int load_flat_file(struct linux_binprm * bprm,
 abi_ulong reloc = 0, rp;
 int i, rev, relocs = 0;
 abi_ulong fpos;
-abi_ulong start_code, end_code;
+abi_ulong start_code;
 abi_ulong indx_len;
 
 hdr = ((struct flat_hdr *) bprm-buf); /* exec-header */
@@ -552,11 +552,10 @@ static int load_flat_file(struct linux_binprm * bprm,
 
 /* The main program needs a little extra setup in the task structure */
 start_code = textpos + sizeof (struct flat_hdr);
-end_code = textpos + text_len;
 
 DBG_FLT(%s %s: TEXT=%x-%x DATA=%x-%x BSS=%x-%x\n,
 id ? Lib : Load, bprm-filename,
-(int) start_code, (int) end_code,
+(int) start_code, (int) (textpos + text_len),
 (int) datapos,
 (int) (datapos + data_len),
 (int) (datapos + data_len),
-- 
1.7.4.1

[Qemu-devel] [PATCH 04/18] linux-user: Don't use MAP_FIXED in do_brk()

2011-06-20 Thread riku . voipio

From: Peter Maydell peter.mayd...@linaro.org

Since mmap() with MAP_FIXED will map over the top of existing mappings,
it's a bad idea to use it to implement brk(), because brk() with a
large size is likely to overwrite important things like qemu itself
or the host libc. So we drop MAP_FIXED and handle mapped but at
different address as an error case instead.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/syscall.c |   29 -
 1 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 5cb27c7..b975730 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -735,23 +735,34 @@ abi_long do_brk(abi_ulong new_brk)
return target_brk;
 }
 
-/* We need to allocate more memory after the brk... */
+/* We need to allocate more memory after the brk... Note that
+ * we don't use MAP_FIXED because that will map over the top of
+ * any existing mapping (like the one with the host libc or qemu
+ * itself); instead we treat mapped but at wrong address as
+ * a failure and unmap again.
+ */
 new_alloc_size = HOST_PAGE_ALIGN(new_brk - brk_page + 1);
 mapped_addr = get_errno(target_mmap(brk_page, new_alloc_size,
 PROT_READ|PROT_WRITE,
-MAP_ANON|MAP_FIXED|MAP_PRIVATE, 0, 0));
+MAP_ANON|MAP_PRIVATE, 0, 0));
+
+if (mapped_addr == brk_page) {
+target_brk = new_brk;
+return target_brk;
+} else if (mapped_addr != -1) {
+/* Mapped but at wrong address, meaning there wasn't actually
+ * enough space for this brk.
+ */
+target_munmap(mapped_addr, new_alloc_size);
+mapped_addr = -1;
+}
 
 #if defined(TARGET_ALPHA)
 /* We (partially) emulate OSF/1 on Alpha, which requires we
return a proper errno, not an unchanged brk value.  */
-if (is_error(mapped_addr)) {
-return -TARGET_ENOMEM;
-}
+return -TARGET_ENOMEM;
 #endif
-
-if (!is_error(mapped_addr)) {
-   target_brk = new_brk;
-}
+/* For everything else, return the previous break. */
 return target_brk;
 }
 
-- 
1.7.4.1

[Qemu-devel] [PATCH 08/18] linux-user: add pselect6 syscall support

2011-06-20 Thread riku . voipio

From: Mike Frysinger vap...@gentoo.org

Some architectures (like Blackfin) only implement pselect6 (and skip
select/newselect).  So add support for it.

Signed-off-by: Mike Frysinger vap...@gentoo.org
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/syscall.c |  149 +++--
 1 files changed, 130 insertions(+), 19 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index be27f53..362cc63 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -550,6 +550,15 @@ _syscall5(int, sys_ppoll, struct pollfd *, fds, nfds_t, 
nfds,
   size_t, sigsetsize)
 #endif
 
+#if defined(TARGET_NR_pselect6)
+#ifndef __NR_pselect6
+# define __NR_pselect6 -1
+#endif
+#define __NR_sys_pselect6 __NR_pselect6
+_syscall6(int, sys_pselect6, int, nfds, fd_set *, readfds, fd_set *, writefds,
+  fd_set *, exceptfds, struct timespec *, timeout, void *, sig);
+#endif
+
 extern int personality(int);
 extern int flock(int, int);
 extern int setfsuid(int);
@@ -799,6 +808,20 @@ static inline abi_long copy_from_user_fdset(fd_set *fds,
 return 0;
 }
 
+static inline abi_ulong copy_from_user_fdset_ptr(fd_set *fds, fd_set **fds_ptr,
+ abi_ulong target_fds_addr,
+ int n)
+{
+if (target_fds_addr) {
+if (copy_from_user_fdset(fds, target_fds_addr, n))
+return -TARGET_EFAULT;
+*fds_ptr = fds;
+} else {
+*fds_ptr = NULL;
+}
+return 0;
+}
+
 static inline abi_long copy_to_user_fdset(abi_ulong target_fds_addr,
   const fd_set *fds,
   int n)
@@ -964,6 +987,7 @@ static inline abi_long copy_to_user_mq_attr(abi_ulong 
target_mq_attr_addr,
 }
 #endif
 
+#if defined(TARGET_NR_select) || defined(TARGET_NR__newselect)
 /* do_select() must return target values and target errnos. */
 static abi_long do_select(int n,
   abi_ulong rfd_addr, abi_ulong wfd_addr,
@@ -974,26 +998,17 @@ static abi_long do_select(int n,
 struct timeval tv, *tv_ptr;
 abi_long ret;
 
-if (rfd_addr) {
-if (copy_from_user_fdset(rfds, rfd_addr, n))
-return -TARGET_EFAULT;
-rfds_ptr = rfds;
-} else {
-rfds_ptr = NULL;
+ret = copy_from_user_fdset_ptr(rfds, rfds_ptr, rfd_addr, n);
+if (ret) {
+return ret;
 }
-if (wfd_addr) {
-if (copy_from_user_fdset(wfds, wfd_addr, n))
-return -TARGET_EFAULT;
-wfds_ptr = wfds;
-} else {
-wfds_ptr = NULL;
+ret = copy_from_user_fdset_ptr(wfds, wfds_ptr, wfd_addr, n);
+if (ret) {
+return ret;
 }
-if (efd_addr) {
-if (copy_from_user_fdset(efds, efd_addr, n))
-return -TARGET_EFAULT;
-efds_ptr = efds;
-} else {
-efds_ptr = NULL;
+ret = copy_from_user_fdset_ptr(efds, efds_ptr, efd_addr, n);
+if (ret) {
+return ret;
 }
 
 if (target_tv_addr) {
@@ -1020,6 +1035,7 @@ static abi_long do_select(int n,
 
 return ret;
 }
+#endif
 
 static abi_long do_pipe2(int host_pipe[], int flags)
 {
@@ -5581,7 +5597,102 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 #endif
 #ifdef TARGET_NR_pselect6
 case TARGET_NR_pselect6:
-   goto unimplemented_nowarn;
+{
+abi_long rfd_addr, wfd_addr, efd_addr, n, ts_addr;
+fd_set rfds, wfds, efds;
+fd_set *rfds_ptr, *wfds_ptr, *efds_ptr;
+struct timespec ts, *ts_ptr;
+
+/*
+ * The 6th arg is actually two args smashed together,
+ * so we cannot use the C library.
+ */
+sigset_t set;
+struct {
+sigset_t *set;
+size_t size;
+} sig, *sig_ptr;
+
+abi_ulong arg_sigset, arg_sigsize, *arg7;
+target_sigset_t *target_sigset;
+
+n = arg1;
+rfd_addr = arg2;
+wfd_addr = arg3;
+efd_addr = arg4;
+ts_addr = arg5;
+
+ret = copy_from_user_fdset_ptr(rfds, rfds_ptr, rfd_addr, n);
+if (ret) {
+goto fail;
+}
+ret = copy_from_user_fdset_ptr(wfds, wfds_ptr, wfd_addr, n);
+if (ret) {
+goto fail;
+}
+ret = copy_from_user_fdset_ptr(efds, efds_ptr, efd_addr, n);
+if (ret) {
+goto fail;
+}
+
+/*
+ * This takes a timespec, and not a timeval, so we cannot
+ * use the do_select() helper ...
+ */
+if (ts_addr) {
+if (target_to_host_timespec(ts, ts_addr)) {
+goto efault;
+}
+ts_ptr = ts;
+} else {
+ts_ptr = NULL;
+}
+
+/*

[Qemu-devel] [PATCH 11/18] syscall: really return ret code

2011-06-20 Thread riku . voipio

From: Juan Quintela quint...@redhat.com

We assign ret with the error code, but then return 0 unconditionally.

Signed-off-by: Juan Quintela quint...@redhat.com
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/syscall.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 362cc63..57d9233 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -3779,10 +3779,10 @@ static abi_long do_get_thread_area(CPUX86State *env, 
abi_ulong ptr)
 #ifndef TARGET_ABI32
 static abi_long do_arch_prctl(CPUX86State *env, int code, abi_ulong addr)
 {
-abi_long ret;
+abi_long ret = 0;
 abi_ulong val;
 int idx;
-
+
 switch(code) {
 case TARGET_ARCH_SET_GS:
 case TARGET_ARCH_SET_FS:
@@ -3801,13 +3801,13 @@ static abi_long do_arch_prctl(CPUX86State *env, int 
code, abi_ulong addr)
 idx = R_FS;
 val = env-segs[idx].base;
 if (put_user(val, addr, abi_ulong))
-return -TARGET_EFAULT;
+ret = -TARGET_EFAULT;
 break;
 default:
 ret = -TARGET_EINVAL;
 break;
 }
-return 0;
+return ret;
 }
 #endif
 
-- 
1.7.4.1

[Qemu-devel] [PATCH 03/18] linux-user: Handle images where lowest vaddr is not page aligned

2011-06-20 Thread riku . voipio

From: Peter Maydell peter.mayd...@linaro.org

Fix a bug in the linux-user ELF loader code where it was not correctly
handling images where the lowest vaddr to be loaded was not page aligned.
The problem was that the code to probe for a suitable guest base address
was changing the 'loaddr' variable (by rounding it to a page boundary),
which meant that the load bias would then be incorrectly calculated
unless loaddr happened to already be page-aligned.

Binaries generated by gcc with the default linker script do start with
a loadable segment at a page-aligned vaddr, so were unaffected. This
bug was noticed with a binary created by the Google Go toolchain for ARM.

We fix the bug by refactoring the probe for guest base code out into
its own self-contained function.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/elfload.c |  130 --
 1 files changed, 73 insertions(+), 57 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index a4aabd5..a13eb7b 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1288,6 +1288,78 @@ static abi_ulong create_elf_tables(abi_ulong p, int 
argc, int envc,
 return sp;
 }
 
+static void probe_guest_base(const char *image_name,
+ abi_ulong loaddr, abi_ulong hiaddr)
+{
+/* Probe for a suitable guest base address, if the user has not set
+ * it explicitly, and set guest_base appropriately.
+ * In case of error we will print a suitable message and exit.
+ */
+#if defined(CONFIG_USE_GUEST_BASE)
+const char *errmsg;
+if (!have_guest_base  !reserved_va) {
+unsigned long host_start, real_start, host_size;
+
+/* Round addresses to page boundaries.  */
+loaddr = qemu_host_page_mask;
+hiaddr = HOST_PAGE_ALIGN(hiaddr);
+
+if (loaddr  mmap_min_addr) {
+host_start = HOST_PAGE_ALIGN(mmap_min_addr);
+} else {
+host_start = loaddr;
+if (host_start != loaddr) {
+errmsg = Address overflow loading ELF binary;
+goto exit_errmsg;
+}
+}
+host_size = hiaddr - loaddr;
+while (1) {
+/* Do not use mmap_find_vma here because that is limited to the
+   guest address space.  We are going to make the
+   guest address space fit whatever we're given.  */
+real_start = (unsigned long)
+mmap((void *)host_start, host_size, PROT_NONE,
+ MAP_ANONYMOUS | MAP_PRIVATE | MAP_NORESERVE, -1, 0);
+if (real_start == (unsigned long)-1) {
+goto exit_perror;
+}
+if (real_start == host_start) {
+break;
+}
+/* That address didn't work.  Unmap and try a different one.
+   The address the host picked because is typically right at
+   the top of the host address space and leaves the guest with
+   no usable address space.  Resort to a linear search.  We
+   already compensated for mmap_min_addr, so this should not
+   happen often.  Probably means we got unlucky and host
+   address space randomization put a shared library somewhere
+   inconvenient.  */
+munmap((void *)real_start, host_size);
+host_start += qemu_host_page_size;
+if (host_start == loaddr) {
+/* Theoretically possible if host doesn't have any suitably
+   aligned areas.  Normally the first mmap will fail.  */
+errmsg = Unable to find space for application;
+goto exit_errmsg;
+}
+}
+qemu_log(Relocating guest address space from 0x
+ TARGET_ABI_FMT_lx  to 0x%lx\n,
+ loaddr, real_start);
+guest_base = real_start - loaddr;
+}
+return;
+
+exit_perror:
+errmsg = strerror(errno);
+exit_errmsg:
+fprintf(stderr, %s: %s\n, image_name, errmsg);
+exit(-1);
+#endif
+}
+
+
 /* Load an ELF image into the address space.
 
IMAGE_NAME is the filename of the image, to use in error messages.
@@ -1373,63 +1445,7 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 /* This is the main executable.  Make sure that the low
address does not conflict with MMAP_MIN_ADDR or the
QEMU application itself.  */
-#if defined(CONFIG_USE_GUEST_BASE)
-/*
- * In case where user has not explicitly set the guest_base, we
- * probe here that should we set it automatically.
- */
-if (!have_guest_base  !reserved_va) {
-unsigned long host_start, real_start, host_size;
-
-/* Round addresses to page boundaries.  */
-loaddr = qemu_host_page_mask;
-hiaddr = HOST_PAGE_ALIGN(hiaddr);
-
-

[Qemu-devel] [PATCH 10/18] linuxload: id_change was a write only variable

2011-06-20 Thread riku . voipio

From: Juan Quintela quint...@redhat.com

Signed-off-by: Juan Quintela quint...@redhat.com
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/linuxload.c |   25 +
 1 files changed, 1 insertions(+), 24 deletions(-)

diff --git a/linux-user/linuxload.c b/linux-user/linuxload.c
index ac8c486..62ebc7e 100644
--- a/linux-user/linuxload.c
+++ b/linux-user/linuxload.c
@@ -26,22 +26,6 @@ abi_long memcpy_to_target(abi_ulong dest, const void *src,
 return 0;
 }
 
-static int in_group_p(gid_t g)
-{
-/* return TRUE if we're in the specified group, FALSE otherwise */
-intngroup;
-inti;
-gid_t  grouplist[NGROUPS];
-
-ngroup = getgroups(NGROUPS, grouplist);
-for(i = 0; i  ngroup; i++) {
-   if(grouplist[i] == g) {
-   return 1;
-   }
-}
-return 0;
-}
-
 static int count(char ** vec)
 {
 inti;
@@ -57,7 +41,7 @@ static int prepare_binprm(struct linux_binprm *bprm)
 {
 struct statst;
 int mode;
-int retval, id_change;
+int retval;
 
 if(fstat(bprm-fd, st)  0) {
return(-errno);
@@ -73,14 +57,10 @@ static int prepare_binprm(struct linux_binprm *bprm)
 
 bprm-e_uid = geteuid();
 bprm-e_gid = getegid();
-id_change = 0;
 
 /* Set-uid? */
 if(mode  S_ISUID) {
bprm-e_uid = st.st_uid;
-   if(bprm-e_uid != geteuid()) {
-   id_change = 1;
-   }
 }
 
 /* Set-gid? */
@@ -91,9 +71,6 @@ static int prepare_binprm(struct linux_binprm *bprm)
  */
 if ((mode  (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
bprm-e_gid = st.st_gid;
-   if (!in_group_p(bprm-e_gid)) {
-   id_change = 1;
-   }
 }
 
 retval = read(bprm-fd, bprm-buf, BPRM_BUF_SIZE);
-- 
1.7.4.1

[Qemu-devel] [PATCH 15/18] linux-user: Bump do_syscall() up to 8 syscall arguments

2011-06-20 Thread riku . voipio

From: Peter Maydell peter.mayd...@linaro.org

On 32 bit MIPS a few syscalls have 7 arguments, and so to call
them via NR_syscall the guest needs to be able to pass 8 arguments
to do_syscall(). Raise the number of arguments do_syscall() takes
accordingly.

This fixes some gcc 4.6 compiler warnings about arg7 and arg8
variables being set and never used.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/main.c|   37 -
 linux-user/qemu.h|3 ++-
 linux-user/syscall.c |8 +---
 3 files changed, 31 insertions(+), 17 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 71dd253..1293450 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -319,7 +319,8 @@ void cpu_loop(CPUX86State *env)
   env-regs[R_EDX],
   env-regs[R_ESI],
   env-regs[R_EDI],
-  env-regs[R_EBP]);
+  env-regs[R_EBP],
+  0, 0);
 break;
 #ifndef TARGET_ABI32
 case EXCP_SYSCALL:
@@ -331,7 +332,8 @@ void cpu_loop(CPUX86State *env)
   env-regs[R_EDX],
   env-regs[10],
   env-regs[8],
-  env-regs[9]);
+  env-regs[9],
+  0, 0);
 env-eip = env-exception_next_eip;
 break;
 #endif
@@ -735,7 +737,8 @@ void cpu_loop(CPUARMState *env)
   env-regs[2],
   env-regs[3],
   env-regs[4],
-  env-regs[5]);
+  env-regs[5],
+  0, 0);
 }
 } else {
 goto error;
@@ -831,7 +834,8 @@ void cpu_loop(CPUState *env)
   env-regs[2],
   env-regs[3],
   env-regs[4],
-  env-regs[5]);
+  env-regs[5],
+  0, 0);
 }
 } else {
 goto error;
@@ -1018,7 +1022,8 @@ void cpu_loop (CPUSPARCState *env)
 ret = do_syscall (env, env-gregs[1],
   env-regwptr[0], env-regwptr[1],
   env-regwptr[2], env-regwptr[3],
-  env-regwptr[4], env-regwptr[5]);
+  env-regwptr[4], env-regwptr[5],
+  0, 0);
 if ((abi_ulong)ret = (abi_ulong)(-515)) {
 #if defined(TARGET_SPARC64)  !defined(TARGET_ABI32)
 env-xcc |= PSR_CARRY;
@@ -1611,7 +1616,7 @@ void cpu_loop(CPUPPCState *env)
 env-crf[0] = ~0x1;
 ret = do_syscall(env, env-gpr[0], env-gpr[3], env-gpr[4],
  env-gpr[5], env-gpr[6], env-gpr[7],
- env-gpr[8]);
+ env-gpr[8], 0, 0);
 if (ret == (uint32_t)(-TARGET_QEMU_ESIGRETURN)) {
 /* Returning from a successful sigreturn syscall.
Avoid corrupting register state.  */
@@ -2072,7 +2077,7 @@ void cpu_loop(CPUMIPSState *env)
  env-active_tc.gpr[5],
  env-active_tc.gpr[6],
  env-active_tc.gpr[7],
- arg5, arg6/*, arg7, arg8*/);
+ arg5, arg6, arg7, arg8);
 }
 if (ret == -TARGET_QEMU_ESIGRETURN) {
 /* Returning from a successful sigreturn syscall.
@@ -2160,7 +2165,8 @@ void cpu_loop (CPUState *env)
  env-gregs[6],
  env-gregs[7],
  env-gregs[0],
- env-gregs[1]);
+ env-gregs[1],
+ 0, 0);
 env-gregs[0] = ret;
 break;
 case EXCP_INTERRUPT:
@@ -2229,7 +2235,8 @@ void cpu_loop (CPUState *env)
  env-regs[12], 
  env-regs[13], 
  env-pregs[7], 
- env-pregs[11]);
+ env-pregs[11],
+ 0, 0);
 env-regs[10] =

[Qemu-devel] [PATCHv4] qemu-img: Add cache command line option

2011-06-20 Thread Federico Simoncelli

qemu-img currently writes disk images using writeback and filling
up the cache buffers which are then flushed by the kernel preventing
other processes from accessing the storage.
This is particularly bad in cluster environments where time-based
algorithms might be in place and accessing the storage within
certain timeouts is critical.
This patch adds the option to choose a cache method when writing
disk images.

Signed-off-by: Federico Simoncelli fsimo...@redhat.com
---
 qemu-img-cmds.hx |6 ++--
 qemu-img.c   |   80 +-
 2 files changed, 70 insertions(+), 16 deletions(-)

diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 3072d38..2b70618 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -22,13 +22,13 @@ STEXI
 ETEXI
 
 DEF(commit, img_commit,
-commit [-f fmt] filename)
+commit [-f fmt] [-t cache] filename)
 STEXI
 @item commit [-f @var{fmt}] @var{filename}
 ETEXI
 
 DEF(convert, img_convert,
-convert [-c] [-p] [-f fmt] [-O output_fmt] [-o options] [-s 
snapshot_name] filename [filename2 [...]] output_filename)
+convert [-c] [-p] [-f fmt] [-t cache] [-O output_fmt] [-o options] [-s 
snapshot_name] filename [filename2 [...]] output_filename)
 STEXI
 @item convert [-c] [-f @var{fmt}] [-O @var{output_fmt}] [-o @var{options}] [-s 
@var{snapshot_name}] @var{filename} [@var{filename2} [...]] 
@var{output_filename}
 ETEXI
@@ -46,7 +46,7 @@ STEXI
 ETEXI
 
 DEF(rebase, img_rebase,
-rebase [-f fmt] [-p] [-u] -b backing_file [-F backing_fmt] filename)
+rebase [-f fmt] [-t cache] [-p] [-u] -b backing_file [-F backing_fmt] 
filename)
 STEXI
 @item rebase [-f @var{fmt}] [-u] -b @var{backing_file} [-F @var{backing_fmt}] 
@var{filename}
 ETEXI
diff --git a/qemu-img.c b/qemu-img.c
index 4f162d1..f904e32 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -40,6 +40,7 @@ typedef struct img_cmd_t {
 
 /* Default to cache=writeback as data integrity is not important for qemu-tcg. 
*/
 #define BDRV_O_FLAGS BDRV_O_CACHE_WB
+#define BDRV_DEFAULT_CACHE writeback
 
 static void format_print(void *opaque, const char *name)
 {
@@ -64,6 +65,8 @@ static void help(void)
Command parameters:\n
  'filename' is a disk image filename\n
  'fmt' is the disk image format. It is guessed automatically in 
most cases\n
+ 'cache' is the cache mode used to write the output disk image, 
the valid\n
+   options are: 'none', 'writeback' (default), 'writethrough' and 
'unsafe'\n
  'size' is the disk image size in bytes. Optional suffixes\n
'k' or 'K' (kilobyte, 1024), 'M' (megabyte, 1024k), 'G' 
(gigabyte, 1024M)\n
and T (terabyte, 1024G) are supported. 'b' is ignored.\n
@@ -180,6 +183,27 @@ static int read_password(char *buf, int buf_size)
 }
 #endif
 
+static int set_cache_flag(const char *mode, int *flags)
+{
+*flags = ~BDRV_O_CACHE_MASK;
+
+if (!strcmp(mode, none) || !strcmp(mode, off)) {
+*flags |= BDRV_O_CACHE_WB;
+*flags |= BDRV_O_NOCACHE;
+} else if (!strcmp(mode, writeback)) {
+*flags |= BDRV_O_CACHE_WB;
+} else if (!strcmp(mode, unsafe)) {
+*flags |= BDRV_O_CACHE_WB;
+*flags |= BDRV_O_NO_FLUSH;
+} else if (!strcmp(mode, writethrough)) {
+/* this is the default */
+} else {
+return -1;
+}
+
+return 0;
+}
+
 static int print_block_option_help(const char *filename, const char *fmt)
 {
 BlockDriver *drv, *proto_drv;
@@ -441,13 +465,14 @@ static int img_check(int argc, char **argv)
 
 static int img_commit(int argc, char **argv)
 {
-int c, ret;
-const char *filename, *fmt;
+int c, ret, flags;
+const char *filename, *fmt, *cache;
 BlockDriverState *bs;
 
 fmt = NULL;
+cache = BDRV_DEFAULT_CACHE;
 for(;;) {
-c = getopt(argc, argv, f:h);
+c = getopt(argc, argv, f:ht:);
 if (c == -1) {
 break;
 }
@@ -459,6 +484,9 @@ static int img_commit(int argc, char **argv)
 case 'f':
 fmt = optarg;
 break;
+case 't':
+cache = optarg;
+break;
 }
 }
 if (optind = argc) {
@@ -466,7 +494,14 @@ static int img_commit(int argc, char **argv)
 }
 filename = argv[optind++];
 
-bs = bdrv_new_open(filename, fmt, BDRV_O_FLAGS | BDRV_O_RDWR);
+flags = BDRV_O_RDWR;
+ret = set_cache_flag(cache, flags);
+if (ret  0) {
+error_report(Invalid cache option: %s\n, cache);
+return -1;
+}
+
+bs = bdrv_new_open(filename, fmt, flags);
 if (!bs) {
 return 1;
 }
@@ -591,8 +626,8 @@ static int compare_sectors(const uint8_t *buf1, const 
uint8_t *buf2, int n,
 static int img_convert(int argc, char **argv)
 {
 int c, ret = 0, n, n1, bs_n, bs_i, compress, cluster_size, cluster_sectors;
-int progress = 0;
-const char *fmt, *out_fmt, *out_baseimg, *out_filename;
+int progress = 0, flags;
+

[Qemu-devel] [PATCH 09/18] linux-user: Define AT_RANDOM to support target stack protection mechanism.

2011-06-20 Thread riku . voipio

From: Laurent ALFONSI laurent.alfo...@st.com

Note that the support for the command-line argument requires:

 1. add the new field uint8_t rand_bytes[16] to struct
image_info since only the variable info lives both in
main() and in create_elf_tables()

 2. write a dedicated parser to convert the command-line to fill
rand_bytes[]

These two steps aren't really hard to achieve but I finally think they
are a little bit overkill regarding the purpose of these 16 bytes.
Maybe we could always fill the 16 bytes pointed to by AT_RANDOM with
zero if we really want to get reproducibility.

Regards,
Cédric.

888888888888

The dynamic linker from the GNU C library v2.10+ uses the ELF
auxiliary vector AT_RANDOM [1] as a pointer to 16 bytes with random
values to initialize the stack protection mechanism.  Technically the
emulated GNU dynamic linker crashes due to a NULL pointer
derefencement if it is built with stack protection enabled and if
AT_RANDOM is not defined by the QEMU ELF loader.

[1] This ELF auxiliary vector was introduced in Linux v2.6.29.

This patch can be tested with the code above:

#include elf.h   /* Elf*_auxv_t, AT_RANDOM, */
#include stdio.h /* printf(3), */
#include stdlib.h/* exit(3), EXIT_*, */
#include stdint.h/* uint8_t, */
#include string.h/* memcpy(3), */

#if defined(__LP64__) || defined(__ILP64__) || defined(__LLP64__)
#define Elf_auxv_t Elf64_auxv_t
#else
#define Elf_auxv_t Elf32_auxv_t
#endif

main(int argc, char* argv[], char* envp[])
{
Elf_auxv_t *auxv;

/* *envp = NULL marks end of envp. */
while (*envp++ != NULL);

/* auxv-a_type = AT_NULL marks the end of auxv. */
for (auxv = (Elf_auxv_t *)envp; auxv-a_type != AT_NULL; auxv++) {
if (auxv-a_type == AT_RANDOM) {
int i;
uint8_t rand_bytes[16];

printf(AT_RANDOM is: 0x%x\n, auxv-a_un.a_val);
memcpy(rand_bytes, (const uint8_t *)auxv-a_un.a_val, 
sizeof(rand_bytes));
printf(it points to: );
for (i = 0; i  16; i++) {
printf(0x%02x , rand_bytes[i]);
}
printf(\n);
exit(EXIT_SUCCESS);
}
}
exit(EXIT_FAILURE);
}

Changes introduced in v2 and v3:

* Fix typos + thinko (AT_RANDOM is used for stack canary, not for
  ASLR)

* AT_RANDOM points to 16 random bytes stored inside the user
  stack.

* Add a small test program.

Signed-off-by: Cédric VINCENT cedric.vinc...@st.com
Signed-off-by: Laurent ALFONSI laurent.alfo...@st.com
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/elfload.c |   21 -
 1 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index a13eb7b..b2746f2 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -927,7 +927,7 @@ struct exec
 #define TARGET_ELF_PAGESTART(_v) ((_v)  ~(unsigned 
long)(TARGET_ELF_EXEC_PAGESIZE-1))
 #define TARGET_ELF_PAGEOFFSET(_v) ((_v)  (TARGET_ELF_EXEC_PAGESIZE-1))
 
-#define DLINFO_ITEMS 12
+#define DLINFO_ITEMS 13
 
 static inline void memcpy_fromfs(void * to, const void * from, unsigned long n)
 {
@@ -1202,6 +1202,9 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, 
int envc,
 {
 abi_ulong sp;
 int size;
+int i;
+abi_ulong u_rand_bytes;
+uint8_t k_rand_bytes[16];
 abi_ulong u_platform;
 const char *k_platform;
 const int n = sizeof(elf_addr_t);
@@ -1231,6 +1234,20 @@ static abi_ulong create_elf_tables(abi_ulong p, int 
argc, int envc,
 /* FIXME - check return value of memcpy_to_target() for failure */
 memcpy_to_target(sp, k_platform, len);
 }
+
+/*
+ * Generate 16 random bytes for userspace PRNG seeding (not
+ * cryptically secure but it's not the aim of QEMU).
+ */
+srand((unsigned int) time(NULL));
+for (i = 0; i  16; i++) {
+k_rand_bytes[i] = rand();
+}
+sp -= 16;
+u_rand_bytes = sp;
+/* FIXME - check return value of memcpy_to_target() for failure */
+memcpy_to_target(sp, k_rand_bytes, 16);
+
 /*
  * Force 16 byte _final_ alignment here for generality.
  */
@@ -1271,6 +1288,8 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, 
int envc,
 NEW_AUX_ENT(AT_EGID, (abi_ulong) getegid());
 NEW_AUX_ENT(AT_HWCAP, (abi_ulong) ELF_HWCAP);
 NEW_AUX_ENT(AT_CLKTCK, (abi_ulong) sysconf(_SC_CLK_TCK));
+NEW_AUX_ENT(AT_RANDOM, (abi_ulong) u_rand_bytes);
+
 if (k_platform)
 NEW_AUX_ENT(AT_PLATFORM, u_platform);
 #ifdef ARCH_DLINFO
-- 
1.7.4.1

[Qemu-devel] [PATCH 18/18] linux-user: Fix sync_file_range on 32bit mips

2011-06-20 Thread riku . voipio

From: Riku Voipio riku.voi...@iki.fi

As noticed while looking at Bump do_syscall() up to 8 syscall arguments
patch, sync_file_range uses a pad argument on 32bit mips. Deal with it
by reading the correct arguments when on mips.

Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/syscall.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index aa11a2c..beb482c 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -7842,8 +7842,13 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 #if defined(TARGET_NR_sync_file_range)
 case TARGET_NR_sync_file_range:
 #if TARGET_ABI_BITS == 32
+#if defined(TARGET_MIPS)
+ret = get_errno(sync_file_range(arg1, target_offset64(arg3, arg4),
+target_offset64(arg5, arg6), arg7));
+#else
 ret = get_errno(sync_file_range(arg1, target_offset64(arg2, arg3),
 target_offset64(arg4, arg5), arg6));
+#endif /* !TARGET_MIPS */
 #else
 ret = get_errno(sync_file_range(arg1, arg2, arg3, arg4));
 #endif
-- 
1.7.4.1

[Qemu-devel] [PATCH 17/18] linux-user/signal.c: Remove unused fenab

2011-06-20 Thread riku . voipio

From: Peter Maydell peter.mayd...@linaro.org

Remove fenab as it is only written, never used. Add a FIXME
comment about the discrepancy between our behaviour and that
of the Linux kernel for this routine.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/signal.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 4edd974..7d168e1 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -2228,7 +2228,6 @@ void sparc64_set_context(CPUSPARCState *env)
 target_mc_gregset_t *grp;
 abi_ulong pc, npc, tstate;
 abi_ulong fp, i7, w_addr;
-unsigned char fenab;
 int err;
 unsigned int i;
 
@@ -2293,7 +2292,11 @@ void sparc64_set_context(CPUSPARCState *env)
 if (put_user(i7, w_addr + offsetof(struct target_reg_window, ins[7]), 
  abi_ulong) != 0)
 goto do_sigsegv;
-err |= __get_user(fenab, (ucp-tuc_mcontext.mc_fpregs.mcfpu_enab));
+/* FIXME this does not match how the kernel handles the FPU in
+ * its sparc64_set_context implementation. In particular the FPU
+ * is only restored if fenab is non-zero in:
+ *   __get_user(fenab, (ucp-tuc_mcontext.mc_fpregs.mcfpu_enab));
+ */
 err |= __get_user(env-fprs, (ucp-tuc_mcontext.mc_fpregs.mcfpu_fprs));
 {
 uint32_t *src, *dst;
-- 
1.7.4.1

[Qemu-devel] [PATCH 14/18] flatload: memp was a write-only variable

2011-06-20 Thread riku . voipio

From: Juan Quintela quint...@redhat.com

Signed-off-by: Juan Quintela quint...@redhat.com
Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/flatload.c |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/linux-user/flatload.c b/linux-user/flatload.c
index 6fb78f5..1062da3 100644
--- a/linux-user/flatload.c
+++ b/linux-user/flatload.c
@@ -379,7 +379,6 @@ static int load_flat_file(struct linux_binprm * bprm,
 abi_long result;
 abi_ulong realdatastart = 0;
 abi_ulong text_len, data_len, bss_len, stack_len, flags;
-abi_ulong memp = 0; /* for finding the brk area */
 abi_ulong extra;
 abi_ulong reloc = 0, rp;
 int i, rev, relocs = 0;
@@ -491,7 +490,6 @@ static int load_flat_file(struct linux_binprm * bprm,
 }
 
 reloc = datapos + (ntohl(hdr-reloc_start) - text_len);
-memp = realdatastart;
 
 } else {
 
@@ -506,7 +504,6 @@ static int load_flat_file(struct linux_binprm * bprm,
 realdatastart = textpos + ntohl(hdr-data_start);
 datapos = realdatastart + indx_len;
 reloc = (textpos + ntohl(hdr-reloc_start) + indx_len);
-memp = textpos;
 
 #ifdef CONFIG_BINFMT_ZFLAT
 #error code needs checking
-- 
1.7.4.1

Re: [Qemu-devel] [PATCH v2] Add support for fd: protocol

2011-06-20 Thread Avi Kivity


On 06/20/2011 04:50 PM, Anthony Liguori wrote:

On 06/20/2011 08:40 AM, Avi Kivity wrote:

On 06/14/2011 04:31 PM, Corey Bryant wrote:

- Starting Qemu with a backing file



For this we could tell qemu that a file named xyz is available via fd
n, via an extension of the getfd command.

For example

(qemu) getfd path=/images/my-image.img
(qemu) getfd path=/images/template.img
(qemu) drive-add path=/images/my-image.img

The open() for my-image.img first looks up the name in the getfd
database, and finds it, so it returns the fd from there instead of
opening. It then opens the backing file (template.img) and looks it up
again, and finds the second fd from the session.


The way I've been thinking about this is:

 -blockdev id=hd0-back,file=fd:4,format=raw \
 -blockdev file=fd:3,format=qcow2,backing=hd0-back

While your proposal is clever, it makes me a little nervous about 
subtle security ramifications.


It would need careful explanation in the management tool author's guide, 
yes.


The main advantage is generality.  It doesn't assume that a file format 
has just one backing file, and doesn't require new syntax wherever a 
file is referred to indirectly.


--
error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH 16/18] linux-user/signal.c: Remove only-ever-set variable fpu_save_addr

2011-06-20 Thread riku . voipio

From: Peter Maydell peter.mayd...@linaro.org

Move the access of fpu_save into the commented out skeleton code for
restoring FPU registers on SPARC sigreturn, thus silencing a gcc
4.6 variable set but never used warning.
(This doesn't affect the calculation of 'err' because in fact
__get_user() can never fail.)

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Riku Voipio riku.voi...@iki.fi
---
 linux-user/signal.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index cb7138f..4edd974 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -2080,7 +2080,6 @@ long do_sigreturn(CPUState *env)
 uint32_t up_psr, pc, npc;
 target_sigset_t set;
 sigset_t host_set;
-abi_ulong fpu_save_addr;
 int err, i;
 
 sf_addr = env-regwptr[UREG_FP];
@@ -2120,10 +2119,11 @@ long do_sigreturn(CPUState *env)
err |= __get_user(env-regwptr[i + UREG_I0], 
sf-info.si_regs.u_regs[i+8]);
}
 
-err |= __get_user(fpu_save_addr, sf-fpu_save);
-
-//if (fpu_save)
-//err |= restore_fpu_state(env, fpu_save);
+/* FIXME: implement FPU save/restore:
+ * __get_user(fpu_save, sf-fpu_save);
+ * if (fpu_save)
+ *err |= restore_fpu_state(env, fpu_save);
+ */
 
 /* This is pretty much atomic, no amount locking would prevent
  * the races which exist anyways.
-- 
1.7.4.1

Re: [Qemu-devel] [PATCH 0/2] Introduce panic hypercall

2011-06-20 Thread Daniel Gollub

On Monday, June 20, 2011 05:45:36 pm Avi Kivity wrote:
However, I'm not sure I see the gain.  Most enterprisey guests
already contain in-guest crash dumpers which provide more
information than a qemu memory dump could, since they know exact
load addresses etc. and are integrated with crash analysis tools.
What do you have in mind?

Right kexec/kdump works perfectly already inside the guest. But:

 - in the field a lot of people still manage to setup VM guest without
   kexec/kdump properly setup (even though most enterprisey distribution try
   hard to setup this up out-of-the-box .. still people manage to not have
   kexec/kdump loaded once they run into a crash).

 - you don't have to reserve disk space for a crashdump for each guest
   e.g. if you run 4 guests with 60 GB of memory each you would loose
   somehow 4*60 GB space ... just for the (rare) case that each of those
   guest could write a crashdump, uncompressed ...

 - legacy distribution - no or buggy kexec

 - maybe writing a crashdump+reboot with QEMU/libvirt is faster then
   with in-guest kexec/kdump? (haven't tested yet)

 - single place on the VM-host to collect coredumps


  
  Well libvirt can capture a core file by doing 'virsh dump $GUESTNAME'.
  This actually uses the QEMU monitor migration command to capture the
  entire of QEMU memory. The 'crash' command line tool actually knows
  how to analyse this data format as it would a normal kernel crashdump.
 
 Interesting.

Right. I'm using the kvmdump support of the crash utility now and then ... it 
could be more often. But unfortunately the people who run KVM in a productive 
environment with some strict service-level-agreement often just reboot, due to 
time pressure, or run out of disk space in the guest, or just forgot that they 
got told to do always virsh dump on a freeze or crash.


 
  I think having a way for a guest OS to notify the host that is has
  crashed would be useful. libvirt could automatically do a crash
  dump of the QEMU memory, or at least pause the guest CPUs and notify
  the management app of the crash, which can then decide what todo.
  You can also use tools like 'virt-dmesg' which uses libvirt to peek
  into guest memory to extract the most recent kernel dmesg logs (even
  if the guest OS itself is crashed  didn't manage to send them out
  via netconsole or something else).
 
 I agree.  But let's do this via a device, this way kvm need not be changed.

Is a device reliable enough if the guest kernel crashes?
Do you mean something like a hardware watchdog?

 
 Do ILO cards / IPMI support something like this?  We could follow their 
 lead in that case.

The only two things which came to my mind are:

 * NMI (aka. ipmitool diag) - already available in qemu/kvm - but requires
   in-guest kexec/kdump
 * Hardware-Watchdog (also available in qemu/libvirt)


lguest and xen have something similar. They also have an hypercall which get 
called by a function registered in the panic_notifier_list. Not quite sure if 
you want to follow their lead.

Something I forgot to mention: This panic hypercall could also sit within an 
external kernel module ... to support (legacy) distribution.

 
  This series does need to introduce a QMP event notification upon
  crash, so that the crash notification can be propagated to mgmt
  layers above QEMU.
 
 Yes.

Already done. I posted the QEMU relevant changes as a separated series to the 
KVM list ... since the initial implementation is KVM specific (KVM hypercall)

Best Regards,
Daniel

-- 
Daniel Gollub
Linux Consultant  Developer
Tel.: +49-160 47 73 970 
Mail: gol...@b1-systems.de

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537


signature.asc
Description: This is a digitally signed message part.

[Qemu-devel] unix domain socket communication with guests

2011-06-20 Thread Joel Uckelman

I'm trying to set up a unix domain socket with a guest on one end and
the host on the other, where the server is running on and bound to the
socket on the guest. I've been able to get the reverse, where the
server is running on the host, this way:

qemu-kvm -kernel kernel -initrd initrd -hda root -device virtio-serial
-serial stdio -chardev
socket,path=/home/uckelman/projects/lightbox/supermin/foo,id=channel0,name=org.libguestfs.channel.0

But, when I try to bind(2) on the guest, I get an Address already in
use error. Adding the server,nowait options to -chardev doesn't
help---I still get the same error.

What am I doing wrong here?

Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support

2011-06-20 Thread Alon Levy

On Mon, Jun 20, 2011 at 05:50:32PM +0200, Gerd Hoffmann wrote:
 On 06/20/11 17:11, Alon Levy wrote:
 On Mon, Jun 20, 2011 at 04:07:59PM +0200, Gerd Hoffmann wrote:
 What is the difference to one worker-stop() + worker-start() cycle?
 
 
 ok, stop+start won't disconnect any clients either. But does stop render 
 all waiting commands?
 I'll have to look, I don't know if it does.
 
 It does.  This is what qemu uses to flush all spice server state to
 device memory on migration.
 
 What is the reason for deleting all surfaces?
 
 Making sure all references are dropped to pci memory in devram.
 
 Ah, because the spice server keeps a reference to the create command
 until the surface is destroyed, right?

Actually right, so my correction stands corrected.

 
 There is is QXL_IO_DESTROY_ALL_SURFACES + worker-destroy_surfaces() ...
 

Regarding QXL_IO_DESTROY_ALL_SURFACES, it destroys the primary surface too,
which is a little special, that's another difference - update_mem destroys
everything except the primary. I know I tried to destroy the primary but it
didn't work right, don't recall why right now, so I guess I'll have to retry.

 The QXL_IO_UPDATE_MEM command does too much special stuff IMHO.
 I also think we don't need to extend the libspice-server API.
 
 We can add a I/O command which renders everything to device memory
 via stop+start.  We can zap all surfaces with the existing command +
Yes, start+stop work nicely, didn't realize (saw it before, assumed
it wouldn't be good enough), just need to destroy the surfaces too.

 worker call.  We can add a I/O command to ask qxl to push the
 release queue head to the release ring.

So you suggest to replace QXL_IO_UPDATE_MEM with what, two io commands instead
of using the val parameter?
 QXL_IO_UPDATE_MEM
 QXL_IO_FLUSH_RELEASE
?

 
 Comments?
 
 cheers,
   Gerd

Re: [Qemu-devel] [PATCH 0/2] Introduce panic hypercall

2011-06-20 Thread Avi Kivity


On 06/20/2011 07:26 PM, Daniel Gollub wrote:


  I agree.  But let's do this via a device, this way kvm need not be changed.

Is a device reliable enough if the guest kernel crashes?
Do you mean something like a hardware watchdog?


I'm proposing a 1:1 equivalent.  Instead of issuing a hypercall that 
tells the host about the panic, write to an I/O port that tells the host 
about the panic.




  Do ILO cards / IPMI support something like this?  We could follow their
  lead in that case.

The only two things which came to my mind are:

  * NMI (aka. ipmitool diag) - already available in qemu/kvm - but requires
in-guest kexec/kdump
  * Hardware-Watchdog (also available in qemu/libvirt)


A watchdog has the advantage that is also detects lockups.

In fact you could implement the panic device via the existing 
watchdogs.  Simply program the timer for the minimum interval and 
*don't* service the interrupt.  This would work for non-virt setups as 
well as another way to issue a reset.



lguest and xen have something similar. They also have an hypercall which get
called by a function registered in the panic_notifier_list. Not quite sure if
you want to follow their lead.


We could do the same, except s/hypercall/writel/.


Something I forgot to mention: This panic hypercall could also sit within an
external kernel module ... to support (legacy) distribution.


Yes.



This series does need to introduce a QMP event notification upon
crash, so that the crash notification can be propagated to mgmt
layers above QEMU.

  Yes.

Already done. I posted the QEMU relevant changes as a separated series to the
KVM list ... since the initial implementation is KVM specific (KVM hypercall)


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH 1/1] fix operator precedence

2011-06-20 Thread Stefan Hajnoczi

On Mon, Jun 20, 2011 at 2:25 PM, Frediano Ziglio fredd...@gmail.com wrote:
 Signed-off-by: Frediano Ziglio fredd...@gmail.com
 ---
  cmd.c |    6 +++---
  1 files changed, 3 insertions(+), 3 deletions(-)

Thanks for the patch!  cmd.c:timestr() has tabs for indentation but
your patch uses spaces.  I applied your changes manually to the
trivial-patches tree:
http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/trivial-patches

For more info on the trivial patches tree, see
http://wiki.qemu.org/Contribute/TrivialPatches.

Please make sure whitespace remains unmodified in the future so that
your patches apply, this is often a mail client issue.  Try
git-send-email(1), it does the right thing.

Stefan

Re: [Qemu-devel] High speed polling

2011-06-20 Thread Clay Andreasen


Thank you for your reply.
I am still a novice with Qemu so pardon me if I don't make any sense.

I tried --enable-io-thread.  I get the error:
cpus.o: In function `qemu_kvm_eat_signal':
cpus.c:(.text+0x111a): undefined reference to `kvm_on_sigbus_vcpu'

so I assume it requires KVM.  I'm not using KVM because I don't have 
full control

over the host I am running on.
I have 8 host processors running 4 Qemu copies (1 vcpu each) plus my 
network simulator.

I have tried polling via a call in vl.c:mainloop and via qemu_mod_timer().
There doesn't appear to be much difference.
The guest is a full-blown x86_64 OS.
I am polling to minimize latency.
I am looking at other ways to tolerate the current latency in case I can't
do much better.

Clay


On 06/15/11 01:22, Stefan Hajnoczi wrote:

On Tue, Jun 14, 2011 at 11:32 PM, Clay Andreasenc...@cray.com  wrote:

I have a network device simulation that I am connecting to multiple
instances of Qemu (nodes) via a shared memory queue.  It works pretty well
as
long as all of the nodes are initiating communication but when one node is
passive, it must poll to get packets.  So far the fastest I have been able
to
get it to poll is about every 2M emulated clocks.
This is with CONFIG_HIGH_RES_TIMERS and CONFIG_NO_HZ on the host.
I also set MIN_TIMER_REARM_NS in qemu-timer.c to 10.
Is there some way to increase the polling rate by about an order of
magnitude?

Without more details it's hard to say what is going on:

Running an x86 guest?  Are you using ./configure --enable-io-thread?
It sounds like you may not be using KVM?  How many vcpus are running
on the host in total compared to the number of logical CPUs on the
host?

You haven't given details on how you are polling in the guest.  Are
you running a polling loop in ring 0 or is the guest running a
full-blown OS and polling from userspace?

Why are you polling in the first place - to minimize latency?

Stefan

Re: [Qemu-devel] [PATCH 0/2] Introduce panic hypercall

2011-06-20 Thread Avi Kivity


On 06/20/2011 08:13 PM, Jan Kiszka wrote:

  A watchdog has the advantage that is also detects lockups.

  In fact you could implement the panic device via the existing
  watchdogs.  Simply program the timer for the minimum interval and
  *don't* service the interrupt.  This would work for non-virt setups as
  well as another way to issue a reset.

If you manage to bring down the other guest CPUs fast enough. Otherwise,
they may corrupt your crashdump before the host had a chance to collect
all pieces. Synchronous signaling to the hypervisor is a bit safer.


You could NMI-IPI them.  But I agree a synchronous signal is better 
(note it's not race-free itself).


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH 0/2] Introduce panic hypercall

2011-06-20 Thread Jan Kiszka

On 2011-06-20 18:34, Avi Kivity wrote:
 
   Do ILO cards / IPMI support something like this?  We could follow
 their
   lead in that case.

 The only two things which came to my mind are:

   * NMI (aka. ipmitool diag) - already available in qemu/kvm - but
 requires
 in-guest kexec/kdump
   * Hardware-Watchdog (also available in qemu/libvirt)
 
 A watchdog has the advantage that is also detects lockups.
 
 In fact you could implement the panic device via the existing
 watchdogs.  Simply program the timer for the minimum interval and
 *don't* service the interrupt.  This would work for non-virt setups as
 well as another way to issue a reset.

If you manage to bring down the other guest CPUs fast enough. Otherwise,
they may corrupt your crashdump before the host had a chance to collect
all pieces. Synchronous signaling to the hypervisor is a bit safer.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH] Support logging xen-guest console

2011-06-20 Thread Stefano Stabellini

On Mon, 20 Jun 2011, Chunyan Liu wrote:
 Add code to support logging xen-domU console, as what xenconsoled does. Log 
 info
 will be saved in /var/log/xen/console/guest-domUname.log.
 
 Signed-off-by: Chunyan Liu cy...@novell.com
 ---
  hw/xen_console.c |   63 
 ++
  1 files changed, 63 insertions(+), 0 deletions(-)
 
 diff --git a/hw/xen_console.c b/hw/xen_console.c
 index c6c8163..ac3208d 100644
 --- a/hw/xen_console.c
 +++ b/hw/xen_console.c
 @@ -36,6 +36,8 @@
  #include qemu-char.h
  #include xen_backend.h
  
 +static int log_guest = 0;
 +
  struct buffer {
  uint8_t *data;
  size_t consumed;
 @@ -52,8 +54,24 @@ struct XenConsole {
  void  *sring;
  CharDriverState   *chr;
  int   backlog;
 +int   log_fd;
  };
  
 +static int write_all(int fd, const char* buf, size_t len)
 +{
 +while (len) {
 +ssize_t ret = write(fd, buf, len);
 +if (ret == -1  errno == EINTR)
 +continue;
 +if (ret = 0)
 +return -1;
 +len -= ret;
 +buf += ret;
 +}
 +
 +return 0;
 +}
 +

If I am not mistaken ret == 0 doesn't always mean an error on write.


  static void buffer_append(struct XenConsole *con)
  {
  struct buffer *buffer = con-buffer;
 @@ -81,6 +99,14 @@ static void buffer_append(struct XenConsole *con)
  intf-out_cons = cons;
  xen_be_send_notify(con-xendev);
  
 +if (con-log_fd != -1) {
 +int logret;
 +logret = write_all(con-log_fd, buffer-data + buffer-size - size, 
 size);
 +if (logret  0)
 +xen_be_printf(con-xendev, 1, Write to log failed on domain 
 %d: %d (%s)\n,
 +  con-xendev.dom, errno, strerror(errno));
 + }

code style: you needs brackets around the xen_be_printf statement


  if (buffer-max_capacity 
   buffer-size  buffer-max_capacity) {
   /* Discard the middle of the data. */
 @@ -174,12 +200,36 @@ static void xencons_send(struct XenConsole *con)
  }
  }
  
 +static int create_domain_log(struct XenConsole *con)
 +{
 +char *logfile;
 +char *path, *domname;
 +int fd;
 +
 +path = xs_get_domain_path(xenstore, con-xendev.dom);
 +domname = xenstore_read_str(path, name);
 +free(path);
 +if (!domname)
 +return -1;
 +
 +asprintf(logfile, /var/log/xen/console/guest-%s.log, domname);
 +qemu_free(domname);
 +
 +fd = open(logfile, O_WRONLY|O_CREAT|O_APPEND, 0644);
 +free(logfile);
 +if (fd == -1)
 +xen_be_printf(con-xendev, 1,  Failed to open log %s: %d (%s), 
 logfile, errno, strerror(errno));
 +
 +return fd;
 +}
 +

What if the console subdirectory is missing? Maybe we should create
the directory automatically here.


  /*  */
  
  static int con_init(struct XenDevice *xendev)
  {
  struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
  char *type, *dom;
 +char *logenv = NULL;
  
  /* setup */
  dom = xs_get_domain_path(xenstore, con-xendev.dom);
 @@ -198,6 +248,10 @@ static int con_init(struct XenDevice *xendev)
  else
  con-chr = serial_hds[con-xendev.dev];
  
 +logenv = getenv(XENCONSOLED_TRACE);
 +if (logenv != NULL  !strcmp(logenv, guest)) {
 +log_guest = 1;
 +}
  return 0;
  }

please check the length of logenv before using strcmp on it

Re: [Qemu-devel] [PATCH v2] Add support for fd: protocol

2011-06-20 Thread Anthony Liguori


On 06/20/2011 12:35 PM, Avi Kivity wrote:

On 06/20/2011 04:50 PM, Anthony Liguori wrote:

On 06/20/2011 08:40 AM, Avi Kivity wrote:

On 06/14/2011 04:31 PM, Corey Bryant wrote:

- Starting Qemu with a backing file



For this we could tell qemu that a file named xyz is available via fd
n, via an extension of the getfd command.

For example

(qemu) getfd path=/images/my-image.img
(qemu) getfd path=/images/template.img
(qemu) drive-add path=/images/my-image.img

The open() for my-image.img first looks up the name in the getfd
database, and finds it, so it returns the fd from there instead of
opening. It then opens the backing file (template.img) and looks it up
again, and finds the second fd from the session.


The way I've been thinking about this is:

-blockdev id=hd0-back,file=fd:4,format=raw \
-blockdev file=fd:3,format=qcow2,backing=hd0-back

While your proposal is clever, it makes me a little nervous about
subtle security ramifications.


It would need careful explanation in the management tool author's guide,
yes.

The main advantage is generality. It doesn't assume that a file format
has just one backing file, and doesn't require new syntax wherever a
file is referred to indirectly.


FWIW, with blockdev, we need options to control this all anyway.  If you 
go back to my QCFG proposal, the parameters would actually be format 
specific, so if we had:


-block 
file=fd:4,format=fancypantsformat,part0=hd0-back.part1,part1=hd0-back.part2...


Regards,

Anthony Liguori

Re: [Qemu-devel] [PATCH 00/12] [uq/master] Import linux headers and some cleanups

2011-06-20 Thread Marcelo Tosatti

On Wed, Jun 08, 2011 at 04:10:54PM +0200, Jan Kiszka wrote:
 Licensing of the virtio headers is no clarified. So we can finally
 resolve the clumbsy and constantly buggy #ifdef'ery around old KVM and
 virtio headers. Recent example: current qemu-kvm does not build against
 2.6.32 headers.
 
 This series introduces an import mechanism for all required Linux
 headers so that the appropriate versions can be kept safely inside the
 QEMU tree. I've incorporated all the valuable review comments on the
 first version and rebased the result over current uq/master after
 rebasing that one over current QEMU master.
 
 Please note that I had no chance to test-build PPC or s390.
 
 Beside the header topic, this series also includes a few assorted KVM
 cleanup patches so that my queue is empty again.

Applied all, thanks.

[Qemu-devel] REMINDER: Participation Requested: Survey about Open-Source Software Development

2011-06-20 Thread Jeffrey Carver

Hi,

Apologies for any inconvenience and thank you to those who have already
completed the survey. We will keep the survey open for another couple of
weeks. But, we do hope you will consider responding to the email request
below (sent 2 weeks ago).

Thanks,

Dr. Jeffrey Carver
Assistant Professor
University of Alabama
(v) 205-348-9829  (f) 205-348-0219
http://www.cs.ua.edu/~carver

-Original Message-
From: Jeffrey Carver [mailto:opensourcesur...@cs.ua.edu] 
Sent: Monday, June 13, 2011 11:45 AM
To: 'qemu-devel@nongnu.org'
Subject: Participation Requested: Survey about Open-Source Software
Development

Hi,

Drs. Jeffrey Carver, Rosanna Guadagno, Debra McCallum, and Mr. Amiangshu
Bosu,  University of Alabama, and Dr. Lorin Hochstein, University of
Southern California, are conducting a survey of open-source software
developers. This survey seeks to understand how developers on distributed,
virtual teams, like open-source projects, interact with each other to
accomplish their tasks. You must be at least 19 years of age to complete the
survey. The survey should take approximately 15 minutes to complete.

If you are actively participating as a developer, please consider completing
our survey.
 
Here is the link to the survey:   http://goo.gl/HQnux

We apologize for inconvenience and if you receive multiple copies of this
email. This survey has been approved by The University of Alabama IRB board.

Thanks,

Dr. Jeffrey Carver
Assistant Professor
University of Alabama
(v) 205-348-9829  (f) 205-348-0219
http://www.cs.ua.edu/~carver

Re: [Qemu-devel] [PATCH 2/2] qxl: add QXL_IO_UPDATE_MEM for guest S3S4 support

2011-06-20 Thread Alon Levy

On Mon, Jun 20, 2011 at 06:32:30PM +0200, Alon Levy wrote:
 On Mon, Jun 20, 2011 at 05:50:32PM +0200, Gerd Hoffmann wrote:
  On 06/20/11 17:11, Alon Levy wrote:
  On Mon, Jun 20, 2011 at 04:07:59PM +0200, Gerd Hoffmann wrote:
  What is the difference to one worker-stop() + worker-start() cycle?
  
  
  ok, stop+start won't disconnect any clients either. But does stop render 
  all waiting commands?
  I'll have to look, I don't know if it does.
  
  It does.  This is what qemu uses to flush all spice server state to
  device memory on migration.
  
  What is the reason for deleting all surfaces?
  
  Making sure all references are dropped to pci memory in devram.
  
  Ah, because the spice server keeps a reference to the create command
  until the surface is destroyed, right?
 
 Actually right, so my correction stands corrected.
 
  
  There is is QXL_IO_DESTROY_ALL_SURFACES + worker-destroy_surfaces() ...
  
 
 Regarding QXL_IO_DESTROY_ALL_SURFACES, it destroys the primary surface too,
 which is a little special, that's another difference - update_mem destroys
 everything except the primary. I know I tried to destroy the primary but it
 didn't work right, don't recall why right now, so I guess I'll have to retry.
 
  The QXL_IO_UPDATE_MEM command does too much special stuff IMHO.
  I also think we don't need to extend the libspice-server API.
  
  We can add a I/O command which renders everything to device memory
  via stop+start.  We can zap all surfaces with the existing command +
 Yes, start+stop work nicely, didn't realize (saw it before, assumed
 it wouldn't be good enough), just need to destroy the surfaces too.
 

ok, it all works nicely except with the current driver patches I get a double
destroy for the primary surface. Removing it with the following patch makes
everything (resolution change/suspend/hibernate) work. I would really suggest
we remove that PANIC_ON, besides of course fixing the driver patches (I'll do a
v2 for the affected patche, the last series of qxl, I didn't cc you since I
didn't assume you'd want to review, but you probably saw it). Something like:

diff --git a/server/red_worker.c b/server/red_worker.c
index f0a8dfc..3b53a3f 100644
--- a/server/red_worker.c
+++ b/server/red_worker.c
@@ -9684,7 +9684,11 @@ static inline void 
handle_dev_destroy_primary_surface(RedWorker *worker)
 receive_data(worker-channel, surface_id, sizeof(uint32_t));
 
 PANIC_ON(surface_id != 0);
-PANIC_ON(!worker-surfaces[surface_id].context.canvas);
+
+if (!worker-surfaces[surface_id].context.canvas) {
+red_printf(warning: double destroy of primary surface\n);
+goto end;
+}
 
 if (worker-cursor) {
 red_release_cursor(worker, worker-cursor);
@@ -9711,6 +9715,7 @@ static inline void 
handle_dev_destroy_primary_surface(RedWorker *worker)
 worker-cursor_position.x = worker-cursor_position.y = 0;
 worker-cursor_trail_length = worker-cursor_trail_frequency = 0;
 
+end:
 message = RED_WORKER_MESSAGE_READY;
 write_message(worker-channel, message);
 }

  worker call.  We can add a I/O command to ask qxl to push the
  release queue head to the release ring.
 
 So you suggest to replace QXL_IO_UPDATE_MEM with what, two io commands instead
 of using the val parameter?
  QXL_IO_UPDATE_MEM
  QXL_IO_FLUSH_RELEASE
 ?
 
  
  Comments?
  
  cheers,
Gerd

Re: [Qemu-devel] [PATCH 14/18] TCG/PPC: use TCG_REG_CALL_STACK instead of TCG_REG_R1

2011-06-20 Thread Blue Swirl

On Mon, Jun 20, 2011 at 1:14 AM, malc av1...@comtv.ru wrote:
 On Mon, 20 Jun 2011, Blue Swirl wrote:

 Use TCG_REG_CALL_STACK instead of TCG_REG_R1 etc. for consistency.

 You spell it TCG_REG_CALL_STACK in the subject/comment but
 REG_CALL_STACK in the patch, which suggest that it was never
 even compile tested.

Actually I seem to have used both versions. I didn't compile test, but
to make matters even worse, I didn't even read any reference manuals
or ABI descriptions for any of these patches but based all this on
bits gathered from */tcg-target.[ch]. But is the patch otherwise OK?
;-)

Re: [Qemu-devel] [PATCH RFC 0/3] basic support for composing sysbus devices

2011-06-20 Thread Blue Swirl

On Mon, Jun 20, 2011 at 6:23 PM, Paul Brook p...@codesourcery.com wrote:
  Yeah, that's why I said, hard to do well.  It makes it very hard to add
  new socket types.

 PCI, USB, IDE, SCSI, SBus, what else? APICBus? I2C? 8 socket types
 ought to be enough for anybody.

 Off the top of my head: AClink (audio), i2s (audio), SSI/SSP (synchonous
 serial), Firewire, rs232, CAN, FibreChannel, ISA, PS2, ADB (apple desktop bus)
 and probably a bunch of others I've missed.  There's also a bunch of all-but
 extinct system architectures with interesting bus-level features (MCA, NuBus,
 etc.)

Are these really buses with identifiable sockets? For example, it's
not possible to enumerate the users of ISA bus or RS-232.

1 2 >

1 - 100 of 104 matches

Mail list logo