Re: [Qemu-devel] [PATCH] spapr-rtas: reset top 4 bits in parameters address

2013-09-05 Thread Alexander Graf


Am 05.09.2013 um 07:58 schrieb Alexey Kardashevskiy :

> On the real hardware, RTAS is called in real mode and therefore
> ignores top 4 bits of the address passed in the call.

Shouldn't we ignore the upper 4 bits for every memory access in real mode, not 
just that one parameter?

Alex

> 
> This fixes QEMU to do the same thing.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
> hw/ppc/spapr_rtas.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index eb542f2..ab03d67 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -240,7 +240,8 @@ target_ulong spapr_rtas_call(PowerPCCPU *cpu, 
> sPAPREnvironment *spapr,
> struct rtas_call *call = rtas_table + (token - TOKEN_BASE);
> 
> if (call->fn) {
> -call->fn(cpu, spapr, token, nargs, args, nret, rets);
> +call->fn(cpu, spapr, token, nargs, args & 0x0FFFUL,
> + nret, rets);
> return H_SUCCESS;
> }
> }
> -- 
> 1.8.4.rc4
> 



Re: [Qemu-devel] Block Filters

2013-09-05 Thread Kevin Wolf
Am 04.09.2013 um 20:15 hat Benoît Canet geschrieben:
> > Propagate operations like snapshot down the tree.  block.c is designed
> > for bs->file/bs->backing_hd kind of BlockDrivers, perhaps it needs to
> > become a bit more generic to support other types of BlockDrivers
> > properly.
> 
> Shouldn't bs->backing_hd become bs->children[0] and bs->file stay the same ?

Intuitively I'd say that bs->children[] would contain _all_ children,
including bs->file and bs->backing_hd.

In the end it all depends on what bs->children[] would be used for.
I'm not even sure if the generic block layer has any use for this
information.

Kevin



Re: [Qemu-devel] [PATCH RFC] Do not set SO_REUSEADDR on Windows

2013-09-05 Thread Michael Tokarev

04.09.2013 18:34, Jan Kiszka wrote:

On 2013-09-04 16:27, Paolo Bonzini wrote:

Il 04/09/2013 16:22, Sebastian Ottlik ha scritto:

This patchset disabels all use of SO_REUSEADDR on Windows. On Windows systems
the default behavior is equivalent to SO_REUSEADDR on other operating
systems. SO_REUSEADDR can still be set but results in undesired bahvior
instead. It may even lead to situations were system behavior is
unspecified. More information on this can be found at:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms740621.aspx

I originally encountered this issue when accidentally launching two QEMU
instances with identical GDB ports at the same time. In which case QEMU won't
fail as one might expect. I am sending this as RFC as I A) only checked that
this fixes issues for the GDB server and B) am not sure if this is the correct
format for this patchset.

gdbstub: do not set SO_REUSEADDR on Windows
net: do not set SO_REUSEADDR on Windows
slirp: do not set SO_REUSEADDR on Windows
util: do not set SO_REUSEADDR on Windows


Makes sense.

Can you make a different patch that introduces a new function
qemu_set_reuseaddr is include/qemu/sockets.h & util/oslib-*, and makes
it a stub for Windows?


Maybe a better approach will be to introduce something like
qemu_[stream_]listen_socket(int family, sockaddr_t *sa, int salen) function
which opens and prepares a socket with all necessary platform-dependent
stuff?


Yeah, this is definitely better then. IIRC, Michael has some version of
patch 3 in his queue right now - should be dropped in this light.


It's been pulled by Anthony yesterday so it is a bit too late now.
But the change is trivial, and can be reworked easily.

Thanks,

/mjt



Re: [Qemu-devel] [PATCH] spapr-rtas: reset top 4 bits in parameters address

2013-09-05 Thread Alexey Kardashevskiy
On 09/05/2013 05:08 PM, Alexander Graf wrote:
> 
> 
> Am 05.09.2013 um 07:58 schrieb Alexey Kardashevskiy :
> 
>> On the real hardware, RTAS is called in real mode and therefore
>> ignores top 4 bits of the address passed in the call.
> 
> Shouldn't we ignore the upper 4 bits for every memory access in real mode, 
> not just that one parameter?

We probably should but I just do not see any easy way of doing this. Yet
another "Ignore N bits on the top" memory region type? No idea.

This particular patch was born after discovering GCC 4.8.0 bug with not
doing -0xc000... correctly and this would not be a problem on
the real hardware. So I would think there are kernel somewhere which have
this bug. And there are few (honestly I found only one place and the patch
fixes it) places which can fail because of this GCC bug. So the patch does
make sense for Paul and myself.

btw the patch is wrong, I should do this in a different place, sorry about
that :)


> 
> Alex
> 
>>
>> This fixes QEMU to do the same thing.
>>
>> Signed-off-by: Alexey Kardashevskiy 
>> ---
>> hw/ppc/spapr_rtas.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
>> index eb542f2..ab03d67 100644
>> --- a/hw/ppc/spapr_rtas.c
>> +++ b/hw/ppc/spapr_rtas.c
>> @@ -240,7 +240,8 @@ target_ulong spapr_rtas_call(PowerPCCPU *cpu, 
>> sPAPREnvironment *spapr,
>> struct rtas_call *call = rtas_table + (token - TOKEN_BASE);
>>
>> if (call->fn) {
>> -call->fn(cpu, spapr, token, nargs, args, nret, rets);
>> +call->fn(cpu, spapr, token, nargs, args & 0x0FFFUL,
>> + nret, rets);
>> return H_SUCCESS;
>> }
>> }
>> -- 
>> 1.8.4.rc4
>>


-- 
Alexey



[Qemu-devel] [RFC v4 1/5] hw/arm: add very initial support for Canon DIGIC SoC

2013-09-05 Thread Antony Pavlov
DIGIC is Canon Inc.'s name for a family of SoC
for digital cameras and camcorders.

There is no publicly available specification for
DIGIC chips. All information about DIGIC chip
internals is based on reverse engineering efforts
made by CHDK (http://chdk.wikia.com) and
Magic Lantern (http://www.magiclantern.fm) projects
contributors.

Signed-off-by: Antony Pavlov 
---
 default-configs/arm-softmmu.mak |  1 +
 hw/arm/Makefile.objs|  2 +-
 hw/arm/digic.c  | 70 +
 include/hw/arm/digic.h  | 23 ++
 4 files changed, 95 insertions(+), 1 deletion(-)
 create mode 100644 hw/arm/digic.c
 create mode 100644 include/hw/arm/digic.h

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index ac0815d..0d1d783 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -63,6 +63,7 @@ CONFIG_FRAMEBUFFER=y
 CONFIG_XILINX_SPIPS=y
 
 CONFIG_A9SCU=y
+CONFIG_DIGIC=y
 CONFIG_MARVELL_88W8618=y
 CONFIG_OMAP=y
 CONFIG_TSC210X=y
diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index 3671b42..e140485 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -3,5 +3,5 @@ obj-y += integratorcp.o kzm.o mainstone.o musicpal.o nseries.o
 obj-y += omap_sx1.o palm.o realview.o spitz.o stellaris.o
 obj-y += tosa.o versatilepb.o vexpress.o xilinx_zynq.o z2.o
 
-obj-y += armv7m.o exynos4210.o pxa2xx.o pxa2xx_gpio.o pxa2xx_pic.o
+obj-y += armv7m.o digic.o exynos4210.o pxa2xx.o pxa2xx_gpio.o pxa2xx_pic.o
 obj-y += omap1.o omap2.o strongarm.o
diff --git a/hw/arm/digic.c b/hw/arm/digic.c
new file mode 100644
index 000..95a9fcd
--- /dev/null
+++ b/hw/arm/digic.c
@@ -0,0 +1,70 @@
+/*
+ * QEMU model of the Canon DIGIC SoC.
+ *
+ * Copyright (C) 2013 Antony Pavlov 
+ *
+ * This model is based on reverse engineering efforts
+ * made by CHDK (http://chdk.wikia.com) and
+ * Magic Lantern (http://www.magiclantern.fm) projects
+ * contributors.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ *
+ */
+
+#include "hw/sysbus.h"
+#include "target-arm/cpu-qom.h"
+#include "hw/arm/digic.h"
+
+static void digic_init(Object *obj)
+{
+DigicState *s = DIGIC(obj);
+
+object_initialize(&s->cpu, sizeof(s->cpu), "arm946-" TYPE_ARM_CPU);
+object_property_add_child(obj, "cpu", OBJECT(&s->cpu), NULL);
+}
+
+static void digic_realize(DeviceState *dev, Error **errp)
+{
+DigicState *s = DIGIC(dev);
+Error *err = NULL;
+
+object_property_set_bool(OBJECT(&s->cpu), true, "realized", &err);
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+}
+
+static void digic_class_init(ObjectClass *oc, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(oc);
+
+dc->realize = digic_realize;
+}
+
+static const TypeInfo digic_type_info = {
+.name = TYPE_DIGIC,
+.parent = TYPE_DEVICE,
+.instance_size = sizeof(DigicState),
+.instance_init = digic_init,
+.class_init = digic_class_init,
+};
+
+static void digic_register_types(void)
+{
+type_register_static(&digic_type_info);
+}
+
+type_init(digic_register_types)
diff --git a/include/hw/arm/digic.h b/include/hw/arm/digic.h
new file mode 100644
index 000..0ef4723
--- /dev/null
+++ b/include/hw/arm/digic.h
@@ -0,0 +1,23 @@
+/*
+ * Misc DIGIC declarations
+ *
+ * Copyright (C) 2013 Antony Pavlov 
+ *
+ */
+
+#ifndef __DIGIC_H__
+#define __DIGIC_H__
+
+#include "cpu-qom.h"
+
+#define TYPE_DIGIC "digic"
+
+#define DIGIC(obj) OBJECT_CHECK(DigicState, (obj), TYPE_DIGIC)
+
+typedef struct DigicState {
+Object parent_obj;
+
+ARMCPU cpu;
+} DigicState;
+
+#endif /* __DIGIC_H__ */
-- 
1.8.4.rc3




[Qemu-devel] [RFC v4 0/5] hw/arm: add initial support for Canon DIGIC SoC

2013-09-05 Thread Antony Pavlov
[RFC v4 1/5] hw/arm: add very initial support for Canon DIGIC SoC
[RFC v4 2/5] hw/arm/digic: prepare DIGIC-based boards support
[RFC v4 3/5] hw/arm/digic: add timer support
[RFC v4 4/5] hw/arm/digic: add UART support
[RFC v4 5/5] hw/arm/digic: add NOR ROM support

 Changes since v3:

  1. fix typos and formatting
  2. digic-timer: drop DPRINTF
  3. digic-timer: fix DIGIC4_TIMER_BASE() macro
  4. digic.c: fix max timer device string

 Changes since v2:

  1. rebase over latest master;
* pass available size to object_initialize().
  2. digic-uart: qemu_log: use LOG_UNIMP instead LOG_GUEST_ERROR;
  3. digic-boards: update rom image load code: introduce digic_load_rom().

 Changes since v1:

  0. drop the "add ARM946E-S CPU" patch;
  1. convert to QOM, split DIGIC SoC code and board code
 (thanks to Andreas Fa:rber, Peter Maydell and Peter Crosthwaite);
  2. fix digic-uart (many thanks to Peter Crosthwaite
 for his comments);
  3. digic-boards: digic4_add_k8p3215uqb_rom(): update
 rom image load code: use the '-bios' option.

DIGIC is Canon Inc.'s name for a family of SoC
for digital cameras and camcorders.

See http://en.wikipedia.org/wiki/DIGIC for details.

There is no publicly available specification for
DIGIC chips. All information about DIGIC chip
internals is based on reverse engineering efforts
made by CHDK (http://chdk.wikia.com) and
Magic Lantern (http://www.magiclantern.fm) projects
contributors.

Also this patch series adds initial support for Canon
PowerShot A1100 IS compact camera (it is my only camera
with connected UART interface). As the DIGIC-based cameras
differences mostly are unsignificant (e.g. RAM-size,
ROM type and size, GPIO usage) the other compact
and DSLR cameras support can be easely added.

This DIGIC support patch series is inspired
by EOS QEMU from Magic Lantern project.
The main differences:
 * EOS QEMU uses home-brew all-in-one monolith design;
 this patch series uses conventional qemu object-centric design;
 * EOS QEMU tries provide simplest emulation for most
 controllers inside SoC to run Magic Lantern firmware;
 this patch series provide more complete support
 only for core devices to run barebox bootloader.
  ** EOS QEMU does not support timer counting
  (this patch series emulate 1 MHz counting);
  ** EOS QEMU support DIGIC UART only for output
  character to stderr; (this patch series emulate
  introduces full blown UART interface);
  ** EOS QEMU has incomplete ROM support;
  (this patch series uses conventional qemu pflash).

This initial DIGIC support can't be used to run
the original camera firmware, but it can successfully
run experimental version of barebox bootloader
(see http://www.barebox.org).

The last sources of barebox for PowerShot A1100 can be
obtained here:
  https://github.com/frantony/barebox/tree/next.digic.20130829

The precompiled ROM image usable with qemu can be
obtained here:

  
https://github.com/frantony/barebox/blob/next.digic.20130829/canon-a1100-rom1.bin

This ROM image (after "dancing bit" encoding) can be run on
real Canon A1100 camera.

The short build instruction for __previous__ DIGIC barebox
version (it can be used with more recent sources too) can
be obtained here:
  http://lists.infradead.org/pipermail/barebox/2013-August/016007.html



[Qemu-devel] [RFC v4 3/5] hw/arm/digic: add timer support

2013-09-05 Thread Antony Pavlov
Signed-off-by: Antony Pavlov 
---
 hw/arm/digic.c |  26 +++
 hw/timer/Makefile.objs |   1 +
 hw/timer/digic-timer.c | 117 +
 hw/timer/digic-timer.h |  19 
 include/hw/arm/digic.h |   7 +++
 5 files changed, 170 insertions(+)
 create mode 100644 hw/timer/digic-timer.c
 create mode 100644 hw/timer/digic-timer.h

diff --git a/hw/arm/digic.c b/hw/arm/digic.c
index 95a9fcd..5932a6a 100644
--- a/hw/arm/digic.c
+++ b/hw/arm/digic.c
@@ -30,21 +30,47 @@
 static void digic_init(Object *obj)
 {
 DigicState *s = DIGIC(obj);
+DeviceState *dev;
+int i;
 
 object_initialize(&s->cpu, sizeof(s->cpu), "arm946-" TYPE_ARM_CPU);
 object_property_add_child(obj, "cpu", OBJECT(&s->cpu), NULL);
+
+for (i = 0; i < DIGIC4_NB_TIMERS; i++) {
+#define DIGIC_TIMER_NAME_MLEN11
+char name[DIGIC_TIMER_NAME_MLEN];
+
+object_initialize(&s->timer[i], sizeof(s->timer[i]), TYPE_DIGIC_TIMER);
+dev = DEVICE(&s->timer[i]);
+qdev_set_parent_bus(dev, sysbus_get_default());
+snprintf(name, DIGIC_TIMER_NAME_MLEN, "timer[%d]", i);
+object_property_add_child(obj, name, OBJECT(&s->timer[i]), NULL);
+}
 }
 
 static void digic_realize(DeviceState *dev, Error **errp)
 {
 DigicState *s = DIGIC(dev);
 Error *err = NULL;
+SysBusDevice *sbd;
+int i;
 
 object_property_set_bool(OBJECT(&s->cpu), true, "realized", &err);
 if (err != NULL) {
 error_propagate(errp, err);
 return;
 }
+
+for (i = 0; i < DIGIC4_NB_TIMERS; i++) {
+object_property_set_bool(OBJECT(&s->timer[i]), true, "realized", &err);
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+
+sbd = SYS_BUS_DEVICE(&s->timer[i]);
+sysbus_mmio_map(sbd, 0, DIGIC4_TIMER_BASE(i));
+}
 }
 
 static void digic_class_init(ObjectClass *oc, void *data)
diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs
index eca5905..5479aee 100644
--- a/hw/timer/Makefile.objs
+++ b/hw/timer/Makefile.objs
@@ -25,5 +25,6 @@ obj-$(CONFIG_OMAP) += omap_synctimer.o
 obj-$(CONFIG_PXA2XX) += pxa2xx_timer.o
 obj-$(CONFIG_SH4) += sh_timer.o
 obj-$(CONFIG_TUSB6010) += tusb6010.o
+obj-$(CONFIG_DIGIC) += digic-timer.o
 
 obj-$(CONFIG_MC146818RTC) += mc146818rtc.o
diff --git a/hw/timer/digic-timer.c b/hw/timer/digic-timer.c
new file mode 100644
index 000..75e2713
--- /dev/null
+++ b/hw/timer/digic-timer.c
@@ -0,0 +1,117 @@
+/*
+ * QEMU model of the Canon Digic timer block.
+ *
+ * Copyright (C) 2013 Antony Pavlov 
+ *
+ * This model is based on reverse engineering efforts
+ * made by CHDK (http://chdk.wikia.com) and
+ * Magic Lantern (http://www.magiclantern.fm) projects
+ * contributors.
+ *
+ * See "Timer/Clock Module" docs here:
+ *   http://magiclantern.wikia.com/wiki/Register_Map
+ *
+ * The QEMU model of the OSTimer in PKUnity SoC by Guan Xuetao
+ * is used as a template.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ *
+ */
+
+#include "hw/sysbus.h"
+#include "hw/ptimer.h"
+#include "qemu/main-loop.h"
+
+#include "hw/timer/digic-timer.h"
+
+# define DIGIC_TIMER_CONTROL 0x00
+# define DIGIC_TIMER_VALUE 0x0c
+
+static uint64_t digic_timer_read(void *opaque, hwaddr offset, unsigned size)
+{
+DigicTimerState *s = opaque;
+uint32_t ret = 0;
+
+switch (offset) {
+case DIGIC_TIMER_VALUE:
+ret = (uint32_t)ptimer_get_count(s->ptimer);
+ret &= 0x;
+break;
+default:
+qemu_log_mask(LOG_UNIMP,
+  "digic-timer: read access to unknown register 0x"
+  TARGET_FMT_plx, offset);
+}
+
+return ret;
+}
+
+static void digic_timer_write(void *opaque, hwaddr offset,
+  uint64_t value, unsigned size)
+{
+DigicTimerState *s = opaque;
+
+/* FIXME: without documentation every write just starts timer */
+ptimer_set_limit(s->ptimer, 0x, 1);
+ptimer_run(s->ptimer, 1);
+}
+
+static const MemoryRegionOps digic_timer_ops = {
+.read = digic_timer_read,
+.write = digic_timer_write,
+.impl = {
+.min_access_size = 4,
+.max_access_size = 4,
+},
+.endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static void digic_timer_tick(void *opaque)
+{
+DigicTimerState *s = opaque;
+
+ptimer_r

[Qemu-devel] [RFC v4 2/5] hw/arm/digic: prepare DIGIC-based boards support

2013-09-05 Thread Antony Pavlov
Also this patch adds initial support for Canon
PowerShot A1100 IS compact camera.

Signed-off-by: Antony Pavlov 
---
 hw/arm/Makefile.objs  |  2 +-
 hw/arm/digic_boards.c | 63 +++
 2 files changed, 64 insertions(+), 1 deletion(-)
 create mode 100644 hw/arm/digic_boards.c

diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index e140485..f6e9533 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -1,4 +1,4 @@
-obj-y += boot.o collie.o exynos4_boards.o gumstix.o highbank.o
+obj-y += boot.o collie.o digic_boards.o exynos4_boards.o gumstix.o highbank.o
 obj-y += integratorcp.o kzm.o mainstone.o musicpal.o nseries.o
 obj-y += omap_sx1.o palm.o realview.o spitz.o stellaris.o
 obj-y += tosa.o versatilepb.o vexpress.o xilinx_zynq.o z2.o
diff --git a/hw/arm/digic_boards.c b/hw/arm/digic_boards.c
new file mode 100644
index 000..0b99227
--- /dev/null
+++ b/hw/arm/digic_boards.c
@@ -0,0 +1,63 @@
+#include "hw/boards.h"
+#include "exec/address-spaces.h"
+#include "hw/arm/digic.h"
+
+typedef struct DigicBoardState {
+DigicState *digic;
+MemoryRegion ram;
+} DigicBoardState;
+
+typedef struct DigicBoard {
+hwaddr ram_size;
+hwaddr start_addr;
+} DigicBoard;
+
+static void digic4_board_setup_ram(DigicBoardState *s, hwaddr ram_size)
+{
+memory_region_init_ram(&s->ram, NULL, "ram", ram_size);
+memory_region_add_subregion(get_system_memory(), 0, &s->ram);
+vmstate_register_ram_global(&s->ram);
+}
+
+static void digic4_board_init(DigicBoard *board)
+{
+Error *err = NULL;
+
+DigicBoardState *s = g_new(DigicBoardState, 1);
+
+s->digic = DIGIC(object_new(TYPE_DIGIC));
+object_property_set_bool(OBJECT(s->digic), true, "realized", &err);
+if (err != NULL) {
+fprintf(stderr, "Couldn't realize DIGIC SoC: %s\n",
+error_get_pretty(err));
+exit(1);
+}
+
+digic4_board_setup_ram(s, board->ram_size);
+
+s->digic->cpu.env.regs[15] = board->start_addr;
+}
+
+static DigicBoard digic4_board_canon_a1100 = {
+.ram_size = 64 * 1024 * 1024,
+/* CHDK recommends this address for ROM disassembly */
+.start_addr = 0xffc0,
+};
+
+static void canon_a1100_init(QEMUMachineInitArgs *args)
+{
+digic4_board_init(&digic4_board_canon_a1100);
+}
+
+static QEMUMachine canon_a1100 = {
+.name = "canon-a1100",
+.desc = "Canon PowerShot A1100 IS",
+.init = &canon_a1100_init,
+};
+
+static void digic_register_machines(void)
+{
+qemu_register_machine(&canon_a1100);
+}
+
+machine_init(digic_register_machines)
-- 
1.8.4.rc3




[Qemu-devel] [RFC v4 4/5] hw/arm/digic: add UART support

2013-09-05 Thread Antony Pavlov
Signed-off-by: Antony Pavlov 
---
 hw/arm/digic.c |  14 
 hw/char/Makefile.objs  |   1 +
 hw/char/digic-uart.c   | 197 +
 hw/char/digic-uart.h   |  27 +++
 include/hw/arm/digic.h |   4 +
 5 files changed, 243 insertions(+)
 create mode 100644 hw/char/digic-uart.c
 create mode 100644 hw/char/digic-uart.h

diff --git a/hw/arm/digic.c b/hw/arm/digic.c
index 5932a6a..d99ffd9 100644
--- a/hw/arm/digic.c
+++ b/hw/arm/digic.c
@@ -46,6 +46,11 @@ static void digic_init(Object *obj)
 snprintf(name, DIGIC_TIMER_NAME_MLEN, "timer[%d]", i);
 object_property_add_child(obj, name, OBJECT(&s->timer[i]), NULL);
 }
+
+object_initialize(&s->uart, sizeof(s->uart), TYPE_DIGIC_UART);
+dev = DEVICE(&s->uart);
+qdev_set_parent_bus(dev, sysbus_get_default());
+object_property_add_child(obj, "uart", OBJECT(&s->uart), NULL);
 }
 
 static void digic_realize(DeviceState *dev, Error **errp)
@@ -71,6 +76,15 @@ static void digic_realize(DeviceState *dev, Error **errp)
 sbd = SYS_BUS_DEVICE(&s->timer[i]);
 sysbus_mmio_map(sbd, 0, DIGIC4_TIMER_BASE(i));
 }
+
+object_property_set_bool(OBJECT(&s->uart), true, "realized", &err);
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+
+sbd = SYS_BUS_DEVICE(&s->uart);
+sysbus_mmio_map(sbd, 0, DIGIC_UART_BASE);
 }
 
 static void digic_class_init(ObjectClass *oc, void *data)
diff --git a/hw/char/Makefile.objs b/hw/char/Makefile.objs
index f8f3dbc..00d37ac 100644
--- a/hw/char/Makefile.objs
+++ b/hw/char/Makefile.objs
@@ -14,6 +14,7 @@ obj-$(CONFIG_COLDFIRE) += mcf_uart.o
 obj-$(CONFIG_OMAP) += omap_uart.o
 obj-$(CONFIG_SH4) += sh_serial.o
 obj-$(CONFIG_PSERIES) += spapr_vty.o
+obj-$(CONFIG_DIGIC) += digic-uart.o
 
 common-obj-$(CONFIG_ETRAXFS) += etraxfs_ser.o
 common-obj-$(CONFIG_ISA_DEBUG) += debugcon.o
diff --git a/hw/char/digic-uart.c b/hw/char/digic-uart.c
new file mode 100644
index 000..f41f74a
--- /dev/null
+++ b/hw/char/digic-uart.c
@@ -0,0 +1,197 @@
+/*
+ * QEMU model of the Canon Digic UART block.
+ *
+ * Copyright (C) 2013 Antony Pavlov 
+ *
+ * This model is based on reverse engineering efforts
+ * made by CHDK (http://chdk.wikia.com) and
+ * Magic Lantern (http://www.magiclantern.fm) projects
+ * contributors.
+ *
+ * See "Serial terminal" docs here:
+ *   http://magiclantern.wikia.com/wiki/Register_Map#Misc_Registers
+ *
+ * The QEMU model of the Milkymist UART block by Michael Walle
+ * is used as a template.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ *
+ */
+
+#include "hw/hw.h"
+#include "hw/sysbus.h"
+#include "sysemu/char.h"
+
+#include "hw/char/digic-uart.h"
+
+enum {
+ST_RX_RDY = (1 << 0),
+ST_TX_RDY = (1 << 1),
+};
+
+static uint64_t digic_uart_read(void *opaque, hwaddr addr,
+unsigned size)
+{
+DigicUartState *s = opaque;
+
+addr >>= 2;
+
+switch (addr) {
+case R_RX:
+s->regs[R_ST] &= ~(ST_RX_RDY);
+break;
+
+case R_ST:
+break;
+
+default:
+qemu_log_mask(LOG_UNIMP,
+  "digic-uart: read access to unknown register 0x"
+  TARGET_FMT_plx, addr << 2);
+}
+
+return s->regs[addr];
+}
+
+static void digic_uart_write(void *opaque, hwaddr addr, uint64_t value,
+ unsigned size)
+{
+DigicUartState *s = opaque;
+unsigned char ch = value;
+
+addr >>= 2;
+
+switch (addr) {
+case R_TX:
+if (s->chr) {
+qemu_chr_fe_write_all(s->chr, &ch, 1);
+}
+break;
+
+case R_ST:
+/*
+ * Ignore write to R_ST.
+ *
+ * The point is that this register is actively used
+ * during receiving and transmitting symbols,
+ * but we don't know the function of most of bits.
+ *
+ * Ignoring writes to R_ST is only a simplification
+ * of the model. It has no perceptible side effects
+ * for existing guests.
+ */
+break;
+
+default:
+qemu_log_mask(LOG_UNIMP,
+  "digic-uart: write access to unknown register 0x"
+  TARGET_FMT_plx, addr << 2);
+}
+}
+
+static const MemoryRegionOps uart_mmio_ops = {
+.read = digic_uart_read,
+.write = digic_uart_w

[Qemu-devel] [RFC v4 5/5] hw/arm/digic: add NOR ROM support

2013-09-05 Thread Antony Pavlov
Signed-off-by: Antony Pavlov 
---
 hw/arm/digic_boards.c | 74 +++
 1 file changed, 74 insertions(+)

diff --git a/hw/arm/digic_boards.c b/hw/arm/digic_boards.c
index 0b99227..850e320 100644
--- a/hw/arm/digic_boards.c
+++ b/hw/arm/digic_boards.c
@@ -1,6 +1,13 @@
 #include "hw/boards.h"
 #include "exec/address-spaces.h"
 #include "hw/arm/digic.h"
+#include "hw/block/flash.h"
+#include "hw/loader.h"
+#include "sysemu/sysemu.h"
+
+#define DIGIC4_ROM0_BASE  0xf000
+#define DIGIC4_ROM1_BASE  0xf800
+# define DIGIC4_ROM_MAX_SIZE  0x0800
 
 typedef struct DigicBoardState {
 DigicState *digic;
@@ -9,6 +16,10 @@ typedef struct DigicBoardState {
 
 typedef struct DigicBoard {
 hwaddr ram_size;
+void (*add_rom0)(DigicBoardState *, hwaddr, const char *);
+const char *rom0_def_filename;
+void (*add_rom1)(DigicBoardState *, hwaddr, const char *);
+const char *rom1_def_filename;
 hwaddr start_addr;
 } DigicBoard;
 
@@ -35,11 +46,74 @@ static void digic4_board_init(DigicBoard *board)
 
 digic4_board_setup_ram(s, board->ram_size);
 
+if (board->add_rom0) {
+board->add_rom0(s, DIGIC4_ROM0_BASE, board->rom0_def_filename);
+}
+
+if (board->add_rom1) {
+board->add_rom1(s, DIGIC4_ROM1_BASE, board->rom1_def_filename);
+}
+
 s->digic->cpu.env.regs[15] = board->start_addr;
 }
 
+static void digic_load_rom(DigicBoardState *s, hwaddr addr,
+   hwaddr max_size, const char *def_filename)
+{
+
+target_long rom_size;
+const char *filename;
+
+if (bios_name) {
+filename = bios_name;
+} else {
+filename = def_filename;
+}
+
+if (filename) {
+char *fn = qemu_find_file(QEMU_FILE_TYPE_BIOS, filename);
+
+if (!fn) {
+fprintf(stderr, "Couldn't find rom image '%s'.\n", filename);
+exit(1);
+}
+
+rom_size = load_image_targphys(fn, addr, max_size);
+if (rom_size < 0 || rom_size > max_size) {
+fprintf(stderr, "Couldn't load rom image '%s'\n", filename);
+exit(1);
+}
+}
+}
+
+static void digic4_add_k8p3215uqb_rom(DigicBoardState *s, hwaddr addr,
+  const char *def_filename)
+{
+#define FLASH_K8P3215UQB_SIZE (4 * 1024 * 1024)
+#define FLASH_K8P3215UQB_SECTOR_SIZE (64 * 1024)
+
+/*
+ * Samsung K8P3215UQB:
+ *  * AMD command set;
+ *  * multiple sector size: some sectors are 8K the other ones are 64K.
+ * Alas! The pflash_cfi02_register() function creates a flash
+ * device with unified sector size.
+ */
+pflash_cfi02_register(addr, NULL, "pflash", FLASH_K8P3215UQB_SIZE,
+  NULL, FLASH_K8P3215UQB_SECTOR_SIZE,
+  FLASH_K8P3215UQB_SIZE / FLASH_K8P3215UQB_SECTOR_SIZE,
+  DIGIC4_ROM_MAX_SIZE / FLASH_K8P3215UQB_SIZE,
+  4,
+  0x00EC, 0x007E, 0x0003, 0x0001,
+  0x0555, 0x2aa, 0);
+
+digic_load_rom(s, addr, FLASH_K8P3215UQB_SIZE, def_filename);
+}
+
 static DigicBoard digic4_board_canon_a1100 = {
 .ram_size = 64 * 1024 * 1024,
+.add_rom1 = digic4_add_k8p3215uqb_rom,
+.rom1_def_filename = "canon-a1100-rom1.bin",
 /* CHDK recommends this address for ROM disassembly */
 .start_addr = 0xffc0,
 };
-- 
1.8.4.rc3




[Qemu-devel] [RFC 1/3] bdrv: Use "Error" for opening images

2013-09-05 Thread Max Reitz
Add an Error ** parameter to bdrv_open, bdrv_file_open and bdrv_create
to allow more specific error messages.

Signed-off-by: Max Reitz 
---
 block.c   |  6 +++---
 block/blkdebug.c  |  3 ++-
 block/blkverify.c |  3 ++-
 block/bochs.c |  3 ++-
 block/cloop.c |  3 ++-
 block/cow.c   |  6 --
 block/dmg.c   |  3 ++-
 block/nbd.c   |  3 ++-
 block/parallels.c |  3 ++-
 block/qcow.c  |  6 --
 block/qcow2.c |  8 +---
 block/qed.c   |  8 +---
 block/raw-posix.c | 18 --
 block/raw_bsd.c   |  6 --
 block/sheepdog.c  |  6 --
 block/snapshot.c  |  2 +-
 block/vdi.c   |  6 --
 block/vhdx.c  |  3 ++-
 block/vmdk.c  |  6 --
 block/vpc.c   |  6 --
 block/vvfat.c |  3 ++-
 include/block/block_int.h |  9 ++---
 22 files changed, 78 insertions(+), 42 deletions(-)

diff --git a/block.c b/block.c
index 26639e8..f485906 100644
--- a/block.c
+++ b/block.c
@@ -374,7 +374,7 @@ static void coroutine_fn bdrv_create_co_entry(void *opaque)
 CreateCo *cco = opaque;
 assert(cco->drv);
 
-cco->ret = cco->drv->bdrv_create(cco->filename, cco->options);
+cco->ret = cco->drv->bdrv_create(cco->filename, cco->options, NULL);
 }
 
 int bdrv_create(BlockDriver *drv, const char* filename,
@@ -734,7 +734,7 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
 if (drv->bdrv_file_open) {
 assert(file == NULL);
 assert(drv->bdrv_parse_filename || filename != NULL);
-ret = drv->bdrv_file_open(bs, options, open_flags);
+ret = drv->bdrv_file_open(bs, options, open_flags, NULL);
 } else {
 if (file == NULL) {
 qerror_report(ERROR_CLASS_GENERIC_ERROR, "Can't use '%s' as a "
@@ -744,7 +744,7 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
 goto free_and_fail;
 }
 bs->file = file;
-ret = drv->bdrv_open(bs, options, open_flags);
+ret = drv->bdrv_open(bs, options, open_flags, NULL);
 }
 
 if (ret < 0) {
diff --git a/block/blkdebug.c b/block/blkdebug.c
index 5d33e03..52d65ff 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -350,7 +350,8 @@ static QemuOptsList runtime_opts = {
 },
 };
 
-static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags)
+static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
+ Error **errp)
 {
 BDRVBlkdebugState *s = bs->opaque;
 QemuOpts *opts;
diff --git a/block/blkverify.c b/block/blkverify.c
index 1d58cc3..5d716bb 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -116,7 +116,8 @@ static QemuOptsList runtime_opts = {
 },
 };
 
-static int blkverify_open(BlockDriverState *bs, QDict *options, int flags)
+static int blkverify_open(BlockDriverState *bs, QDict *options, int flags,
+  Error **errp)
 {
 BDRVBlkverifyState *s = bs->opaque;
 QemuOpts *opts;
diff --git a/block/bochs.c b/block/bochs.c
index d7078c0..51d9a90 100644
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -108,7 +108,8 @@ static int bochs_probe(const uint8_t *buf, int buf_size, 
const char *filename)
 return 0;
 }
 
-static int bochs_open(BlockDriverState *bs, QDict *options, int flags)
+static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
+  Error **errp)
 {
 BDRVBochsState *s = bs->opaque;
 int i;
diff --git a/block/cloop.c b/block/cloop.c
index 6ea7cf4..b907023 100644
--- a/block/cloop.c
+++ b/block/cloop.c
@@ -53,7 +53,8 @@ static int cloop_probe(const uint8_t *buf, int buf_size, 
const char *filename)
 return 0;
 }
 
-static int cloop_open(BlockDriverState *bs, QDict *options, int flags)
+static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
+  Error **errp)
 {
 BDRVCloopState *s = bs->opaque;
 uint32_t offsets_size, max_compressed_block_size = 1, i;
diff --git a/block/cow.c b/block/cow.c
index 1cc2e89..207fea1 100644
--- a/block/cow.c
+++ b/block/cow.c
@@ -58,7 +58,8 @@ static int cow_probe(const uint8_t *buf, int buf_size, const 
char *filename)
 return 0;
 }
 
-static int cow_open(BlockDriverState *bs, QDict *options, int flags)
+static int cow_open(BlockDriverState *bs, QDict *options, int flags,
+Error **errp)
 {
 BDRVCowState *s = bs->opaque;
 struct cow_header_v2 cow_header;
@@ -255,7 +256,8 @@ static void cow_close(BlockDriverState *bs)
 {
 }
 
-static int cow_create(const char *filename, QEMUOptionParameter *options)
+static int cow_create(const char *filename, QEMUOptionParameter *options,
+  Error **errp)
 {
 struct cow_header_v2 cow_header;
 struct stat st;
diff --git a/block/dmg.c b/block/dmg.c
index 3141cb5..d5e9b1f 100

[Qemu-devel] [RFC 0/3] block: Error parameter for opening/creating images

2013-09-05 Thread Max Reitz
This RFC adds an Error ** parameter to bdrv_open, bdrv_file_open,
bdrv_create and the respective functions provided by a block driver.

This results in more specific error information than just -errno provided
to the user when opening or creating images (disregarding the fact that
block drivers often already use error_report, which is generally changed
to error_setg through this patch).

The last patch in this series changes the qcow2 block driver to set an
example of usage in a block driver.

Note that several I/O tests break by applying this RFC since they expect
different error messages (generally, previously, an error message on
image opening/creation consisted of two lines; the first of which would be
generated by the driver through error_report, the second by the block
layer itself through strerror(-ret); this patch is designed to merge these
two lines into a single one). This applies to the tests 49, 51, 54 and 60.

Max Reitz (3):
  bdrv: Use "Error" for opening images
  block: Error parameter for opening functions
  qcow2: Use Error parameter

 block.c   | 164 ++
 block/blkdebug.c  |   5 +-
 block/blkverify.c |   7 +-
 block/bochs.c |   3 +-
 block/cloop.c |   3 +-
 block/cow.c   |  10 +--
 block/dmg.c   |   3 +-
 block/mirror.c|   5 +-
 block/nbd.c   |   3 +-
 block/parallels.c |   3 +-
 block/qcow.c  |  10 +--
 block/qcow2.c | 143 ++--
 block/qed.c   |  13 ++--
 block/raw-posix.c |  18 +++--
 block/raw_bsd.c   |   8 ++-
 block/sheepdog.c  |  10 +--
 block/snapshot.c  |   2 +-
 block/vdi.c   |   6 +-
 block/vhdx.c  |   3 +-
 block/vmdk.c  |  11 ++--
 block/vpc.c   |   6 +-
 block/vvfat.c |   7 +-
 blockdev.c|  30 -
 include/block/block.h |  11 ++--
 include/block/block_int.h |   9 ++-
 qemu-img.c|  39 +--
 qemu-io.c |  14 ++--
 qemu-nbd.c|   6 +-
 28 files changed, 343 insertions(+), 209 deletions(-)

-- 
1.8.3.1




[Qemu-devel] [RFC 2/3] block: Error parameter for opening functions

2013-09-05 Thread Max Reitz
Add an Error ** parameter to bdrv_open, bdrv_file_open, bdrv_create and
associated functions to allow more specific error messages.

Signed-off-by: Max Reitz 
---
 block.c   | 164 --
 block/blkdebug.c  |   2 +-
 block/blkverify.c |   4 +-
 block/cow.c   |   4 +-
 block/mirror.c|   5 +-
 block/qcow.c  |   4 +-
 block/qcow2.c |   6 +-
 block/qed.c   |   5 +-
 block/raw_bsd.c   |   2 +-
 block/sheepdog.c  |   4 +-
 block/vmdk.c  |   5 +-
 block/vvfat.c |   4 +-
 blockdev.c|  30 -
 include/block/block.h |  11 ++--
 qemu-img.c|  39 ++--
 qemu-io.c |  14 +++--
 qemu-nbd.c|   6 +-
 17 files changed, 183 insertions(+), 126 deletions(-)

diff --git a/block.c b/block.c
index f485906..d734265 100644
--- a/block.c
+++ b/block.c
@@ -367,18 +367,26 @@ typedef struct CreateCo {
 char *filename;
 QEMUOptionParameter *options;
 int ret;
+Error *err;
 } CreateCo;
 
 static void coroutine_fn bdrv_create_co_entry(void *opaque)
 {
+Error *local_err = NULL;
+int ret;
+
 CreateCo *cco = opaque;
 assert(cco->drv);
 
-cco->ret = cco->drv->bdrv_create(cco->filename, cco->options, NULL);
+ret = cco->drv->bdrv_create(cco->filename, cco->options, &local_err);
+if (error_is_set(&local_err)) {
+error_propagate(&cco->err, local_err);
+}
+cco->ret = ret;
 }
 
 int bdrv_create(BlockDriver *drv, const char* filename,
-QEMUOptionParameter *options)
+QEMUOptionParameter *options, Error **errp)
 {
 int ret;
 
@@ -388,9 +396,11 @@ int bdrv_create(BlockDriver *drv, const char* filename,
 .filename = g_strdup(filename),
 .options = options,
 .ret = NOT_DONE,
+.err = NULL,
 };
 
 if (!drv->bdrv_create) {
+error_setg(errp, "Driver '%s' does not support image creation", 
drv->format_name);
 ret = -ENOTSUP;
 goto out;
 }
@@ -407,22 +417,37 @@ int bdrv_create(BlockDriver *drv, const char* filename,
 }
 
 ret = cco.ret;
+if (ret < 0) {
+if (error_is_set(&cco.err)) {
+error_propagate(errp, cco.err);
+} else {
+error_setg_errno(errp, -ret, "Could not create image");
+}
+}
 
 out:
 g_free(cco.filename);
 return ret;
 }
 
-int bdrv_create_file(const char* filename, QEMUOptionParameter *options)
+int bdrv_create_file(const char* filename, QEMUOptionParameter *options,
+ Error **errp)
 {
 BlockDriver *drv;
+Error *local_err = NULL;
+int ret;
 
 drv = bdrv_find_protocol(filename, true);
 if (drv == NULL) {
+error_setg(errp, "Could not find protocol for file '%s'", filename);
 return -ENOENT;
 }
 
-return bdrv_create(drv, filename, options);
+ret = bdrv_create(drv, filename, options, &local_err);
+if (error_is_set(&local_err)) {
+error_propagate(errp, local_err);
+}
+return ret;
 }
 
 /*
@@ -525,7 +550,7 @@ BlockDriver *bdrv_find_protocol(const char *filename,
 }
 
 static int find_image_format(BlockDriverState *bs, const char *filename,
- BlockDriver **pdrv)
+ BlockDriver **pdrv, Error **errp)
 {
 int score, score_max;
 BlockDriver *drv1, *drv;
@@ -536,6 +561,7 @@ static int find_image_format(BlockDriverState *bs, const 
char *filename,
 if (bs->sg || !bdrv_is_inserted(bs) || bdrv_getlength(bs) == 0) {
 drv = bdrv_find_format("raw");
 if (!drv) {
+error_setg(errp, "Could not find raw image format");
 ret = -ENOENT;
 }
 *pdrv = drv;
@@ -544,6 +570,8 @@ static int find_image_format(BlockDriverState *bs, const 
char *filename,
 
 ret = bdrv_pread(bs, 0, buf, sizeof(buf));
 if (ret < 0) {
+error_setg_errno(errp, -ret, "Could not read image for determining its 
"
+ "format");
 *pdrv = NULL;
 return ret;
 }
@@ -560,6 +588,8 @@ static int find_image_format(BlockDriverState *bs, const 
char *filename,
 }
 }
 if (!drv) {
+error_setg(errp, "Could not determine image format: No compatible "
+   "driver found");
 ret = -ENOENT;
 }
 *pdrv = drv;
@@ -679,10 +709,11 @@ static int bdrv_open_flags(BlockDriverState *bs, int 
flags)
  * Removes all processed options from *options.
  */
 static int bdrv_open_common(BlockDriverState *bs, BlockDriverState *file,
-QDict *options, int flags, BlockDriver *drv)
+QDict *options, int flags, BlockDriver *drv, Error **errp)
 {
 int ret, open_flags;
 const char *filename;
+Error *local_err = NULL;
 
 assert(drv != NULL);
 assert(bs->file == NULL);
@@ -711,6 +742,7 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
 bs->read_only = !(open_flags & BDRV_O_RDWR);

[Qemu-devel] [RFC 3/3] qcow2: Use Error parameter

2013-09-05 Thread Max Reitz
Employ usage of the new Error ** parameter in qcow2_open, qcow2_create
and associated functions.

Signed-off-by: Max Reitz 
---
 block/qcow2.c | 135 ++
 1 file changed, 88 insertions(+), 47 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index e16d352..9a96188 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1,4 +1,5 @@
 /*
+ * error_setg(errp, "
  * Block driver for the QCOW version 2 format
  *
  * Copyright (c) 2004-2006 Fabrice Bellard
@@ -79,7 +80,8 @@ static int qcow2_probe(const uint8_t *buf, int buf_size, 
const char *filename)
  * return 0 upon success, non-0 otherwise
  */
 static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
- uint64_t end_offset, void **p_feature_table)
+ uint64_t end_offset, void **p_feature_table,
+ Error **errp)
 {
 BDRVQcowState *s = bs->opaque;
 QCowExtension ext;
@@ -100,10 +102,10 @@ static int qcow2_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,
 printf("attempting to read extended header in offset %lu\n", offset);
 #endif
 
-if (bdrv_pread(bs->file, offset, &ext, sizeof(ext)) != sizeof(ext)) {
-fprintf(stderr, "qcow2_read_extension: ERROR: "
-"pread fail from offset %" PRIu64 "\n",
-offset);
+ret = bdrv_pread(bs->file, offset, &ext, sizeof(ext));
+if (ret < 0) {
+error_setg_errno(errp, -ret, "qcow2_read_extension: ERROR: "
+ "pread fail from offset %" PRIu64, offset);
 return 1;
 }
 be32_to_cpus(&ext.magic);
@@ -113,7 +115,7 @@ static int qcow2_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,
 printf("ext.magic = 0x%x\n", ext.magic);
 #endif
 if (ext.len > end_offset - offset) {
-error_report("Header extension too large");
+error_setg(errp, "Header extension too large");
 return -EINVAL;
 }
 
@@ -123,17 +125,19 @@ static int qcow2_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,
 
 case QCOW2_EXT_MAGIC_BACKING_FORMAT:
 if (ext.len >= sizeof(bs->backing_format)) {
-fprintf(stderr, "ERROR: ext_backing_format: len=%u too large"
-" (>=%zu)\n",
-ext.len, sizeof(bs->backing_format));
+error_setg(errp, "ERROR: ext_backing_format: len=%u too large"
+   " (>=%zu)", ext.len, sizeof(bs->backing_format));
 return 2;
 }
-if (bdrv_pread(bs->file, offset , bs->backing_format,
-   ext.len) != ext.len)
+ret = bdrv_pread(bs->file, offset, bs->backing_format, ext.len);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "ERROR: ext_backing_format: "
+ "Could not read format name");
 return 3;
+}
 bs->backing_format[ext.len] = '\0';
 #ifdef DEBUG_EXT
-printf("Qcow2: Got format extension %s\n", bs->backing_format);
+printf("Qcow2: Got format extension %s", bs->backing_format);
 #endif
 break;
 
@@ -142,6 +146,8 @@ static int qcow2_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,
 void* feature_table = g_malloc0(ext.len + 2 * 
sizeof(Qcow2Feature));
 ret = bdrv_pread(bs->file, offset , feature_table, ext.len);
 if (ret < 0) {
+error_setg_errno(errp, -ret, "ERROR: ext_feature_table: "
+ "Could not read table");
 return ret;
 }
 
@@ -161,6 +167,8 @@ static int qcow2_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,
 
 ret = bdrv_pread(bs->file, offset , uext->data, uext->len);
 if (ret < 0) {
+error_setg_errno(errp, -ret, "ERROR: unknown extension: "
+ "Could not read data");
 return ret;
 }
 }
@@ -184,8 +192,8 @@ static void cleanup_unknown_header_ext(BlockDriverState *bs)
 }
 }
 
-static void GCC_FMT_ATTR(2, 3) report_unsupported(BlockDriverState *bs,
-const char *fmt, ...)
+static void GCC_FMT_ATTR(3, 4) report_unsupported(BlockDriverState *bs,
+Error **errp, const char *fmt, ...)
 {
 char msg[64];
 va_list ap;
@@ -194,17 +202,17 @@ static void GCC_FMT_ATTR(2, 3) 
report_unsupported(BlockDriverState *bs,
 vsnprintf(msg, sizeof(msg), fmt, ap);
 va_end(ap);
 
-qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
-bs->device_name, "qcow2", msg);
+error_set(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, bs->device_name, 
"qcow2",
+  msg);
 }
 
 static void report_unsupported_feature(BlockD

[Qemu-devel] [PATCH 0/2] tcg-ppc: use new return-argument ld/st helpers

2013-09-05 Thread Paolo Bonzini
Last month I revived my old PowerBook, and here are the resulting patches
to use the new return-argument ld/st helpers.  I have a few more tcg-ppc
patches but they have a much smaller performance impact so I'll wait
till I have some more free time before posting.  But the impact of the
new helpers is huge, and AIUI Richard wants to get rid of GETPC_LDST
so here are these.

Paolo Bonzini (2):
  tcg-ppc: fix qemu_ld/qemu_st for AIX ABI
  tcg-ppc: use new return-argument ld/st helpers

 include/exec/exec-all.h |  4 +---
 tcg/ppc/tcg-target.c| 56 -
 2 files changed, 29 insertions(+), 31 deletions(-)

-- 
1.8.3.1




[Qemu-devel] [PATCH 1/2] tcg-ppc: fix qemu_ld/qemu_st for AIX ABI

2013-09-05 Thread Paolo Bonzini
For the AIX ABI, the function pointer and small area pointer need
to be loaded in the trampoline.  The trampoline instead is called
with a normal BL instruction.

Signed-off-by: Paolo Bonzini 
---
 tcg/ppc/tcg-target.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/tcg/ppc/tcg-target.c b/tcg/ppc/tcg-target.c
index 2595556..204ffbe 100644
--- a/tcg/ppc/tcg-target.c
+++ b/tcg/ppc/tcg-target.c
@@ -490,7 +490,8 @@ static void tcg_out_b (TCGContext *s, int mask, 
tcg_target_long target)
 }
 }
 
-static void tcg_out_call (TCGContext *s, tcg_target_long arg, int const_arg)
+static void tcg_out_call (TCGContext *s, tcg_target_long arg, int const_arg,
+  int lk)
 {
 #ifdef _CALL_AIX
 int reg;
@@ -504,14 +505,14 @@ static void tcg_out_call (TCGContext *s, tcg_target_long 
arg, int const_arg)
 tcg_out32 (s, LWZ | RT (0) | RA (reg));
 tcg_out32 (s, MTSPR | RA (0) | CTR);
 tcg_out32 (s, LWZ | RT (2) | RA (reg) | 4);
-tcg_out32 (s, BCCTR | BO_ALWAYS | LK);
+tcg_out32 (s, BCCTR | BO_ALWAYS | lk);
 #else
 if (const_arg) {
-tcg_out_b (s, LK, arg);
+tcg_out_b (s, lk, arg);
 }
 else {
 tcg_out32 (s, MTSPR | RS (arg) | LR);
-tcg_out32 (s, BCLR | BO_ALWAYS | LK);
+tcg_out32 (s, BCLR | BO_ALWAYS | lk);
 }
 #endif
 }
@@ -860,7 +861,7 @@ static void tcg_out_qemu_ld_slow_path (TCGContext *s, 
TCGLabelQemuLdst *label)
 tcg_out_mov (s, TCG_TYPE_I32, ir++, addr_reg);
 #endif
 tcg_out_movi (s, TCG_TYPE_I32, ir, mem_index);
-tcg_out_call (s, (tcg_target_long) ld_trampolines[s_bits], 1);
+tcg_out_b (s, LK, (tcg_target_long) ld_trampolines[s_bits]);
 tcg_out32 (s, (tcg_target_long) raddr);
 switch (opc) {
 case 0|4:
@@ -954,7 +955,7 @@ static void tcg_out_qemu_st_slow_path (TCGContext *s, 
TCGLabelQemuLdst *label)
 ir++;
 
 tcg_out_movi (s, TCG_TYPE_I32, ir, mem_index);
-tcg_out_call (s, (tcg_target_long) st_trampolines[opc], 1);
+tcg_out_b (s, LK, (tcg_target_long) st_trampolines[opc]);
 tcg_out32 (s, (tcg_target_long) raddr);
 tcg_out_b (s, 0, (tcg_target_long) raddr);
 }
@@ -984,7 +985,7 @@ static void emit_ldst_trampoline (TCGContext *s, const void 
*ptr)
 tcg_out32 (s, ADDI | RT (3) | RA (3) | 4);
 tcg_out32 (s, MTSPR | RS (3) | LR);
 tcg_out_mov (s, TCG_TYPE_I32, 3, TCG_AREG0);
-tcg_out_b (s, 0, (tcg_target_long) ptr);
+tcg_out_call (s, (tcg_target_long) ptr, 1, 0);
 }
 #endif
 
@@ -1493,7 +1494,7 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, 
const TCGArg *args,
 }
 break;
 case INDEX_op_call:
-tcg_out_call (s, args[0], const_args[0]);
+tcg_out_call (s, args[0], const_args[0], LK);
 break;
 case INDEX_op_movi_i32:
 tcg_out_movi(s, TCG_TYPE_I32, args[0], args[1]);
-- 
1.8.3.1





[Qemu-devel] [PATCH 2/2] tcg-ppc: use new return-argument ld/st helpers

2013-09-05 Thread Paolo Bonzini
These use a 32-bit load-of-immediate to save a mflr+addi+mtlr sequence.
Tested with a Windows 98 guest (pretty much the most recent thing I
could run on my PPC machine) and kvm-unit-tests's sieve.flat.  The
speed up for sieve.flat is as high as 10% for qemu-system-i386, 25%
(no kidding) for qemu-system-x86_64 on my PowerBook G4.

Signed-off-by: Paolo Bonzini 
---
 include/exec/exec-all.h |  4 +---
 tcg/ppc/tcg-target.c| 41 -
 2 files changed, 21 insertions(+), 24 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index beb4149..a81e805 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -324,9 +324,7 @@ extern uintptr_t tci_tb_ptr;
In some implementations, we pass the "logical" return address manually;
in others, we must infer the logical return from the true return.  */
 #if defined(CONFIG_QEMU_LDST_OPTIMIZATION) && defined(CONFIG_SOFTMMU)
-# if defined (_ARCH_PPC) && !defined (_ARCH_PPC64)
-#  define GETRA_LDST(RA)   (*(int32_t *)((RA) - 4))
-# elif defined(__arm__)
+# if defined(__arm__)
 /* We define two insns between the return address and the branch back to
straight-line.  Find and decode that branch insn.  */
 #  define GETRA_LDST(RA)   tcg_getra_ldst(RA)
diff --git a/tcg/ppc/tcg-target.c b/tcg/ppc/tcg-target.c
index 204ffbe..24a8621 100644
--- a/tcg/ppc/tcg-target.c
+++ b/tcg/ppc/tcg-target.c
@@ -550,22 +550,24 @@ static void add_qemu_ldst_label (TCGContext *s,
 label->label_ptr[0] = label_ptr;
 }
 
-/* helper signature: helper_ld_mmu(CPUState *env, target_ulong addr,
-   int mmu_idx) */
+/* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
+ * int mmu_idx, uintptr_t ra)
+ */
 static const void * const qemu_ld_helpers[4] = {
-helper_ldb_mmu,
-helper_ldw_mmu,
-helper_ldl_mmu,
-helper_ldq_mmu,
+helper_ret_ldub_mmu,
+helper_ret_lduw_mmu,
+helper_ret_ldul_mmu,
+helper_ret_ldq_mmu,
 };
 
-/* helper signature: helper_st_mmu(CPUState *env, target_ulong addr,
-   uintxx_t val, int mmu_idx) */
+/* helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr,
+ * uintxx_t val, int mmu_idx, uintptr_t ra)
+ */
 static const void * const qemu_st_helpers[4] = {
-helper_stb_mmu,
-helper_stw_mmu,
-helper_stl_mmu,
-helper_stq_mmu,
+helper_ret_stb_mmu,
+helper_ret_stw_mmu,
+helper_ret_stl_mmu,
+helper_ret_stq_mmu,
 };
 
 static void *ld_trampolines[4];
@@ -860,9 +862,9 @@ static void tcg_out_qemu_ld_slow_path (TCGContext *s, 
TCGLabelQemuLdst *label)
 tcg_out_mov (s, TCG_TYPE_I32, ir++, label->addrhi_reg);
 tcg_out_mov (s, TCG_TYPE_I32, ir++, addr_reg);
 #endif
-tcg_out_movi (s, TCG_TYPE_I32, ir, mem_index);
+tcg_out_movi (s, TCG_TYPE_I32, ir++, mem_index);
+tcg_out_movi (s, TCG_TYPE_I32, ir, (tcg_target_long) raddr);
 tcg_out_b (s, LK, (tcg_target_long) ld_trampolines[s_bits]);
-tcg_out32 (s, (tcg_target_long) raddr);
 switch (opc) {
 case 0|4:
 tcg_out32 (s, EXTSB | RA (data_reg) | RS (3));
@@ -954,10 +956,10 @@ static void tcg_out_qemu_st_slow_path (TCGContext *s, 
TCGLabelQemuLdst *label)
 }
 ir++;
 
-tcg_out_movi (s, TCG_TYPE_I32, ir, mem_index);
-tcg_out_b (s, LK, (tcg_target_long) st_trampolines[opc]);
-tcg_out32 (s, (tcg_target_long) raddr);
-tcg_out_b (s, 0, (tcg_target_long) raddr);
+tcg_out_movi (s, TCG_TYPE_I32, ir++, mem_index);
+tcg_out_movi (s, TCG_TYPE_I32, ir, (tcg_target_long) raddr);
+tcg_out32 (s, MTSPR | RS (ir) | LR);
+tcg_out_b (s, 0, (tcg_target_long) st_trampolines[opc]);
 }
 
 void tcg_out_tb_finalize(TCGContext *s)
@@ -981,9 +983,6 @@ void tcg_out_tb_finalize(TCGContext *s)
 #ifdef CONFIG_SOFTMMU
 static void emit_ldst_trampoline (TCGContext *s, const void *ptr)
 {
-tcg_out32 (s, MFSPR | RT (3) | LR);
-tcg_out32 (s, ADDI | RT (3) | RA (3) | 4);
-tcg_out32 (s, MTSPR | RS (3) | LR);
 tcg_out_mov (s, TCG_TYPE_I32, 3, TCG_AREG0);
 tcg_out_call (s, (tcg_target_long) ptr, 1, 0);
 }
-- 
1.8.3.1




[Qemu-devel] [ARM][regression][bisect] ARM target broken: test v5 image does not start kernel

2013-09-05 Thread Claudio Fontana
Hello all,

I just finished bisecting a regression I am experiencing on ARM 32bit target,
while testing qemu-system-arm built for the aarch64 host running the ARMv5 
integrator image from our test page at

http://wiki.qemu.org/Testing

The breakage assumes the form of failure to start the kernel after

Uncompressing Linux done, booting the kernel.

With info registers showing registers stuck, with PC=c002f6bc.
Worked fine in July.

Can you help me get this to work again?

After a painful bisection, I got a first bad commit, which when reverted on the 
latest QEMU fixes the issue for me. Maybe something needs to be adapted to the 
return value (back-)change?

Bisection logs follow.

Thank you all,

Claudio

9b8c69243585a32d14b9bb9fcd52c37b0b5a1b71 is the first bad commit
commit 9b8c69243585a32d14b9bb9fcd52c37b0b5a1b71
Author: Jan Kiszka 
Date:   Tue Jul 16 14:45:16 2013 +0200

memory: Return -1 again on reads from unsigned regions

This restore the behavior prior to b018ddf633 which accidentally changed
the return code to 0. Specifically guests probing for register existence
were affected by this.

Signed-off-by: Jan Kiszka 
Signed-off-by: Paolo Bonzini 

:100644 100644 9938b6ba458e6e3caeb18d64b9fac2c64bc04c34 
34a088e90f4cb745e6fd462c6fc9b14bc6b25f41 M  memory.c

git bisect start
# bad: [aaa6a40194e9f204cb853f64ef3c1e170bb014e8] Merge remote-tracking branch 
'afaerber/tags/qom-cpu-for-anthony' into staging
git bisect bad aaa6a40194e9f204cb853f64ef3c1e170bb014e8
# good: [1acd5a373905ddb28957842256a038956941f332] Merge remote-tracking branch 
'agraf/ppc-for-upstream' into staging
git bisect good 1acd5a373905ddb28957842256a038956941f332
# bad: [f9e741903982e55c0dc138ab2e61059a4f3c9a76] cs4231: QOM cast cleanup
git bisect bad f9e741903982e55c0dc138ab2e61059a4f3c9a76
# bad: [f9e741903982e55c0dc138ab2e61059a4f3c9a76] cs4231: QOM cast cleanup
git bisect bad f9e741903982e55c0dc138ab2e61059a4f3c9a76
# bad: [c9fea5d701f8fd33f0843728ec264d95cee3ed37] Merge remote-tracking branch 
'bonzini/iommu-for-anthony' into staging
git bisect bad c9fea5d701f8fd33f0843728ec264d95cee3ed37
# good: [91b1df8cf9e1ecaa8679c9ea8713d1e25c28e6c4] cpu: Move reset logging to 
CPUState
git bisect good 91b1df8cf9e1ecaa8679c9ea8713d1e25c28e6c4
# good: [91b1df8cf9e1ecaa8679c9ea8713d1e25c28e6c4] cpu: Move reset logging to 
CPUState
git bisect good 91b1df8cf9e1ecaa8679c9ea8713d1e25c28e6c4
# good: [a34001fab5da2d0df605a8b83880c917c8aa0606] Merge remote-tracking branch 
'rth/axp-next' into staging
git bisect good a34001fab5da2d0df605a8b83880c917c8aa0606
# good: [a34001fab5da2d0df605a8b83880c917c8aa0606] Merge remote-tracking branch 
'rth/axp-next' into staging
git bisect good a34001fab5da2d0df605a8b83880c917c8aa0606
# good: [fdd26fca3ce66863e547560fbde1a444fc5d71b7] libqtest: Plug fd and memory 
leaks in qtest_quit()
git bisect good fdd26fca3ce66863e547560fbde1a444fc5d71b7
# good: [293706dd682f578b457d052988cf3c20b4eab82d] Merge remote-tracking branch 
'rth/axp-next' into staging
git bisect good 293706dd682f578b457d052988cf3c20b4eab82d
# good: [293706dd682f578b457d052988cf3c20b4eab82d] Merge remote-tracking branch 
'rth/axp-next' into staging
git bisect good 293706dd682f578b457d052988cf3c20b4eab82d
# good: [a23fdf355969d331f60593fa5b857d43aec25aac] block/raw: add .bdrv_get_info
git bisect good a23fdf355969d331f60593fa5b857d43aec25aac
# good: [be022d61f4938bb051e8af8e6cb470ec1602c488] doc: monitor multiplexing 
rewording
git bisect good be022d61f4938bb051e8af8e6cb470ec1602c488
# bad: [9b8c69243585a32d14b9bb9fcd52c37b0b5a1b71] memory: Return -1 again on 
reads from unsigned regions
git bisect bad 9b8c69243585a32d14b9bb9fcd52c37b0b5a1b71
# good: [b4afea11aafe85975e74dd562bb94f7ce3de1ef1] memory: actually set the 
owner
git bisect good b4afea11aafe85975e74dd562bb94f7ce3de1ef1


-- 
Claudio Fontana
Server OS Architect
Huawei Technologies Duesseldorf GmbH




Re: [Qemu-devel] [RFC 1/3] bdrv: Use "Error" for opening images

2013-09-05 Thread Kevin Wolf
Am 05.09.2013 um 10:10 hat Max Reitz geschrieben:
> Add an Error ** parameter to bdrv_open, bdrv_file_open and bdrv_create
> to allow more specific error messages.
> 
> Signed-off-by: Max Reitz 
> ---
>  block.c   |  6 +++---
>  block/blkdebug.c  |  3 ++-
>  block/blkverify.c |  3 ++-
>  block/bochs.c |  3 ++-
>  block/cloop.c |  3 ++-
>  block/cow.c   |  6 --
>  block/dmg.c   |  3 ++-
>  block/nbd.c   |  3 ++-
>  block/parallels.c |  3 ++-
>  block/qcow.c  |  6 --
>  block/qcow2.c |  8 +---
>  block/qed.c   |  8 +---
>  block/raw-posix.c | 18 --
>  block/raw_bsd.c   |  6 --
>  block/sheepdog.c  |  6 --
>  block/snapshot.c  |  2 +-
>  block/vdi.c   |  6 --
>  block/vhdx.c  |  3 ++-
>  block/vmdk.c  |  6 --
>  block/vpc.c   |  6 --
>  block/vvfat.c |  3 ++-
>  include/block/block_int.h |  9 ++---
>  22 files changed, 78 insertions(+), 42 deletions(-)

Don't rely on compiler errors for changing the prototypes: There are
more block drivers which just aren't compiled on your system, either
because they are for a different platform (like raw-win32.c) or they
require headers that you don't have installed, so configure disabled
them.

Kevin



[Qemu-devel] [PATCH] qemu-iotests: New test case in 061

2013-09-05 Thread Max Reitz
Add one test case for zero cluster expansion on qcow2 version downgrade
in shared L2 tables (i.e., L2 tables with a refcount > 1) and one for
zero expansion on backed clusters in shared L2 tables.

Signed-off-by: Max Reitz 
---
Depends on (follow-up to):
 - block/qcow2: Image file option amendment (series, v5)
---
 tests/qemu-iotests/061 | 28 
 tests/qemu-iotests/061.out | 27 +++
 2 files changed, 55 insertions(+)

diff --git a/tests/qemu-iotests/061 b/tests/qemu-iotests/061
index 86404e6..5f04bfa 100755
--- a/tests/qemu-iotests/061
+++ b/tests/qemu-iotests/061
@@ -146,6 +146,19 @@ _check_test_img
 $QEMU_IO -c "read -P 0 0 128k" "$TEST_IMG" | _filter_qemu_io
 
 echo
+echo "=== Testing zero expansion on shared L2 table ==="
+echo
+IMGOPTS="compat=1.1" _make_test_img 64M
+$QEMU_IO -c "write -z 0 128k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG snapshot -c foo "$TEST_IMG"
+$QEMU_IMG amend -o "compat=0.10" "$TEST_IMG"
+_check_test_img
+$QEMU_IO -c "read -P 0 0 128k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG snapshot -a foo "$TEST_IMG"
+_check_test_img
+$QEMU_IO -c "read -P 0 0 128k" "$TEST_IMG" | _filter_qemu_io
+
+echo
 echo "=== Testing zero expansion on backed image ==="
 echo
 IMGOPTS="compat=1.1" TEST_IMG="$TEST_IMG.base" _make_test_img 64M
@@ -172,6 +185,21 @@ $QEMU_IMG snapshot -a foo "$TEST_IMG"
 _check_test_img
 $QEMU_IO -c "read -P 0 0 64k" -c "read -P 0x2a 64k 64k" "$TEST_IMG" | 
_filter_qemu_io
 
+echo
+echo "=== Testing zero expansion on backed image with shared L2 table ==="
+echo
+IMGOPTS="compat=1.1" TEST_IMG="$TEST_IMG.base" _make_test_img 64M
+$QEMU_IO -c "write -P 0x2a 0 128k" "$TEST_IMG.base" | _filter_qemu_io
+IMGOPTS="compat=1.1,backing_file=$TEST_IMG.base" _make_test_img 64M
+$QEMU_IO -c "write -z 0 128k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG snapshot -c foo "$TEST_IMG"
+$QEMU_IMG amend -o "compat=0.10" "$TEST_IMG"
+_check_test_img
+$QEMU_IO -c "read -P 0 0 128k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG snapshot -a foo "$TEST_IMG"
+_check_test_img
+$QEMU_IO -c "read -P 0 0 128k" "$TEST_IMG" | _filter_qemu_io
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/061.out b/tests/qemu-iotests/061.out
index 05bd1d5..d42127f 100644
--- a/tests/qemu-iotests/061.out
+++ b/tests/qemu-iotests/061.out
@@ -312,6 +312,18 @@ No errors were found on the image.
 read 131072/131072 bytes at offset 0
 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
+=== Testing zero expansion on shared L2 table ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
 === Testing zero expansion on backed image ===
 
 Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864 
@@ -346,4 +358,19 @@ read 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 read 65536/65536 bytes at offset 65536
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+=== Testing zero expansion on backed image with shared L2 table ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864 
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
backing_file='TEST_DIR/t.IMGFMT.base' 
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 *** done
-- 
1.8.3.1




Re: [Qemu-devel] [ARM][regression][bisect] ARM target broken: test v5 image does not start kernel

2013-09-05 Thread Peter Maydell
On 5 September 2013 09:31, Claudio Fontana  wrote:

> I just finished bisecting a regression I am experiencing on ARM 32bit target,

> After a painful bisection, I got a first bad commit, which when reverted on 
> the latest QEMU fixes the issue for me. Maybe something needs to be adapted 
> to the return value (back-)change?

> 9b8c69243585a32d14b9bb9fcd52c37b0b5a1b71 is the first bad commit
> commit 9b8c69243585a32d14b9bb9fcd52c37b0b5a1b71
> Author: Jan Kiszka 
> Date:   Tue Jul 16 14:45:16 2013 +0200
>
> memory: Return -1 again on reads from unsigned regions

This one again! Claudio, can you test with these patches:
http://patchwork.ozlabs.org/patch/272011/
http://patchwork.ozlabs.org/patch/272012/

which Jan posted earlier this week and should fix
this regression?

thanks
-- PMM



Re: [Qemu-devel] [Qemu-ppc] [PATCH 16/16] target-ppc: Convert to new ldst opcodes

2013-09-05 Thread Alexander Graf

On 04.09.2013, at 23:05, Richard Henderson wrote:

> This lets us change "le_mode" to "end_mode" and fold away nearly all
> of the tests for the current cpu endianness, and removing all of the
> explicitly generated bswap opcodes.
> 
> Cc: qemu-...@nongnu.org
> Signed-off-by: Richard Henderson 

No complaints from me, apart from the usual "LE mode isn't necessarily what you 
think it is on PPC" one. But the code would be as broken as before IIUC.

Ben, you had some insight in how LE mode on different PPC flavors work. Could 
you please make sure we're not walking into the wrong direction here?


Alex




Re: [Qemu-devel] [RFC PATCH] spapr: support time base offset migration

2013-09-05 Thread Alexander Graf

On 05.09.2013, at 06:54, Alexey Kardashevskiy wrote:

> On 09/05/2013 02:30 PM, David Gibson wrote:
>> On Tue, Sep 03, 2013 at 05:31:42PM +1000, Alexey Kardashevskiy wrote:
>>> This allows guests to have a different timebase origin from the host.
>>> 
>>> This is needed for migration, where a guest can migrate from one host
>>> to another and the two hosts might have a different timebase origin.
>>> However, the timebase seen by the guest must not go backwards, and
>>> should go forwards only by a small amount corresponding to the time
>>> taken for the migration.
>>> 
>>> This is only supported for recent POWER hardware which has the TBU40
>>> (timebase upper 40 bits) register. That includes POWER6, 7, 8 but not
>>> 970.
>>> 
>>> This adds kvm_access_one_reg() to access a special register which is not
>>> in env->spr.
>>> 
>>> The feature must be present in the host kernel.
>>> 
>>> Signed-off-by: Alexey Kardashevskiy 
>>> ---
>>> 
>>> This is an RFC but not a final patch. Can break something but I just do not 
>>> see what.
>>> 
>>> ---
>>> hw/ppc/ppc.c | 49 +
>>> include/hw/ppc/ppc.h |  4 
>>> target-ppc/kvm.c | 23 +++
>>> target-ppc/machine.c | 44 
>>> trace-events |  3 +++
>>> 5 files changed, 123 insertions(+)
>>> 
>>> diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
>>> index 1e3cab3..7d08c9a 100644
>>> --- a/hw/ppc/ppc.c
>>> +++ b/hw/ppc/ppc.c
>>> @@ -31,6 +31,7 @@
>>> #include "hw/loader.h"
>>> #include "sysemu/kvm.h"
>>> #include "kvm_ppc.h"
>>> +#include "trace.h"
>>> 
>>> //#define PPC_DEBUG_IRQ
>>> #define PPC_DEBUG_TB
>>> @@ -796,6 +797,54 @@ static void cpu_ppc_set_tb_clk (void *opaque, uint32_t 
>>> freq)
>>> cpu_ppc_store_purr(cpu, 0xULL);
>>> }
>>> 
>>> +/*
>>> + * Calculate timebase on the destination side of migration
>> 
>>> + * We calculate new timebase offset as shown below:
>>> + * 1) Gtb2 = Gtb1 + max(tod2 - tod1, 0)
>>> + *Gtb2 = tb2 + off2
>>> + *Gtb1 = tb1 + off1
>>> + * 2) tb2 + off2 = tb1 + off1 + max(tod2 - tod1, 0)
>>> + * 3) off2 = tb1 - tb2 + off1 + max(tod2 - tod1, 0)
>>> + *
>>> + * where:
>>> + * Gtb2 - destination guest timebase
>>> + * tb2 - destination host timebase
>>> + * off2 - destination timebase offset
>>> + * tod2 - destination time of the day
>>> + * Gtb1 - source guest timebase
>>> + * tb1 - source host timebase
>>> + * off1 - source timebase offset
>>> + * tod1 - source time of the day
>>> + *
>>> + * The result we want is in @off2
>>> + *
>>> + * Two conditions must be met for @off2:
>>> + * 1) off2 must be multiple of 2^24 ticks as it will be set via TBU40 SPR
>>> + * 2) Gtb2 >= Gtb1
>> 
>> What about the TCG case, where there is not host timebase, only a a
>> host system clock?
> 
> 
> cpu_get_real_ticks() returns ticks, this is what the patch cares about.
> What is the difference between KVM and TCG here?
> 
> 
>>> + */
>>> +void cpu_ppc_adjust_tb_offset(ppc_tb_t *tb_env)
>>> +{
>>> +uint64_t tb2, tod2, off2;
>>> +int ratio = tb_env->tb_freq / 100;
>>> +struct timeval tv;
>>> +
>>> +tb2 = cpu_get_real_ticks();
>>> +gettimeofday(&tv, NULL);
>>> +tod2 = tv.tv_sec * 100 + tv.tv_usec;
>>> +
>>> +off2 = tb_env->timebase - tb2 + tb_env->tb_offset;
>>> +if (tod2 > tb_env->time_of_the_day) {
>>> +off2 += (tod2 - tb_env->time_of_the_day) * ratio;
>>> +}
>>> +off2 = ROUND_UP(off2, 1 << 24);
>>> +
>>> +trace_ppc_tb_adjust(tb_env->tb_offset, off2,
>>> +(int64_t)off2 - tb_env->tb_offset);
>>> +
>>> +tb_env->tb_offset = off2;
>>> +}
>>> +
>>> /* Set up (once) timebase frequency (in Hz) */
>>> clk_setup_cb cpu_ppc_tb_init (CPUPPCState *env, uint32_t freq)
>>> {
>>> diff --git a/include/hw/ppc/ppc.h b/include/hw/ppc/ppc.h
>>> index 132ab97..235871c 100644
>>> --- a/include/hw/ppc/ppc.h
>>> +++ b/include/hw/ppc/ppc.h
>>> @@ -32,6 +32,9 @@ struct ppc_tb_t {
>>> uint64_t purr_start;
>>> void *opaque;
>>> uint32_t flags;
>>> +/* Cached values for live migration purposes */
>>> +uint64_t timebase;
>>> +uint64_t time_of_the_day;
>> 
>> How is the time of day encoded here?
> 
> 
> Microseconds. I'll put a comment here, I just thought it is quite obvious
> as gettimeofday() returns microseconds.
> 
> 
>>> };
>>> 
>>> /* PPC Timers flags */
>>> @@ -46,6 +49,7 @@ struct ppc_tb_t {
>>>*/
>>> 
>>> uint64_t cpu_ppc_get_tb(ppc_tb_t *tb_env, uint64_t vmclk, int64_t 
>>> tb_offset);
>>> +void cpu_ppc_adjust_tb_offset(ppc_tb_t *tb_env);
>>> clk_setup_cb cpu_ppc_tb_init (CPUPPCState *env, uint32_t freq);
>>> /* Embedded PowerPC DCR management */
>>> typedef uint32_t (*dcr_read_cb)(void *opaque, int dcrn);
>>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>>> index 7af9e3d..93df955 100644
>>> --- a/target-ppc/kvm.c
>>> +++ b/target-ppc/kvm.c
>>> @@ -35,6 +35,7 @@
>>> #inclu

Re: [Qemu-devel] [RFC] qcow2 journalling draft

2013-09-05 Thread Stefan Hajnoczi
On Wed, Sep 04, 2013 at 11:39:51AM +0200, Kevin Wolf wrote:
> First of all, excuse any inconsistencies in the following mail. I wrote
> it from top to bottom, and there was some thought process involved in
> almost every paragraph...

I should add this disclaimer to all my emails ;-).

> Am 04.09.2013 um 10:03 hat Stefan Hajnoczi geschrieben:
> > On Tue, Sep 03, 2013 at 03:45:52PM +0200, Kevin Wolf wrote:
> > > @@ -103,7 +107,11 @@ in the description of a field.
> > >  write to an image with unknown auto-clear features 
> > > if it
> > >  clears the respective bits from this field first.
> > >  
> > > -Bits 0-63:  Reserved (set to 0)
> > > +Bit 0:  Journal valid bit. This bit indicates 
> > > that the
> > > +image contains a valid main journal 
> > > starting at
> > > +journal_offset.
> > 
> > Whether the journal is used can be determined from the journal_offset
> > value (header length must be large enough and journal offset must be
> > valid).
> > 
> > Why do we need this autoclear bit?
> 
> Hm, I introduced this one first and the journal dirty incompatible bit
> later, perhaps it's unnecessary now. Let's check...
> 
> The obvious thing we need to protect against is applying stale journal
> data to an image that has been changed by an older version. As long as
> the journal is clean, this can't happen, and the journal dirty bit will
> ensure that the old version can only open the image if it is clean.
> 
> However, what if we run 'qemu-img check -r leaks' with an old qemu-img
> version? It will reclaim the clusters used by the journal, and if we
> continue using the journal we'll corrupt whatever new data is there
> now.
> 
> Can we protect against this without using an autoclear bit?

You are right.  It's a weird case I didn't think of but it could happen.
An autoclear bit sounds like the simplest solution.

Please document this scenario.

> > > +A journal is organised in journal blocks, all of which have a reference 
> > > count
> > > +of exactly 1. It starts with a block containing the following journal 
> > > header:
> > > +
> > > +Byte  0 -  7:   Magic ("qjournal" ASCII string)
> > > +
> > > +  8 - 11:   Journal size in bytes, including the header
> > > +
> > > + 12 - 15:   Journal block size order (block size in bytes = 1 << 
> > > order)
> > > +The block size must be at least 512 bytes and must 
> > > not
> > > +exceed the cluster size.
> > > +
> > > + 16 - 19:   Journal block index of the descriptor for the last
> > > +transaction that has been synced, starting with 1 
> > > for the
> > > +journal block after the header. 0 is used for empty
> > > +journals.
> > > +
> > > + 20 - 23:   Sequence number of the last transaction that has been
> > > +synced. 0 is recommended as the initial value.
> > > +
> > > + 24 - 27:   Sequence number of the last transaction that has been
> > > +committed. When replaying a journal, all transactions
> > > +after the last synced one up to the last commit one 
> > > must be
> > > +synced. Note that this may include a wraparound of 
> > > sequence
> > > +numbers.
> > > +
> > > + 28 -  31:  Checksum (one's complement of the sum of all bytes 
> > > in the
> > > +header journal block except those of the checksum 
> > > field)
> > > +
> > > + 32 - 511:  Reserved (set to 0)
> > 
> > I'm not sure if these fields are necessary.  They require updates (and
> > maybe flush) after every commit and sync.
> > 
> > The fewer metadata updates, the better, not just for performance but
> > also to reduce the risk of data loss.  If any metadata required to
> > access the journal is corrupted, the image will be unavailable.
> > 
> > It should be possible to determine this information by scanning the
> > journal transactions.
> 
> This is rather handwavy. Can you elaborate how this would work in detail?
> 
> 
> For example, let's assume we get to read this journal (a journal can be
> rather large, too, so I'm not sure if we want to read it in completely):
> 
>  - Descriptor, seq 42, 2 data blocks
>  - Data block
>  - Data block
>  - Data block starting with "qjbk"
>  - Data block
>  - Descriptor, seq 7, 0 data blocks
>  - Descriptor, seq 8, 1 data block
>  - Data block
> 
> Which of these have already been synced? Which have been committed?
> 
> 
> I guess we could introduce an is_commited flag in the descriptor, but
> wouldn't correct operation look like this then:
> 
> 1. Write out descriptor commit flag clear and any data blocks
> 2. Flush
> 3. Rewrite descriptor with commit flag set
> 
> This ensures that the commit flag is only set if all the required data
> is ind

Re: [Qemu-devel] [RFC] qcow2 journalling draft

2013-09-05 Thread Stefan Hajnoczi
On Wed, Sep 04, 2013 at 11:55:23AM +0200, Benoît Canet wrote:
> > > I'm not sure if multiple journals will work in practice.  Doesn't this
> > > re-introduce the need to order update steps and flush between them?
> > 
> > This is a question for Benoît, who made this requirement. I asked him
> > the same a while ago and apparently his explanation made some sense to
> > me, or I would have remembered that I don't want it. ;-)
> 
> The reason behind the multiple journal requirement is that if a block get
> created and deleted in a cyclic way it can generate cyclic 
> insertions/deletions
> journal entries.
> The journal could easilly be filled if this pathological corner case happen.
> When it happen the dedup code repack the journal by writting only the non
> redundant information into a new journal and then use the new one.
> It would not be easy to do so if non dedup journal entries are present in the
> journal hence the multiple journal requirement.
> 
> The deduplication also need two journals because when the first one is frozen 
> it
> take some time to write the hash table to disk and anyway new entries must be
> stored somewhere at the same time. The code cannot block.
> 
> > It might have something to do with the fact that deduplication uses the
> > journal more as a kind of cache for hash values that can be dropped and
> > rebuilt after a crash.
> 
> For dedupe the journal is more a "resume after exit" tool.

I'm not sure anymore if dedupe needs the same kind of "journal" as a
metadata journal for qcow2.

Since you have a dirty flag to discard the "journal" on crash, the
journal is not used for data integrity.

That makes me wonder if the metadata journal is the right structure for
dedupe?  Maybe your original proposal was fine for dedupe and we just
misinterpreted it because we thought this needs to be a safe journal.

Stefan



Re: [Qemu-devel] [PATCH] spapr-rtas: reset top 4 bits in parameters address

2013-09-05 Thread Alexander Graf

On 05.09.2013, at 09:40, Alexey Kardashevskiy wrote:

> On 09/05/2013 05:08 PM, Alexander Graf wrote:
>> 
>> 
>> Am 05.09.2013 um 07:58 schrieb Alexey Kardashevskiy :
>> 
>>> On the real hardware, RTAS is called in real mode and therefore
>>> ignores top 4 bits of the address passed in the call.
>> 
>> Shouldn't we ignore the upper 4 bits for every memory access in real mode, 
>> not just that one parameter?
> 
> We probably should but I just do not see any easy way of doing this. Yet
> another "Ignore N bits on the top" memory region type? No idea.

Well, it already works for code that runs inside of guest context, because 
there the softmmu code for real mode strips the upper 4 bits.

I basically see 2 ways of fixing this "correctly":

1) Don't access memory through cpu_physical_memory_rw or ldx_phys but instead 
through real mode wrappers that strip the upper 4 bits, similar to how we 
handle virtual memory differently from physical memory

2) Create 15 aliases to system_memory at the upper 4 bits of address space. 
That should at the end of the day give you the same effect

The fix as you're proposing it wouldn't work for indirect memory descriptors. 
Imagine you have an "address" parameter that gives you a pointer to a struct in 
memory that again contains a pointer. You still want that pointer be 
interpreted correctly, no?


Alex




Re: [Qemu-devel] [RFC] qcow2 journalling draft

2013-09-05 Thread Stefan Hajnoczi
On Tue, Sep 03, 2013 at 03:45:52PM +0200, Kevin Wolf wrote:
> This contains an extension of the qcow2 spec that introduces journalling
> to the image format, plus some preliminary type definitions and
> function prototypes in the qcow2 code.
> 
> Journalling functionality is a crucial feature for the design of data
> deduplication, and it will improve the core part of qcow2 by avoiding
> cluster leaks on crashes as well as provide an easier way to get a
> reliable implementation of performance features like Delayed COW.
> 
> At this point of the RFC, it would be most important to review the
> on-disk structure. Once we're confident that it can do everything we
> want, we can start going into more detail on the qemu side of things.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/Makefile.objs   |   2 +-
>  block/qcow2-journal.c |  55 ++
>  block/qcow2.h |  78 +++
>  docs/specs/qcow2.txt  | 204 
> +-
>  4 files changed, 337 insertions(+), 2 deletions(-)
>  create mode 100644 block/qcow2-journal.c

Although we are still discussing details of the on-disk layout, the
general design is clear enough to discuss how the journal will be used.

Today qcow2 uses Qcow2Cache to do lazy, ordered metadata updates.  The
performance is pretty good with two exceptions that I can think of:

1. The delayed CoW problem that Kevin has been working on.  Guests
   perform sequential writes that are smaller than a qcow2 cluster.  The
   first write triggers a copy-on-write of the full cluster.  Later
   writes then overwrite the copied data.  It would be more efficient to
   anticipate sequential writes and hold off on CoW where possible.

2. Lazy metadata updates lead to bursty behavior and expensive flushes.
   We do not take advantage of disk bandwidth since metadata updates
   stay in the Qcow2Cache until the last possible second.  When the
   guest issues a flush we must write out dirty Qcow2Cache entries and
   possibly fsync between them if dependencies have been set (e.g.
   refcount before L2).

How will the journal change this situation?  Writes that go through the
journal are doubled - they must first be journalled, fsync, and then
they can be applied to the actual image.

How do we benefit by using the journal?

Stefan



Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/2] tcg-ppc: use new return-argument ld/st helpers

2013-09-05 Thread Alexander Graf

On 05.09.2013, at 10:22, Paolo Bonzini wrote:

> Last month I revived my old PowerBook, and here are the resulting patches
> to use the new return-argument ld/st helpers.  I have a few more tcg-ppc
> patches but they have a much smaller performance impact so I'll wait
> till I have some more free time before posting.  But the impact of the
> new helpers is huge, and AIUI Richard wants to get rid of GETPC_LDST
> so here are these.

Reviewed-by: Alexander Graf 


Alex

> 
> Paolo Bonzini (2):
>  tcg-ppc: fix qemu_ld/qemu_st for AIX ABI
>  tcg-ppc: use new return-argument ld/st helpers
> 
> include/exec/exec-all.h |  4 +---
> tcg/ppc/tcg-target.c| 56 -
> 2 files changed, 29 insertions(+), 31 deletions(-)
> 
> -- 
> 1.8.3.1
> 
> 




Re: [Qemu-devel] [RFC PATCH] spapr: support time base offset migration

2013-09-05 Thread Alexey Kardashevskiy
On 09/05/2013 07:16 PM, Alexander Graf wrote:
> 
> On 05.09.2013, at 06:54, Alexey Kardashevskiy wrote:
> 
>> On 09/05/2013 02:30 PM, David Gibson wrote:
>>> On Tue, Sep 03, 2013 at 05:31:42PM +1000, Alexey Kardashevskiy wrote:
 This allows guests to have a different timebase origin from the host.

 This is needed for migration, where a guest can migrate from one host
 to another and the two hosts might have a different timebase origin.
 However, the timebase seen by the guest must not go backwards, and
 should go forwards only by a small amount corresponding to the time
 taken for the migration.

 This is only supported for recent POWER hardware which has the TBU40
 (timebase upper 40 bits) register. That includes POWER6, 7, 8 but not
 970.

 This adds kvm_access_one_reg() to access a special register which is not
 in env->spr.

 The feature must be present in the host kernel.

 Signed-off-by: Alexey Kardashevskiy 
 ---

 This is an RFC but not a final patch. Can break something but I just do 
 not see what.

 ---
 hw/ppc/ppc.c | 49 +
 include/hw/ppc/ppc.h |  4 
 target-ppc/kvm.c | 23 +++
 target-ppc/machine.c | 44 
 trace-events |  3 +++
 5 files changed, 123 insertions(+)

 diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
 index 1e3cab3..7d08c9a 100644
 --- a/hw/ppc/ppc.c
 +++ b/hw/ppc/ppc.c
 @@ -31,6 +31,7 @@
 #include "hw/loader.h"
 #include "sysemu/kvm.h"
 #include "kvm_ppc.h"
 +#include "trace.h"

 //#define PPC_DEBUG_IRQ
 #define PPC_DEBUG_TB
 @@ -796,6 +797,54 @@ static void cpu_ppc_set_tb_clk (void *opaque, 
 uint32_t freq)
 cpu_ppc_store_purr(cpu, 0xULL);
 }

 +/*
 + * Calculate timebase on the destination side of migration
>>>
 + * We calculate new timebase offset as shown below:
 + * 1) Gtb2 = Gtb1 + max(tod2 - tod1, 0)
 + *Gtb2 = tb2 + off2
 + *Gtb1 = tb1 + off1
 + * 2) tb2 + off2 = tb1 + off1 + max(tod2 - tod1, 0)
 + * 3) off2 = tb1 - tb2 + off1 + max(tod2 - tod1, 0)
 + *
 + * where:
 + * Gtb2 - destination guest timebase
 + * tb2 - destination host timebase
 + * off2 - destination timebase offset
 + * tod2 - destination time of the day
 + * Gtb1 - source guest timebase
 + * tb1 - source host timebase
 + * off1 - source timebase offset
 + * tod1 - source time of the day
 + *
 + * The result we want is in @off2
 + *
 + * Two conditions must be met for @off2:
 + * 1) off2 must be multiple of 2^24 ticks as it will be set via TBU40 SPR
 + * 2) Gtb2 >= Gtb1
>>>
>>> What about the TCG case, where there is not host timebase, only a a
>>> host system clock?
>>
>>
>> cpu_get_real_ticks() returns ticks, this is what the patch cares about.
>> What is the difference between KVM and TCG here?
>>
>>
 + */
 +void cpu_ppc_adjust_tb_offset(ppc_tb_t *tb_env)
 +{
 +uint64_t tb2, tod2, off2;
 +int ratio = tb_env->tb_freq / 100;
 +struct timeval tv;
 +
 +tb2 = cpu_get_real_ticks();
 +gettimeofday(&tv, NULL);
 +tod2 = tv.tv_sec * 100 + tv.tv_usec;
 +
 +off2 = tb_env->timebase - tb2 + tb_env->tb_offset;
 +if (tod2 > tb_env->time_of_the_day) {
 +off2 += (tod2 - tb_env->time_of_the_day) * ratio;
 +}
 +off2 = ROUND_UP(off2, 1 << 24);
 +
 +trace_ppc_tb_adjust(tb_env->tb_offset, off2,
 +(int64_t)off2 - tb_env->tb_offset);
 +
 +tb_env->tb_offset = off2;
 +}
 +
 /* Set up (once) timebase frequency (in Hz) */
 clk_setup_cb cpu_ppc_tb_init (CPUPPCState *env, uint32_t freq)
 {
 diff --git a/include/hw/ppc/ppc.h b/include/hw/ppc/ppc.h
 index 132ab97..235871c 100644
 --- a/include/hw/ppc/ppc.h
 +++ b/include/hw/ppc/ppc.h
 @@ -32,6 +32,9 @@ struct ppc_tb_t {
 uint64_t purr_start;
 void *opaque;
 uint32_t flags;
 +/* Cached values for live migration purposes */
 +uint64_t timebase;
 +uint64_t time_of_the_day;
>>>
>>> How is the time of day encoded here?
>>
>>
>> Microseconds. I'll put a comment here, I just thought it is quite obvious
>> as gettimeofday() returns microseconds.
>>
>>
 };

 /* PPC Timers flags */
 @@ -46,6 +49,7 @@ struct ppc_tb_t {
*/

 uint64_t cpu_ppc_get_tb(ppc_tb_t *tb_env, uint64_t vmclk, int64_t 
 tb_offset);
 +void cpu_ppc_adjust_tb_offset(ppc_tb_t *tb_env);
 clk_setup_cb cpu_ppc_tb_init (CPUPPCState *env, uint32_t freq);
 /* Embedded PowerPC DCR management */
 typedef uint32_t (*dcr_read_cb)(void *opaque, int dcrn);
>>>

Re: [Qemu-devel] [RFC PATCH] spapr: support time base offset migration

2013-09-05 Thread Alexander Graf

On 05.09.2013, at 11:48, Alexey Kardashevskiy wrote:

> On 09/05/2013 07:16 PM, Alexander Graf wrote:
>> 
>> On 05.09.2013, at 06:54, Alexey Kardashevskiy wrote:
>> 
>>> On 09/05/2013 02:30 PM, David Gibson wrote:

[...]

>>> 
> #endif /* TARGET_PPC64 */
>}
> 
> @@ -1082,6 +1102,9 @@ int kvm_arch_get_registers(CPUState *cs)
>DPRINTF("Warning: Unable to get VPA information from 
> KVM\n");
>}
>}
> +
> +kvm_access_one_reg(cs, 0, KVM_REG_PPC_TB_OFFSET,
> +   &env->tb_env->tb_offset);
> #endif
>}
> 
> diff --git a/target-ppc/machine.c b/target-ppc/machine.c
> index 12e1512..d1ffc7f 100644
> --- a/target-ppc/machine.c
> +++ b/target-ppc/machine.c
> @@ -1,5 +1,6 @@
> #include "hw/hw.h"
> #include "hw/boards.h"
> +#include "hw/ppc/ppc.h"
> #include "sysemu/kvm.h"
> #include "helper_regs.h"
> 
> @@ -459,6 +460,45 @@ static const VMStateDescription vmstate_tlbmas = {
>}
> };
> 
> +static void timebase_pre_save(void *opaque)
> +{
> +ppc_tb_t *tb_env = opaque;
> +struct timeval tv;
> +
> +gettimeofday(&tv, NULL);
> +tb_env->time_of_the_day = tv.tv_sec * 100 + tv.tv_usec;
> +tb_env->timebase = cpu_get_real_ticks();
> +}
> +
> +static int timebase_post_load(void *opaque, int version_id)
> +{
> +ppc_tb_t *tb_env = opaque;
> +
> +if (!tb_env) {
> +printf("NO TB!\n");
> +return -1;
> +}
> +cpu_ppc_adjust_tb_offset(tb_env);
> +
> +return 0;
> +}
> +
> +static const VMStateDescription vmstate_timebase = {
> +.name = "cpu/timebase",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.minimum_version_id_old = 1,
> +.pre_save = timebase_pre_save,
> +.post_load = timebase_post_load,
> +.fields  = (VMStateField []) {
> +VMSTATE_UINT64(timebase, ppc_tb_t),
> +VMSTATE_INT64(tb_offset, ppc_tb_t),
 
 tb_offset is inherently a local concept, since it depends on the host
 timebase.  So how can it belong in the migration stream?
>>> 
>>> 
>>> I do not have pure guest timebase in QEMU and I need it on the destination.
>>> But I have host timebase + offset to calculate it. And tb_offset is already
>>> in ppc_tb_t. It looked logical to me to send the existing field and add
>>> only the missing part.
>> 
>> I still don't understand. You want the guest visible timebase in the 
>> migration stream, no?
> 
> 
> Yes. I do not really understand the problem here (and I am not playing
> dump). Do you suggest sending just the guest timebase and do not send the
> host timebase and the offset (one number instead of two)? I can do that,
> makes sense, no problem, thanks for the idea.

Yup, pretty much :). The receiving end should have no business in knowing how 
far off the guest and the host timebase were skewed on the sending end :).


Alex




Re: [Qemu-devel] Block Filters

2013-09-05 Thread Stefan Hajnoczi
On Wed, Sep 04, 2013 at 08:15:36PM +0200, Benoît Canet wrote:
> > Propagate operations like snapshot down the tree.  block.c is designed
> > for bs->file/bs->backing_hd kind of BlockDrivers, perhaps it needs to
> > become a bit more generic to support other types of BlockDrivers
> > properly.
> 
> Shouldn't bs->backing_hd become bs->children[0] and bs->file stay the same ?

bs->backing_hd and bs->file exist so that block.c can implement generic
functionality shared by a lot of block drivers.  They are for code
reuse.

Many places in the block layer have come to assume that there is only
one ->file, for example.  That's not true for quorum or even vmdk.

If we forget about code reuse for a second, and just think of BDS trees,
then both ->backing_hd and ->file should be in ->children[].

The block driver can put constraints on ->children[], e.g. how many
children are allowed, read/write access, etc.  .bdrv_open() can look at
->children[] and set things up.

I think the problem we have is that too much of QEMU uses ->file and
->backing_hd.  It's kind of baked in now to the point where more
flexible block drivers like quorum are hard to represent.

->backing_hd and ->file are mostly image format concepts.  Filters and
protocols could do without them.

After saying all that, I don't have a design that makes everything
better :P.

Stefan



Re: [Qemu-devel] [RFC PATCH] spapr: add initial ibm, client-architecture-support rtas call support

2013-09-05 Thread Paul Mackerras
On Wed, Sep 04, 2013 at 04:32:20PM -0500, Anthony Liguori wrote:
> On Wed, Sep 4, 2013 at 8:37 AM, Alexander Graf  wrote:
> >
> >>
> >>> So IMHO this whole thing should be orthogonal to -cpu.
> >>
> >> Well, since we cannot change CPU class on the fly, yes, it should be a
> >> "compatibility" flags/properties/methods/whatever of the default CPU for
> >> the specific family.
> >
> > Since it's machine global, it could just as well be a machine option, no? 
> > Or can you have multiple CPUs with different compat modes in a single 
> > system?
> 
> AFAIK, this has nothing to do with CPUs.

I'm not sure what you mean by that; it has to do with CPUs since it
means changing the CPUs' behaviour, at least for user-mode programs.

> >>> Why? Just because you're on POWER7 as default doesn't mean you can't bump 
> >>> to a newer compat later on, no?
> >>
> >> Bump when exactly? And it won't be a new compat, it will be a native CPU. I
> >
> > If you configure your guest to boot in POWER7-compat mode on your POWER8 
> > box and it then tells you through ibm,client-architecture-support that it 
> > can do POWER8, we can just remove all the compat bits and be happy, no?

Answering Alex here -- if we want to preserve the option of migrating
to a POWER7 host in future, we would run the guest in POWER7 compat
mode even if the current host is a POWER8 and the guest knows about
POWER8.

> POWER8 is compatible with POWER7, right?  There's no magic bits that
> need to be disabled AFAIK.

POWER8 is a superset of POWER7, yes, but what we want to do when
running an old OS and applications that don't know about POWER8 is to
disable the new POWER8 features in usermode (e.g., transactional
memory, various other new instructions, etc.).  The reason is to make
it more likely that applications won't misbehave even if they do
silly things like trying instructions that aren't defined on POWER7
(or at least, they will behave the same as they would on a real
POWER7).

To this end, the processor has a Processor Compatibility Register
(PCR) which has bits which disables the new instructions and SPRs when
the processor is in user mode - actually 2 bits, one which disables
the features that were new for POWER8 compared to POWER7, and one
which disables the features that were new for POWER7 compared to
POWER6.

> The effective version if exposed through device tree.  The reason a
> reboot is needed is because the device tree needs to be updated (which
> can't be done without a reboot).
> 
> The problem to solve is delaying device tree generation in order to
> avoid the reboots, no?

No, we can't delay generating the device tree to that point.  SLOF
needs the device tree, and the kernel does look at the device tree
before calling the ibm,client-architecture-support method.

> (Maybe it's time to start thinking of non-PAPR interfaces that Linux
> KVM guests can use to avoid a lot of this silliness...)

That won't help us boot (e.g.) RHEL6 or SLES11 on a POWER8, which is
what this is ultimately about.

In any case, there are changes we might need to make that will
definitely need a reboot - changing the size of the hashed page table,
for instance.

Paul.



Re: [Qemu-devel] [PATCH] spapr-rtas: reset top 4 bits in parameters address

2013-09-05 Thread Alexey Kardashevskiy
On 09/05/2013 07:27 PM, Alexander Graf wrote:
> 
> On 05.09.2013, at 09:40, Alexey Kardashevskiy wrote:
> 
>> On 09/05/2013 05:08 PM, Alexander Graf wrote:
>>>
>>>
>>> Am 05.09.2013 um 07:58 schrieb Alexey Kardashevskiy :
>>>
 On the real hardware, RTAS is called in real mode and therefore
 ignores top 4 bits of the address passed in the call.
>>>
>>> Shouldn't we ignore the upper 4 bits for every memory access in real mode, 
>>> not just that one parameter?
>>
>> We probably should but I just do not see any easy way of doing this. Yet
>> another "Ignore N bits on the top" memory region type? No idea.
> 
> Well, it already works for code that runs inside of guest context, because 
> there the softmmu code for real mode strips the upper 4 bits.
> 
> I basically see 2 ways of fixing this "correctly":
> 

> 1) Don't access memory through cpu_physical_memory_rw or ldx_phys but
> instead through real mode wrappers that strip the upper 4 bits, similar
> to how we handle virtual memory differently from physical memory

But there is no a ready wrapper for this, correct? I could not find any. I
would rather do this, looks nicer than 2).


> 2) Create 15 aliases to system_memory at the upper 4 bits of address
> space. That should at the end of the day give you the same effect

Wow. Is not that too much?
Ooor since I am normally making bad decisions, I should do this :)


> The fix as you're proposing it wouldn't work for indirect memory
> descriptors. Imagine you have an "address" parameter that gives you a
> pointer to a struct in memory that again contains a pointer. You still
> want that pointer be interpreted correctly, no?

Yes I do. I just think that having non zero bits at the top is a bug and I
would not want the guest to continue sending bad addresses to the host. Or
at least I want to know if it still happening.

Now we know that the only occasion of this misbehaviour is the "stop-self"
call and others works just fine. If something new comes up (what is pretty
unlikely, otherwise we would have noticed this issue a loong time ago AND
Paul already made&posted a patch for the host to fix __pa() so it is not
going to happen on new kernels either), ok, we will think of fixing this.

Doing in QEMU what the hardware does is a good thing but here I would think
twice.


-- 
Alexey



Re: [Qemu-devel] Block Filters

2013-09-05 Thread Fam Zheng
On Thu, 09/05 12:01, Stefan Hajnoczi wrote:
> On Wed, Sep 04, 2013 at 08:15:36PM +0200, Benoît Canet wrote:
> > > Propagate operations like snapshot down the tree.  block.c is designed
> > > for bs->file/bs->backing_hd kind of BlockDrivers, perhaps it needs to
> > > become a bit more generic to support other types of BlockDrivers
> > > properly.
> > 
> > Shouldn't bs->backing_hd become bs->children[0] and bs->file stay the same ?
> 
> bs->backing_hd and bs->file exist so that block.c can implement generic
> functionality shared by a lot of block drivers.  They are for code
> reuse.
> 
> Many places in the block layer have come to assume that there is only
> one ->file, for example.  That's not true for quorum or even vmdk.
> 
> If we forget about code reuse for a second, and just think of BDS trees,
> then both ->backing_hd and ->file should be in ->children[].
> 
> I think the problem we have is that too much of QEMU uses ->file and
> ->backing_hd.  It's kind of baked in now to the point where more
> flexible block drivers like quorum are hard to represent.
> 
> ->backing_hd and ->file are mostly image format concepts.  Filters and
> protocols could do without them.
> 
> After saying all that, I don't have a design that makes everything
> better :P.
> 

Maybe we could start from a generic scheme and add specific operations upon:

I propose we let bs->children[] keep all the node connections, including
backing_hd and file, then leave the BlockDriver, BlockFilter or BlockProtocol
to implement ->get_backing_hd(), ->get_file_hd(), or even ->get_files(), or
mark an operation as NULL.  These operations give semantics to its children (of
course they need some semantics to actually be useful), but it's orthogonal to
management of elements in the object tree.

Fam



Re: [Qemu-devel] [RFC PATCH] spapr: add initial ibm, client-architecture-support rtas call support

2013-09-05 Thread Alexander Graf

On 05.09.2013, at 12:16, Paul Mackerras wrote:

> On Wed, Sep 04, 2013 at 04:32:20PM -0500, Anthony Liguori wrote:
>> On Wed, Sep 4, 2013 at 8:37 AM, Alexander Graf  wrote:
>>> 
 
> So IMHO this whole thing should be orthogonal to -cpu.
 
 Well, since we cannot change CPU class on the fly, yes, it should be a
 "compatibility" flags/properties/methods/whatever of the default CPU for
 the specific family.
>>> 
>>> Since it's machine global, it could just as well be a machine option, no? 
>>> Or can you have multiple CPUs with different compat modes in a single 
>>> system?
>> 
>> AFAIK, this has nothing to do with CPUs.
> 
> I'm not sure what you mean by that; it has to do with CPUs since it
> means changing the CPUs' behaviour, at least for user-mode programs.
> 
> Why? Just because you're on POWER7 as default doesn't mean you can't bump 
> to a newer compat later on, no?
 
 Bump when exactly? And it won't be a new compat, it will be a native CPU. I
>>> 
>>> If you configure your guest to boot in POWER7-compat mode on your POWER8 
>>> box and it then tells you through ibm,client-architecture-support that it 
>>> can do POWER8, we can just remove all the compat bits and be happy, no?
> 
> Answering Alex here -- if we want to preserve the option of migrating
> to a POWER7 host in future, we would run the guest in POWER7 compat
> mode even if the current host is a POWER8 and the guest knows about
> POWER8.

Yes, so we boot the guest with compat mode set to POWER7, then the guest calls 
ibm,client-architecutre-support including POWER8 and then we can reconfigure 
the guest to be POWER8, right?


Alex




[Qemu-devel] [RFC PATCH 0/6] Shared Library Module Support

2013-09-05 Thread Fam Zheng
This series implements feature of shared object building as described in:

http://wiki.qemu.org/Features/Modules

It's achieved in three steps, with extra bonus to change curl to a shared
library module in the end (only to demonstrate the usage, no "make install"
support of .so files yet).

1. Allow per object cflags and libs:

[01/06] make.rule: fix $(obj) to a real relative path
[02/06] rule.mak: allow per object cflags and libs

2. Rules for building .so:

[03/06] Makefile: define curl cflags and libs with object

3. Code to load module. All .so files are scanned and loaded when program
   starts:

[04/06] Makefile: introduce common-obj-m and block-obj-m for DSO

4. curl adoption:

[05/06] module: load modules at start
[06/06] curl: build as shared library



Fam Zheng (6):
  make.rule: fix $(obj) to a real relative path
  rule.mak: allow per object cflags and libs
  Makefile: define curl cflags and libs with object
  Makefile: introduce common-obj-m and block-obj-m for DSO
  module: load modules at start
  curl: build as shared library

 Makefile  | 24 +---
 Makefile.objs | 10 +-
 Makefile.target   |  3 ++-
 block/Makefile.objs   |  3 ++-
 configure | 28 
 include/qemu/module.h |  2 ++
 qemu-img.c|  2 ++
 qemu-io.c |  1 +
 qemu-nbd.c|  1 +
 rules.mak | 20 ++--
 scripts/create_config |  3 +++
 util/Makefile.objs|  2 ++
 util/module.c | 40 
 vl.c  |  1 +
 14 files changed, 116 insertions(+), 24 deletions(-)

-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 1/6] make.rule: fix $(obj) to a real relative path

2013-09-05 Thread Fam Zheng
Makefile.target includes rule.mak and unnested common-obj-y, then prefix
them with '../', this will ignore object specific QEMU_CFLAGS in subdir
Makefile.objs:

$(obj)/curl.o: QEMU_CFLAGS += $(CURL_CFLAGS)

Because $(obj) here is './block', instead of '../block'. This doesn't
hurt compiling because we basically build all .o from top Makefile,
before entering Makefile.target, but it will affact arriving per-object
libs support.

The starting point of $(obj) is fixed before including ./Makefile.objs,
to get consistency with nested Makefile rules in target rule and
variable definition.

Signed-off-by: Fam Zheng 
---
 Makefile.target | 3 ++-
 rules.mak   | 6 +++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/Makefile.target b/Makefile.target
index 9a49852..b35e7c1 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -144,12 +144,13 @@ endif # CONFIG_SOFTMMU
 %/translate.o: QEMU_CFLAGS += $(TRANSLATE_OPT_CFLAGS)
 
 nested-vars += obj-y
+obj := ..
 
 # This resolves all nested paths, so it must come last
 include $(SRC_PATH)/Makefile.objs
 
 all-obj-y = $(obj-y)
-all-obj-y += $(addprefix ../, $(common-obj-y))
+all-obj-y += $(addprefix $(obj)/, $(common-obj-y))
 
 ifndef CONFIG_HAIKU
 LIBS+=-lm
diff --git a/rules.mak b/rules.mak
index 4499745..5758137 100644
--- a/rules.mak
+++ b/rules.mak
@@ -103,7 +103,6 @@ clean: clean-timestamp
 
 # magic to descend into other directories
 
-obj := .
 old-nested-dirs :=
 
 define push-var
@@ -119,9 +118,10 @@ endef
 
 define unnest-dir
 $(foreach var,$(nested-vars),$(call push-var,$(var),$1/))
-$(eval obj := $(obj)/$1)
+$(eval old-obj := $(obj))
+$(eval obj := $(if $(obj),$(obj)/$1,$1))
 $(eval include $(SRC_PATH)/$1/Makefile.objs)
-$(eval obj := $(patsubst %/$1,%,$(obj)))
+$(eval obj := $(old-obj))
 $(foreach var,$(nested-vars),$(call pop-var,$(var),$1/))
 endef
 
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 2/6] rule.mak: allow per object cflags and libs

2013-09-05 Thread Fam Zheng
Adds extract-libs in LINK to expand any "per object libs", the syntax to define
such a libs options is like:

$(obj)/curl.o-libs = $(CURL_LIBS)

in block/Makefile.objs.

Similarly,

$(obj)foo.o-cflags = $(FOO_CFLAGS)

is also supported.

Signed-off-by: Fam Zheng 
---
 rules.mak | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/rules.mak b/rules.mak
index 5758137..8276421 100644
--- a/rules.mak
+++ b/rules.mak
@@ -17,15 +17,17 @@ QEMU_DGFLAGS += -MMD -MP -MT $@ -MF $(*D)/$(*F).d
 # Same as -I$(SRC_PATH) -I., but for the nested source/object directories
 QEMU_INCLUDES += -I$(

Re: [Qemu-devel] [PATCH] spapr-rtas: reset top 4 bits in parameters address

2013-09-05 Thread Alexander Graf

On 05.09.2013, at 12:17, Alexey Kardashevskiy wrote:

> On 09/05/2013 07:27 PM, Alexander Graf wrote:
>> 
>> On 05.09.2013, at 09:40, Alexey Kardashevskiy wrote:
>> 
>>> On 09/05/2013 05:08 PM, Alexander Graf wrote:
 
 
 Am 05.09.2013 um 07:58 schrieb Alexey Kardashevskiy :
 
> On the real hardware, RTAS is called in real mode and therefore
> ignores top 4 bits of the address passed in the call.
 
 Shouldn't we ignore the upper 4 bits for every memory access in real mode, 
 not just that one parameter?
>>> 
>>> We probably should but I just do not see any easy way of doing this. Yet
>>> another "Ignore N bits on the top" memory region type? No idea.
>> 
>> Well, it already works for code that runs inside of guest context, because 
>> there the softmmu code for real mode strips the upper 4 bits.
>> 
>> I basically see 2 ways of fixing this "correctly":
>> 
> 
>> 1) Don't access memory through cpu_physical_memory_rw or ldx_phys but
>> instead through real mode wrappers that strip the upper 4 bits, similar
>> to how we handle virtual memory differently from physical memory
> 
> But there is no a ready wrapper for this, correct? I could not find any. I
> would rather do this, looks nicer than 2).
> 
> 
>> 2) Create 15 aliases to system_memory at the upper 4 bits of address
>> space. That should at the end of the day give you the same effect
> 
> Wow. Is not that too much?
> Ooor since I am normally making bad decisions, I should do this :)
> 
> 
>> The fix as you're proposing it wouldn't work for indirect memory
>> descriptors. Imagine you have an "address" parameter that gives you a
>> pointer to a struct in memory that again contains a pointer. You still
>> want that pointer be interpreted correctly, no?
> 
> Yes I do. I just think that having non zero bits at the top is a bug and I
> would not want the guest to continue sending bad addresses to the host. Or
> at least I want to know if it still happening.
> 
> Now we know that the only occasion of this misbehaviour is the "stop-self"
> call and others works just fine. If something new comes up (what is pretty
> unlikely, otherwise we would have noticed this issue a loong time ago AND
> Paul already made&posted a patch for the host to fix __pa() so it is not
> going to happen on new kernels either), ok, we will think of fixing this.
> 
> Doing in QEMU what the hardware does is a good thing but here I would think
> twice.

Well, the idea behind RTAS is that everything RTAS does is usually run in IR=0 
DR=0 inside of guest context, so that's the view of the world we should expose.

Which makes me think.

Couldn't we just set IR=0 DR=0 when getting an RTAS call and use the virtual 
memory access functions? Those will already strip the upper 4 bits.


Alex




[Qemu-devel] [RFC PATCH 3/6] Makefile: define curl cflags and libs with object

2013-09-05 Thread Fam Zheng
We have per object cflags and libs support now, move CURL_CFLAGS and
CURL_LIBS from global option variables to a per object basis.

Signed-off-by: Fam Zheng 
---
 block/Makefile.objs | 3 ++-
 configure   | 3 +--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 4cf9aa4..b1e1520 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -23,4 +23,5 @@ common-obj-y += commit.o
 common-obj-y += mirror.o
 common-obj-y += backup.o
 
-$(obj)/curl.o: QEMU_CFLAGS+=$(CURL_CFLAGS)
+$(obj)/curl.o-cflags := $(CURL_CFLAGS)
+$(obj)/curl.o-libs := $(CURL_LIBS)
diff --git a/configure b/configure
index 0a55c20..48e14b8 100755
--- a/configure
+++ b/configure
@@ -2199,8 +2199,6 @@ EOF
   curl_libs=`$curlconfig --libs 2>/dev/null`
   if compile_prog "$curl_cflags" "$curl_libs" ; then
 curl=yes
-libs_tools="$curl_libs $libs_tools"
-libs_softmmu="$curl_libs $libs_softmmu"
   else
 if test "$curl" = "yes" ; then
   feature_not_found "curl"
@@ -3878,6 +3876,7 @@ fi
 if test "$curl" = "yes" ; then
   echo "CONFIG_CURL=y" >> $config_host_mak
   echo "CURL_CFLAGS=$curl_cflags" >> $config_host_mak
+  echo "CURL_LIBS=$curl_libs" >> $config_host_mak
 fi
 if test "$brlapi" = "yes" ; then
   echo "CONFIG_BRLAPI=y" >> $config_host_mak
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 5/6] module: load modules at start

2013-09-05 Thread Fam Zheng
Add module_load_all to load all DSO modules under:
/usr/lib/qemu/block/
/usr/lib/qemu/net/
/usr/lib/qemu/ui/
when starting process.

Requires gmodule-2.0 from glib.

Signed-off-by: Fam Zheng 
---
 configure | 20 +++-
 include/qemu/module.h |  2 ++
 qemu-img.c|  2 ++
 qemu-io.c |  1 +
 qemu-nbd.c|  1 +
 scripts/create_config |  3 +++
 util/Makefile.objs|  2 ++
 util/module.c | 40 
 vl.c  |  1 +
 9 files changed, 63 insertions(+), 9 deletions(-)

diff --git a/configure b/configure
index 8f0a882..975853e 100755
--- a/configure
+++ b/configure
@@ -2238,15 +2238,17 @@ if test "$mingw32" = yes; then
 else
 glib_req_ver=2.12
 fi
-if $pkg_config --atleast-version=$glib_req_ver gthread-2.0 > /dev/null 2>&1
-then
-glib_cflags=`$pkg_config --cflags gthread-2.0 2>/dev/null`
-glib_libs=`$pkg_config --libs gthread-2.0 2>/dev/null`
-LIBS="$glib_libs $LIBS"
-libs_qga="$glib_libs $libs_qga"
-else
-error_exit "glib-$glib_req_ver required to compile QEMU"
-fi
+for i in gthread-2.0 gmodule-2.0; do
+if $pkg_config --atleast-version=$glib_req_ver $i > /dev/null 2>&1
+then
+glib_cflags=`$pkg_config --cflags $i 2>/dev/null`
+glib_libs=`$pkg_config --libs $i 2>/dev/null`
+LIBS="$glib_libs $LIBS"
+libs_qga="$glib_libs $libs_qga"
+else
+error_exit "glib-$glib_req_ver required to compile QEMU"
+fi
+done
 
 ##
 # pixman support probe
diff --git a/include/qemu/module.h b/include/qemu/module.h
index c4ccd57..d3500be 100644
--- a/include/qemu/module.h
+++ b/include/qemu/module.h
@@ -37,4 +37,6 @@ void register_module_init(void (*fn)(void), module_init_type 
type);
 
 void module_call_init(module_init_type type);
 
+void module_load_all(void);
+
 #endif
diff --git a/qemu-img.c b/qemu-img.c
index b9a848d..e761027 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include "qemu/module.h"
 
 #ifdef _WIN32
 #include 
@@ -2328,6 +2329,7 @@ int main(int argc, char **argv)
 
 error_set_progname(argv[0]);
 
+module_load_all();
 qemu_init_main_loop();
 bdrv_init();
 if (argc < 2)
diff --git a/qemu-io.c b/qemu-io.c
index d54dc86..d02e8f5 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -398,6 +398,7 @@ int main(int argc, char **argv)
 exit(1);
 }
 
+module_load_all();
 qemu_init_main_loop();
 bdrv_init();
 
diff --git a/qemu-nbd.c b/qemu-nbd.c
index f044546..ecea543 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -558,6 +558,7 @@ int main(int argc, char **argv)
 snprintf(sockpath, 128, SOCKET_PATH, basename(device));
 }
 
+module_load_all();
 qemu_init_main_loop();
 bdrv_init();
 atexit(bdrv_close_all);
diff --git a/scripts/create_config b/scripts/create_config
index b1adbf5..3ecb789 100755
--- a/scripts/create_config
+++ b/scripts/create_config
@@ -104,6 +104,9 @@ case $line in
 value=${line#*=}
 echo "#define $name $value"
 ;;
+ DSOSUF=*)
+echo "#define HOST_DSOSUF \"${line#*=}\""
+;;
 esac
 
 done # read
diff --git a/util/Makefile.objs b/util/Makefile.objs
index dc72ab0..33e56b0 100644
--- a/util/Makefile.objs
+++ b/util/Makefile.objs
@@ -11,3 +11,5 @@ util-obj-y += iov.o aes.o qemu-config.o qemu-sockets.o uri.o 
notify.o
 util-obj-y += qemu-option.o qemu-progress.o
 util-obj-y += hexdump.o
 util-obj-y += crc32c.o
+
+$(obj)/module.o-libs := -lglib
diff --git a/util/module.c b/util/module.c
index 7acc33d..29d3a4f 100644
--- a/util/module.c
+++ b/util/module.c
@@ -13,6 +13,8 @@
  * GNU GPL, version 2 or (at your option) any later version.
  */
 
+#include 
+#include 
 #include "qemu-common.h"
 #include "qemu/queue.h"
 #include "qemu/module.h"
@@ -79,3 +81,41 @@ void module_call_init(module_init_type type)
 e->init();
 }
 }
+
+void module_load_all(void)
+{
+static const char *module_dir_list[] = {
+"/usr/lib/qemu/block/",
+"/usr/lib/qemu/net/",
+"/usr/lib/qemu/ui/",
+NULL
+};
+const char **path;
+const char *dsosuf = HOST_DSOSUF;
+char fname[1024];
+int suf_len = strlen(dsosuf);
+DIR *dp;
+struct dirent *ep = NULL;
+GModule *g_module;
+for (path = &module_dir_list[0]; *path != NULL; path++) {
+dp = opendir(*path);
+if (!dp) {
+fprintf(stderr, "Failed to open dir %s\n", *path);
+}
+for (ep = readdir(dp); ep; ep = readdir(dp)) {
+int len = strlen(ep->d_name);
+if (len > suf_len &&
+!strcmp(&ep->d_name[len - suf_len], dsosuf)) {
+pstrcpy(fname, sizeof(fname), *path);
+pstrcat(fname, sizeof(fname), ep->d_name);
+g_module = g_module_open(fname, G_MODULE_BIND_LAZY);
+if (!g_module) {
+fprintf(stderr, "Failed to open module fil

[Qemu-devel] [RFC PATCH 6/6] curl: build as shared library

2013-09-05 Thread Fam Zheng
Produce block/curl.so with --enable-curl. "make install" is not
installing it yet, manually copy it to /usr/lib/qemu/block/curl.so to
make it loaded.

Signed-off-by: Fam Zheng 
---
 configure | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure b/configure
index 975853e..ac81a7c 100755
--- a/configure
+++ b/configure
@@ -3878,7 +3878,7 @@ if test "$bswap_h" = "yes" ; then
   echo "CONFIG_MACHINE_BSWAP_H=y" >> $config_host_mak
 fi
 if test "$curl" = "yes" ; then
-  echo "CONFIG_CURL=y" >> $config_host_mak
+  echo "CONFIG_CURL=m" >> $config_host_mak
   echo "CURL_CFLAGS=$curl_cflags" >> $config_host_mak
   echo "CURL_LIBS=$curl_libs" >> $config_host_mak
 fi
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 4/6] Makefile: introduce common-obj-m and block-obj-m for DSO

2013-09-05 Thread Fam Zheng
Add necessary rules and flags for shared object generation.
common-obj-m will include block-obj-m, as common-obj-y for block-obj-y.
The rules introduced here are:

  QEMU_CFLAGS += -shared -fPIC, for all %.o of shared objects.

  1) %.o in $(common-obj-m) is compiled to %.o, with
 "QEMU_CFLAGS += -shared -fPIC". Then "linked" to %.mo, which is an
 incremental object with "ln -r". This step is for consistency with
 %.mod case and has no effect.

  2) %.mod in $(common-obj-m) is generated by "ln -r", with all its
 dependencies (multiple *.o) as input. Which means the list of
 dependency objects must be ruled out in each sub-Makefile.objs
 like:

$(obj)/foo.mod: $(addprefix $(obj)/,bar.o baz.o qux.o)

 Notice that $(obj)/ is required for both target and dependency in
 the rule.

Signed-off-by: Fam Zheng 
---
 Makefile  | 24 +---
 Makefile.objs | 10 +-
 configure |  3 +++
 rules.mak |  6 ++
 4 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/Makefile b/Makefile
index 806946e..130a497 100644
--- a/Makefile
+++ b/Makefile
@@ -56,7 +56,7 @@ Makefile: ;
 configure: ;
 
 .PHONY: all clean cscope distclean dvi html info install install-doc \
-   pdf recurse-all speed test dist
+   pdf recurse-all speed test dist modules
 
 $(call set-vpath, $(SRC_PATH))
 
@@ -121,7 +121,22 @@ ifeq ($(CONFIG_SMARTCARD_NSS),y)
 include $(SRC_PATH)/libcacard/Makefile
 endif
 
-all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all
+all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all modules
+
+mod-obj-m = $(patsubst %.o,%$(DSOSUF),$(filter %.o,$(common-obj-m))) \
+  $(patsubst %.mo,%$(DSOSUF),$(filter %.mo,$(common-obj-m)))
+mod-obj-y = $(patsubst %.mo,%.mo.o,$(filter %.mo,$(common-obj-y)))
+
+# Generate rules for single file modules (%.mo: %.o).
+# For multi file modules, dependencies should be listed explicitly in
+# Makefile.objs
+$(foreach o,$(mod-obj-m) $(mod-obj-y),$(eval \
+   $(patsubst %$(DSOSUF),%.mo,$o): $(patsubst %.c,%.o,$(wildcard 
$(patsubst %$(DSOSUF),%.c,$o))) \
+   ))
+
+modules: $(mod-obj-m)
+modules: BUILD_DYNAMIC = 1
+modules: QEMU_CFLAGS += -shared -fPIC
 
 config-host.h: config-host.h-timestamp
 config-host.h-timestamp: config-host.mak
@@ -155,7 +170,7 @@ subdir-dtc:dtc/libfdt dtc/tests
 dtc/%:
mkdir -p $@
 
-$(SUBDIR_RULES): libqemuutil.a libqemustub.a $(common-obj-y)
+$(SUBDIR_RULES): libqemuutil.a libqemustub.a $(common-obj-y) $(common-obj-m)
 
 ROMSUBDIR_RULES=$(patsubst %,romsubdir-%, $(ROMS))
 romsubdir-%:
@@ -235,6 +250,9 @@ clean:
rm -f qemu-options.def
find . -name '*.[oda]' -type f -exec rm -f {} +
find . -name '*.l[oa]' -type f -exec rm -f {} +
+   find . -name '*'$(DSOSUF) -type f -exec rm -f {} +
+   find . -name '*.mo' -type f -exec rm -f {} +
+
rm -f $(TOOLS) $(HELPERS-y) qemu-ga TAGS cscope.* *.pod *~ */*~
rm -Rf .libs
rm -f qemu-img-cmds.h
diff --git a/Makefile.objs b/Makefile.objs
index f46a4cd..2f178af 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -19,6 +19,8 @@ block-obj-y += qemu-coroutine.o qemu-coroutine-lock.o 
qemu-coroutine-io.o
 block-obj-y += qemu-coroutine-sleep.o
 block-obj-y += coroutine-$(CONFIG_COROUTINE_BACKEND).o
 
+block-obj-m = block/
+
 ifeq ($(CONFIG_VIRTIO)$(CONFIG_VIRTFS)$(CONFIG_PCI),yyy)
 # Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.
 # only pull in the actual virtio-9p device if we also enabled virtio.
@@ -83,6 +85,9 @@ common-obj-$(CONFIG_SMARTCARD_NSS) += $(libcacard-y)
 
 common-obj-y += qmp-marshal.o
 common-obj-y += qmp.o hmp.o
+
+common-obj-m = $(block-obj-m)
+
 endif
 
 ##
@@ -121,5 +126,8 @@ nested-vars += \
util-obj-y \
qga-obj-y \
block-obj-y \
-   common-obj-y
+   block-obj-m \
+   common-obj-y \
+   common-obj-m
+
 dummy := $(call unnest-vars)
diff --git a/configure b/configure
index 48e14b8..8f0a882 100755
--- a/configure
+++ b/configure
@@ -190,6 +190,7 @@ mingw32="no"
 gcov="no"
 gcov_tool="gcov"
 EXESUF=""
+DSOSUF=".so"
 prefix="/usr/local"
 mandir="\${prefix}/share/man"
 datadir="\${prefix}/share"
@@ -580,6 +581,7 @@ fi
 
 if test "$mingw32" = "yes" ; then
   EXESUF=".exe"
+  DSOSUF=".dll"
   QEMU_CFLAGS="-DWIN32_LEAN_AND_MEAN -DWINVER=0x501 $QEMU_CFLAGS"
   # enable C99/POSIX format strings (needs mingw32-runtime 3.15 or later)
   QEMU_CFLAGS="-D__USE_MINGW_ANSI_STDIO=1 $QEMU_CFLAGS"
@@ -4165,6 +4167,7 @@ echo "LIBTOOLFLAGS=$LIBTOOLFLAGS" >> $config_host_mak
 echo "LIBS+=$LIBS" >> $config_host_mak
 echo "LIBS_TOOLS+=$libs_tools" >> $config_host_mak
 echo "EXESUF=$EXESUF" >> $config_host_mak
+echo "DSOSUF=$DSOSUF" >> $config_host_mak
 echo "LIBS_QGA+=$libs_qga" >> $config_host_mak
 echo "POD2MAN=$POD2MAN" >> $config_host_mak
 echo "TRANSLATE_OPT_CFLAGS=$TRANSLATE_OPT_CFLAGS" >> $config_host_mak
diff --git a/rules.mak b/rules.mak
index 8276421..090eee1 1

Re: [Qemu-devel] [RFC PATCH 0/6] Shared Library Module Support

2013-09-05 Thread Fam Zheng
On Thu, 09/05 18:20, Fam Zheng wrote:
> This series implements feature of shared object building as described in:
> 
> http://wiki.qemu.org/Features/Modules
> 
> It's achieved in three steps, with extra bonus to change curl to a shared
> library module in the end (only to demonstrate the usage, no "make install"
> support of .so files yet).
> 
> 1. Allow per object cflags and libs:
> 
> [01/06] make.rule: fix $(obj) to a real relative path
> [02/06] rule.mak: allow per object cflags and libs
> 
> 2. Rules for building .so:
> 
> [03/06] Makefile: define curl cflags and libs with object

Sorry, misleading. 04 should be this step, 03, 06 is for curl enablement.

> 
> 3. Code to load module. All .so files are scanned and loaded when program
>starts:
> 
> [04/06] Makefile: introduce common-obj-m and block-obj-m for DSO
> 

And this should be patch 05.

> 4. curl adoption:
> 
> [05/06] module: load modules at start
> [06/06] curl: build as shared library
> 



Re: [Qemu-devel] [PATCH 0/8] [PATCH RFC v3] s390 cpu hotplug

2013-09-05 Thread Christian Borntraeger
On 04/09/13 14:45, Andreas Färber wrote:
> Hello,
> 
> Am 01.08.2013 16:12, schrieb Jason J. Herne:
>> From: "Jason J. Herne" 
>>
>> Latest code for cpu Hotplug on S390 architecture.   This one is vastly 
>> simpler
>> than v2 as we have decided to avoid the command line specification 
>> of -device s390-cpu.
>>
>> The last version can be found here:
>> http://lists.gnu.org/archive/html/qemu-devel/2013-06/msg01183.html
>>
>> There is also a patch in this series to add cpu-add to the Qemu monitor
>> interface.
>>
>> Hotplugged cpus are created in the configured state and can be used by the
>> guest after the guest onlines the cpu by: 
>> "echo 1 > /sys/bus/cpu/devices/cpuN/online"
>>
>> Hot unplugging is currently not implemented by this code. 
> 
> We have been having several off-list discussions since then that I'll
> try to briefly summarize here, please correct or extend as needed:
> 
> 1) CPU topology for QOM
> 
> Physically a System z machine may have an MCM with, e.g., 6 CPUs with 6
> cores each. But unlike x86, there is PR/SM, LPAR and possibly z/VM in
> between Linux and hardware, so we do actually want to be able to
> hot-plug in quantities of 1 and not by 6 on s390x for the foreseeable
> future. We seem willing to set a QOM ABI in stone based on that assumption.

Just stepping in, Jason is on vacation this week.

To summarize my understanding:
You were thinking if CPU model needs topology (e.g. -device mcm,id=m1, -device 
cpu,mcm=m1)
and s390 was the only arch left, that you were not sure about if topology is 
needed? 
All other platforms dont need topology for cpu hotplug?

Yes, we want to be able to hotplug single cores (not chips, not MCMs). 
It is pretty hard to pin the vCPUs to a given real topology for KVM. You need to
pin on LPAR and KVM. Libvirt could  do some pinning of guest vCPUs to host CPUs 
and
LPAR can have dedicated CPUs. But pinning a full chip (6cores) would only make
sense in very rare cases.


> 
> => s390-cpu (or future subtypes) to be used with device_add.
> => Flat /machine/cpu[n] list in composition tree a possibility.
> 
> 1a) CPU topology for guests
> 
> STSI instruction topology support not implemented yet.

Right not implemented yet, but we certainly want to be able to define the guest
visible topology at some point in time (grouping of cores basically). 
But I guess this does not mean that we have to go away from the flat list of 
CPUs.


> 
> => Guest unaware of any emulated topology today.

An additional problem is, that for the normal case (linux scheduler, no 
pinning, also
no gang scheduling) the topology would change too fast. The guest would be busy 
rebuilding
the scheduler domains all the time.

> 
> hyptop tool requires hypfs implementation for KVM.
> 
> => Guest unaware of sibling VMs today, unlike z/VM and LPAR.

Right. But we will look into hyptop at some point in time. 

> 
> 2) CPU hot-unplug
> 
> Hotplug will always use a unique linear CPU address, even if hot-unplug
> leads to a sparse address space.
> 
> => cpu_num != cpu_index
> 
> 
> With all that in mind, I'll now need to review the s390 patches again.
> 
> For the HMP patch I am waiting on feedback from Igor once he returns
> from his vacation and, if there are no objections, would like to see
> that patch go through Luiz' queue since unrelated to s390x.

Yes, would be cool if the HMP patch could go upstream via that path. That would 
reduce the rework/patch burden of Jason.

> 
> Regards,
> Andreas
> 
>>
>> Jason J. Herne (8):
>>   s390-qemu: cpu hotplug - Define New SCLP Codes
>>   s390-qemu: cpu hotplug - SCLP CPU Info
>>   s390-qemu: cpu hotplug - SCLP Event integration
>>   s390-qemu: cpu hotplug - Storage key global access
>>   s390-qemu: cpu hotplug - ipi_states enhancements
>>   s390-qemu: cpu hotplug - s390 cpu init improvements for hotplug
>>   s390-qemu: cpu hotplug - Implement hot_add_cpu hook
>>   qemu-monitor: HMP cpu-add wrapper
>>
>>  hmp-commands.hx   |   13 
>>  hmp.c |   10 
>>  hmp.h |1 +
>>  hw/s390x/Makefile.objs|2 +-
>>  hw/s390x/event-facility.c |7 +++
>>  hw/s390x/s390-virtio-ccw.c|8 ++-
>>  hw/s390x/s390-virtio.c|   47 +--
>>  hw/s390x/s390-virtio.h|2 +-
>>  hw/s390x/sclp.c   |   53 +++-
>>  hw/s390x/sclpcpu.c|  120 
>> +
>>  include/hw/s390x/event-facility.h |3 +
>>  include/hw/s390x/sclp.h   |   41 +
>>  target-s390x/cpu.c|   36 ++-
>>  target-s390x/cpu.h|7 +++
>>  target-s390x/helper.c |   12 
>>  15 files changed, 336 insertions(+), 26 deletions(-)
>>  create mode 100644 hw/s390x/sclpcpu.c
>>
> 
> 




Re: [Qemu-devel] [RFC 0/3] block: Error parameter for opening/creating images

2013-09-05 Thread Kevin Wolf
Am 05.09.2013 um 10:10 hat Max Reitz geschrieben:
> This RFC adds an Error ** parameter to bdrv_open, bdrv_file_open,
> bdrv_create and the respective functions provided by a block driver.
> 
> This results in more specific error information than just -errno provided
> to the user when opening or creating images (disregarding the fact that
> block drivers often already use error_report, which is generally changed
> to error_setg through this patch).
> 
> The last patch in this series changes the qcow2 block driver to set an
> example of usage in a block driver.
> 
> Note that several I/O tests break by applying this RFC since they expect
> different error messages (generally, previously, an error message on
> image opening/creation consisted of two lines; the first of which would be
> generated by the driver through error_report, the second by the block
> layer itself through strerror(-ret); this patch is designed to merge these
> two lines into a single one). This applies to the tests 49, 51, 54 and 60.
> 
> Max Reitz (3):
>   bdrv: Use "Error" for opening images
>   block: Error parameter for opening functions
>   qcow2: Use Error parameter

One thing I'd like to suggest is to keep the conversions of
bdrv_(file_)open and bdrv_create separate. I don't think there are
dependencies between them, so two logical changes should come in two
separate patches.

The other point is that you do a lot of "dummy" conversions where you
pass NULL as errp. This is generally not what we want to have in the end
because it swallows error messages that would previously be printed by
error_report(). I guess most of them are changed into real error
handling, but it's hard to keep track of the places that need to be
fixed in later patches.

In order to address this, I think it would be better to use code like
this when you can't forward the errors yet:

Error *local_err;
foo(&local_err);
if (error_is_set(&local_err)) {
qerror_report_err(local_err);
error_free(local_err);
}

So that the output isn't lost in any intermediate commit of your series.

Kevin



[Qemu-devel] [PATCH RESEND] configure: Undefine _FORTIFY_SOURCE prior using it

2013-09-05 Thread Michal Privoznik
Currently, we are enforcing the _FORTIFY_SOURCE=2 without any
previous detection if the macro has been already defined, e.g.
by environment, or is just enabled by compiler by default.

Signed-off-by: Michal Privoznik 
---

The issue still happens thus the patch is still valid.

 configure | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure b/configure
index e989609..24e5b00 100755
--- a/configure
+++ b/configure
@@ -3419,7 +3419,7 @@ if test "$gcov" = "yes" ; then
   CFLAGS="-fprofile-arcs -ftest-coverage -g $CFLAGS"
   LDFLAGS="-fprofile-arcs -ftest-coverage $LDFLAGS"
 elif test "$debug" = "no" ; then
-  CFLAGS="-O2 -D_FORTIFY_SOURCE=2 $CFLAGS"
+  CFLAGS="-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 $CFLAGS"
 fi
 
 
-- 
1.8.1.5




Re: [Qemu-devel] [ARM][regression][bisect] ARM target broken: test v5 image does not start kernel

2013-09-05 Thread Claudio Fontana
Hi Peter,

On 05.09.2013 10:59, Peter Maydell wrote:
> On 5 September 2013 09:31, Claudio Fontana  wrote:
> 
>> I just finished bisecting a regression I am experiencing on ARM 32bit target,
> 
>> After a painful bisection, I got a first bad commit, which when reverted on 
>> the latest QEMU fixes the issue for me. Maybe something needs to be adapted 
>> to the return value (back-)change?
> 
>> 9b8c69243585a32d14b9bb9fcd52c37b0b5a1b71 is the first bad commit
>> commit 9b8c69243585a32d14b9bb9fcd52c37b0b5a1b71
>> Author: Jan Kiszka 
>> Date:   Tue Jul 16 14:45:16 2013 +0200
>>
>> memory: Return -1 again on reads from unsigned regions
> 
> This one again! Claudio, can you test with these patches:
> http://patchwork.ozlabs.org/patch/272011/
> http://patchwork.ozlabs.org/patch/272012/
> 
> which Jan posted earlier this week and should fix
> this regression?
> 
> thanks
> -- PMM

Yes, this fixes it. I hope the fix lands in mainline soon.

Thanks,

Claudio

-- 
Claudio Fontana
Server OS Architect
Huawei Technologies Duesseldorf GmbH





Re: [Qemu-devel] [PATCH RESEND] configure: Undefine _FORTIFY_SOURCE prior using it

2013-09-05 Thread Peter Maydell
On 5 September 2013 11:54, Michal Privoznik  wrote:
> Currently, we are enforcing the _FORTIFY_SOURCE=2 without any
> previous detection if the macro has been already defined, e.g.
> by environment, or is just enabled by compiler by default.
>
> Signed-off-by: Michal Privoznik 

Reviewed-by: Peter Maydell 

-- PMM



Re: [Qemu-devel] Problems with QEMU sdcard while using glib 2.33.8

2013-09-05 Thread Taimoor Mirza
Hi Stefan,

I am using released 1.5.0 version from http://wiki.qemu.org/Download.
I think it should be same as commit ID
"295d81c62414a63c625fa2e78175573d4b3f5ba4"

I have observed some interesting behavior. This problem does not come
if I use MinGW with GCC version 4.7.2. I was originally using 4.6.2. I
spent some time looking at what can cause this problem and found out
an interesting thing. If I change optimization flag from O2 to O1
while building coroutine-win32 then it works fine even with 4.6.2.
For this I first built QEMU binary with default O2 flag, remove
coroutine-win32.o, changed CFLAG in makefile to O1 and rerun make.
Generated binary works fine without any problem.

Thanks,
Taimoor

On Thu, Aug 29, 2013 at 1:28 PM, Stefan Hajnoczi  wrote:
> On Thu, Aug 29, 2013 at 12:10:48AM +0500, Taimoor Mirza wrote:
>> Hi Stefan,
>>
>> Below is result of bt:
>>
>> Breakpoint 1, 0x006ac304 in abort ()
>> (gdb) bt
>> #0  0x006ac304 in abort ()
>> #1  0x00553052 in _fu10846stack_chk_guard () at qemu-coroutine.c:111
>> #2  0x0040d746 in _fu473stack_chk_guard () at block.c:4294
>> #3  0x00413cc7 in _fu805stack_chk_guard () at block.c:2530
>> #4  0x00413cc7 in _fu805stack_chk_guard () at block.c:2530
>> #5  0x00414875 in bdrv_rw_co_entry (opaque=0xa90f8d8) at block.c:2172
>> #6  _fu836stack_chk_guard () at block.c:2167
>> #7  0x004439f8 in _fu1936stack_chk_guard () at coroutine-win32.c:57
>> #8  0x767dbff2 in KERNEL32!GetQueuedCompletionStatus ()
>>from C:\windows\syswow64\kernel32.dll
>> #9  0x035e3be8 in ?? ()
>> #10 0x767dbfaa in KERNEL32!GetQueuedCompletionStatus ()
>>from C:\windows\syswow64\kernel32.dll
>> #11 0x019c3c70 in ?? ()
>> Cannot access memory at address 0xa08
>> (gdb)
>
> Great, thanks.  Please post the exact git commit ID you have built, I'm
> having trouble tracking down the line numbers in qemu.git/master:
>
>   $ git rev-parse HEAD
>
> Stefan



Re: [Qemu-devel] [RFC] qcow2 journalling draft

2013-09-05 Thread Kevin Wolf
Am 05.09.2013 um 11:21 hat Stefan Hajnoczi geschrieben:
> On Wed, Sep 04, 2013 at 11:39:51AM +0200, Kevin Wolf wrote:
> > However, what if we run 'qemu-img check -r leaks' with an old qemu-img
> > version? It will reclaim the clusters used by the journal, and if we
> > continue using the journal we'll corrupt whatever new data is there
> > now.
> > 
> > Can we protect against this without using an autoclear bit?
> 
> You are right.  It's a weird case I didn't think of but it could happen.
> An autoclear bit sounds like the simplest solution.
> 
> Please document this scenario.

Okay, I've updated the description as follows:

Bit 0:  Journal valid bit. This bit indicates that the
image contains a valid main journal starting at
journal_offset; it is used to mark journals
invalid if the image was opened by older
implementations that may have "reclaimed" the
journal clusters that would appear as leaked
clusters to them.

> > > > +A journal is organised in journal blocks, all of which have a 
> > > > reference count
> > > > +of exactly 1. It starts with a block containing the following journal 
> > > > header:
> > > > +
> > > > +Byte  0 -  7:   Magic ("qjournal" ASCII string)
> > > > +
> > > > +  8 - 11:   Journal size in bytes, including the header
> > > > +
> > > > + 12 - 15:   Journal block size order (block size in bytes = 1 
> > > > << order)
> > > > +The block size must be at least 512 bytes and must 
> > > > not
> > > > +exceed the cluster size.
> > > > +
> > > > + 16 - 19:   Journal block index of the descriptor for the last
> > > > +transaction that has been synced, starting with 1 
> > > > for the
> > > > +journal block after the header. 0 is used for empty
> > > > +journals.
> > > > +
> > > > + 20 - 23:   Sequence number of the last transaction that has 
> > > > been
> > > > +synced. 0 is recommended as the initial value.
> > > > +
> > > > + 24 - 27:   Sequence number of the last transaction that has 
> > > > been
> > > > +committed. When replaying a journal, all 
> > > > transactions
> > > > +after the last synced one up to the last commit 
> > > > one must be
> > > > +synced. Note that this may include a wraparound of 
> > > > sequence
> > > > +numbers.
> > > > +
> > > > + 28 -  31:  Checksum (one's complement of the sum of all bytes 
> > > > in the
> > > > +header journal block except those of the checksum 
> > > > field)
> > > > +
> > > > + 32 - 511:  Reserved (set to 0)
> > > 
> > > I'm not sure if these fields are necessary.  They require updates (and
> > > maybe flush) after every commit and sync.
> > > 
> > > The fewer metadata updates, the better, not just for performance but
> > > also to reduce the risk of data loss.  If any metadata required to
> > > access the journal is corrupted, the image will be unavailable.
> > > 
> > > It should be possible to determine this information by scanning the
> > > journal transactions.
> > 
> > This is rather handwavy. Can you elaborate how this would work in detail?
> > 
> > 
> > For example, let's assume we get to read this journal (a journal can be
> > rather large, too, so I'm not sure if we want to read it in completely):
> > 
> >  - Descriptor, seq 42, 2 data blocks
> >  - Data block
> >  - Data block
> >  - Data block starting with "qjbk"
> >  - Data block
> >  - Descriptor, seq 7, 0 data blocks
> >  - Descriptor, seq 8, 1 data block
> >  - Data block
> > 
> > Which of these have already been synced? Which have been committed?

So what's your algorithm for this?

> > I guess we could introduce an is_commited flag in the descriptor, but
> > wouldn't correct operation look like this then:
> > 
> > 1. Write out descriptor commit flag clear and any data blocks
> > 2. Flush
> > 3. Rewrite descriptor with commit flag set
> > 
> > This ensures that the commit flag is only set if all the required data
> > is indeed stable on disk. What has changed compared to this proposal is
> > just the offset at which you write in step 3 (header vs. descriptor).
> 
> A commit flag cannot be relied upon.  A transaction can be corrupted
> after being committed, or it can be corrupted due to power failure while
> writing the transaction.  In both cases we have an invalid transaction
> and we must discard it.

No, I believe it is vitally important to distinguish these two cases.

If a transaction was corrupted due to power failure while writing the
transaction, then we can simply discard it indeed.

If, however, a transaction was committed and gets corrupted after the
fact, then we have a problem because the data on the disk is laid out as
described by on-disk metadat (e.g. L2 tables) _with the j

Re: [Qemu-devel] Questions about hvm domU default devices with upstream qemu

2013-09-05 Thread Stefano Stabellini
On Wed, 4 Sep 2013, Fabio Fantoni wrote:
> Il 04/09/2013 15:17, Stefano Stabellini ha scritto:
> > On Wed, 4 Sep 2013, Fabio Fantoni wrote:
> > > Il 04/07/2013 15:51, Fabio Fantoni ha scritto:
> > > > Last year I posted a question about default devices of upstream qemu
> > > > that
> > > > differ from qemu traditional, like empty floppy and cdrom in particular.
> > > > About empty floppy now is disabled as workaround.
> > > > About empty cdrom Stabellini tells that is useful, so I tried to use it
> > > > but
> > > > cd-insert was failing,and I must define other empty cdrom on xl
> > > > configuration file, for example with: ',raw,hdb,ro,cdrom'.
> > > > It seem that there isn't default devices usable without define them in
> > > > xen,
> > > > and I think that probably need to revise the definition of the devices
> > > > to
> > > > avoid empty and unusable ones and to do everything as best as possible
> > > > (also
> > > > with possibly reviewing qemu side).
> > > > Another good thing is also consider pv domU case and possible future
> > > > support
> > > > of spice and/or other useful features.
> > > > Thanks for any reply.
> > > I seen no replies about this, I think is important improve upstream qemu
> > > support and remove the traditional ASA upstream will be equivalent or
> > > superior
> > > in all features.
> > > 
> > > I also think is good idea add q35 support on xen.
> > > Seem that hvm domUs have performance limits, in particular on windows
> > > domUs
> > > (also with pv drivers) caused probably by old hardware emulation by qemu
> > > and/or xen support limited on some instruction sets.
> > > I don't know how to add essential q35 support on hvmloader to do some
> > > tests
> > > and have effective data about improvements.
> > > Anyone on this?
> > I agree that this is important and thanks for raising the issue.
> > It would be nice if somebody stepped up to do the q35 work.
> 
> Thanks for reply, can you reply also on questions about default devices with
> upstream qemu please?

I think it might be a good idea to keep the same set of default devices
between qemu-traditional and qemu-xen.

We always had an empty cdrom and I don't see why we should remove it. If
cd-insert doesn't work, it's a bug and should be fixed.

Supporting spice in PV guests would be hard, because spice needs QXL
that is a PCI device. PV guests don't have a PCI bus. Of course if
somebody did the work to make QXL and spice work with PV guests I would
be happy to accept it.

I agree that we should keep improving upstream QEMU, however removing
qemu-traditional is not going to be easy because it's needed for
compatibility when a guest started with qemu-traditional is
live-migrated to a new system. But we can make sure that this remains
the only use case for it.



Re: [Qemu-devel] [PATCH 0/8] [PATCH RFC v3] s390 cpu hotplug

2013-09-05 Thread Andreas Färber
Am 05.09.2013 12:40, schrieb Christian Borntraeger:
> On 04/09/13 14:45, Andreas Färber wrote:
>> Hello,
>>
>> Am 01.08.2013 16:12, schrieb Jason J. Herne:
>>> From: "Jason J. Herne" 
>>>
>>> Latest code for cpu Hotplug on S390 architecture.   This one is vastly 
>>> simpler
>>> than v2 as we have decided to avoid the command line specification 
>>> of -device s390-cpu.
>>>
>>> The last version can be found here:
>>> http://lists.gnu.org/archive/html/qemu-devel/2013-06/msg01183.html
>>>
>>> There is also a patch in this series to add cpu-add to the Qemu monitor
>>> interface.
>>>
>>> Hotplugged cpus are created in the configured state and can be used by the
>>> guest after the guest onlines the cpu by: 
>>> "echo 1 > /sys/bus/cpu/devices/cpuN/online"
>>>
>>> Hot unplugging is currently not implemented by this code. 
>>
>> We have been having several off-list discussions since then that I'll
>> try to briefly summarize here, please correct or extend as needed:
>>
>> 1) CPU topology for QOM
>>
>> Physically a System z machine may have an MCM with, e.g., 6 CPUs with 6
>> cores each. But unlike x86, there is PR/SM, LPAR and possibly z/VM in
>> between Linux and hardware, so we do actually want to be able to
>> hot-plug in quantities of 1 and not by 6 on s390x for the foreseeable
>> future. We seem willing to set a QOM ABI in stone based on that assumption.
> 
> Just stepping in, Jason is on vacation this week.

Everyone is welcome to comment. :)

> To summarize my understanding:
> You were thinking if CPU model needs topology (e.g. -device mcm,id=m1, 
> -device cpu,mcm=m1)
> and s390 was the only arch left, that you were not sure about if topology is 
> needed? 
> All other platforms dont need topology for cpu hotplug?

No, on the contrary: I don't want s390x to blindly copy x86 cpu-add,
because for x86 we know that what we have is a hack to make it work
today, but there we know we want to do device_add Xeon-X42-4242 instead,
which then hot-plugs the 6 cores x 2 threads at once that a physical
hot-plug would do and not hot-add individual threads.

So the question of topology is not about what is below KVM but about
what is inside QEMU, since x86 emulates i440fx/q35 based hardware.
The understanding I reached on IRC is that s390x (similar to sPAPR)
tries to emulate LPAR / z/VM layer rather than the hardware below them,
thus no applicable concept of "real" hardware and arbitrary quantities.

> Yes, we want to be able to hotplug single cores (not chips, not MCMs). 
> It is pretty hard to pin the vCPUs to a given real topology for KVM. You need 
> to
> pin on LPAR and KVM. Libvirt could  do some pinning of guest vCPUs to host 
> CPUs and
> LPAR can have dedicated CPUs. But pinning a full chip (6cores) would only make
> sense in very rare cases.

Last time I looked into this, the post-add hook was solely for overall
ccw initialization. So we can use device_add s390-cpu today, can't we?

The question that I still need to investigate is how the
always-incrementing CPU address interacts with maxcpus. Consider
maxcpus=6 and smp_cpus=2. 4x device_add should work. Now if we did 1x
device_del, then 1x device_add should work again IMO. cpu-add checks the
user-supplied id against maxcpus though iirc.

Therefore my saying in multiple contexts that we should get the QEMU and
KVM CPU count checks into the CPU realizefn so that we get the checks
irrespective of the call site with nice error reporting.

>> => s390-cpu (or future subtypes) to be used with device_add.
>> => Flat /machine/cpu[n] list in composition tree a possibility.
>>
>> 1a) CPU topology for guests
>>
>> STSI instruction topology support not implemented yet.
> 
> Right not implemented yet, but we certainly want to be able to define the 
> guest
> visible topology at some point in time (grouping of cores basically). 
> But I guess this does not mean that we have to go away from the flat list of 
> CPUs.

So STSI would show what real LPAR/CPU we are running on? But QEMU would
have /machine/cpu[0]? Or do we need /machine/cpugroup[0]/cpu[0]? The
latter is my concern here, to decide about child<> vs. link<> properties.

To cope with device_add s390-cpu adding the device to
/machine/peripheral/ or /machine/peripheral-anon/device[0] I *think*
we'll need link<>, which would then translate back to ipi_states array
as backend and the remaining question would be where to expose those
properties in the composition tree - i.e. /machine/cpu[n] or
/machine/ipi/cpu[n] or something - please suggest. Similarly if those
become link<> properties then the CPUs created by the machine via
smp_cpus need a canonical path as well; quite obviously both cannot be
the same.

Background is that long-term Anthony would like x86 CPU hot-plug to
become setting/unsetting some /machine/cpu-socket[n] link<> property of
the machine, and the ipi_states array seems a close equivalent on s390x.

>> => Guest unaware of any emulated topology today.
> 
> An additional problem is, that for the normal case

Re: [Qemu-devel] [PATCH 1/8] [PATCH RFC v3] s390-qemu: cpu hotplug - Define New SCLP Codes

2013-09-05 Thread Alexander Graf

On 01.08.2013, at 16:12, Jason J. Herne wrote:

> From: "Jason J. Herne" 
> 
> Define new SCLP codes to improve code readability.
> 
> Signed-off-by: Jason J. Herne 
> ---
> hw/s390x/sclp.c |2 +-
> include/hw/s390x/sclp.h |8 
> 2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
> index 86d6ae0..cb53d7e 100644
> --- a/hw/s390x/sclp.c
> +++ b/hw/s390x/sclp.c
> @@ -45,7 +45,7 @@ static void sclp_execute(SCCB *sccb, uint64_t code)
> {
> S390SCLPDevice *sdev = get_event_facility();
> 
> -switch (code) {
> +switch (code & SCLP_NO_CMD_PARM) {

switch (code & ~SCLP_CMD_PARM)

Or are the upper bits parm as well? In fact, what about the upper 32 bits?


Alex

> case SCLP_CMDW_READ_SCP_INFO:
> case SCLP_CMDW_READ_SCP_INFO_FORCED:
> read_SCP_info(sccb);
> diff --git a/include/hw/s390x/sclp.h b/include/hw/s390x/sclp.h
> index 231a38a..174097d 100644
> --- a/include/hw/s390x/sclp.h
> +++ b/include/hw/s390x/sclp.h
> @@ -26,6 +26,14 @@
> #define SCLP_CMD_WRITE_EVENT_DATA   0x00760005
> #define SCLP_CMD_WRITE_EVENT_MASK   0x00780005
> 
> +/* CPU hotplug SCLP codes */
> +#define SCLP_NO_CMD_PARM0x00ff
> +#define SCLP_HAS_CPU_INFO   0x0C00ULL
> +#define SCLP_CMDW_READ_CPU_INFO 0x00010001
> +#define SCLP_CMDW_CONFIGURE_CPU 0x00110001
> +#define SCLP_CMDW_DECONFIGURE_CPU   0x0011
> +#define SCLP_CMDW_CPU_CMD_PARM  0xff00
> +
> /* SCLP response codes */
> #define SCLP_RC_NORMAL_READ_COMPLETION  0x0010
> #define SCLP_RC_NORMAL_COMPLETION   0x0020
> -- 
> 1.7.10.4
> 




Re: [Qemu-devel] [PATCH 1/8] [PATCH RFC v3] s390-qemu: cpu hotplug - Define New SCLP Codes

2013-09-05 Thread Andreas Färber
Am 01.08.2013 16:12, schrieb Jason J. Herne:
> From: "Jason J. Herne" 
> 
> Define new SCLP codes to improve code readability.
> 
> Signed-off-by: Jason J. Herne 

"s390-qemu:" is really bad. For one, all QEMU patches are somehow about
QEMU, so that's redundant. For another, "sclp:" would be much more
telling to me which patches to look at than text disappearing at the end
of the subject line in the mail client. :)

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH 2/8] [PATCH RFC v3] s390-qemu: cpu hotplug - SCLP CPU Info

2013-09-05 Thread Andreas Färber
Am 01.08.2013 16:12, schrieb Jason J. Herne:
> From: "Jason J. Herne" 
> 
> Implement the CPU data in SCLP "Read SCP Info".  And implement "Read CPU Info"
> SCLP command. This data will be used by the guest to get information about hot
> plugged cpus.
> 
> Signed-off-by: Jason J. Herne 
> ---
>  hw/s390x/sclp.c |   51 
> +++
>  include/hw/s390x/sclp.h |   32 +
>  2 files changed, 83 insertions(+)
> 
> diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
> index cb53d7e..da8cf7a 100644
> --- a/hw/s390x/sclp.c
> +++ b/hw/s390x/sclp.c
> @@ -15,6 +15,7 @@
>  #include "cpu.h"
>  #include "sysemu/kvm.h"
>  #include "exec/memory.h"
> +#include "sysemu/sysemu.h"
>  
>  #include "hw/s390x/sclp.h"
>  
> @@ -31,7 +32,26 @@ static inline S390SCLPDevice *get_event_facility(void)
>  static void read_SCP_info(SCCB *sccb)
>  {
>  ReadInfo *read_info = (ReadInfo *) sccb;
> +CPUState *cpu;
>  int shift = 0;
> +int cpu_count = 0;
> +int i = 0;
> +
> +for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
> +cpu_count++;
> +}
> +
> +/* CPU information */
> +read_info->entries_cpu = cpu_to_be16(cpu_count);
> +read_info->offset_cpu = cpu_to_be16(offsetof(ReadInfo, entries));
> +read_info->highest_cpu = cpu_to_be16(max_cpus);
> +
> +for (i = 0; i < cpu_count; i++) {
> +read_info->entries[i].address = i;
> +read_info->entries[i].type = 0;
> +}
> +
> +read_info->facilities = cpu_to_be64(SCLP_HAS_CPU_INFO);
>  
>  while ((ram_size >> (20 + shift)) > 65535) {
>  shift++;
> @@ -41,6 +61,34 @@ static void read_SCP_info(SCCB *sccb)
>  sccb->h.response_code = cpu_to_be16(SCLP_RC_NORMAL_READ_COMPLETION);
>  }
>  
> +/* Provide information about the CPU */
> +static void sclp_read_cpu_info(SCCB *sccb)
> +{
> +ReadCpuInfo *cpu_info = (ReadCpuInfo *) sccb;
> +CPUState *cpu;
> +int cpu_count = 0;
> +int i = 0;
> +
> +for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {

This becomes CPU_FOREACH(cpu) { now.

> +cpu_count++;
> +}
> +
> +cpu_info->nr_configured = cpu_to_be16(cpu_count);
> +cpu_info->offset_configured = cpu_to_be16(offsetof(ReadCpuInfo, 
> entries));
> +cpu_info->nr_standby = cpu_to_be16(0);
> +
> +/* The standby offset is 16-byte for each CPU */
> +cpu_info->offset_standby = cpu_to_be16(cpu_info->offset_configured
> ++ cpu_info->nr_configured*sizeof(CpuEntry));
> +
> +for (i = 0; i < cpu_count; i++) {
> +cpu_info->entries[i].address = i;
> +cpu_info->entries[i].type = 0;
> +}
> +
> +sccb->h.response_code = cpu_to_be16(SCLP_RC_NORMAL_READ_COMPLETION);
> +}
> +
>  static void sclp_execute(SCCB *sccb, uint64_t code)
>  {
>  S390SCLPDevice *sdev = get_event_facility();
> @@ -50,6 +98,9 @@ static void sclp_execute(SCCB *sccb, uint64_t code)
>  case SCLP_CMDW_READ_SCP_INFO_FORCED:
>  read_SCP_info(sccb);
>  break;
> +case SCLP_CMDW_READ_CPU_INFO:
> +sclp_read_cpu_info(sccb);
> +break;
>  default:
>  sdev->sclp_command_handler(sdev->ef, sccb, code);
>  break;
> diff --git a/include/hw/s390x/sclp.h b/include/hw/s390x/sclp.h
> index 174097d..89ae7d1 100644
> --- a/include/hw/s390x/sclp.h
> +++ b/include/hw/s390x/sclp.h
> @@ -79,12 +79,44 @@ typedef struct SCCBHeader {
>  
>  #define SCCB_DATA_LEN (SCCB_SIZE - sizeof(SCCBHeader))
>  
> +/* CPU information */
> +typedef struct CpuEntry {
> +uint8_t address;
> +uint8_t reserved0[13];
> +uint8_t type;
> +uint8_t reserved1;
> +} QEMU_PACKED CpuEntry;

Feel free to use CPUEntry capitalization if this is not copied from a
Linux struct of that name - your choice.

Andreas

> +
>  typedef struct ReadInfo {
>  SCCBHeader h;
>  uint16_t rnmax;
>  uint8_t rnsize;
> +uint8_t  _reserved1[16 - 11];   /* 11-15 */
> +uint16_t entries_cpu;   /* 16-17 */
> +uint16_t offset_cpu;/* 18-19 */
> +uint8_t  _reserved2[24 - 20];   /* 20-23 */
> +uint8_t  loadparm[8];   /* 24-31 */
> +uint8_t  _reserved3[48 - 32];   /* 32-47 */
> +uint64_t facilities;/* 48-55 */
> +uint8_t  _reserved0[100 - 56];
> +uint32_t rnsize2;
> +uint64_t rnmax2;
> +uint8_t  _reserved4[120-112];   /* 112-119 */
> +uint16_t highest_cpu;
> +uint8_t  _reserved5[128 - 122]; /* 122-127 */
> +struct CpuEntry entries[0];
>  } QEMU_PACKED ReadInfo;
>  
> +typedef struct ReadCpuInfo {
> +SCCBHeader h;
> +uint16_t nr_configured; /* 8-9 */
> +uint16_t offset_configured; /* 10-11 */
> +uint16_t nr_standby;/* 12-13 */
> +uint16_t offset_standby;/* 14-15 */
> +uint8_t reserved0[24-16];   /* 16-23 */
> +struct CpuEntry entries[0];
> +} QEMU_PACKED ReadCpuInfo;
> +
>  typedef struct SCCB {
>  SCCBHeader h;
>  

[Qemu-devel] [QEMU-1.6 & QEMU-Upstream PATCH] vl.c: Output error on invalid machine type provided

2013-09-05 Thread Michal Novotny
Output error message using qemu's error_report() function when user
provides the invalid machine type on the command line. This also saves
time to find what issue is when you downgrade from one version of qemu
to another that doesn't support required machine type yet (the version
user downgraded to have to have this patch applied too, of course).

(This has been posted a while ago and reviewed however not applied yet
 so this is basically just a reminder e-mail to ask for pushing it.
 It also applies cleanly to QEMU-1.6 so I'm sending to qemu-stable as
 well.)

Signed-off-by: Michal Novotny 
---
 vl.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/vl.c b/vl.c
index f422a1c..9b4a3f9 100644
--- a/vl.c
+++ b/vl.c
@@ -2671,6 +2671,11 @@ static QEMUMachine *machine_parse(const char *name)
 if (machine) {
 return machine;
 }
+
+if (name && !is_help_option(name)) {
+error_report("Unsupported machine type");
+}
+
 printf("Supported machines are:\n");
 for (m = first_machine; m != NULL; m = m->next) {
 if (m->alias) {
-- 
1.7.11.7




Re: [Qemu-devel] [QEMU-1.6 & QEMU-Upstream PATCH] vl.c: Output error on invalid machine type provided

2013-09-05 Thread Daniel P. Berrange
On Thu, Sep 05, 2013 at 01:36:09PM +0200, Michal Novotny wrote:
> Output error message using qemu's error_report() function when user
> provides the invalid machine type on the command line. This also saves
> time to find what issue is when you downgrade from one version of qemu
> to another that doesn't support required machine type yet (the version
> user downgraded to have to have this patch applied too, of course).
> 
> (This has been posted a while ago and reviewed however not applied yet
>  so this is basically just a reminder e-mail to ask for pushing it.
>  It also applies cleanly to QEMU-1.6 so I'm sending to qemu-stable as
>  well.)
> 
> Signed-off-by: Michal Novotny 
> ---
>  vl.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/vl.c b/vl.c
> index f422a1c..9b4a3f9 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2671,6 +2671,11 @@ static QEMUMachine *machine_parse(const char *name)
>  if (machine) {
>  return machine;
>  }
> +
> +if (name && !is_help_option(name)) {
> +error_report("Unsupported machine type");
> +}

It would be nicer to do

 error_report("Unsupported machine type '%s'", name);


So when people report error messages in bugs / mailing lists, without
providing their CLI args, it'll be obvious what they requested.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [Qemu-ppc] [PATCH 16/16] target-ppc: Convert to new ldst opcodes

2013-09-05 Thread Benjamin Herrenschmidt
On Thu, 2013-09-05 at 11:08 +0200, Alexander Graf wrote:
> On 04.09.2013, at 23:05, Richard Henderson wrote:
> 
> > This lets us change "le_mode" to "end_mode" and fold away nearly all
> > of the tests for the current cpu endianness, and removing all of the
> > explicitly generated bswap opcodes.

Only nit: I find "end_mode" a very confusing identifier :-) "end"
usually means something else ! Why not endian_mode ?
> > 
> > Cc: qemu-...@nongnu.org
> > Signed-off-by: Richard Henderson 
> 
> No complaints from me, apart from the usual "LE mode isn't necessarily what 
> you think it is on PPC" one. But the code would be as broken as before IIUC.
> 
> Ben, you had some insight in how LE mode on different PPC flavors work. Could 
> you please make sure we're not walking into the wrong direction here?

I haven't seen the patch itself for some reason (and I'm about to go off
for a few days). The early day powerpc endian mode can be safely ignored
I think, I don't even remember the details myself, I think it induced
some changes to the byte lanes ordering on the bus and thus required the
host bridge to be adjusted.

The embedded PPCs have simply a per-page E bit in the TLB controlling
the endianness of accesses through the translation, the endianness is
"clean" in that case, and the bus doesn't flip around so it's akin to
what P7 does but with a finer granularity.

Cheers,
Ben.





[Qemu-devel] [PATCH] abitypes.h: Remove incorrect ARM ABI_LLONG_ALIGNMENT

2013-09-05 Thread Peter Maydell
The ARM EABI specifies that 64 bit integers should be
8 aligned; remove our incorrect setting of 4 alignment.
This has no actual effect since it only set the alignment
for the 'abi_ullong' and 'abi_llong' types, which are used
only inside code which is MIPS-specific, but it will
avoid problems later if we use the types elsewhere.

Signed-off-by: Peter Maydell 
---
 include/exec/user/abitypes.h |4 
 1 file changed, 4 deletions(-)

diff --git a/include/exec/user/abitypes.h b/include/exec/user/abitypes.h
index 008501b..80eedac 100644
--- a/include/exec/user/abitypes.h
+++ b/include/exec/user/abitypes.h
@@ -14,10 +14,6 @@
 #define ABI_LLONG_ALIGNMENT 2
 #endif
 
-#ifdef TARGET_ARM
-#define ABI_LLONG_ALIGNMENT 4
-#endif
-
 #ifndef ABI_SHORT_ALIGNMENT
 #define ABI_SHORT_ALIGNMENT 2
 #endif
-- 
1.7.9.5




Re: [Qemu-devel] [RFC PATCH] spapr: support time base offset migration

2013-09-05 Thread Benjamin Herrenschmidt
On Thu, 2013-09-05 at 19:48 +1000, Alexey Kardashevskiy wrote:
> >> I do not have pure guest timebase in QEMU and I need it on the destination.
> >> But I have host timebase + offset to calculate it. And tb_offset is already
> >> in ppc_tb_t. It looked logical to me to send the existing field and add
> >> only the missing part.
> > 
> > I still don't understand. You want the guest visible timebase in the 
> > migration stream, no?
> 
> 
> Yes. I do not really understand the problem here (and I am not playing
> dump). Do you suggest sending just the guest timebase and do not send the
> host timebase and the offset (one number instead of two)? I can do that,
> makes sense, no problem, thanks for the idea.

No, we want to send the guest TB and the corresponding "time of day" so that
the target can adjust the TB based on how long the migration took.

Cheers,
Ben.





Re: [Qemu-devel] [PATCH 3/8] [PATCH RFC v3] s390-qemu: cpu hotplug - SCLP Event integration

2013-09-05 Thread Andreas Färber
Am 01.08.2013 16:12, schrieb Jason J. Herne:
> From: "Jason J. Herne" 
> 
> Add an sclp event for "cpu was hot plugged".  This allows Qemu to deliver an
> SCLP interrupt to the guest stating that the requested cpu hotplug was
> completed.
> 
> Signed-off-by: Jason J. Herne 
> ---
>  hw/s390x/Makefile.objs|2 +-
>  hw/s390x/event-facility.c |7 +++
>  hw/s390x/sclpcpu.c|  120 
> +
>  include/hw/s390x/event-facility.h |3 +
>  include/hw/s390x/sclp.h   |1 +
>  5 files changed, 132 insertions(+), 1 deletion(-)
>  create mode 100644 hw/s390x/sclpcpu.c
> 
> diff --git a/hw/s390x/Makefile.objs b/hw/s390x/Makefile.objs
> index 77e1218..104ae8e 100644
> --- a/hw/s390x/Makefile.objs
> +++ b/hw/s390x/Makefile.objs
> @@ -2,7 +2,7 @@ obj-y = s390-virtio-bus.o s390-virtio.o
>  obj-y += s390-virtio-hcall.o
>  obj-y += sclp.o
>  obj-y += event-facility.o
> -obj-y += sclpquiesce.o
> +obj-y += sclpquiesce.o sclpcpu.o

On a line of its own for consistency and to avoid '-' line?

>  obj-y += ipl.o
>  obj-y += css.o
>  obj-y += s390-virtio-ccw.o
> diff --git a/hw/s390x/event-facility.c b/hw/s390x/event-facility.c
> index 0faade0..aec35cb 100644
> --- a/hw/s390x/event-facility.c
> +++ b/hw/s390x/event-facility.c
> @@ -317,6 +317,7 @@ static int init_event_facility(S390SCLPDevice *sdev)
>  {
>  SCLPEventFacility *event_facility;
>  DeviceState *quiesce;
> +DeviceState *cpu_hotplug;
>  
>  event_facility = g_malloc0(sizeof(SCLPEventFacility));
>  sdev->ef = event_facility;
> @@ -335,6 +336,12 @@ static int init_event_facility(S390SCLPDevice *sdev)
>  }
>  qdev_init_nofail(quiesce);
>  
> +cpu_hotplug = qdev_create(&event_facility->sbus.qbus, "sclpcpuhotplug");

Please don't create devices in such an init function. Also don't access
.qbus please.

Instead, please use object_initialize() followed by
qdev_set_parent_bus() in an instance_init function.

Conversion of the initfn to a realizefn will be a bit more involved so I
won't ask that for this series, but if there were volunteers among your
colleagues that would be appreciated. The effect would be to propagate
errors to the caller of the realizefn rather than asserting here.

> +if (!cpu_hotplug) {
> +return -1;
> +}
> +qdev_init_nofail(cpu_hotplug);
> +
>  return 0;
>  }
>  
> diff --git a/hw/s390x/sclpcpu.c b/hw/s390x/sclpcpu.c
> new file mode 100644
> index 000..5b4139e
> --- /dev/null
> +++ b/hw/s390x/sclpcpu.c
> @@ -0,0 +1,120 @@
> +/*
> + * SCLP event type
> + *Signal CPU - Trigger SCLP interrupt for system CPU configure or
> + *de-configure
> + *
> + * Copyright IBM, Corp. 2013
> + *
> + * Authors:
> + *  Thang Pham 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or (at 
> your
> + * option) any later version.  See the COPYING file in the top-level 
> directory.
> + *
> + */
> +#include 

"hw/qdev.h", but I'm guessing that's redundant with either sclp.h or
event-facility.h?

> +#include "sysemu/sysemu.h"
> +#include "hw/s390x/sclp.h"
> +#include "hw/s390x/event-facility.h"
> +#include "cpu.h"
> +#include "sysemu/cpus.h"
> +#include "sysemu/kvm.h"
> +
> +typedef struct ConfigMgtData {
> +EventBufferHeader ebh;
> +uint8_t reserved;
> +uint8_t event_qualifier;
> +} QEMU_PACKED ConfigMgtData;
> +
> +static qemu_irq irq_cpu_hotplug; /* Only used in this file */
> +
> +#define EVENT_QUAL_CPU_CHANGE  1
> +
> +void raise_irq_cpu_hotplug(void)
> +{
> +qemu_irq_raise(irq_cpu_hotplug);
> +}
> +
> +static int event_type(void)
> +{
> +return SCLP_EVENT_CONFIG_MGT_DATA;
> +}
> +
> +static unsigned int send_mask(void)
> +{
> +return SCLP_EVENT_MASK_CONFIG_MGT_DATA;
> +}
> +
> +static unsigned int receive_mask(void)
> +{
> +return 0;
> +}
> +
> +static int read_event_data(SCLPEvent *event, EventBufferHeader *evt_buf_hdr,
> +   int *slen)
> +{
> +ConfigMgtData *cdata = (ConfigMgtData *) evt_buf_hdr;
> +if (*slen < sizeof(ConfigMgtData)) {
> +return 0;
> +}
> +
> +/* Event is no longer pending */
> +if (!event->event_pending) {
> +return 0;
> +}
> +event->event_pending = false;
> +
> +/* Event header data */
> +cdata->ebh.length = cpu_to_be16(sizeof(ConfigMgtData));
> +cdata->ebh.type = SCLP_EVENT_CONFIG_MGT_DATA;
> +cdata->ebh.flags |= SCLP_EVENT_BUFFER_ACCEPTED;
> +
> +/* Trigger a rescan of CPUs by setting event qualifier */
> +cdata->event_qualifier = EVENT_QUAL_CPU_CHANGE;
> +*slen -= sizeof(ConfigMgtData);
> +
> +return 1;
> +}
> +
> +static void trigger_signal(void *opaque, int n, int level)
> +{
> +SCLPEvent *event = opaque;
> +event->event_pending = true;
> +
> +/* Trigger SCLP read operation */
> +sclp_service_interrupt(0);
> +}
> +
> +static int irq_cpu_hotplug_init(SCLPEvent *event)
> +{
> +irq_cpu_hotplug = *qemu_allocate_irqs(trigger_signal, e

Re: [Qemu-devel] [RFC PATCH 5/6] module: load modules at start

2013-09-05 Thread Lluís Vilanova
Fam Zheng writes:

> Add module_load_all to load all DSO modules under:
> /usr/lib/qemu/block/
> /usr/lib/qemu/net/
> /usr/lib/qemu/ui/
> when starting process.

This should probably be based on a define with the prefix set at configure time.

Adding directories from command-line arguments into that list would also be
nice.


Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth



Re: [Qemu-devel] [QEMU-1.6 & QEMU-Upstream PATCH] vl.c: Output error on invalid machine type provided

2013-09-05 Thread Michal Novotny

On 09/05/2013 01:39 PM, Daniel P. Berrange wrote:
> On Thu, Sep 05, 2013 at 01:36:09PM +0200, Michal Novotny wrote:
>> Output error message using qemu's error_report() function when user
>> provides the invalid machine type on the command line. This also saves
>> time to find what issue is when you downgrade from one version of qemu
>> to another that doesn't support required machine type yet (the version
>> user downgraded to have to have this patch applied too, of course).
>>
>> (This has been posted a while ago and reviewed however not applied yet
>>  so this is basically just a reminder e-mail to ask for pushing it.
>>  It also applies cleanly to QEMU-1.6 so I'm sending to qemu-stable as
>>  well.)
>>
>> Signed-off-by: Michal Novotny 
>> ---
>>  vl.c | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/vl.c b/vl.c
>> index f422a1c..9b4a3f9 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -2671,6 +2671,11 @@ static QEMUMachine *machine_parse(const char *name)
>>  if (machine) {
>>  return machine;
>>  }
>> +
>> +if (name && !is_help_option(name)) {
>> +error_report("Unsupported machine type");
>> +}
> It would be nicer to do
>
>  error_report("Unsupported machine type '%s'", name);

No need Daniel, as the output is having this already (as mentioned by
Markus in the previous thread):

$ ./x86_64-softmmu/qemu-system-x86_64 -M bogus
-M bogus: Unsupported machine type
$

So the CLI arguments are in the beginning of the line already. If we
consider long CLI args, like:

$ ./x86_64-softmmu/qemu-system-x86_64 -S -M bogus -cpu
core2duo,+lahf_lm,+rdtscp,+avx,+osxsave,+xsave,+aes,+tsc-deadline,+popcnt,+x2apic,+sse4.2,+sse4.1,+pdcm,+xtpr,+cx16,+tm2,+est,+smx,+vmx,+ds_cpl,+dtes64,+pclmuldq,+pbe,+tm,+ht,+ss,+acpi,+ds
-enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name debian
-uuid 374677db-d9b1-3326-a097-5f2b79d3fca0 -nodefconfig -nodefaults
-chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/debian.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
-no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -device
lsi,id=scsi0,bus=pci.0,addr=0x7 -drive
if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device
ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive
file=/home/mig/images/kvm/debian.qcow2,if=none,id=drive-virtio-disk0,format=qcow2
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-netdev tap,fd=20,id=hostnet0 -device
rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:4d:d9:c9,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -chardev
spicevmc,id=charchannel0,name=vdagent -device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0
-device usb-tablet,id=input0 -spice
port=5900,addr=127.0.0.1,disable-ticketing -vga qxl -global
qxl-vga.vram_size=67108864 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6

then it can make sense, however to change it from:

-S -M bogus -cpu
core2duo,+lahf_lm,+rdtscp,+avx,+osxsave,+xsave,+aes,+tsc-deadline,+popcnt,+x2apic,+sse4.2,+sse4.1,+pdcm,+xtpr,+cx16,+tm2,+est,+smx,+vmx,+ds_cpl,+dtes64,+pclmuldq,+pbe,+tm,+ht,+ss,+acpi,+ds
-enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name debian
-uuid 374677db-d9b1-3326-a097-5f2b79d3fca0 -nodefconfig -nodefaults
-chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/debian.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
-no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -device
lsi,id=scsi0,bus=pci.0,addr=0x7 -drive
if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device
ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive
file=/home/mig/images/kvm/debian.qcow2,if=none,id=drive-virtio-disk0,format=qcow2
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-netdev tap,fd=20,id=hostnet0 -device
rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:4d:d9:c9,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -chardev
spicevmc,id=charchannel0,name=vdagent -device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0
-device usb-tablet,id=input0 -spice
port=5900,addr=127.0.0.1,disable-ticketing -vga qxl -global
qxl-vga.vram_size=67108864 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6: Unsupported machine type

to:

-S -M bogus -cpu
core2duo,+lahf_lm,+rdtscp,+avx,+osxsave,+xsave,+aes,+tsc-deadline,+popcnt,+x2apic,+sse4.2,+sse4.1,+pdcm,+xtpr,+cx16,+tm2,+est,+smx,+vmx,+ds_cpl,+dtes64,+pclmuldq,+pbe,+tm,+ht,+ss,+acpi,+ds
-enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name debian
-uuid 374677db-d9b1-3326-a097-5f2b79d3fca0

Re: [Qemu-devel] [PATCH 4/8] [PATCH RFC v3] s390-qemu: cpu hotplug - Storage key global access

2013-09-05 Thread Andreas Färber
Am 01.08.2013 16:12, schrieb Jason J. Herne:
> From: "Jason J. Herne" 
> 
> Introduces global access to storage key data so we can set it for each cpu in
> the S390 cpu initialization routine.
> 
> Signed-off-by: Jason J. Herne 
> ---
>  hw/s390x/s390-virtio-ccw.c |5 ++---
>  hw/s390x/s390-virtio.c |   21 -
>  hw/s390x/s390-virtio.h |2 +-
>  target-s390x/cpu.h |4 
>  4 files changed, 23 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index aebbbf1..b469960 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -65,7 +65,6 @@ static void ccw_init(QEMUMachineInitArgs *args)
>  MemoryRegion *sysmem = get_system_memory();
>  MemoryRegion *ram = g_new(MemoryRegion, 1);
>  int shift = 0;
> -uint8_t *storage_keys;
>  int ret;
>  VirtualCssBus *css_bus;
>  
> @@ -94,10 +93,10 @@ static void ccw_init(QEMUMachineInitArgs *args)
>  memory_region_add_subregion(sysmem, 0, ram);
>  
>  /* allocate storage keys */
> -storage_keys = g_malloc0(my_ram_size / TARGET_PAGE_SIZE);
> +s390_alloc_storage_keys(my_ram_size);
>  
>  /* init CPUs */
> -s390_init_cpus(args->cpu_model, storage_keys);
> +s390_init_cpus(args->cpu_model);
>  
>  if (kvm_enabled()) {
>  kvm_s390_enable_css_support(s390_cpu_addr2state(0));
> diff --git a/hw/s390x/s390-virtio.c b/hw/s390x/s390-virtio.c
> index 439d732..21e9124 100644
> --- a/hw/s390x/s390-virtio.c
> +++ b/hw/s390x/s390-virtio.c
> @@ -123,6 +123,18 @@ static void s390_virtio_register_hcalls(void)
> s390_virtio_hcall_set_status);
>  }
>  
> +static uint8_t *storage_keys;
> +
> +uint8_t *s390_get_storage_keys(void)
> +{
> +return storage_keys;
> +}
> +
> +void s390_alloc_storage_keys(ram_addr_t ram_size)
> +{
> +storage_keys = g_malloc0(ram_size / TARGET_PAGE_SIZE);
> +}
> +
>  /*
>   * The number of running CPUs. On s390 a shutdown is the state of all CPUs
>   * being either stopped or disabled (for interrupts) waiting. We have to
> @@ -176,7 +188,7 @@ void s390_init_ipl_dev(const char *kernel_filename,
>  qdev_init_nofail(dev);
>  }
>  
> -void s390_init_cpus(const char *cpu_model, uint8_t *storage_keys)
> +void s390_init_cpus(const char *cpu_model)
>  {
>  int i;
>  
> @@ -196,7 +208,7 @@ void s390_init_cpus(const char *cpu_model, uint8_t 
> *storage_keys)
>  ipi_states[i] = cpu;
>  cs->halted = 1;
>  cpu->env.exception_index = EXCP_HLT;
> -cpu->env.storage_keys = storage_keys;
> +cpu->env.storage_keys = s390_get_storage_keys();

Why not go this from the CPU initfn? Is there any ccw- vs. non-ccw
difference? Thinking about -device s390-cpu here. I believe it's safe to
assume that machine init and thus allocation has run before the CPU is
instantiated - possibly assert that.

Andreas

>  }
>  }
>  
> @@ -231,7 +243,6 @@ static void s390_init(QEMUMachineInitArgs *args)
>  MemoryRegion *sysmem = get_system_memory();
>  MemoryRegion *ram = g_new(MemoryRegion, 1);
>  int shift = 0;
> -uint8_t *storage_keys;
>  void *virtio_region;
>  hwaddr virtio_region_len;
>  hwaddr virtio_region_start;
> @@ -270,10 +281,10 @@ static void s390_init(QEMUMachineInitArgs *args)
>virtio_region_len);
>  
>  /* allocate storage keys */
> -storage_keys = g_malloc0(my_ram_size / TARGET_PAGE_SIZE);
> +s390_alloc_storage_keys(my_ram_size);
>  
>  /* init CPUs */
> -s390_init_cpus(args->cpu_model, storage_keys);
> +s390_init_cpus(args->cpu_model);
>  
>  /* Create VirtIO network adapters */
>  s390_create_virtio_net((BusState *)s390_bus, "virtio-net-s390");
> diff --git a/hw/s390x/s390-virtio.h b/hw/s390x/s390-virtio.h
> index 5c405e7..c1cb042 100644
> --- a/hw/s390x/s390-virtio.h
> +++ b/hw/s390x/s390-virtio.h
> @@ -20,7 +20,7 @@
>  typedef int (*s390_virtio_fn)(const uint64_t *args);
>  void s390_register_virtio_hypercall(uint64_t code, s390_virtio_fn fn);
>  
> -void s390_init_cpus(const char *cpu_model, uint8_t *storage_keys);
> +void s390_init_cpus(const char *cpu_model);
>  void s390_init_ipl_dev(const char *kernel_filename,
> const char *kernel_cmdline,
> const char *initrd_filename,
> diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
> index 65bef86..877eac7 100644
> --- a/target-s390x/cpu.h
> +++ b/target-s390x/cpu.h
> @@ -374,6 +374,10 @@ static inline void kvm_s390_interrupt_internal(S390CPU 
> *cpu, int type,
>  {
>  }
>  #endif
> +
> +uint8_t *s390_get_storage_keys(void);
> +void s390_alloc_storage_keys(ram_addr_t ram_size);
> +
>  S390CPU *s390_cpu_addr2state(uint16_t cpu_addr);
>  void s390_add_running_cpu(S390CPU *cpu);
>  unsigned s390_del_running_cpu(S390CPU *cpu);
> 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Feli

Re: [Qemu-devel] [RFC PATCH 5/6] module: load modules at start

2013-09-05 Thread Michael Tokarev
05.09.2013 14:20, Fam Zheng wrote:
> Add module_load_all to load all DSO modules under:
> /usr/lib/qemu/block/
> /usr/lib/qemu/net/
> /usr/lib/qemu/ui/
> when starting process.

NACK.

This is wrong, as has been mentioned already.
For example, you can't expect to load ui/*
from qemu-img, because qemu-img provides no
symbols needed to link ui stuff.

Much better approach is to hook this stuff into
object/module registration framework, in module.c.
And either load every object of a given kind at
the init time, or try to load it at lookup time
if the requested object isn't found in current
list.

I implemented this approach in my tree more than
a month ago.

Thanks,

/mjt




Re: [Qemu-devel] [RFC] qcow2 journalling draft

2013-09-05 Thread Kevin Wolf
Am 05.09.2013 um 11:35 hat Stefan Hajnoczi geschrieben:
> Although we are still discussing details of the on-disk layout, the
> general design is clear enough to discuss how the journal will be used.
> 
> Today qcow2 uses Qcow2Cache to do lazy, ordered metadata updates.  The
> performance is pretty good with two exceptions that I can think of:
> 
> 1. The delayed CoW problem that Kevin has been working on.  Guests
>perform sequential writes that are smaller than a qcow2 cluster.  The
>first write triggers a copy-on-write of the full cluster.  Later
>writes then overwrite the copied data.  It would be more efficient to
>anticipate sequential writes and hold off on CoW where possible.

To be clear, "more efficient" can mean a plus of 50% and more. COW
overhead is the only major overhead compared to raw when looking at
normal cluster allocations. So this is something that is really
important for cluster allocation performance.

The patches that I posted a while ago showed that it's possible to do
this without a journal, however the flush operation became very complex
(which we all found rather scary) and required that the COW be completed
before signalling flush completion.

With a journal, the only thing that you need to do on a flush is to
commit all transactions, i.e. write them out and bdrv_flush(bs->file).
The actualy data copy of the COW (i.e. the sync) can be further delayed
and doesn't have to happen at commit type as it would have without a
journal.

> 2. Lazy metadata updates lead to bursty behavior and expensive flushes.
>We do not take advantage of disk bandwidth since metadata updates
>stay in the Qcow2Cache until the last possible second.  When the
>guest issues a flush we must write out dirty Qcow2Cache entries and
>possibly fsync between them if dependencies have been set (e.g.
>refcount before L2).

Hm, have we ever measured the impact of this?

I don't think a journal can make a fundamental difference here - either
you write only at the last possible second (today flush, with a journal
commit), or you write out more data than strictly necessary.

> How will the journal change this situation?  Writes that go through the
> journal are doubled - they must first be journalled, fsync, and then
> they can be applied to the actual image.
> 
> How do we benefit by using the journal?

I believe Delayed COW is a pretty strong one. But there are more cases
in which performance isn't that great.

I think you refer to the simple case with a normal empty image where new
clusters are allocated, which is pretty good indeed if we ignore COW.
Trouble starts when you also free clusters, which happens for example
with internal COW (internal snapshots, compressed images) or discard.
Deduplication as well in the future, I suppose.

Then you get very quickly alternating sequences of "L2 depends on
refcount update" (for allocation) and "refcount update depends on L2
update" (for freeing), which means that Qcow2Cache starts flushing all
the time without accumulating many requests. These are cases that would
benefit as well from the atomicity of journal transactions.

And then, of course, we still leak clusters on failed operations. With a
journal, this wouldn't happen any more and the image would always stay
consistent (instead of only corruption-free).

Kevin



Re: [Qemu-devel] [RFC PATCH] spapr: add initial ibm, client-architecture-support rtas call support

2013-09-05 Thread Paul Mackerras
On Thu, Sep 05, 2013 at 12:19:09PM +0200, Alexander Graf wrote:
> 
> On 05.09.2013, at 12:16, Paul Mackerras wrote:
> 
> > On Wed, Sep 04, 2013 at 04:32:20PM -0500, Anthony Liguori wrote:
> >> On Wed, Sep 4, 2013 at 8:37 AM, Alexander Graf  wrote:
> >>> 
>  
> > So IMHO this whole thing should be orthogonal to -cpu.
>  
>  Well, since we cannot change CPU class on the fly, yes, it should be a
>  "compatibility" flags/properties/methods/whatever of the default CPU for
>  the specific family.
> >>> 
> >>> Since it's machine global, it could just as well be a machine option, no? 
> >>> Or can you have multiple CPUs with different compat modes in a single 
> >>> system?
> >> 
> >> AFAIK, this has nothing to do with CPUs.
> > 
> > I'm not sure what you mean by that; it has to do with CPUs since it
> > means changing the CPUs' behaviour, at least for user-mode programs.
> > 
> > Why? Just because you're on POWER7 as default doesn't mean you can't 
> > bump to a newer compat later on, no?
>  
>  Bump when exactly? And it won't be a new compat, it will be a native 
>  CPU. I
> >>> 
> >>> If you configure your guest to boot in POWER7-compat mode on your POWER8 
> >>> box and it then tells you through ibm,client-architecture-support that it 
> >>> can do POWER8, we can just remove all the compat bits and be happy, no?
> > 
> > Answering Alex here -- if we want to preserve the option of migrating
> > to a POWER7 host in future, we would run the guest in POWER7 compat
> > mode even if the current host is a POWER8 and the guest knows about
> > POWER8.
> 
> Yes, so we boot the guest with compat mode set to POWER7, then the guest 
> calls ibm,client-architecutre-support including POWER8 and then we can 
> reconfigure the guest to be POWER8, right?

Not if we want to be able to migrate to a real POWER7 later.  If we
tell the guest it's a POWER8, it will start using POWER8 features, and
then break when we migrate it to a POWER7 where those features don't
exist.  If we run the POWER8 in POWER7 compatibility mode (and more
importantly, the device tree says it's a 2.06 architecture processor),
it should only use POWER7 features and then work just fine when
migrated to a real POWER7.

Paul.



Re: [Qemu-devel] [PATCH] tci: Add implementation of rotl_i64, rotr_i64

2013-09-05 Thread Jay Foad
> diff --git a/tci.c b/tci.c
> index 18c888e..94b7851 100644
> --- a/tci.c
> +++ b/tci.c
> @@ -952,8 +952,16 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t 
> *tb_ptr)
>  break;
>  #if TCG_TARGET_HAS_rot_i64
>  case INDEX_op_rotl_i64:
> +t0 = *tb_ptr++;
> +t1 = tci_read_ri64(&tb_ptr);
> +t2 = tci_read_ri64(&tb_ptr);
> +tci_write_reg64(t0, (t1 << t2) | (t1 >> (64 - t2)));
> +break;
>  case INDEX_op_rotr_i64:
> -TODO();
> +t0 = *tb_ptr++;
> +t1 = tci_read_ri64(&tb_ptr);
> +t2 = tci_read_ri64(&tb_ptr);
> +tci_write_reg64(t0, (t1 >> t2) | (t1 << (64 - t2)));

<< (64 - t2) is undefined behaviour in C when t2 is 0. How about << (-t2 & 63) ?

Jay.



Re: [Qemu-devel] [PATCH 5/8] [PATCH RFC v3] s390-qemu: cpu hotplug - ipi_states enhancements

2013-09-05 Thread Andreas Färber
Am 01.08.2013 16:12, schrieb Jason J. Herne:
> From: "Jason J. Herne" 
> 
> Modify s390_cpu_addr2state to allow fetching state information for cpu 
> addresses
> above smp_cpus.  Hotplug requires this capability.
> 
> Also add s390_cpu_set_state function to allow modification of ipi_state 
> entries
> during hotplug.
> 
> Signed-off-by: Jason J. Herne 
> ---
>  hw/s390x/s390-virtio.c |9 +
>  target-s390x/cpu.h |2 +-
>  2 files changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio.c b/hw/s390x/s390-virtio.c
> index 21e9124..5ad9cf3 100644
> --- a/hw/s390x/s390-virtio.c
> +++ b/hw/s390x/s390-virtio.c
> @@ -54,12 +54,13 @@
>  static VirtIOS390Bus *s390_bus;
>  static S390CPU **ipi_states;
>  
> -S390CPU *s390_cpu_addr2state(uint16_t cpu_addr)
> +void s390_cpu_set_ipistate(uint16_t cpu_addr, S390CPU *state)
>  {
> -if (cpu_addr >= smp_cpus) {
> -return NULL;
> -}
> +ipi_states[cpu_addr] = state;
> +}
>  
> +S390CPU *s390_cpu_addr2state(uint16_t cpu_addr)
> +{
>  return ipi_states[cpu_addr];
>  }
>  

This is what got us into the link<> discussion last time. If we do

for (i = 0; i < ARRAY_SIZE(ipi_states); i++) {
name = g_strdup_printf("cpu[%i]", i);
object_property_add_link(qdev_get_machine(), name, TYPE_S390_CPU,
 &ipi_states[i], &err);
}

then we get said /machine/cpu[n] link<> properties, at a QMP level
either returning nothing or the canonical path to the CPU object.

On IRC I didn't get an answer of whether it was being done the above way
because there is infrastructure missing, and a look at object.h now
confirms that suspicion. CC'ing Anthony and Paolo.

Since object_property_add_link() uses a NULL opaque, my idea would be to
add a single setter hook argument passed through as opaque to
object_set_link_property(), which would call it with the old and the new
value.

The purpose would be to avoid growing our own internal setter API, which
is disjoint from the QMP qom-set we are targetting at.

Regards,
Andreas

> diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
> index 877eac7..62eb810 100644
> --- a/target-s390x/cpu.h
> +++ b/target-s390x/cpu.h
> @@ -377,7 +377,7 @@ static inline void kvm_s390_interrupt_internal(S390CPU 
> *cpu, int type,
>  
>  uint8_t *s390_get_storage_keys(void);
>  void s390_alloc_storage_keys(ram_addr_t ram_size);
> -
> +void s390_cpu_set_ipistate(uint16_t cpu_addr, S390CPU *state);
>  S390CPU *s390_cpu_addr2state(uint16_t cpu_addr);
>  void s390_add_running_cpu(S390CPU *cpu);
>  unsigned s390_del_running_cpu(S390CPU *cpu);
> 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH] spapr-rtas: reset top 4 bits in parameters address

2013-09-05 Thread Alexey Kardashevskiy
On 09/05/2013 08:21 PM, Alexander Graf wrote:
> 
> On 05.09.2013, at 12:17, Alexey Kardashevskiy wrote:
> 
>> On 09/05/2013 07:27 PM, Alexander Graf wrote:
>>>
>>> On 05.09.2013, at 09:40, Alexey Kardashevskiy wrote:
>>>
 On 09/05/2013 05:08 PM, Alexander Graf wrote:
>
>
> Am 05.09.2013 um 07:58 schrieb Alexey Kardashevskiy :
>
>> On the real hardware, RTAS is called in real mode and therefore
>> ignores top 4 bits of the address passed in the call.
>
> Shouldn't we ignore the upper 4 bits for every memory access in real 
> mode, not just that one parameter?

 We probably should but I just do not see any easy way of doing this. Yet
 another "Ignore N bits on the top" memory region type? No idea.
>>>
>>> Well, it already works for code that runs inside of guest context, because 
>>> there the softmmu code for real mode strips the upper 4 bits.
>>>
>>> I basically see 2 ways of fixing this "correctly":
>>>
>>
>>> 1) Don't access memory through cpu_physical_memory_rw or ldx_phys but
>>> instead through real mode wrappers that strip the upper 4 bits, similar
>>> to how we handle virtual memory differently from physical memory
>>
>> But there is no a ready wrapper for this, correct? I could not find any. I
>> would rather do this, looks nicer than 2).
>>
>>
>>> 2) Create 15 aliases to system_memory at the upper 4 bits of address
>>> space. That should at the end of the day give you the same effect
>>
>> Wow. Is not that too much?
>> Ooor since I am normally making bad decisions, I should do this :)
>>
>>
>>> The fix as you're proposing it wouldn't work for indirect memory
>>> descriptors. Imagine you have an "address" parameter that gives you a
>>> pointer to a struct in memory that again contains a pointer. You still
>>> want that pointer be interpreted correctly, no?
>>
>> Yes I do. I just think that having non zero bits at the top is a bug and I
>> would not want the guest to continue sending bad addresses to the host. Or
>> at least I want to know if it still happening.
>>
>> Now we know that the only occasion of this misbehaviour is the "stop-self"
>> call and others works just fine. If something new comes up (what is pretty
>> unlikely, otherwise we would have noticed this issue a loong time ago AND
>> Paul already made&posted a patch for the host to fix __pa() so it is not
>> going to happen on new kernels either), ok, we will think of fixing this.
>>
>> Doing in QEMU what the hardware does is a good thing but here I would think
>> twice.
> 
> Well, the idea behind RTAS is that everything RTAS does is usually run in 
> IR=0 DR=0 inside of guest context, so that's the view of the world we should 
> expose.
> 
> Which makes me think.
> 

> Couldn't we just set IR=0 DR=0 when getting an RTAS call and use the
> virtual memory access functions? Those will already strip the upper 4
> bits.

Ok. We reached the border where my ignorance starts :) Never could
understand the concept of the guest virtual memory in QEMU.

So we clear IR/DR and call what API? This is not address_space_rw() and
company, right?



-- 
Alexey



[Qemu-devel] [PATCH 1/3] block: Additional info string in ImageInfo and BDI

2013-09-05 Thread Max Reitz
Add a string for additional information to ImageInfo and
BlockDriverInfo. Also, use this string to emit the compatibility level
and lazy_refcount value (on compat=1.1) for qcow2.

Signed-off-by: Max Reitz 
---
 block.c   |  3 ++-
 block/mirror.c|  6 --
 block/qapi.c  | 10 +-
 block/qcow2.c |  6 ++
 include/block/block.h |  2 ++
 qapi-schema.json  |  5 -
 qemu-img.c|  3 ++-
 qemu-io-cmds.c|  7 ++-
 8 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/block.c b/block.c
index 26639e8..9fd9f3a 100644
--- a/block.c
+++ b/block.c
@@ -1921,7 +1921,7 @@ void bdrv_round_to_clusters(BlockDriverState *bs,
 int64_t *cluster_sector_num,
 int *cluster_nb_sectors)
 {
-BlockDriverInfo bdi;
+BlockDriverInfo bdi = { .info_string = NULL };
 
 if (bdrv_get_info(bs, &bdi) < 0 || bdi.cluster_size == 0) {
 *cluster_sector_num = sector_num;
@@ -1932,6 +1932,7 @@ void bdrv_round_to_clusters(BlockDriverState *bs,
 *cluster_nb_sectors = QEMU_ALIGN_UP(sector_num - *cluster_sector_num +
 nb_sectors, c);
 }
+g_free(bdi.info_string);
 }
 
 static bool tracked_request_overlaps(BdrvTrackedRequest *req,
diff --git a/block/mirror.c b/block/mirror.c
index 86de458..296ad54 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -295,7 +295,7 @@ static void coroutine_fn mirror_run(void *opaque)
 BlockDriverState *bs = s->common.bs;
 int64_t sector_num, end, sectors_per_chunk, length;
 uint64_t last_pause_ns;
-BlockDriverInfo bdi;
+BlockDriverInfo bdi = { .info_string = NULL };
 char backing_filename[1024];
 int ret = 0;
 int n;
@@ -325,6 +325,7 @@ static void coroutine_fn mirror_run(void *opaque)
 s->buf_size = MAX(s->buf_size, bdi.cluster_size);
 s->cow_bitmap = bitmap_new(length);
 }
+g_free(bdi.info_string);
 }
 
 end = s->common.len >> BDRV_SECTOR_BITS;
@@ -544,13 +545,14 @@ void mirror_start(BlockDriverState *bs, BlockDriverState 
*target,
 if (granularity == 0) {
 /* Choose the default granularity based on the target file's cluster
  * size, clamped between 4k and 64k.  */
-BlockDriverInfo bdi;
+BlockDriverInfo bdi = { .info_string = NULL };
 if (bdrv_get_info(target, &bdi) >= 0 && bdi.cluster_size != 0) {
 granularity = MAX(4096, bdi.cluster_size);
 granularity = MIN(65536, granularity);
 } else {
 granularity = 65536;
 }
+g_free(bdi.info_string);
 }
 
 assert ((granularity & (granularity - 1)) == 0);
diff --git a/block/qapi.c b/block/qapi.c
index a4bc411..b63376c 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -110,7 +110,7 @@ void bdrv_query_image_info(BlockDriverState *bs,
 uint64_t total_sectors;
 const char *backing_filename;
 char backing_filename2[1024];
-BlockDriverInfo bdi;
+BlockDriverInfo bdi = { .info_string = NULL };
 int ret;
 Error *err = NULL;
 ImageInfo *info = g_new0(ImageInfo, 1);
@@ -133,6 +133,10 @@ void bdrv_query_image_info(BlockDriverState *bs,
 }
 info->dirty_flag = bdi.is_dirty;
 info->has_dirty_flag = true;
+if (bdi.info_string) {
+info->info_string = bdi.info_string;
+info->has_info_string = true;
+}
 }
 backing_filename = bs->backing_file;
 if (backing_filename[0] != '\0') {
@@ -467,4 +471,8 @@ void bdrv_image_info_dump(fprintf_function func_fprintf, 
void *f,
 func_fprintf(f, "\n");
 }
 }
+
+if (info->has_info_string) {
+func_fprintf(f, "additional information: %s\n", info->info_string);
+}
 }
diff --git a/block/qcow2.c b/block/qcow2.c
index 4bc679a..7351ab1 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1757,6 +1757,12 @@ static int qcow2_get_info(BlockDriverState *bs, 
BlockDriverInfo *bdi)
 BDRVQcowState *s = bs->opaque;
 bdi->cluster_size = s->cluster_size;
 bdi->vm_state_offset = qcow2_vm_state_offset(s);
+if (s->qcow_version == 2) {
+bdi->info_string = g_strdup("compat=0.10");
+} else if (s->qcow_version == 3) {
+bdi->info_string = g_strdup_printf("compat=1.1,lazy_refcounts=%s",
+s->use_lazy_refcounts ? "on" : "off");
+}
 return 0;
 }
 
diff --git a/include/block/block.h b/include/block/block.h
index e6b391c..f8335a2 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -18,6 +18,8 @@ typedef struct BlockDriverInfo {
 /* offset at which the VM state can be saved (0 if not possible) */
 int64_t vm_state_offset;
 bool is_dirty;
+/* additional information; NULL if none */
+char *info_string;
 } BlockDriverInfo;
 
 typedef struct BlockFragInfo {
diff --git a/qapi-schema.json b/qapi-schema.json
index a51f7d2..15c4f02 100644
--- a/qapi-s

[Qemu-devel] [PATCH 3/3] qemu-iotests: Additional info from qemu-img info

2013-09-05 Thread Max Reitz
Add a test for the additional information now provided by qemu-img info
when used on qcow2 images.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/064 | 72 ++
 tests/qemu-iotests/064.out | 17 +++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 90 insertions(+)
 create mode 100755 tests/qemu-iotests/064
 create mode 100644 tests/qemu-iotests/064.out

diff --git a/tests/qemu-iotests/064 b/tests/qemu-iotests/064
new file mode 100755
index 000..1ab2a64
--- /dev/null
+++ b/tests/qemu-iotests/064
@@ -0,0 +1,72 @@
+#!/bin/bash
+#
+# Test for additional information emitted by qemu-img info on qcow2
+# images
+#
+# Copyright (C) 2013 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=mre...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+_cleanup()
+{
+   _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+# This tests qocw2-specific low-level functionality
+_supported_fmt qcow2
+_supported_proto generic
+_supported_os Linux
+
+IMG_SIZE=64M
+
+echo
+echo "=== Testing qcow2 image with -o compat=0.10 ==="
+echo
+IMGOPTS="compat=0.10" _make_test_img $IMG_SIZE
+# don't use _img_info, since that function will filter out the
+# info-string
+$QEMU_IMG info "$TEST_IMG" | grep "additional information"
+
+echo
+echo "=== Testing qcow2 image with -o compat=1.1,lazy_refcounts=off ==="
+echo
+IMGOPTS="compat=1.1,lazy_refcounts=off" _make_test_img $IMG_SIZE
+$QEMU_IMG info "$TEST_IMG" | grep "additional information"
+
+echo
+echo "=== Testing qcow2 image with -o compat=1.1,lazy_refcounts=on ==="
+echo
+IMGOPTS="compat=1.1,lazy_refcounts=on" _make_test_img $IMG_SIZE
+$QEMU_IMG info "$TEST_IMG" | grep "additional information"
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/064.out b/tests/qemu-iotests/064.out
new file mode 100644
index 000..fcaa670
--- /dev/null
+++ b/tests/qemu-iotests/064.out
@@ -0,0 +1,17 @@
+QA output created by 064
+
+=== Testing qcow2 image with -o compat=0.10 ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+additional information: compat=0.10
+
+=== Testing qcow2 image with -o compat=1.1,lazy_refcounts=off ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+additional information: compat=1.1,lazy_refcounts=off
+
+=== Testing qcow2 image with -o compat=1.1,lazy_refcounts=on ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+additional information: compat=1.1,lazy_refcounts=on
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index b696242..740cd84 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -66,3 +66,4 @@
 059 rw auto
 060 rw auto
 062 rw auto
+064 rw auto
-- 
1.8.3.1




[Qemu-devel] [PATCH 2/3] qemu-iotests: info-string filter in _img_info

2013-09-05 Thread Max Reitz
Filter out additional information specific to the image format provided
by qemu-img info in _img_info.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/common.rc | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 5e077c3..d2dae4b 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -186,7 +186,9 @@ _img_info()
 -e "s#$TEST_DIR#TEST_DIR#g" \
 -e "s#$IMGFMT#IMGFMT#g" \
 -e "/^disk size:/ D" \
--e "/actual-size/ D"
+-e "/actual-size/ D" \
+-e "/additional information:/ D" \
+-e "/info-string/ D"
 }
 
 _get_pids_by_name()
-- 
1.8.3.1




[Qemu-devel] [PATCH 0/3] Provide additional info through qemu-img info

2013-09-05 Thread Max Reitz
qemu-img info provides only pretty general information about an image.
For any image format, there might be specific options which cannot be
represented in a universal way; for instance, qcow2 provides the
compatibility and lazy_refcount options whose values are certainly
interesting but currently cannot be output by qemu-img info.

Therefore, this series adds an info-string field to ImageInfo resp.
info_string to BlockDriverInfo which may be used by block drivers to
hand an uninterpreted string to the user. It also adds support to
qemu-img info and qemu-io -c info to dump that field's value.

Max Reitz (3):
  block: Additional info string in ImageInfo and BDI
  qemu-iotests: info-string filter in _img_info
  qemu-iotests: Additional info from qemu-img info

 block.c  |  3 +-
 block/mirror.c   |  6 ++--
 block/qapi.c | 10 +-
 block/qcow2.c|  6 
 include/block/block.h|  2 ++
 qapi-schema.json |  5 ++-
 qemu-img.c   |  3 +-
 qemu-io-cmds.c   |  7 -
 tests/qemu-iotests/064   | 72 
 tests/qemu-iotests/064.out   | 17 +++
 tests/qemu-iotests/common.rc |  4 ++-
 tests/qemu-iotests/group |  1 +
 12 files changed, 128 insertions(+), 8 deletions(-)
 create mode 100755 tests/qemu-iotests/064
 create mode 100644 tests/qemu-iotests/064.out

-- 
1.8.3.1




Re: [Qemu-devel] [RFC] qcow2 journalling draft

2013-09-05 Thread Benoît Canet
> Then you get very quickly alternating sequences of "L2 depends on
> refcount update" (for allocation) and "refcount update depends on L2
> update" (for freeing), which means that Qcow2Cache starts flushing all
> the time without accumulating many requests. These are cases that would
> benefit as well from the atomicity of journal transactions.

True, Deduplication can hit this case on delete if I remember correctly
and it slow down everything.

Best regards

Benoît



Re: [Qemu-devel] [RFC PATCH] spapr: support time base offset migration

2013-09-05 Thread Alexey Kardashevskiy
On 09/05/2013 09:42 PM, Benjamin Herrenschmidt wrote:
> On Thu, 2013-09-05 at 19:48 +1000, Alexey Kardashevskiy wrote:
 I do not have pure guest timebase in QEMU and I need it on the destination.
 But I have host timebase + offset to calculate it. And tb_offset is already
 in ppc_tb_t. It looked logical to me to send the existing field and add
 only the missing part.
>>>
>>> I still don't understand. You want the guest visible timebase in the 
>>> migration stream, no?
>>
>>
>> Yes. I do not really understand the problem here (and I am not playing
>> dump). Do you suggest sending just the guest timebase and do not send the
>> host timebase and the offset (one number instead of two)? I can do that,
>> makes sense, no problem, thanks for the idea.
> 
> No, we want to send the guest TB and the corresponding "time of day" so that
> the target can adjust the TB based on how long the migration took.

Nobody is discussing time_of_the_day, it is there and transmitted. The only
question is whether to send host_timebase and tb_offset (2 values) or just
guest timebase (1 value).


-- 
Alexey



Re: [Qemu-devel] [PATCH] spapr-rtas: reset top 4 bits in parameters address

2013-09-05 Thread Alexander Graf

On 05.09.2013, at 14:04, Alexey Kardashevskiy wrote:

> On 09/05/2013 08:21 PM, Alexander Graf wrote:
>> 
>> On 05.09.2013, at 12:17, Alexey Kardashevskiy wrote:
>> 
>>> On 09/05/2013 07:27 PM, Alexander Graf wrote:
 
 On 05.09.2013, at 09:40, Alexey Kardashevskiy wrote:
 
> On 09/05/2013 05:08 PM, Alexander Graf wrote:
>> 
>> 
>> Am 05.09.2013 um 07:58 schrieb Alexey Kardashevskiy :
>> 
>>> On the real hardware, RTAS is called in real mode and therefore
>>> ignores top 4 bits of the address passed in the call.
>> 
>> Shouldn't we ignore the upper 4 bits for every memory access in real 
>> mode, not just that one parameter?
> 
> We probably should but I just do not see any easy way of doing this. Yet
> another "Ignore N bits on the top" memory region type? No idea.
 
 Well, it already works for code that runs inside of guest context, because 
 there the softmmu code for real mode strips the upper 4 bits.
 
 I basically see 2 ways of fixing this "correctly":
 
>>> 
 1) Don't access memory through cpu_physical_memory_rw or ldx_phys but
 instead through real mode wrappers that strip the upper 4 bits, similar
 to how we handle virtual memory differently from physical memory
>>> 
>>> But there is no a ready wrapper for this, correct? I could not find any. I
>>> would rather do this, looks nicer than 2).
>>> 
>>> 
 2) Create 15 aliases to system_memory at the upper 4 bits of address
 space. That should at the end of the day give you the same effect
>>> 
>>> Wow. Is not that too much?
>>> Ooor since I am normally making bad decisions, I should do this :)
>>> 
>>> 
 The fix as you're proposing it wouldn't work for indirect memory
 descriptors. Imagine you have an "address" parameter that gives you a
 pointer to a struct in memory that again contains a pointer. You still
 want that pointer be interpreted correctly, no?
>>> 
>>> Yes I do. I just think that having non zero bits at the top is a bug and I
>>> would not want the guest to continue sending bad addresses to the host. Or
>>> at least I want to know if it still happening.
>>> 
>>> Now we know that the only occasion of this misbehaviour is the "stop-self"
>>> call and others works just fine. If something new comes up (what is pretty
>>> unlikely, otherwise we would have noticed this issue a loong time ago AND
>>> Paul already made&posted a patch for the host to fix __pa() so it is not
>>> going to happen on new kernels either), ok, we will think of fixing this.
>>> 
>>> Doing in QEMU what the hardware does is a good thing but here I would think
>>> twice.
>> 
>> Well, the idea behind RTAS is that everything RTAS does is usually run in 
>> IR=0 DR=0 inside of guest context, so that's the view of the world we should 
>> expose.
>> 
>> Which makes me think.
>> 
> 
>> Couldn't we just set IR=0 DR=0 when getting an RTAS call and use the
>> virtual memory access functions? Those will already strip the upper 4
>> bits.
> 
> Ok. We reached the border where my ignorance starts :) Never could
> understand the concept of the guest virtual memory in QEMU.
> 
> So we clear IR/DR and call what API? This is not address_space_rw() and
> company, right?

Nono, we basically route things through the same accesses that instructions 
inside of guest context would call. Something like

  cpu_ldl_data()

for example. IIRC there is also an #ifdef that allows you to just run ldl().

It automatically uses the current virtual layout the same way that the 
instruction emulator would do it - which is pretty much what we want.

IIRC you also have to enter RTAS calls with DR=0, so we wouldn't even need to 
flip any MSR bits when emulating RTAS calls, right?


Alex




[Qemu-devel] [PATCH] vl.c: Implement SIGILL signal handler for triggering SIGSEGV

2013-09-05 Thread Michal Novotny
This is the patch to introduce SIGILL handler to be able to trigger
SIGSEGV signal in qemu. This has been written to help debugging
state when qemu crashes by SIGSEGV as a simple reproducer to
emulate such situation in case of need.

Signed-off-by: Michal Novotny 
---
 vl.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/vl.c b/vl.c
index 7e04641..3966271 100644
--- a/vl.c
+++ b/vl.c
@@ -2897,6 +2897,26 @@ static int object_create(QemuOpts *opts, void *opaque)
 return 0;
 }
 
+#ifdef CONFIG_POSIX
+static void signal_handler(int signal)
+{
+int *p = NULL;
+
+*p = 0xDEADBEEF;
+}
+
+static void setup_signal_handlers(void)
+{
+struct sigaction action;
+
+memset(&action, 0, sizeof(action));
+sigfillset(&action.sa_mask);
+action.sa_handler = signal_handler;
+action.sa_flags = 0;
+sigaction(SIGILL, &action, NULL);
+}
+#endif
+
 int main(int argc, char **argv, char **envp)
 {
 int i;
@@ -2945,6 +2965,10 @@ int main(int argc, char **argv, char **envp)
 #endif
 }
 
+#ifdef CONFIG_POSIX
+setup_signal_handlers();
+#endif
+
 module_call_init(MODULE_INIT_QOM);
 
 qemu_add_opts(&qemu_drive_opts);
-- 
1.7.11.7




Re: [Qemu-devel] Questions about hvm domU default devices with upstream qemu

2013-09-05 Thread Fabio Fantoni

Il 05/09/2013 13:23, Stefano Stabellini ha scritto:

On Wed, 4 Sep 2013, Fabio Fantoni wrote:

Il 04/09/2013 15:17, Stefano Stabellini ha scritto:

On Wed, 4 Sep 2013, Fabio Fantoni wrote:

Il 04/07/2013 15:51, Fabio Fantoni ha scritto:

Last year I posted a question about default devices of upstream qemu
that
differ from qemu traditional, like empty floppy and cdrom in particular.
About empty floppy now is disabled as workaround.
About empty cdrom Stabellini tells that is useful, so I tried to use it
but
cd-insert was failing,and I must define other empty cdrom on xl
configuration file, for example with: ',raw,hdb,ro,cdrom'.
It seem that there isn't default devices usable without define them in
xen,
and I think that probably need to revise the definition of the devices
to
avoid empty and unusable ones and to do everything as best as possible
(also
with possibly reviewing qemu side).
Another good thing is also consider pv domU case and possible future
support
of spice and/or other useful features.
Thanks for any reply.

I seen no replies about this, I think is important improve upstream qemu
support and remove the traditional ASA upstream will be equivalent or
superior
in all features.

I also think is good idea add q35 support on xen.
Seem that hvm domUs have performance limits, in particular on windows
domUs
(also with pv drivers) caused probably by old hardware emulation by qemu
and/or xen support limited on some instruction sets.
I don't know how to add essential q35 support on hvmloader to do some
tests
and have effective data about improvements.
Anyone on this?

I agree that this is important and thanks for raising the issue.
It would be nice if somebody stepped up to do the q35 work.

Thanks for reply, can you reply also on questions about default devices with
upstream qemu please?

I think it might be a good idea to keep the same set of default devices
between qemu-traditional and qemu-xen.

We always had an empty cdrom and I don't see why we should remove it. If
cd-insert doesn't work, it's a bug and should be fixed.

Supporting spice in PV guests would be hard, because spice needs QXL
that is a PCI device. PV guests don't have a PCI bus. Of course if
somebody did the work to make QXL and spice work with PV guests I would
be happy to accept it.

I agree that we should keep improving upstream QEMU, however removing
qemu-traditional is not going to be easy because it's needed for
compatibility when a guest started with qemu-traditional is
live-migrated to a new system. But we can make sure that this remains
the only use case for it.


Thanks for reply.

About qemu default empty and unusable floppy and cdrom I have already 
reply at start of year that qemu traditional does not have them.
The tests were with xl only if I remember good, now I checked also on 
old production server with xend and qemu traditional. I confirm that hvm 
domUs is without empty floppy and cdrom also in that case.
cd-insert works (tested with xl and upstream qemu). cd-insert does not 
works only with qemu default empty cdrom because is undefined on xen side.




Re: [Qemu-devel] [PATCH 1/3] block: Additional info string in ImageInfo and BDI

2013-09-05 Thread Eric Blake
On 09/05/2013 06:05 AM, Max Reitz wrote:
> Add a string for additional information to ImageInfo and
> BlockDriverInfo. Also, use this string to emit the compatibility level
> and lazy_refcount value (on compat=1.1) for qcow2.
> 
> Signed-off-by: Max Reitz 
> ---

> +++ b/qapi-schema.json
> @@ -238,6 +238,9 @@
>  #
>  # @backing-image: #optional info of the backing image (since 1.6)
>  #
> +# @info-string: #optional string supplying additional format-specific
> +# information (since 1.7)
> +#
>  # Since: 1.3
>  #
>  ##
> @@ -248,7 +251,7 @@
> '*cluster-size': 'int', '*encrypted': 'bool',
> '*backing-filename': 'str', '*full-backing-filename': 'str',
> '*backing-filename-format': 'str', '*snapshots': ['SnapshotInfo'],
> -   '*backing-image': 'ImageInfo' } }
> +   '*backing-image': 'ImageInfo', '*info-string': 'str' } }

This may work for HMP, but is LOUSY for use by QMP clients.  If you are
passing back more than a single piece of information, you are now
requiring the QMP client to do a parse of a free-form string to learn
those pieces of information.  I'd much rather see a full JSON schema
where EVERY piece of information passed back gets its own optional
field, or even where the additional information is a union type
discriminated by the image, so that we have full structure of the
information being returned rather than just an ad-hoc blobbed string.

Please rework this so that QMP clients like libvirt can easily probe
what compat mode a qcow2 image uses, without having to parse a free-form
string.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 6/8] [PATCH RFC v3] s390-qemu: cpu hotplug - s390 cpu init improvements for hotplug

2013-09-05 Thread Andreas Färber
Am 01.08.2013 16:12, schrieb Jason J. Herne:
> From: "Jason J. Herne" 
> 
> s390_new_cpu is created to encapsulate the creation of a new QOM S390CPU
> object given a cpuid and a model string.
> 
> All actual cpu initialization code is moved from boot time specific 
> functions to
> s390_cpu_initfn (qom init routine) or to s390_new_cpu. This is done to 
> allow us
> to use the same basic code path for a cpu created at boot time and one 
> created
> during a hotplug operation.

Intentionally indented?

> 
> Signed-off-by: Jason J. Herne 
> ---
>  hw/s390x/s390-virtio.c |   25 -
>  target-s390x/cpu.c |4 ++--
>  target-s390x/cpu.h |1 +
>  target-s390x/helper.c  |   12 
>  4 files changed, 27 insertions(+), 15 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio.c b/hw/s390x/s390-virtio.c
> index 5ad9cf3..103f32e 100644
> --- a/hw/s390x/s390-virtio.c
> +++ b/hw/s390x/s390-virtio.c
> @@ -56,11 +56,16 @@ static S390CPU **ipi_states;
>  
>  void s390_cpu_set_ipistate(uint16_t cpu_addr, S390CPU *state)
>  {
> -ipi_states[cpu_addr] = state;
> +if (cpu_addr < max_cpus) {
> +ipi_states[cpu_addr] = state;
> +}
>  }
>  
>  S390CPU *s390_cpu_addr2state(uint16_t cpu_addr)
>  {
> +if (cpu_addr >= max_cpus) {
> +return NULL;
> +}
>  return ipi_states[cpu_addr];
>  }
>  
> @@ -197,19 +202,13 @@ void s390_init_cpus(const char *cpu_model)
>  cpu_model = "host";
>  }
>  
> -ipi_states = g_malloc(sizeof(S390CPU *) * smp_cpus);
> -
> -for (i = 0; i < smp_cpus; i++) {
> -S390CPU *cpu;
> -CPUState *cs;
> +ipi_states = g_malloc(sizeof(S390CPU *) * max_cpus);
>  
> -cpu = cpu_s390x_init(cpu_model);
> -cs = CPU(cpu);
> -
> -ipi_states[i] = cpu;
> -cs->halted = 1;
> -cpu->env.exception_index = EXCP_HLT;
> -cpu->env.storage_keys = s390_get_storage_keys();
> +for (i = 0; i < max_cpus; i++) {
> +ipi_states[i] = NULL;

Using g_malloc0() above would hopefully be more efficient and would
allow to leave the loop untouched for easier review.

> +if (i < smp_cpus) {
> +s390_new_cpu(cpu_model, i);
> +}
>  }
>  }
>  
> diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
> index 6be6c08..c90a91c 100644
> --- a/target-s390x/cpu.c
> +++ b/target-s390x/cpu.c
> @@ -116,7 +116,6 @@ static void s390_cpu_initfn(Object *obj)
>  S390CPU *cpu = S390_CPU(obj);
>  CPUS390XState *env = &cpu->env;
>  static bool inited;
> -static int cpu_num = 0;
>  #if !defined(CONFIG_USER_ONLY)
>  struct tm tm;
>  #endif
> @@ -135,8 +134,9 @@ static void s390_cpu_initfn(Object *obj)
>   * cpu counter in s390_cpu_reset to a negative number at
>   * initial ipl */
>  cs->halted = 1;
> +cpu->env.exception_index = EXCP_HLT;
> +env->storage_keys = s390_get_storage_keys();

4/8?

>  #endif
> -env->cpu_num = cpu_num++;
>  env->ext_index = -1;
>  
>  if (tcg_enabled() && !inited) {
> diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
> index 62eb810..0f68dd0 100644
> --- a/target-s390x/cpu.h
> +++ b/target-s390x/cpu.h
> @@ -315,6 +315,7 @@ static inline int get_ilen(uint8_t opc)
>  #endif
>  
>  S390CPU *cpu_s390x_init(const char *cpu_model);
> +S390CPU *s390_new_cpu(const char *cpu_model, int64_t cpuid);
>  void s390x_translate_init(void);
>  int cpu_s390x_exec(CPUS390XState *s);
>  
> diff --git a/target-s390x/helper.c b/target-s390x/helper.c
> index 61abfd7..a39b148 100644
> --- a/target-s390x/helper.c
> +++ b/target-s390x/helper.c
> @@ -70,6 +70,18 @@ void s390x_cpu_timer(void *opaque)
>  }
>  #endif
>  
> +S390CPU *s390_new_cpu(const char *cpu_model, int64_t cpuid)

Like I said on IRC, I'm not so fond of copying x86 workarounds here...
x86 does not have a fully QOM'ified CPU, s390x does.

> +{
> +S390CPU *cpu;
> +
> +cpu = cpu_s390x_init(cpu_model);
> +cpu->env.cpu_num = cpuid;

linux-user never calls s390_new_cpu(), so it will change behavior in
always having cpu_num of 0. I guess we can live with that but such a
change needs to be mentioned in the commit message at least.

Why is this moved to after CPU init? Can't we just override the field if
need be? Either Jens or Christian said that we would not want to fill up
holes in ipi_tables to have the CPU address be always unique; which
would mean that it would always be counting as before. if we need to
tweak it, we should add a property to be able to set it from command
line and QMP.

This affects migration btw: We would need to migrate the current or next
CPU address since the last CPU might've been hot-unplugged so that next
CPU address != last non-NULL ipi_states[] slot plus one.

> +#if !defined(CONFIG_USER_ONLY)
> +s390_cpu_set_ipistate(cpuid, cpu);
> +#endif

...leaving only this then. Why not do this from the CPU realizefn so
that errors actually can be propagated? If cpuid >= max_cpus the above
will silently 

Re: [Qemu-devel] [PATCH] Change email address

2013-09-05 Thread Peter Maydell
On 19 August 2013 14:51, Anthony Liguori  wrote:
> My IBM email address will be unaccessible after August 23rd, 2013.
>
> Signed-off-by: Anthony Liguori 

> --- a/.mailmap
> +++ b/.mailmap
> @@ -2,7 +2,7 @@
>  # into proper addresses so that they are counted properly in git shortlog 
> output.
>  #
>  Andrzej Zaborowski  balrog 
> 
> -Anthony Liguori  aliguori 
> 
> +Anthony Liguori  aliguori 
> 

I've only just noticed this, but you probably want to also add a
line mapping your IBM address to your new one:

Anthony Liguori  Anthony Liguori 

(perhaps in a new section at the bottom for "people
who've changed their addresses").

As well as making the "git shortlog -nse" stats look
better, scripts/get_maintainer.pl consults mailmap, so
it will mean people using its "grab emails from
affected commits" feature won't use your old email
by mistake.

(me, I mostly care about the stats scoreboard ;-))

thanks
-- PMM



Re: [Qemu-devel] [PATCH 7/8] [PATCH RFC v3] s390-qemu: cpu hotplug - Implement hot_add_cpu hook

2013-09-05 Thread Andreas Färber
Am 01.08.2013 16:12, schrieb Jason J. Herne:
> From: "Jason J. Herne" 
> 
> Implement hot_add_cpu for S390 to allow hot plugging of cpus.
> 
> Signed-off-by: Jason J. Herne 
> ---
>  hw/s390x/s390-virtio-ccw.c |3 +++
>  target-s390x/cpu.c |   32 
>  target-s390x/cpu.h |2 ++
>  3 files changed, 37 insertions(+)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index b469960..30b6a48 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -117,6 +117,9 @@ static QEMUMachine ccw_machine = {
>  .alias = "s390-ccw",
>  .desc = "VirtIO-ccw based S390 machine",
>  .init = ccw_init,
> +#if !defined(CONFIG_USER_ONLY)
> +.hot_add_cpu = ccw_hot_add_cpu,
> +#endif

I doubt this #ifdeffery is necessary here?

>  .block_default_type = IF_VIRTIO,
>  .no_cdrom = 1,
>  .no_floppy = 1,
> diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
> index c90a91c..60029d9 100644
> --- a/target-s390x/cpu.c
> +++ b/target-s390x/cpu.c
> @@ -27,6 +27,8 @@
>  #include "qemu-common.h"
>  #include "qemu/timer.h"
>  #include "hw/hw.h"
> +#include "hw/s390x/sclp.h"
> +#include "sysemu/sysemu.h"
>  #ifndef CONFIG_USER_ONLY
>  #include "sysemu/arch_init.h"
>  #endif
> @@ -154,6 +156,36 @@ static void s390_cpu_finalize(Object *obj)
>  #endif
>  }
>  
> +#if !defined(CONFIG_USER_ONLY)
> +void ccw_hot_add_cpu(const int64_t id, Error **errp)
> +{
> +S390CPU *new_cpu;
> +CPUState *cpu;
> +const char *model_str;
> +int cpu_count = 0;
> +
> +for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {

CPU_FOREACH(cpu) {

> +cpu_count++;
> +}
> +
> +if (cpu_count == max_cpus) {
> +error_setg(errp, "Maximum number of cpus already defined");
> +return;
> +}
> +
> +if (id != cpu_count) {
> +error_setg(errp, "Unable to add CPU: %" PRIi64
> +   ", The next available id is %d", id, cpu_count);
> +return;
> +}

This logic seems wrong according to your colleagues. It should be
checking against the static cpu_num counter or not checking at all if we
want to allow explicit device_add s390-cpu,cpu-num=42.

> +
> +model_str = s390_cpu_addr2state(0)->env.cpu_model_str;
> +new_cpu = s390_new_cpu(model_str, id);

As announced, a patch in my large series finally sent out removes
cpu_model_str field. Since we don't have any for s390x, I suggest that
you use the QOM constructs so that device_add works as well, i.e.
new_cpu = object_new(TYPE_S390_CPU).

> +object_property_set_bool(OBJECT(new_cpu), true, "realized", NULL);
> +raise_irq_cpu_hotplug();

This would mean moving this line into the realizefn, conditional on
dev->hotplugged (and probably #ifndef CONFIG_USER_ONLY).

Regards,
Andreas

> +}
> +#endif
> +
>  static const VMStateDescription vmstate_s390_cpu = {
>  .name = "cpu",
>  .unmigratable = 1,
> diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
> index 0f68dd0..711aad4 100644
> --- a/target-s390x/cpu.h
> +++ b/target-s390x/cpu.h
> @@ -383,6 +383,8 @@ S390CPU *s390_cpu_addr2state(uint16_t cpu_addr);
>  void s390_add_running_cpu(S390CPU *cpu);
>  unsigned s390_del_running_cpu(S390CPU *cpu);
>  
> +void ccw_hot_add_cpu(const int64_t id, Error **errp);
> +
>  /* service interrupts are floating therefore we must not pass an cpustate */
>  void s390_sclp_extint(uint32_t parm);
>  
> 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [RFC PATCH] spapr: support time base offset migration

2013-09-05 Thread Alexander Graf

On 05.09.2013, at 13:44, Benjamin Herrenschmidt wrote:

> On Thu, 2013-09-05 at 11:58 +0200, Alexander Graf wrote:
> 
>>> Yes. I do not really understand the problem here (and I am not
>> playing
>>> dump). Do you suggest sending just the guest timebase and do not
>> send the
>>> host timebase and the offset (one number instead of two)? I can do
>> that,
>>> makes sense, no problem, thanks for the idea.
>> 
>> Yup, pretty much :). The receiving end should have no business in
>> knowing how far off the guest and the host timebase were skewed on the
>> sending end :).
> 
> Well, yes and no ... we'd like to account for the migration latency on
> the timebase or the guest view of real time will get skewed since it
> uses the TB to maintain it's clock.
> 
> So assuming both hosts are reasonably synchronized with NTP, we want a
> correlation guest TB / TOD in order to properly make the adjustment on
> the target.

Hrm, I think I'm starting to understand what this is about. So what we want is

  - timebase in guest
  - timebase frequency in guest
  - wall clock time in host

That way the receiving end can then take the timebase and add (new_timebase - 
old_timebase) * tb_freq to the guest's time base.

Which gets me to the next question. Can we modify the tb frequency in guests?


Alex




Re: [Qemu-devel] [RFC PATCH] spapr: support time base offset migration

2013-09-05 Thread Benjamin Herrenschmidt
On Thu, 2013-09-05 at 11:58 +0200, Alexander Graf wrote:

> > Yes. I do not really understand the problem here (and I am not
> playing
> > dump). Do you suggest sending just the guest timebase and do not
> send the
> > host timebase and the offset (one number instead of two)? I can do
> that,
> > makes sense, no problem, thanks for the idea.
> 
> Yup, pretty much :). The receiving end should have no business in
> knowing how far off the guest and the host timebase were skewed on the
> sending end :).

Well, yes and no ... we'd like to account for the migration latency on
the timebase or the guest view of real time will get skewed since it
uses the TB to maintain it's clock.

So assuming both hosts are reasonably synchronized with NTP, we want a
correlation guest TB / TOD in order to properly make the adjustment on
the target.

This may not be what Alexey implemented but I think that's what Paulus
and I asked for :-)

Cheers,
Ben.





Re: [Qemu-devel] [PATCH 4/8] [PATCH RFC v3] s390-qemu: cpu hotplug - Storage key global access

2013-09-05 Thread Alexander Graf

On 01.08.2013, at 16:12, Jason J. Herne wrote:

> From: "Jason J. Herne" 
> 
> Introduces global access to storage key data so we can set it for each cpu in
> the S390 cpu initialization routine.
> 
> Signed-off-by: Jason J. Herne 
> ---
> hw/s390x/s390-virtio-ccw.c |5 ++---
> hw/s390x/s390-virtio.c |   21 -
> hw/s390x/s390-virtio.h |2 +-
> target-s390x/cpu.h |4 
> 4 files changed, 23 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index aebbbf1..b469960 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -65,7 +65,6 @@ static void ccw_init(QEMUMachineInitArgs *args)
> MemoryRegion *sysmem = get_system_memory();
> MemoryRegion *ram = g_new(MemoryRegion, 1);
> int shift = 0;
> -uint8_t *storage_keys;
> int ret;
> VirtualCssBus *css_bus;
> 
> @@ -94,10 +93,10 @@ static void ccw_init(QEMUMachineInitArgs *args)
> memory_region_add_subregion(sysmem, 0, ram);
> 
> /* allocate storage keys */
> -storage_keys = g_malloc0(my_ram_size / TARGET_PAGE_SIZE);
> +s390_alloc_storage_keys(my_ram_size);
> 
> /* init CPUs */
> -s390_init_cpus(args->cpu_model, storage_keys);
> +s390_init_cpus(args->cpu_model);
> 
> if (kvm_enabled()) {
> kvm_s390_enable_css_support(s390_cpu_addr2state(0));
> diff --git a/hw/s390x/s390-virtio.c b/hw/s390x/s390-virtio.c
> index 439d732..21e9124 100644
> --- a/hw/s390x/s390-virtio.c
> +++ b/hw/s390x/s390-virtio.c
> @@ -123,6 +123,18 @@ static void s390_virtio_register_hcalls(void)
>s390_virtio_hcall_set_status);
> }
> 
> +static uint8_t *storage_keys;
> +
> +uint8_t *s390_get_storage_keys(void)
> +{
> +return storage_keys;

This is basically the same as having a global in my point of view. I would 
personally just make it a global and call it a day.

However, it might make sense to make storage keys be a device. That way you get 
direct access for free (you can ask for the device by-path) and you have 
somewhere to hook migration off of.


Alex

> +}
> +
> +void s390_alloc_storage_keys(ram_addr_t ram_size)
> +{
> +storage_keys = g_malloc0(ram_size / TARGET_PAGE_SIZE);
> +}
> +
> /*
>  * The number of running CPUs. On s390 a shutdown is the state of all CPUs
>  * being either stopped or disabled (for interrupts) waiting. We have to
> @@ -176,7 +188,7 @@ void s390_init_ipl_dev(const char *kernel_filename,
> qdev_init_nofail(dev);
> }
> 
> -void s390_init_cpus(const char *cpu_model, uint8_t *storage_keys)
> +void s390_init_cpus(const char *cpu_model)
> {
> int i;
> 
> @@ -196,7 +208,7 @@ void s390_init_cpus(const char *cpu_model, uint8_t 
> *storage_keys)
> ipi_states[i] = cpu;
> cs->halted = 1;
> cpu->env.exception_index = EXCP_HLT;
> -cpu->env.storage_keys = storage_keys;
> +cpu->env.storage_keys = s390_get_storage_keys();
> }
> }
> 
> @@ -231,7 +243,6 @@ static void s390_init(QEMUMachineInitArgs *args)
> MemoryRegion *sysmem = get_system_memory();
> MemoryRegion *ram = g_new(MemoryRegion, 1);
> int shift = 0;
> -uint8_t *storage_keys;
> void *virtio_region;
> hwaddr virtio_region_len;
> hwaddr virtio_region_start;
> @@ -270,10 +281,10 @@ static void s390_init(QEMUMachineInitArgs *args)
>   virtio_region_len);
> 
> /* allocate storage keys */
> -storage_keys = g_malloc0(my_ram_size / TARGET_PAGE_SIZE);
> +s390_alloc_storage_keys(my_ram_size);
> 
> /* init CPUs */
> -s390_init_cpus(args->cpu_model, storage_keys);
> +s390_init_cpus(args->cpu_model);
> 
> /* Create VirtIO network adapters */
> s390_create_virtio_net((BusState *)s390_bus, "virtio-net-s390");
> diff --git a/hw/s390x/s390-virtio.h b/hw/s390x/s390-virtio.h
> index 5c405e7..c1cb042 100644
> --- a/hw/s390x/s390-virtio.h
> +++ b/hw/s390x/s390-virtio.h
> @@ -20,7 +20,7 @@
> typedef int (*s390_virtio_fn)(const uint64_t *args);
> void s390_register_virtio_hypercall(uint64_t code, s390_virtio_fn fn);
> 
> -void s390_init_cpus(const char *cpu_model, uint8_t *storage_keys);
> +void s390_init_cpus(const char *cpu_model);
> void s390_init_ipl_dev(const char *kernel_filename,
>const char *kernel_cmdline,
>const char *initrd_filename,
> diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
> index 65bef86..877eac7 100644
> --- a/target-s390x/cpu.h
> +++ b/target-s390x/cpu.h
> @@ -374,6 +374,10 @@ static inline void kvm_s390_interrupt_internal(S390CPU 
> *cpu, int type,
> {
> }
> #endif
> +
> +uint8_t *s390_get_storage_keys(void);
> +void s390_alloc_storage_keys(ram_addr_t ram_size);
> +
> S390CPU *s390_cpu_addr2state(uint16_t cpu_addr);
> void s390_add_running_cpu(S390CPU *cpu);
> unsigned s390_del_running_cpu(S390CPU *cpu);
> -- 
> 1.7.10.4
> 




Re: [Qemu-devel] [PATCH 5/8] [PATCH RFC v3] s390-qemu: cpu hotplug - ipi_states enhancements

2013-09-05 Thread Alexander Graf

On 01.08.2013, at 16:12, Jason J. Herne wrote:

> From: "Jason J. Herne" 
> 
> Modify s390_cpu_addr2state to allow fetching state information for cpu 
> addresses
> above smp_cpus.  Hotplug requires this capability.
> 
> Also add s390_cpu_set_state function to allow modification of ipi_state 
> entries
> during hotplug.
> 
> Signed-off-by: Jason J. Herne 
> ---
> hw/s390x/s390-virtio.c |9 +
> target-s390x/cpu.h |2 +-
> 2 files changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio.c b/hw/s390x/s390-virtio.c
> index 21e9124..5ad9cf3 100644
> --- a/hw/s390x/s390-virtio.c
> +++ b/hw/s390x/s390-virtio.c
> @@ -54,12 +54,13 @@
> static VirtIOS390Bus *s390_bus;
> static S390CPU **ipi_states;
> 
> -S390CPU *s390_cpu_addr2state(uint16_t cpu_addr)
> +void s390_cpu_set_ipistate(uint16_t cpu_addr, S390CPU *state)
> {
> -if (cpu_addr >= smp_cpus) {
> -return NULL;
> -}
> +ipi_states[cpu_addr] = state;

We should still ensure that we don't access anything beyond the size of the 
array, no?


Alex

> +}
> 
> +S390CPU *s390_cpu_addr2state(uint16_t cpu_addr)
> +{
> return ipi_states[cpu_addr];
> }
> 
> diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
> index 877eac7..62eb810 100644
> --- a/target-s390x/cpu.h
> +++ b/target-s390x/cpu.h
> @@ -377,7 +377,7 @@ static inline void kvm_s390_interrupt_internal(S390CPU 
> *cpu, int type,
> 
> uint8_t *s390_get_storage_keys(void);
> void s390_alloc_storage_keys(ram_addr_t ram_size);
> -
> +void s390_cpu_set_ipistate(uint16_t cpu_addr, S390CPU *state);
> S390CPU *s390_cpu_addr2state(uint16_t cpu_addr);
> void s390_add_running_cpu(S390CPU *cpu);
> unsigned s390_del_running_cpu(S390CPU *cpu);
> -- 
> 1.7.10.4
> 




Re: [Qemu-devel] [PATCH] spapr-rtas: reset top 4 bits in parameters address

2013-09-05 Thread Alexey Kardashevskiy
On 09/05/2013 10:16 PM, Alexander Graf wrote:
> 
> On 05.09.2013, at 14:04, Alexey Kardashevskiy wrote:
> 
>> On 09/05/2013 08:21 PM, Alexander Graf wrote:
>>>
>>> On 05.09.2013, at 12:17, Alexey Kardashevskiy wrote:
>>>
 On 09/05/2013 07:27 PM, Alexander Graf wrote:
>
> On 05.09.2013, at 09:40, Alexey Kardashevskiy wrote:
>
>> On 09/05/2013 05:08 PM, Alexander Graf wrote:
>>>
>>>
>>> Am 05.09.2013 um 07:58 schrieb Alexey Kardashevskiy :
>>>
 On the real hardware, RTAS is called in real mode and therefore
 ignores top 4 bits of the address passed in the call.
>>>
>>> Shouldn't we ignore the upper 4 bits for every memory access in real 
>>> mode, not just that one parameter?
>>
>> We probably should but I just do not see any easy way of doing this. Yet
>> another "Ignore N bits on the top" memory region type? No idea.
>
> Well, it already works for code that runs inside of guest context, 
> because there the softmmu code for real mode strips the upper 4 bits.
>
> I basically see 2 ways of fixing this "correctly":
>

> 1) Don't access memory through cpu_physical_memory_rw or ldx_phys but
> instead through real mode wrappers that strip the upper 4 bits, similar
> to how we handle virtual memory differently from physical memory

 But there is no a ready wrapper for this, correct? I could not find any. I
 would rather do this, looks nicer than 2).


> 2) Create 15 aliases to system_memory at the upper 4 bits of address
> space. That should at the end of the day give you the same effect

 Wow. Is not that too much?
 Ooor since I am normally making bad decisions, I should do this :)


> The fix as you're proposing it wouldn't work for indirect memory
> descriptors. Imagine you have an "address" parameter that gives you a
> pointer to a struct in memory that again contains a pointer. You still
> want that pointer be interpreted correctly, no?

 Yes I do. I just think that having non zero bits at the top is a bug and I
 would not want the guest to continue sending bad addresses to the host. Or
 at least I want to know if it still happening.

 Now we know that the only occasion of this misbehaviour is the "stop-self"
 call and others works just fine. If something new comes up (what is pretty
 unlikely, otherwise we would have noticed this issue a loong time ago AND
 Paul already made&posted a patch for the host to fix __pa() so it is not
 going to happen on new kernels either), ok, we will think of fixing this.

 Doing in QEMU what the hardware does is a good thing but here I would think
 twice.
>>>
>>> Well, the idea behind RTAS is that everything RTAS does is usually run in 
>>> IR=0 DR=0 inside of guest context, so that's the view of the world we 
>>> should expose.
>>>
>>> Which makes me think.
>>>
>>
>>> Couldn't we just set IR=0 DR=0 when getting an RTAS call and use the
>>> virtual memory access functions? Those will already strip the upper 4
>>> bits.
>>
>> Ok. We reached the border where my ignorance starts :) Never could
>> understand the concept of the guest virtual memory in QEMU.
>>
>> So we clear IR/DR and call what API? This is not address_space_rw() and
>> company, right?
> 
> Nono, we basically route things through the same accesses that instructions 
> inside of guest context would call. Something like
> 
>   cpu_ldl_data()
> 
> for example. IIRC there is also an #ifdef that allows you to just run ldl().

cpu_ldl_data() is defined for CONFIG_USER_ONLY. But ok, it is defined
simply as ldl_p():

#define cpu_ldl_data(env, addr) ldl_raw(addr)
#define g2h(x) ((void *)((unsigned long)(target_ulong)(x) + GUEST_BASE))
#define laddr(x) g2h(x)
#define ldl_raw(p) ldl_p(laddr((p)))

static inline int ldl_p(const void *ptr)
{
int32_t r;
memcpy(&r, ptr, sizeof(r));
return r;
}

So it tries accessing memory @ptr (which is the guest physical) and -
crashes :) So I need an address converter which is not there.

What do I miss? Thanks.


> It automatically uses the current virtual layout the same way that the 
> instruction emulator would do it - which is pretty much what we want.
> 

> IIRC you also have to enter RTAS calls with DR=0, so we wouldn't even
> need to flip any MSR bits when emulating RTAS calls, right?

Probably. Right now cpu->env.msr==0x0 in rtas handler but not sure that I
see the real value.


-- 
Alexey



Re: [Qemu-devel] [PATCH 6/8] [PATCH RFC v3] s390-qemu: cpu hotplug - s390 cpu init improvements for hotplug

2013-09-05 Thread Alexander Graf

On 01.08.2013, at 16:12, Jason J. Herne wrote:

> From: "Jason J. Herne" 
> 
>s390_new_cpu is created to encapsulate the creation of a new QOM S390CPU
>object given a cpuid and a model string.
> 
>All actual cpu initialization code is moved from boot time specific 
> functions to
>s390_cpu_initfn (qom init routine) or to s390_new_cpu. This is done to 
> allow us
>to use the same basic code path for a cpu created at boot time and one 
> created
>during a hotplug operation.
> 
> Signed-off-by: Jason J. Herne 
> ---
> hw/s390x/s390-virtio.c |   25 -
> target-s390x/cpu.c |4 ++--
> target-s390x/cpu.h |1 +
> target-s390x/helper.c  |   12 
> 4 files changed, 27 insertions(+), 15 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio.c b/hw/s390x/s390-virtio.c
> index 5ad9cf3..103f32e 100644
> --- a/hw/s390x/s390-virtio.c
> +++ b/hw/s390x/s390-virtio.c
> @@ -56,11 +56,16 @@ static S390CPU **ipi_states;
> 
> void s390_cpu_set_ipistate(uint16_t cpu_addr, S390CPU *state)
> {
> -ipi_states[cpu_addr] = state;
> +if (cpu_addr < max_cpus) {

Ah, here you add the checks back in. Works for me.

> +ipi_states[cpu_addr] = state;
> +}
> }
> 
> S390CPU *s390_cpu_addr2state(uint16_t cpu_addr)
> {
> +if (cpu_addr >= max_cpus) {
> +return NULL;
> +}
> return ipi_states[cpu_addr];
> }
> 
> @@ -197,19 +202,13 @@ void s390_init_cpus(const char *cpu_model)
> cpu_model = "host";
> }
> 
> -ipi_states = g_malloc(sizeof(S390CPU *) * smp_cpus);
> -
> -for (i = 0; i < smp_cpus; i++) {
> -S390CPU *cpu;
> -CPUState *cs;
> +ipi_states = g_malloc(sizeof(S390CPU *) * max_cpus);

g_new is easier :).


Alex




Re: [Qemu-devel] [PATCH 1/3] block: Additional info string in ImageInfo and BDI

2013-09-05 Thread Max Reitz

On 2013-09-05 14:25, Eric Blake wrote:

On 09/05/2013 06:05 AM, Max Reitz wrote:

Add a string for additional information to ImageInfo and
BlockDriverInfo. Also, use this string to emit the compatibility level
and lazy_refcount value (on compat=1.1) for qcow2.

Signed-off-by: Max Reitz 
---
+++ b/qapi-schema.json
@@ -238,6 +238,9 @@
  #
  # @backing-image: #optional info of the backing image (since 1.6)
  #
+# @info-string: #optional string supplying additional format-specific
+# information (since 1.7)
+#
  # Since: 1.3
  #
  ##
@@ -248,7 +251,7 @@
 '*cluster-size': 'int', '*encrypted': 'bool',
 '*backing-filename': 'str', '*full-backing-filename': 'str',
 '*backing-filename-format': 'str', '*snapshots': ['SnapshotInfo'],
-   '*backing-image': 'ImageInfo' } }
+   '*backing-image': 'ImageInfo', '*info-string': 'str' } }

This may work for HMP, but is LOUSY for use by QMP clients.  If you are
passing back more than a single piece of information, you are now
requiring the QMP client to do a parse of a free-form string to learn
those pieces of information.  I'd much rather see a full JSON schema
where EVERY piece of information passed back gets its own optional
field, or even where the additional information is a union type
discriminated by the image, so that we have full structure of the
information being returned rather than just an ad-hoc blobbed string.

Please rework this so that QMP clients like libvirt can easily probe
what compat mode a qcow2 image uses, without having to parse a free-form
string.


Seems very reasonable; I'll do my best.

Max



Re: [Qemu-devel] [PATCH 0/8] [PATCH RFC v3] s390 cpu hotplug

2013-09-05 Thread Alexander Graf

On 01.08.2013, at 16:12, Jason J. Herne wrote:

> From: "Jason J. Herne" 
> 
> Latest code for cpu Hotplug on S390 architecture.   This one is vastly simpler
> than v2 as we have decided to avoid the command line specification 
> of -device s390-cpu.
> 
> The last version can be found here:
> http://lists.gnu.org/archive/html/qemu-devel/2013-06/msg01183.html
> 
> There is also a patch in this series to add cpu-add to the Qemu monitor
> interface.
> 
> Hotplugged cpus are created in the configured state and can be used by the
> guest after the guest onlines the cpu by: 
> "echo 1 > /sys/bus/cpu/devices/cpuN/online"
> 
> Hot unplugging is currently not implemented by this code. 

Very simple and clean patch set. I don't think it deserves the RFC tag.

Apart from the minor comments I had consider it

Reviewed-by: Alexander Graf 


Alex




Re: [Qemu-devel] [Qemu-ppc] [PATCH 16/16] target-ppc: Convert to new ldst opcodes

2013-09-05 Thread Alexander Graf

On 05.09.2013, at 13:40, Benjamin Herrenschmidt wrote:

> On Thu, 2013-09-05 at 11:08 +0200, Alexander Graf wrote:
>> On 04.09.2013, at 23:05, Richard Henderson wrote:
>> 
>>> This lets us change "le_mode" to "end_mode" and fold away nearly all
>>> of the tests for the current cpu endianness, and removing all of the
>>> explicitly generated bswap opcodes.
> 
> Only nit: I find "end_mode" a very confusing identifier :-) "end"
> usually means something else ! Why not endian_mode ?
>>> 
>>> Cc: qemu-...@nongnu.org
>>> Signed-off-by: Richard Henderson 
>> 
>> No complaints from me, apart from the usual "LE mode isn't necessarily what 
>> you think it is on PPC" one. But the code would be as broken as before IIUC.
>> 
>> Ben, you had some insight in how LE mode on different PPC flavors work. 
>> Could you please make sure we're not walking into the wrong direction here?
> 
> I haven't seen the patch itself for some reason (and I'm about to go off
> for a few days). The early day powerpc endian mode can be safely ignored
> I think, I don't even remember the details myself, I think it induced
> some changes to the byte lanes ordering on the bus and thus required the
> host bridge to be adjusted.
> 
> The embedded PPCs have simply a per-page E bit in the TLB controlling
> the endianness of accesses through the translation, the endianness is
> "clean" in that case, and the bus doesn't flip around so it's akin to
> what P7 does but with a finer granularity.

So on P7 basically everything that goes from registers out is byte-swapped, 
including any RAM access and MMIOs? I think that's basically what the current 
little endian mode implements (though it might miss a few places, like FPU or 
Altivec, but I'd consider that bugs).


Alex




Re: [Qemu-devel] Questions about hvm domU default devices with upstream qemu

2013-09-05 Thread Stefano Stabellini
On Thu, 5 Sep 2013, Fabio Fantoni wrote:
> Il 05/09/2013 13:23, Stefano Stabellini ha scritto:
> > On Wed, 4 Sep 2013, Fabio Fantoni wrote:
> > > Il 04/09/2013 15:17, Stefano Stabellini ha scritto:
> > > > On Wed, 4 Sep 2013, Fabio Fantoni wrote:
> > > > > Il 04/07/2013 15:51, Fabio Fantoni ha scritto:
> > > > > > Last year I posted a question about default devices of upstream qemu
> > > > > > that
> > > > > > differ from qemu traditional, like empty floppy and cdrom in
> > > > > > particular.
> > > > > > About empty floppy now is disabled as workaround.
> > > > > > About empty cdrom Stabellini tells that is useful, so I tried to use
> > > > > > it
> > > > > > but
> > > > > > cd-insert was failing,and I must define other empty cdrom on xl
> > > > > > configuration file, for example with: ',raw,hdb,ro,cdrom'.
> > > > > > It seem that there isn't default devices usable without define them
> > > > > > in
> > > > > > xen,
> > > > > > and I think that probably need to revise the definition of the
> > > > > > devices
> > > > > > to
> > > > > > avoid empty and unusable ones and to do everything as best as
> > > > > > possible
> > > > > > (also
> > > > > > with possibly reviewing qemu side).
> > > > > > Another good thing is also consider pv domU case and possible future
> > > > > > support
> > > > > > of spice and/or other useful features.
> > > > > > Thanks for any reply.
> > > > > I seen no replies about this, I think is important improve upstream
> > > > > qemu
> > > > > support and remove the traditional ASA upstream will be equivalent or
> > > > > superior
> > > > > in all features.
> > > > > 
> > > > > I also think is good idea add q35 support on xen.
> > > > > Seem that hvm domUs have performance limits, in particular on windows
> > > > > domUs
> > > > > (also with pv drivers) caused probably by old hardware emulation by
> > > > > qemu
> > > > > and/or xen support limited on some instruction sets.
> > > > > I don't know how to add essential q35 support on hvmloader to do some
> > > > > tests
> > > > > and have effective data about improvements.
> > > > > Anyone on this?
> > > > I agree that this is important and thanks for raising the issue.
> > > > It would be nice if somebody stepped up to do the q35 work.
> > > Thanks for reply, can you reply also on questions about default devices
> > > with
> > > upstream qemu please?
> > I think it might be a good idea to keep the same set of default devices
> > between qemu-traditional and qemu-xen.
> > 
> > We always had an empty cdrom and I don't see why we should remove it. If
> > cd-insert doesn't work, it's a bug and should be fixed.
> > 
> > Supporting spice in PV guests would be hard, because spice needs QXL
> > that is a PCI device. PV guests don't have a PCI bus. Of course if
> > somebody did the work to make QXL and spice work with PV guests I would
> > be happy to accept it.
> > 
> > I agree that we should keep improving upstream QEMU, however removing
> > qemu-traditional is not going to be easy because it's needed for
> > compatibility when a guest started with qemu-traditional is
> > live-migrated to a new system. But we can make sure that this remains
> > the only use case for it.
> 
> Thanks for reply.
> 
> About qemu default empty and unusable floppy and cdrom I have already reply at
> start of year that qemu traditional does not have them.
> The tests were with xl only if I remember good, now I checked also on old
> production server with xend and qemu traditional. I confirm that hvm domUs is
> without empty floppy and cdrom also in that case.
>
> cd-insert works (tested with xl and upstream qemu). cd-insert does not works
> only with qemu default empty cdrom because is undefined on xen side.

So HVM domUs do NOT have a cdrom drive by default with qemu-traditional.

Do they have an empty cdrom drive by default with qemu-xen?

Is the lack of a cdrom drive the reason why cd-insert doesn't work by
default (unless you specify an empty cdrom drive in the VM config file)?



Re: [Qemu-devel] [RFC PATCH] spapr: add initial ibm, client-architecture-support rtas call support

2013-09-05 Thread Alexander Graf

On 05.09.2013, at 13:55, Paul Mackerras wrote:

> On Thu, Sep 05, 2013 at 12:19:09PM +0200, Alexander Graf wrote:
>> 
>> On 05.09.2013, at 12:16, Paul Mackerras wrote:
>> 
>>> On Wed, Sep 04, 2013 at 04:32:20PM -0500, Anthony Liguori wrote:
 On Wed, Sep 4, 2013 at 8:37 AM, Alexander Graf  wrote:
> 
>> 
>>> So IMHO this whole thing should be orthogonal to -cpu.
>> 
>> Well, since we cannot change CPU class on the fly, yes, it should be a
>> "compatibility" flags/properties/methods/whatever of the default CPU for
>> the specific family.
> 
> Since it's machine global, it could just as well be a machine option, no? 
> Or can you have multiple CPUs with different compat modes in a single 
> system?
 
 AFAIK, this has nothing to do with CPUs.
>>> 
>>> I'm not sure what you mean by that; it has to do with CPUs since it
>>> means changing the CPUs' behaviour, at least for user-mode programs.
>>> 
>>> Why? Just because you're on POWER7 as default doesn't mean you can't 
>>> bump to a newer compat later on, no?
>> 
>> Bump when exactly? And it won't be a new compat, it will be a native 
>> CPU. I
> 
> If you configure your guest to boot in POWER7-compat mode on your POWER8 
> box and it then tells you through ibm,client-architecture-support that it 
> can do POWER8, we can just remove all the compat bits and be happy, no?
>>> 
>>> Answering Alex here -- if we want to preserve the option of migrating
>>> to a POWER7 host in future, we would run the guest in POWER7 compat
>>> mode even if the current host is a POWER8 and the guest knows about
>>> POWER8.
>> 
>> Yes, so we boot the guest with compat mode set to POWER7, then the guest 
>> calls ibm,client-architecutre-support including POWER8 and then we can 
>> reconfigure the guest to be POWER8, right?
> 
> Not if we want to be able to migrate to a real POWER7 later.  If we
> tell the guest it's a POWER8, it will start using POWER8 features, and
> then break when we migrate it to a POWER7 where those features don't
> exist.  If we run the POWER8 in POWER7 compatibility mode (and more
> importantly, the device tree says it's a 2.06 architecture processor),
> it should only use POWER7 features and then work just fine when
> migrated to a real POWER7.

Ah, ok. So we want to be able to specify 2 things:

  - current compat level
  - max compat level

That way we can boot a POWER8 virtual machine in POWER7 compat mode straight on 
boot, but allow that machine to later reconfigure itself if it isn't configured 
to POWER7 as max compat level.

The current level is mostly a convenience thing to not have to reboot SLOF 
every time you boot up a guest. The max level is what gives you compatibility 
throughout a cluster of machines.

Or do I miss anything here?


Alex




Re: [Qemu-devel] [PATCH 0/8] [PATCH RFC v3] s390 cpu hotplug

2013-09-05 Thread Andreas Färber
Am 05.09.2013 14:54, schrieb Alexander Graf:
> 
> On 01.08.2013, at 16:12, Jason J. Herne wrote:
> 
>> From: "Jason J. Herne" 
>>
>> Latest code for cpu Hotplug on S390 architecture.   This one is vastly 
>> simpler
>> than v2 as we have decided to avoid the command line specification 
>> of -device s390-cpu.
>>
>> The last version can be found here:
>> http://lists.gnu.org/archive/html/qemu-devel/2013-06/msg01183.html
>>
>> There is also a patch in this series to add cpu-add to the Qemu monitor
>> interface.
>>
>> Hotplugged cpus are created in the configured state and can be used by the
>> guest after the guest onlines the cpu by: 
>> "echo 1 > /sys/bus/cpu/devices/cpuN/online"
>>
>> Hot unplugging is currently not implemented by this code. 
> 
> Very simple and clean patch set. I don't think it deserves the RFC tag.

Negative, see my review. If you want to fix up and queue patches 1-2
that's fine with me, but the others need a respin. No major blocker
though, just some more footwork mostly related to QOM and Jason's
shifted focus on cpu-add rather than device_add.

Open issues:
* Might ipi_states need to become a device due to migration?
* QOM properties considerations
* Device creation in qdev initfn
* Parent field access
* QOM-unfriendly creation and reliance upon helper function

Andreas

> 
> Apart from the minor comments I had consider it
> 
> Reviewed-by: Alexander Graf 
> 
> 
> Alex
> 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



[Qemu-devel] [PATCH uq/master 1/2] x86: fix migration from pre-version 12

2013-09-05 Thread Paolo Bonzini
On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv,
and not restore anything.

Since FP and SSE data are always valid, set them in xstate_bv at reset
time.  In fact, that value is the same that KVM_GET_XSAVE returns on
pre-XSAVE hosts.

Signed-off-by: Paolo Bonzini 
---
 target-i386/cpu.c | 1 +
 target-i386/cpu.h | 5 +
 2 files changed, 6 insertions(+)

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index c36345e..ac83106 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2386,6 +2386,7 @@ static void x86_cpu_reset(CPUState *s)
 env->fpuc = 0x37f;
 
 env->mxcsr = 0x1f80;
+env->xstate_bv = XSTATE_FP | XSTATE_SSE;
 
 env->pat = 0x0007040600070406ULL;
 env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT;
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 5723eff..a153078 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -380,6 +380,11 @@
 
 #define MSR_VM_HSAVE_PA 0xc0010117
 
+#define XSTATE_SUPPORTED   (XSTATE_FP|XSTATE_SSE|XSTATE_YMM)
+#define XSTATE_FP  1
+#define XSTATE_SSE 2
+#define XSTATE_YMM 4
+
 /* CPUID feature words */
 typedef enum FeatureWord {
 FEAT_1_EDX, /* CPUID[1].EDX */
-- 
1.8.3.1





[Qemu-devel] [PATCH uq/master 0/2] KVM: issues with XSAVE support

2013-09-05 Thread Paolo Bonzini
This series fixes two migration bugs concerning KVM's XSAVE ioctls,
both found by code inspection (the second in fact is just theoretical
until AVX512 or MPX support is added to KVM).

Please review.

Paolo Bonzini (2):
  x86: fix migration from pre-version 12
  KVM: make XSAVE support more robust

 target-i386/cpu.c | 1 +
 target-i386/cpu.h | 5 +
 target-i386/kvm.c | 3 ++-
 target-i386/machine.c | 4 
 4 files changed, 12 insertions(+), 1 deletion(-)

-- 
1.8.3.1




[Qemu-devel] [PATCH uq/master 2/2] KVM: make XSAVE support more robust

2013-09-05 Thread Paolo Bonzini
QEMU moves state from CPUArchState to struct kvm_xsave and back when it
invokes the KVM_*_XSAVE ioctls.  Because it doesn't treat the XSAVE
region as an opaque blob, it might be impossible to set some state on
the destination if migrating to an older version.

This patch blocks migration if it finds that unsupported bits are set
in the XSTATE_BV header field.  To make this work robustly, QEMU should
only report in env->xstate_bv those fields that will actually end up
in the migration stream.

Signed-off-by: Paolo Bonzini 
---
 target-i386/kvm.c | 3 ++-
 target-i386/machine.c | 4 
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 749aa09..df08a4b 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1291,7 +1291,8 @@ static int kvm_get_xsave(X86CPU *cpu)
 sizeof env->fpregs);
 memcpy(env->xmm_regs, &xsave->region[XSAVE_XMM_SPACE],
 sizeof env->xmm_regs);
-env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV];
+env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV] &
+XSTATE_SUPPORTED;
 memcpy(env->ymmh_regs, &xsave->region[XSAVE_YMMH_SPACE],
 sizeof env->ymmh_regs);
 return 0;
diff --git a/target-i386/machine.c b/target-i386/machine.c
index dc81cde..9e2cfcf 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -278,6 +278,10 @@ static int cpu_post_load(void *opaque, int version_id)
 CPUX86State *env = &cpu->env;
 int i;
 
+if (env->xstate_bv & ~XSTATE_SUPPORTED) {
+return -EINVAL;
+}
+ 
 /*
  * Real mode guest segments register DPL should be zero.
  * Older KVM version were setting it wrongly.
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH] spapr-rtas: reset top 4 bits in parameters address

2013-09-05 Thread Alexander Graf

On 05.09.2013, at 14:49, Alexey Kardashevskiy wrote:

> On 09/05/2013 10:16 PM, Alexander Graf wrote:
>> 
>> On 05.09.2013, at 14:04, Alexey Kardashevskiy wrote:
>> 
>>> On 09/05/2013 08:21 PM, Alexander Graf wrote:
 
 On 05.09.2013, at 12:17, Alexey Kardashevskiy wrote:
 
> On 09/05/2013 07:27 PM, Alexander Graf wrote:
>> 
>> On 05.09.2013, at 09:40, Alexey Kardashevskiy wrote:
>> 
>>> On 09/05/2013 05:08 PM, Alexander Graf wrote:
 
 
 Am 05.09.2013 um 07:58 schrieb Alexey Kardashevskiy :
 
> On the real hardware, RTAS is called in real mode and therefore
> ignores top 4 bits of the address passed in the call.
 
 Shouldn't we ignore the upper 4 bits for every memory access in real 
 mode, not just that one parameter?
>>> 
>>> We probably should but I just do not see any easy way of doing this. Yet
>>> another "Ignore N bits on the top" memory region type? No idea.
>> 
>> Well, it already works for code that runs inside of guest context, 
>> because there the softmmu code for real mode strips the upper 4 bits.
>> 
>> I basically see 2 ways of fixing this "correctly":
>> 
> 
>> 1) Don't access memory through cpu_physical_memory_rw or ldx_phys but
>> instead through real mode wrappers that strip the upper 4 bits, similar
>> to how we handle virtual memory differently from physical memory
> 
> But there is no a ready wrapper for this, correct? I could not find any. I
> would rather do this, looks nicer than 2).
> 
> 
>> 2) Create 15 aliases to system_memory at the upper 4 bits of address
>> space. That should at the end of the day give you the same effect
> 
> Wow. Is not that too much?
> Ooor since I am normally making bad decisions, I should do this :)
> 
> 
>> The fix as you're proposing it wouldn't work for indirect memory
>> descriptors. Imagine you have an "address" parameter that gives you a
>> pointer to a struct in memory that again contains a pointer. You still
>> want that pointer be interpreted correctly, no?
> 
> Yes I do. I just think that having non zero bits at the top is a bug and I
> would not want the guest to continue sending bad addresses to the host. Or
> at least I want to know if it still happening.
> 
> Now we know that the only occasion of this misbehaviour is the "stop-self"
> call and others works just fine. If something new comes up (what is pretty
> unlikely, otherwise we would have noticed this issue a loong time ago AND
> Paul already made&posted a patch for the host to fix __pa() so it is not
> going to happen on new kernels either), ok, we will think of fixing this.
> 
> Doing in QEMU what the hardware does is a good thing but here I would 
> think
> twice.
 
 Well, the idea behind RTAS is that everything RTAS does is usually run in 
 IR=0 DR=0 inside of guest context, so that's the view of the world we 
 should expose.
 
 Which makes me think.
 
>>> 
 Couldn't we just set IR=0 DR=0 when getting an RTAS call and use the
 virtual memory access functions? Those will already strip the upper 4
 bits.
>>> 
>>> Ok. We reached the border where my ignorance starts :) Never could
>>> understand the concept of the guest virtual memory in QEMU.
>>> 
>>> So we clear IR/DR and call what API? This is not address_space_rw() and
>>> company, right?
>> 
>> Nono, we basically route things through the same accesses that instructions 
>> inside of guest context would call. Something like
>> 
>>  cpu_ldl_data()
>> 
>> for example. IIRC there is also an #ifdef that allows you to just run ldl().
> 
> cpu_ldl_data() is defined for CONFIG_USER_ONLY. But ok, it is defined
> simply as ldl_p():
> 
> #define cpu_ldl_data(env, addr) ldl_raw(addr)
> #define g2h(x) ((void *)((unsigned long)(target_ulong)(x) + GUEST_BASE))
> #define laddr(x) g2h(x)
> #define ldl_raw(p) ldl_p(laddr((p)))
> 
> static inline int ldl_p(const void *ptr)
> {
>int32_t r;
>memcpy(&r, ptr, sizeof(r));
>return r;
> }
> 
> So it tries accessing memory @ptr (which is the guest physical) and -
> crashes :) So I need an address converter which is not there.
> 
> What do I miss? Thanks.

It should be defined through a bunch of macros and incomprehensible #include's 
and glue()'s for softmmu too. Just try and see if it works for you.

> 
> 
>> It automatically uses the current virtual layout the same way that the 
>> instruction emulator would do it - which is pretty much what we want.
>> 
> 
>> IIRC you also have to enter RTAS calls with DR=0, so we wouldn't even
>> need to flip any MSR bits when emulating RTAS calls, right?
> 
> Probably. Right now cpu->env.msr==0x0 in rtas handler but not sure that I
> see the real value.

Make sure you run cpu_synchronize_state() before you look at cpu->env.msr.


Alex




  1   2   3   >